Skip to content

deploy 5 CockroachDB nodes from RSS #3450

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
Jul 3, 2023
Merged

deploy 5 CockroachDB nodes from RSS #3450

merged 16 commits into from
Jul 3, 2023

Conversation

davepacheco
Copy link
Collaborator

@davepacheco davepacheco commented Jun 29, 2023

Depends on #3446. Replaces #2956. Fixes #727.

This changes RSS to deploy 5 CockroachDB nodes and sets the default replication factor to 5. Quoting the CockroachDB production checklist:

When deploying in a single availability zone:...To be able to tolerate 2 simultaneous node failures, use at least 5 nodes and increase the default replication factor for user data to 5. The replication factor for important internal data is 5 by default, so no adjustments are needed for internal data. In this case, if 2 nodes fail at the same time, each range retains 3 of its 5 replicas, a majority.

@davepacheco davepacheco requested a review from smklein June 29, 2023 04:45
@davepacheco davepacheco marked this pull request as draft June 29, 2023 05:15
@davepacheco
Copy link
Collaborator Author

This needs a bit more work. Sled Agent makes some assumptions that either need to be relaxed or worked around (e.g., that there will only be one CockroachDB dataset in a given zpool).

@smklein
Copy link
Collaborator

smklein commented Jun 29, 2023

This needs a bit more work. Sled Agent makes some assumptions that either need to be relaxed or worked around (e.g., that there will only be one CockroachDB dataset in a given zpool).

Is this still an issue? I saw you bumped the number of zpools in the simulated environment from three to five, which looks like the right call to me

@davepacheco
Copy link
Collaborator Author

This needs a bit more work. Sled Agent makes some assumptions that either need to be relaxed or worked around (e.g., that there will only be one CockroachDB dataset in a given zpool).

Is this still an issue? I saw you bumped the number of zpools in the simulated environment from three to five, which looks like the right call to me

No, this is fixed with a change to the dependent PR (#3446) to allocate from separate zpools when possible, plus a change in this PR to set up more virtual zpools.

@davepacheco davepacheco marked this pull request as ready for review June 30, 2023 21:47
@@ -47,7 +47,7 @@ function ensure_zpools {
echo "Zpool: [$ZPOOL]"
VDEV_PATH="$OMICRON_TOP/$ZPOOL.vdev"
if ! [[ -f "$VDEV_PATH" ]]; then
dd if=/dev/zero of="$VDEV_PATH" bs=1 count=0 seek=10G
dd if=/dev/zero of="$VDEV_PATH" bs=1 count=0 seek=6G
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to keep the total space used by the zpools the same size. It used to be 3x10G. Now it's 5x6G.

Comment on lines +36 to +37
"oxp_14b4dc87-ab46-49fb-a4b4-d361ae214c03",
"oxp_24b4dc87-ab46-49fb-a4b4-d361ae214c03",
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need two more pools in the system in order to deploy a 5-node CockroachDB cluster.

@@ -240,7 +240,7 @@ pub async fn build_client() -> Result<oxide_client::Client> {
})
},
&Duration::from_secs(1),
&Duration::from_secs(300),
&Duration::from_secs(600),
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're deploying two extra CockroachDB nodes and it can take a little longer for the system to set up now.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can look into parallelizing service bringup on the sled-agent side, if that would help mitigate?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expect it would. No pressure on my part in terms of urgency.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, lets not block this PR, but I can do this as a follow-up.

Base automatically changed from dap/multi-crdb to main June 30, 2023 22:31
Copy link
Collaborator

@smklein smklein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

@@ -240,7 +240,7 @@ pub async fn build_client() -> Result<oxide_client::Client> {
})
},
&Duration::from_secs(1),
&Duration::from_secs(300),
&Duration::from_secs(600),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can look into parallelizing service bringup on the sled-agent side, if that would help mitigate?

* Configure a replication factor of 5 to ensure that the system can maintain
* availability in the face of any two node failures.
*/
ALTER RANGE default CONFIGURE ZONE USING num_replicas = 5;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For local testing CRDB -- e.g. with the omicron-dev tool -- what does this do?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. I expect it will have no practical impact. The default replication factor is 3, which is already more nodes than we ever have today. The impact is that CockroachDB considers all of its data under-replicated. It reports it as such (e.g., here) and if new nodes show up, it will try to replicate data until nothing is under-replicated.

One could imagine that this would instead disallow you from writing any data to the cluster when it can't satisfy this constraint. That doesn't seem to be how it works. If it did, the tests wouldn't run at all, I think.

@davepacheco davepacheco merged commit 92258b4 into main Jul 3, 2023
@davepacheco davepacheco deleted the dap/more-crdb branch July 3, 2023 17:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[sled-agent] Initialize multi-node CockroachDB, rather than single-node
2 participants