Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gemini 3h guide updates #401

Merged
merged 24 commits into from
Jan 31, 2024
Merged

Gemini 3h guide updates #401

merged 24 commits into from
Jan 31, 2024

Conversation

EmilFattakhov
Copy link
Contributor

@EmilFattakhov EmilFattakhov commented Jan 16, 2024

This PR reflects the changes introduced in:

  1. PR 2377
  2. PR 2408
  3. PR 2412
  4. PR 2374
  5. PR 2392
  6. PR 2405
  7. PR2414
  8. Subnomicon PR 16

This will require some dogfooding and testing before mering, especially on the operators side (key generation, starting an operator node, starting an operator+farmer node).

For now, requesting a review for any obvious missed flags or inconsistencies. Thanks!

Edit:

Also reflected the PR 2452

Copy link

netlify bot commented Jan 16, 2024

Deploy Preview for subspace-docs-staging ready!

Name Link
🔨 Latest commit 674dd34
🔍 Latest deploy log https://app.netlify.com/sites/subspace-docs-staging/deploys/65ba5e7f6af49b000846a015
😎 Deploy Preview https://deploy-preview-401--subspace-docs-staging.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@EmilFattakhov EmilFattakhov added documentation Improvements or additions to documentation enhancement New feature or request do not merge and removed enhancement New feature or request labels Jan 16, 2024
@EmilFattakhov EmilFattakhov marked this pull request as ready for review January 18, 2024 21:36
@EmilFattakhov EmilFattakhov changed the title WIP Gemini 3h guide updates Gemini 3h guide updates Jan 18, 2024
Copy link
Member

@nazar-pc nazar-pc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looked through everything briefly and left a few comments

@@ -80,3 +80,15 @@ You can use any file system with default settings, we advise against their use f
In order to disable **CoW** on **BTRFS**, run the command `chattr +C /your_farmer_directory/`.

In case of **ZFS**, you can try adjusting the **recordsize** by running `zfs set recordsize=128K tank/dataset`, or creating a separate **dataset** for the farmer with adjusted **recordsize** parameters.


### Folder structure
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How important is this information to users? They can see the structure when they run software anyway.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. Unless we add some explanation of what the folders are for this is pretty much self-documenting.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's important in the context of keypair location for operators, but I agree that they look a bit out-of-context here. Removed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not important in the context of keypair location because farmers will not need to know or care about keypair location anymore. There is no --keypair-path anywhere now.

Comment on lines 192 to 194
--bootstrap-nodes /ip4/3.87.28.170/tcp/40333/p2p/12D3KooWGHtULvhdKMZtzigSK1438uWXPj9rBQHVzTaKMWv1WRXp \`
--bootstrap-nodes /ip4/140.82.45.36/tcp/40333/p2p/12D3KooWJxB2PVzyNqZywoCWAMLGhKr2LCbFMXbAPzUPQZon7yt1 \`
--bootstrap-nodes /ip4/167.179.116.135/tcp/40333/p2p/12D3KooWHsRae3ip2uQAUPtpMcStHGGw777SJT3nu9maFmtsGry7 \`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We really need to replace this mess of IP addresses with readable domain names for 3h. Also note that these can be in the chain spec and we should add them for 3h such that for domains created by us specifying bootstrap nodes will not be required, but have "advanced" section about adding these for custom domains (bootstrap nodes for which we will probably not run at all).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Who can help us with that? Is it @DaMandal0rian?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, he should create it on infrastructure level and someone needs to update chain spec afterwards.

Copy link
Contributor

@DaMandal0rian DaMandal0rian Jan 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jim-counter these will be in the format:

"--bootstrap-nodes", "/dns/bootstrap-0.nova.gemini-3h.subspace.network/tcp/30334/p2p/ID"
"--bootstrap-nodes", "/dns/bootstrap-1.nova.gemini-3h.subspace.network/tcp/30334/p2p/ID"
"--bootstrap-nodes", "/dns/bootstrap-2.nova.gemini-3h.subspace.network/tcp/30334/p2p/ID"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much Muhammad!

"--base-path", "/var/subspace",
"--port", "30333",
"--listen-on", "30333",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here and in multiple other places syntax is wrong, it is not just port anymore

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the --port has been replaced by --listen-on. Perhaps some github quirk, just double checked that all --port commands were indeed replaced.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was talking about value, not the key name. The value is wrong.

"--dsn-listen-on", "/ip4/0.0.0.0/udp/30433/quic-v1",
"--dsn-listen-on", "/ip4/0.0.0.0/tcp/30433",
"--rpc-cors", "all",
"--rpc-methods", "unsafe",
"--rpc-external",
"--rpc-listen-on",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is and in many other places syntax is wrong, this is not a boolean option

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, what we should replace --rpc-external with in this case? I thought it could be used with or without specifying the port, as it has a default value unless specified explicitly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely not. --rpc-listen-on replaces a bunch of options, it is not a simple renaming. --help explains the syntax of each option.

--rpc-listen-on is not a port, it is IP:port. If you want to listen on all IPs then you specify 0.0.0.0 as IP explicitly. As to port - you specify the port you want, probably 9944, so the value is 0.0.0.0:9944 here. In future we'll make RPC optional (Substrate currently always starts it unconditionally) and will not run at all unless this option is specified.

--chain gemini-3g \
./subspace-node-ubuntu-x86_64-skylake-gemini-3g-2024-jan-08 run \
--base-path path_to_node \
--chain gemini-3h \
--blocks-pruning 256 \
--state-pruning archive-canonical \
--no-private-ipv4 \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this option has been removed or changed as it no longer works as a CLI parameter.

cc @nazar-pc

@@ -18,7 +18,7 @@ An operator needs a key pair to participate in bundle production.
You can create a key using the following command:

```bash
target/production/subspace-node key generate --scheme sr25519
target/production/subspace-node run key generate --scheme sr25519
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nazar-pc should this work? When I try to ./subspace-node run key ... I get an error: unexpected argument 'key' found.

Copy link
Member

@nazar-pc nazar-pc Jan 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, run x doesn't make sense, that command is for running exclusively. All key-related commands are gone except importing, but now I see that we need to bring generation back. I'll start a Slack thread about this to figure out what the interface should look like.

@EmilFattakhov
Copy link
Contributor Author

EmilFattakhov commented Jan 23, 2024

I will take this evening to test the commands with the 3h build and push the update either later today or tomorrow. Appreciate your feedback everyone!

@@ -33,7 +33,7 @@ After downloading both files that suit your system, start a node using your pref
You can do this with the following command:

```bash
./your_subspace_node_path --dev --alice --rpc-port 9444 -- --domain-id 3 --dev --rpc-port 8545
./your_subspace_node_path run --dev --rpc-listen-on 9444 -- --domain-id 3 --dev --rpc-listen-on 8545
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The format for --rpc-listen-on is 127.0.0.1:9944, not just the port.

Copy link
Member

@jim-counter jim-counter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested some changes. I have not made them in version-latest but I expect they need to be there too.


If you were running a node previously, and want to switch to a new network, please perform these steps and then follow the guide again:
```bash
# Replace `FARMER_FILE_NAME` with the name of the farmer file you downloaded from releases
./FARMER_FILE_NAME wipe PATH_TO_FARM
# Replace `NODE_FILE_NAME` with the name of the node file you downloaded from releases
./NODE_FILE_NAME purge-chain --chain gemini-3g
./NODE_FILE_NAME wipe PATH_TO_NODE
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is PATH_TO_NODE the value we use for --base-path?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is the same value, just positional argument in this case rather than named option

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's named inconsistently in different places. And PATH_TO_NODE_DATA or even NODE_DATA_PATH feels like a better explanation.

Just checked and we use the latter elsewhere. I think we should standardise on that.

Previously plotting/replotting thread pools were created for each farm separately even though only a configured number of them can be used at a time (by default just one).
With the introduction of NUMA support, the farmer application has a thread pool manager that will create a necessary number of thread pools that will be allocated to currently plotting/replotting farms.
When a thread pool is created, it is assigned to a set of CPU cores and will only be able to use those cores. Pinning doesn’t pin threads to cores 1:1, instead, OS is free to move threads between cores, but only within CPU cores allocated for thread pool. This will ensure plotting for a particular sector only happens on a particular CPU/NUMA node.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
When a thread pool is created, it is assigned to a set of CPU cores and will only be able to use those cores. Pinning doesn’t pin threads to cores 1:1, instead, OS is free to move threads between cores, but only within CPU cores allocated for thread pool. This will ensure plotting for a particular sector only happens on a particular CPU/NUMA node.
When a thread pool is created, it is assigned to a set of CPU cores and will only be able to use those cores. Pinning doesn’t pin threads to cores 1:1, instead, the OS is free to move threads between cores, but only within CPU cores allocated for the thread pool. This will ensure plotting for a particular sector only happens on a particular CPU/NUMA node.

#### Enabling NUMA on Windows/Linux machines

On Linux and Windows, the farmer will detect NUMA systems and create a number of thread pools that correspond to a number of NUMA nodes. This means the default behavior will change for large CPUs and consume more memory as a result, but that can be changed to the previous behavior with CLI options if desired.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
On Linux and Windows, the farmer will detect NUMA systems and create a number of thread pools that correspond to a number of NUMA nodes. This means the default behavior will change for large CPUs and consume more memory as a result, but that can be changed to the previous behavior with CLI options if desired.
On Linux and Windows, the farmer will detect NUMA systems and create a number of thread pools that correspond to the number of NUMA nodes. This means the default behavior will change for large CPUs and consume more memory as a result, but that can be changed to the previous behavior with CLI options if desired.

To use NUMA, you need to enable it via the BIOS of your motherboard for the farmer to know it exists. This option is present in motherboards for **Threadripper/Epyc processors** but might exist in others too. If you don’t enable it, both OS and farmer will think you have a single UMA processor and will not be able to apply optimizations.

To read more about NUMA support and benefits it provides for large CPUs read [this forum post](https://forum.subspace.network/t/numa-support-is-coming/2299?u=nazar-pc).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
To read more about NUMA support and benefits it provides for large CPUs read [this forum post](https://forum.subspace.network/t/numa-support-is-coming/2299?u=nazar-pc).
To read more about NUMA support and the benefits it provides for large CPUs read [this forum post](https://forum.subspace.network/t/numa-support-is-coming/2299?u=nazar-pc).


**You have successfully generated an operator key, congratulations!**

The keys were generated in the subfolder of **PATH_TO_NODE** under `/domains/domainID/keystore`. If you don't see the generated keypair in the subfolder, something went wrong.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we include a warning about how important it is that users record the seed phrase output to the CLI?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably, though CLI will already print a warning on its own. Also you can always rotate the key, it is just important to not leak it or else someone can get operator slashed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a short note below.


#### Insert key to Keystore:
The key generated above needs to be added to the Keystore so that the operator node can use it to participate in bundle production.
This might be useful if you decicded to switch domains and already have **the secret phrase**. Read more about switching domain in the next section.
Copy link
Member

@jim-counter jim-counter Jan 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This might be useful if you decicded to switch domains and already have **the secret phrase**. Read more about switching domain in the next section.
This might be useful if you decided to switch domains or already have **the secret phrase**. Read more about switching domain in the next section.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say "or already have" because keys can be generated externally.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say "or already have" because keys can be generated externally.

Updated.

docs/farming-&-staking/staking/staking.md Outdated Show resolved Hide resolved

If you were running a node previously, and want to switch to a new network, please perform these steps and then follow the guide again:
```bash
# Replace `FARMER_FILE_NAME` with the name of the farmer file you downloaded from releases
./FARMER_FILE_NAME wipe PATH_TO_FARM
# Replace `NODE_FILE_NAME` with the name of the node file you downloaded from releases
./NODE_FILE_NAME purge-chain --chain gemini-3g
./NODE_FILE_NAME wipe PATH_TO_NODE
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is the same value, just positional argument in this case rather than named option


**You have successfully generated an operator key, congratulations!**

The keys were generated in the subfolder of **PATH_TO_NODE** under `/domains/domainID/keystore`. If you don't see the generated keypair in the subfolder, something went wrong.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably, though CLI will already print a warning on its own. Also you can always rotate the key, it is just important to not leak it or else someone can get operator slashed.


#### Insert key to Keystore:
The key generated above needs to be added to the Keystore so that the operator node can use it to participate in bundle production.
This might be useful if you decicded to switch domains and already have **the secret phrase**. Read more about switching domain in the next section.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say "or already have" because keys can be generated externally.

@EmilFattakhov
Copy link
Contributor Author

Everything should be updated now with all comments resolved. Thanks again for the careful reviews, everyone!

Copy link
Member

@jim-counter jim-counter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few small items and there's an existing unresolved conversation.

@EmilFattakhov
Copy link
Contributor Author

Thanks @jim-counter, hopefully a last ping from me 😅

Copy link
Member

@jim-counter jim-counter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@EmilFattakhov EmilFattakhov merged commit 5e4ed11 into main Jan 31, 2024
5 checks passed
@EmilFattakhov EmilFattakhov deleted the 3h_guide_updates branch January 31, 2024 15:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
do not merge documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants