Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only one/few collators building blocks if parachain is underscheduled #6667

Open
alindima opened this issue Nov 27, 2024 · 6 comments
Open
Labels
I5-enhancement An additional feature request. T9-cumulus This PR/Issue is related to cumulus.

Comments

@alindima
Copy link
Contributor

Assume using the slot-based collator and a parachain with slot duration of 2 seconds.
In order to fully take advantage of elastic scaling, the parachain should have 3 cores scheduled at all times.
But what happens if it doesn't, for some reason (by accident or just the parachain decides to scale down based on lower demand)?

In this case, only one collator will be authoring blocks, which is not good.
This is because:

Collator A builds the first block.
Collator B's slot kicks in, but collator A already built a block for this core. So go back to sleep.
Collator C's slot kicks in, but collator A already built a block for this core. So go back to sleep.
Collator A's slot kicks in, there's a new relay chain block and a new opportunity to build a new block.

In the end, only collator A will builds blocks.

This can be fixed by having some dynamic slot duration based on the amount of scheduled cores.

CC: @sandreim @skunert

@alindima alindima added I5-enhancement An additional feature request. T9-cumulus This PR/Issue is related to cumulus. labels Nov 27, 2024
@skunert
Copy link
Contributor

skunert commented Nov 27, 2024

This does not sound like something we need to solve. Your problem statement assumes that we only have 3 collators. As soon as the slot_duration / core_count / collator_count does not match up anymore, someone else will author (e.g. in your example, if we have collator D and E, someone else will author).

Same could happen if we have no elastic scaling but are only scheduled on a core every x blocks.

@alindima
Copy link
Contributor Author

alindima commented Nov 27, 2024

This does not sound like something we need to solve. Your problem statement assumes that we only have 3 collators. As soon as the slot_duration / core_count / collator_count does not match up anymore, someone else will author (e.g. in your example, if we have collator D and E, someone else will author).

Indeed it depends on this correlation. But we don't have any restriction AFAIK on the exact number of collators a parachain should have.
Another example would be slot duration of 3 seconds, 1 scheduled core, 4 collators. Only A and C will author.

So basically it seems a problem if the number of expected cores is a divisor of the number of collators

Same could happen if we have no elastic scaling but are only scheduled on a core every x blocks.

Agreed, but in my view this is another reason why this should be solved. Doesn't sounds too good if a core-sharing parachain that has one assignment every 5 blocks and 5 collators only uses one of the collators to author (which is a realistic scenario, right?). What if that one collator is down? Even if it's not, it will be reaping all of the block authoring rewards or could pick which transactions to include

@skunert
Copy link
Contributor

skunert commented Nov 27, 2024

Fair enough, I think we can keep the issue open but I don't see it as high priority for now.

@sandreim
Copy link
Contributor

sandreim commented Nov 27, 2024

Yeah, low priority for now, but will become higher once we have a implemented a transaction streaming solution.

As for the solution, I think some flavor of @eskimor's proposal in #4813 is a good fix for the problem. The gist of it relevant to this issue here is in the Limitations/Consequences section.

@eskimor
Copy link
Member

eskimor commented Nov 27, 2024

Good catch and indeed one more reason to not decrease slot times too much, but instead let one collator build all the blocks for a given relay block.

@alindima
Copy link
Contributor Author

Good catch and indeed one more reason to not decrease slot times too much, but instead let one collator build all the blocks for a given relay block.

What about the case with core-sharing? I think this needs a separate solution

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I5-enhancement An additional feature request. T9-cumulus This PR/Issue is related to cumulus.
Projects
None yet
Development

No branches or pull requests

4 participants