Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: Move Data Layer provisioning from Runner to Coordinator #805

Merged
merged 13 commits into from
Jun 19, 2024

Conversation

morgsmccauley
Copy link
Collaborator

@morgsmccauley morgsmccauley commented Jun 17, 2024

This PR updates Coordinator to handle Data Layer provisioning, removing the implicit step from Runner. Provisioning itself is still completed within Runner, but Coordinator will trigger and monitor it. Functionally, provisioning is the same, but there are some subtle differences around how it is handled:

  • Provisioning will now happen as soon as the Indexer is registered, rather than when the first matched block is executed.
  • Block Streams/Executors will not be started until provisioning is successful, neither will start when either pending or failed.

A provisioned_state enum has been added to the Indexer state within Redis. This is used to persist what stage of provisioning the Data Layer is at, as well as ensuring we only provision once.

Concerns with current implementation

Overall, I'm not happy with the implementation here, but feel it is the best we can do given the current structure of Coordinator. As to not block this feature I decided to go forward with this approach, and will create a ticket to refactor/update later.

Provisioning is triggered within the "new" handler, and then polled within the "existing" handler, which seems a little awkward. The separated handling is necessary since no operation within the control loop (i.e. Synchroniser) should block, as that would block synchronisation for all other Indexers. So we need to trigger the provisioning initially, and then poll the completion each subsequent control loop.

I feel we have outgrown the current control loop, and am planning to refactor later. Rather than have a single control loop for all Indexers, I'm thinking we can have dedicated loops for each of them. We could spawn a new task for each Indexer, which then manages its own lifecycle. Then each Indexer is free to wait for as long as it wants, without impacting other Indexers. This would allow us to handle the blocking provisioning step much more elegantly.

@morgsmccauley morgsmccauley changed the title Feat/handle provisioning in coordinator feat: Provisioning Data Layer from Coordinator Jun 17, 2024
@morgsmccauley morgsmccauley linked an issue Jun 17, 2024 that may be closed by this pull request
@morgsmccauley morgsmccauley changed the title feat: Provisioning Data Layer from Coordinator feat: Provision Data Layer from Coordinator Jun 17, 2024
@morgsmccauley morgsmccauley changed the title feat: Provision Data Layer from Coordinator feat: Move Data Layer provisioning from Runner to Coordinator Jun 17, 2024
@@ -126,6 +126,8 @@ describe('Indexer integration', () => {
}
);

await provisioner.provisionUserApi(indexerConfig);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to provision manually since it's no longer called within execute

if self.redis_client.indexer_states_set_exists().await? {
return Ok(());
}
pub async fn migrate(&self) -> anyhow::Result<()> {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Migration to add provisioned_state enum, defaulting to Provisioned. There is some risk here if an Indexer hasn't actually been provisioned, but I believe it's unlikely and can handle manually if so.

@morgsmccauley morgsmccauley marked this pull request as ready for review June 17, 2024 23:08
@morgsmccauley morgsmccauley requested a review from a team as a code owner June 17, 2024 23:08
@morgsmccauley morgsmccauley changed the title feat: Move Data Layer provisioning from Runner to Coordinator refactor: Move Data Layer provisioning from Runner to Coordinator Jun 18, 2024
@morgsmccauley morgsmccauley merged commit 973efd6 into main Jun 19, 2024
7 checks passed
@morgsmccauley morgsmccauley deleted the feat/handle-provisioning-in-coordinator branch June 19, 2024 20:35
morgsmccauley added a commit that referenced this pull request Jun 20, 2024
This PR removes Data Layer resources on Indexer Delete. To achieve this,
the following has been added:
- `Provisioner.deprovision()` method which removes: schema, cron jobs,
and if necessary, Hasura source, database, and role
- `DataLayerService.StartDeprovisioningTask` gRPC method
- Calling the above from Coordinator within the delete lifecycle hook

In addition to the above, I've slightly refactored DataLayerService to
make the addition of de-provisioning more accomodating:
- `StartDeprovisioningTask` and `StartProvisioningTask` now return
opaque IDs rather than using `accountId`/`functionName` - to avoid
conflicts with eachother
- There is a single `GetTaskStatus` method which is used for both,
before it was specific to provisioning

As mentioned in #805, the Coordinator implementation is a little awkward
due to the shared/non-blocking Control Loop. I'll look to refactor this
later and hopefully improve on this.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Handle provisioning within Coordinator
1 participant