-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] - 2024.9.1 Upgrade Bug with Auto-Creation of Group Directory Role #2766
Comments
@kenafoster were you able to upgrade to 2024.9.1 using the workaround? @viniciusdc do you have any ideas on how to fix this? |
Thanks, @kenafoster, for the fantastic attention to detail in this issue. That's awesome. It looks like the upgrade path for the role assumes the role's presence already to proceed with the rest of the logic. That's a flaw on my end while testing it. Hopefully, things like this will soon be catched with our updates on release testing. To answer your and @marcelovilla's questions, the way to address this is to create the role prior to the actual deployment. This can be done with an extra check in the upgrade command, but I am worried that Terraform might later complain about the role's existence while applying the changes. I will do a test this afternoon and follow back here, if that all succeeds then the idea above is the way to go. |
@marcelovilla FYI the 2024.9.1 deploy fails if you have manually previously created While I guess you could try and manually import it into Terraform state then re-run the deploy, I found it easiest to manually delete the role, run the deploy (which then succeeded) and then re-create the role and manually assign it to groups. Given that the upgrade auto-assign step doesn't work if the client role doesn't exist, and then the subsequent deploy step doesn't work if the client role does exist, I don't think there's a path where this feature is functional in this implementation |
Describe the bug
When doing
nebari upgrade
from 2024.7.1 to 2024.9.1, there is a step which asks "Would you like Nebari to assign the corresponding role to all of your current groups automatically? [y/N] (N): ". This is due to the fact that in 2024.9.1, Keycloak groups will not automatically get JupyterHub shared directories created/mounted for them UNLESS the Keycloak group is manually assigned a JupyterHub Client Role (three groups get directories by default - admin, analyst, developer)There are two issues:
FIrst, if you choose "Y", Nebari has to then interact directly with the Keycloak REST API using the credentials in nebari-config. If you have changed the Keycloak root password to something outside of the config (a good practice especially if you're committing the config file to a repo for CICD) then it uses an invalid password. The way to work around this is to retrieve the valid password and temporarily make it the value of
security.keycloak.initial_root_password
... just be careful not to commit the real password. Maybe there's a fix to this? The Keycloak terraform stages are able to interact with the API even with the root password not stored in plaintext config... I haven't looked exactly into how that happened. In any case, even if it's the intended behavior/only possible solution to use the nebari-config file value, maybe some help text would aid users in troubleshooting (or at the very least I hope they come across this issue!)Second, once you have a valid Keycloak credential, the second problem is "allow-group-directory-creation-role" doesn't exist. The commit 6a16cb8 that adds this role isn't present in 2024.7.1. So you have a chicken-and-egg problem... can't get the role until the upgrade, and can't upgrade without the role
The workaround is to manually create the role in Keycloak. I'm currently in the process of finishing the upgrade via CI/CD... once 2024.9.1 actually deploys to AWS, I'll see whether this creates any errors.
Expected behavior
If the user enters "Y", the process of creating and assigning the role to current groups should succeed.
OS and architecture in which you are running Nebari
MacOS Sequoia 15.0.1 ARM (apple silicon)
How to Reproduce the problem?
Begin with a Nebari 2024.7.1 deployment and a corresponding
Upgrade your Nebari CLI to 2024.9.1
Run
nebari upgrade -c nebari-config.yaml
You'll encounter the first problem (401: Invalid User Credentials) if you have set your Keycloak root password to something other than what is in your config file
You'll encounter the second issue (404: Could not find role) once you have gotten past the first issue (fix the value of
security.keycloak.initial_root_password
if needed).Command output
The text was updated successfully, but these errors were encountered: