Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deploy and manage Superset with CI/CD #95

Open
MeltyBot opened this issue Apr 22, 2022 · 2 comments
Open

Deploy and manage Superset with CI/CD #95

MeltyBot opened this issue Apr 22, 2022 · 2 comments

Comments

@MeltyBot
Copy link

Migrated from GitLab: https://gitlab.com/meltano/squared/-/issues/105

Originally created by @pnadolny13 on 2022-04-22 20:23:43


Based on this article, Superset has seen some updates to the import/export features for doing CICD.
https://preset.io/blog/version-control-superset-charts-dashboards-superset/

Findings:

  • The Preset CLI (also works for standalone Superset instances) is a nice way to manage these imports/exports, previously you would need to use the Superset CLI from inside the container and unzip the output manually. It would also be a good way to interact with a Preset cloud SaaS version of Superset (if users chose to use it) as thats its primary use. It would simplify API interactions, auth, etc with the SaaS for us. I'm imagining the case where Meltano manages the assets but interfaces with Preset's API to deploy.
  • Superset now stores exports in yaml which is great in comparison to the previous JSON that was hard to diff.
  • I was able to successfully import our legacy dashboards, then export them to the new yaml format so we do have a path forward for migrating.
  • Theres a follow on article coming that talks about syncing dbt with Superset in both directions i.e. pulling Superset assets into dbt as exposures and pushing dbt models to Superset presumably as datasets auto imported. I tried those commands in the CLI but couldnt get anything working.

Blockers:

  • I noticed that when I import our whole set of dashboards and charts, the load order must be different or something because they get assigned new chart IDs. This causes diffs when nothing has changed. The chart ID is only used in the file name so we would want the filename to not have IDs in them (i.e. <Chart_name>.yml vs the current <Chart_name>_1.yml). Then even if the ID changes behind the scenes the assets dont update.
  • Sometimes when running an export multiple times using the same running instance, without changing anything, the order of the charts list in a dashboard asset is different leading to noisey diffs when really nothing changed.
  • Export a chart, update the name of that chart, exporting it again creates a brand new chart asset so the diff would end up being an entirely new file with the new name and the dashboard assets also have all their chart ID references updated. Having a chart name be decoupled from the ID would be ideal, also have the ID not be instance specific would be ideal i.e. when I import the ID should be the same as the file so future exports dont have diffs.
  • If I delete a chart I would want it to be removed from my assets folder during the export process but right now it just gets left behind. We could either require users to edit the files themselves which isnt a bad process i.e. UI based edits but small yaml file updates manually. Or we could have the export process truncate the entire assets directory so it has a full updated snapshot of whats in your local instance. This could be difficult to scale if importing/exporting the whole project starts getting slow at some point.

Challenges:

  • Import export everything could take a while. Ideally we would probably want a way to import/export a single dashboard and its dependencies.

Summary:

Superset doesnt have the capabalities yet to allow us to use it in a CI pipeline like we would want to. It does seem to be very close though, with what seem like some relatively minor changes like making sure list ordering is stable and handling IDs across instances better it would be ready to implement. I offered to give more detailed feedback and help contribute if thats whats needed but nobody has taken me up on that yet 😄.

cc @aaronsteers @tayloramurphy

@MeltyBot
Copy link
Author

@labelsync-manager labelsync-manager bot added the kind/Feature New feature or request label Jun 23, 2022
@DouweM
Copy link

DouweM commented Jun 24, 2022

Related to preset-io/backend-sdk#16

@pnadolny13 pnadolny13 self-assigned this Aug 22, 2022
@pnadolny13 pnadolny13 moved this to Backlog in Data Team Aug 22, 2022
@pnadolny13 pnadolny13 moved this from Triage to Backlog in Data Team Aug 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Backlog
Development

No branches or pull requests

4 participants