Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SIP-58] Proposal to move superset-ui to this repo #13013

Closed
7 tasks
MatanBobi opened this issue Feb 8, 2021 · 14 comments
Closed
7 tasks

[SIP-58] Proposal to move superset-ui to this repo #13013

MatanBobi opened this issue Feb 8, 2021 · 14 comments
Assignees
Labels
sip Superset Improvement Proposal

Comments

@MatanBobi
Copy link
Contributor

MatanBobi commented Feb 8, 2021

[SIP] Proposal to move superset-ui to this repo

Motivation

At the moment, working with superset-ui in order to develop plugins or packages to superset comes with a very bad developer experience.
In case you want to edit or add a plugin, you need to manually link it using npm link, this causes a higher barrier for new developers to the project. Working on multiple packages in superset-ui will need a developer to manually link all the packages.
It's important to note that not all packages and plugins will move back to this repo so we'll still need to use npm link from time to time.
Using lerna in our new monorepo will also ensure that a superset-frontend version will also have the latest versions of all the plugins and packages.
This will also let us align the tech stack between superset-frontend and the packages and plugins used in it.

Proposed Change

The whole superset-ui repo will be transferred to superset-frontend and superset-ui's repo will be archived.
The stack will be the same as the one used in superset-ui, using lerna and yarn workspaces to manage the monorepo.
The release flow will also be from this repo using github actions as superset-ui works.
superest-frontend will become a package within the new monorepo with "private": true flag, this means that no package will be deployed to npm for that package and the docker image build process will remain as is.
The whole monorepo will have one storybook, it will include all the plugins.
Testing the plugins and packages will remain as it is in superset-ui, using jest. Support for RTL will be added (since it's used in superset-frontend and not in superset-ui.

Regarding storybook, we can take two approaches:

  • We can keep the superset-ui-demo package where we define all the stories.
  • We can create a main storybook folder and write each story in it's own plugin. The main storybook will collect all stories from every plugin/package.
    We should rethink later on if we want to rename the scope from @superset-ui to @superset or keep it that way.

New folders structure (WIP):
image

New or Changed Public Interfaces

None. We will use the same package names.

New dependencies

lerna - To manage the monorepo.
This is already being used in superset-ui so I believe licensing isn't an issue here.

Migration Plan and Compatibility

No database migrations.

Rejected Alternatives

The current solution where these packages that are tightly related sit under different repositories create a bad developer experience.

Open issues

At the moment, superset-ui's lerna configuration sets the same version for all packages, this means that the superset-frontend version will be aligned with the plugins and packages versions.
There's also the option to set the versioning as independent but we need to define what's the behavior we'd like to see here :)

Todo List:

  • Move the core packages from superset-ui
  • Align all tooling between core ui and plugins/packages (ts, babel, jest configs etc.)
  • Verify build and publish process works with commands
  • Integrate build and publish process to github actions
  • Storybook locally should work
  • All workflows (Chromatic, storybook) preview deployments should work.
  • Move all other plugins, one by one

CC: @amitmiran137 @ktmud @rusackas

@junlincc junlincc added the sip Superset Improvement Proposal label Feb 8, 2021
@ktmud
Copy link
Member

ktmud commented Feb 9, 2021

Adding some historical context to make sure we made a conscious decision after considering all pros and cons:

  1. The initial discussion on moving packages out of the main repo: [SIP-4] [Embeddable Charts] Create Superset JS Client  #5667 (comment)
  2. My question about the developer experience on working across multiple repos: [wip] superset-ui-plugins linking tool for local development #8638 (comment)
  3. A couple of historical PRs to address the pain raised from npm link and diverging build scripts, etc: [wip] superset-ui-plugins linking tool for local development #8638 Build: optimize frontend build configs to improve superset-ui-plugin dev experience #9326
  4. When we merge superset-ui-plugins to superset-ui: build: start merging superset-ui-plugins apache-superset/superset-ui#333
  5. When we merge all core superset-ui packages: refactor: merge core superset-ui packages apache-superset/superset-ui#768

Note in the original proposal to move superset-ui to npm packages, we did consider keeping packages under one monorepo, but the decision at the time was that pros outweighs cons with separate repos.

A question to be answered in this SIP: What has changed? The biggest argument I can see is that after migrating to separate repos, and starting to use TypeScript, we realized a couple of new facts:

  1. Maintaining diverging build configs between repos is not easy.
  2. npm link is often problematic, especially with TypeScript.
  3. The two PRs + waiting for npm release workflow severely hinders developer velocity and it turned out to be larger problem than we thought it to be as number of Superset contributors grow.
  4. There may be other solutions to the cons or the cons were not that important than we original thought.

Ultimately, moving to either direction (monorepo vs separate rpos) is a big tradeoff. By moving to monorepo, we will immediately lose these current benefits (unless more work is done on making sure relevant tools are properly configured ):

  1. PR preview Storybook via Zeit and Chromatic UI (it's more difficult to request new Github Apps under the apache org, which hosts all other Apache projects, than in apache-superset, which most Superset PMC members have direct admin access).
  2. Reduced build time from smaller codebase and not having to share the GitHub Action runner pool with other Apache projects (Can be mitigated by conditionally triggered workflows)
  3. Automatic NPM release (need to configure NPM_SECRET and the required Workflow)
  4. 100% test coverage guarantee for core packages (need to update codecov config)

@kristw
Copy link
Contributor

kristw commented Feb 10, 2021

When we moved to the separated repos, the goal was also to raise code quality bar, increase development velocity, and produce components that can be natively embedded in other applications (therefore the npm packages).
However, as @ktmud mentioned, it has been 2 years since then and many things were changed.


History-1: The quality of the code in apache/(incubator-)superset was not great. Some parts were well tested. Some not. Lints were enabled only for certain parts. superset-ui was created to be stricter: requiring 100% test coverage and TypeScript. We rewrote many core parts including massive clean up of all charts, then replaced legacy code in the main repo with these newer components.

Current: 🟢 Main repo has more TypeScript + lints and overall code quality has improved.


History-2: CI jobs in apache/(incubator-)superset took long time to run and was not stable (flaky tests and master often had issues at that time).

Current: 🟢 CI and master are more stable these days. 🟡 The jobs still take time to run. Probably can mitigate by skipping the unnecessary ones.


History-3: Better tooling and flexibility. We experimented and adopted bots, conventional commits, PR previews (Netlify then later Zeit/Vercel), semi-automatic npm packages release via github actions. These are difficult to experiment and configure on repos under apache org which we don't have admin rights.

Current: 🔴 We still don't have admin rights on apache/superset and some apps are not allowed.


History-4: There were needs to embed charts in other apps. Plugins system and publishing the packages on npm was for this.
Current: 🟢 Embeddable chart is less of a focus now. There are some internal apps using the core package to call Superset APIs, but I believe they are fairly stable.


History-5: Some legacy chart plugins were added by community and caused maintenance problems for committers over time. The plugins system and publishing the packages on npm hoped to decentralize this a bit. Each org/person can have repo of their custom plugins with repo structure forked from superset-ui-plugins.

Current: 🟡 Can still have boilerplate of standalone plugin repo for 3rd party who wants to create plugin, but it will still have issues: keep build config in sync and npm link. If the main developers are no longer dogfooding this approach, it will likely be broken after a while.

Or we could change the process for creating custom plugins to (1) fork apache/superset (2) dev on that and (3) only publish the flagged plugins. However, the forked repo can be heavy and need to keep all parts to be able to git rebase from upstream.


Overall, it seems things have changed in favorable direction.

I agree with the observations (@ktmud 's 1-4 above) that the workflow has caused new issues with more contributors joining the project. The key questions to be addressed before merging back are also good and I want to add a bit more on top of that.

Topic Things to consider before proceeding
Repo tools: PR preview Storybook via Zeit and Chromatic UI (it's more difficult to request new Github Apps under the apache org, which hosts all other Apache projects, than in apache-superset, which most Superset PMC members have direct admin access). Do some feasibility study. Check with Apache admin to know what are allowed. Are there workaround and are we OK losing some if not allowed? Better avoid surprises.
Reduced build time from smaller codebase and not having to share the GitHub Action runner pool with other Apache projects Verify that conditionally triggered workflows can help and make this task a requirement before the repo merge, not after. Make sure we won't save time from npm link (fix cost) only to spend more time staring at CI instead (recurring).
Revise automatic NPM release Should be able to set NPM_SECRET via JIRA to Apache admins. Have to reconsider publishing flow. Currently, releases are triggered by committers committing a PR that bumps version in package.json and CI took care of the rest. Direct git push to master is not allowed in apache/superset.
Figure out npm packages release cadence: How often will the plugins be published? Currently it is being forced because the main app refer to plugins via package and version in package.json. (1) If the npm publishing is still a requirement, you still need 2-3 PRs in apache/superset. One PR up to the point where plugin is completed. Potentially another PR to bump version and force npm publishing. Another PR is to update package version in superset-frontend after new packages have been published. (2) The 2-3 PRs process sounds tedious. After moving to the same repo, one might be interested in pointing to plugin src directly and bypass the npm publishing at all. This could leave the npm packages outdated and someone has to publish at one point, when will that be? This is not a problem for the main app, but downstream apps or 3rd party chart plugin dev may be using outdated @superset-ui/xxx.
Maintain quality bar e.g. 100% test coverage guarantee for core packages Sounds straightforward (update codecov config)
3rd party chart plugins: How would someone develop chart plugins that are only useful for his/her org (without dropping code in apache/superset)? Worth spend sometimes thinking how to provide flexibility while avoid accumulating maintenance cost for apache/superset. Please refer to the comment above about History-5.

To sum up, thank you for writing the SIP and thinking more about this problem. I think this could work and would welcome overall improvement to developer experience. Just want to make sure important issues are brought to attention and some thoughts are put into them before voting and executing.

@rusackas
Copy link
Member

A few additional points building on all the helpful background/issues from @ktmud and @kristw:

  • SIP-57, in favor of Semantic Versioning, is still in the VOTE process. This should be considered here. The current NPM publishing scripts bump ALL touched packages in parallel. We may need to pay more attention to semantically bumping the version numbers of individual plugins, or establishing some process to make sure that ALL plugins are bumped to the highest/safest level (i.e. a breaking change on one package means they all get a major number bump).
  • I really prefer the idea of using the src path for packages, rather than having to jump through the hoops of publishing/bumping packages internally in Superset. Removal of this overhead is the main motivation for the homecoming, in my opinion. We should indeed address the cadence of when we publish new plugin versions. My off-the-cuff suggestion would be to make it part of the Superset OSS release procedure, so they're in tighter lock-step. We could even consider aligning version numbers between Superset and the internal packages.
  • Regarding the viz plugins, we would still need to keep some of them external, on superset-ui. One motivation to externalize them was around the licensing of dependencies being incompatible with Apache. I believe this is likely to remain the case, so only the "safe" ones can come home. We'll need to support npm link, Storybook, release actions, etc on superset-ui to support those remaining packages.

@villebro
Copy link
Member

I think it would be nice if the superset-ui npm packages would be synced with the official release versions; when a new Superset version is released, the all superset-ui packages would be released with the same version on npm. If we stay true to semantic versioning (the SIP has now passed), people would know that if their external applications is using e.g. v. 1.1.0 of superset-ui, it will be compatible with a backend running 1.5.0.

@MatanBobi
Copy link
Contributor Author

  • SIP-57, in favor of Semantic Versioning, is still in the VOTE process. This should be considered here. The current NPM publishing scripts bump ALL touched packages in parallel. We may need to pay more attention to semantically bumping the version numbers of individual plugins, or establishing some process to make sure that ALL plugins are bumped to the highest/safest level (i.e. a breaking change on one package means they all get a major number bump).
  • I really prefer the idea of using the src path for packages, rather than having to jump through the hoops of publishing/bumping packages internally in Superset. Removal of this overhead is the main motivation for the homecoming, in my opinion. We should indeed address the cadence of when we publish new plugin versions. My off-the-cuff suggestion would be to make it part of the Superset OSS release procedure, so they're in tighter lock-step. We could even consider aligning version numbers between Superset and the internal packages.
  • Regarding the viz plugins, we would still need to keep some of them external, on superset-ui. One motivation to externalize them was around the licensing of dependencies being incompatible with Apache. I believe this is likely to remain the case, so only the "safe" ones can come home. We'll need to support npm link, Storybook, release actions, etc on superset-ui to support those remaining packages.

Thanks everyone for all the helpful inputs!

@rusackas

  1. IMO we should use lerna independent versioning, I don't think that a breaking change in one plugin should cause a major version bump in another version.
  2. If we decide to align the version numbers we'll get a problem with semantic versioning and major version bumps you raised on the first point wouldn't we?
  3. That's a good point. Is there a list of the plugins/packages we can move based on the license? Since I couldn't find any licensing on the plugins, just on the whole superset-ui repo.

I think it would be nice if the superset-ui npm packages would be synced with the official release versions; when a new Superset version is released, the all superset-ui packages would be released with the same version on npm. If we stay true to semantic versioning (the SIP has now passed), people would know that if their external applications is using e.g. v. 1.1.0 of superset-ui, it will be compatible with a backend running 1.5.0.

@villebro I do agree that it can be nice, but I'm not sure it is really helpful. This means that whenever you want to release a patch to a plugin you'll also bump all the packages and the superset-ui core module, sounds redundant to me but I'm not familiar with the whole release mechanism so I might be missing some stuff.

@kristw thanks for the helpful inputs!

History-1: The quality of the code in apache/(incubator-)superset was not great. Some parts were well tested. Some not. Lints were enabled only for certain parts. superset-ui was created to be stricter: requiring 100% test coverage and TypeScript. We rewrote many core parts including massive clean up of all charts, then replaced legacy code in the main repo with these newer components.

Current: 🟢 Main repo has more TypeScript + lints and overall code quality has improved.

I believe the issue we have now is the opposite one.. superset-ui doesn't have tests almost at all. The coverage configs are all on 0.

History-3: Better tooling and flexibility. We experimented and adopted bots, conventional commits, PR previews (Netlify then later Zeit/Vercel), semi-automatic npm packages release via github actions. These are difficult to experiment and configure on repos under apache org which we don't have admin rights.

Current: 🔴 We still don't have admin rights on apache/superset and some apps are not allowed.

That's an important one, the question is, is that a blocker?

History-5: Some legacy chart plugins were added by community and caused maintenance problems for committers over time. The plugins system and publishing the packages on npm hoped to decentralize this a bit. Each org/person can have repo of their custom plugins with repo structure forked from superset-ui-plugins.

Current: 🟡 Can still have boilerplate of standalone plugin repo for 3rd party who wants to create plugin, but it will still have issues: keep build config in sync and npm link. If the main developers are no longer dogfooding this approach, it will likely be broken after a while.

I think we should still maintain that approach, having developers create PR's on the main repo might be a serious bottleneck.
As long as there are still plugins left outside on superset-ui due to licensing I guess we'll keep maintaining it, but even after, we need to maintain a doc page explaining how to do it and also not to break that mechanism.

Regrading the open issues:

  • Repo tools: PR preview Storybook via Zeit and Chromatic UI - Who can check with Apache admins what are we allowed to do?
  • Reduced build time from smaller codebase and not having to share the GitHub Action runner pool with other Apache projects - I'll work on this one as part of this SIP.
  • Revise automatic NPM release - We need to talk about this one since IMO we need to define a way to publish plugins which is unrelated to the main repo. Also, since the FE will be a part of the lerna monorepo, we should know that a new version will be released once it has changes (though we might not rebuild the docker image of-course).
  • Figure out npm packages release cadence: How often will the plugins be published? - IMO this should be automatic even if we decide to use the src folder instead of the npm package. Every change will be released using lerna and github actions.
  • Maintain quality bar e.g. 100% test coverage guarantee for core packages - As I stated before, the current status of superset-ui isn't as good as we thought it is..
  • 3rd party chart plugins: How would someone develop chart plugins that are only useful for his/her org (without dropping code in apache/superset)? - This doesn't sound like something which should be coupled to this SIP but I do agree that we need to think about it. At the moment, one already needs to edit code in superset to add plugins. We might need to figure out a way to extend the plugins without editing any code.

Thanks everyone for the valuable inputs! I'm starting to work on this one but I think we should setup some milestones we want to see. I'll try to update the PR and SIP with specific milestones.

@ktmud
Copy link
Member

ktmud commented Mar 2, 2021

I believe the issue we have now is the opposite one.. superset-ui doesn't have tests almost at all. The coverage configs are all on 0.

This is not true. superset-ui core packages (@superset-ui/core and @superset-ui/chart-controls) do enforce 100% test coverage at least on non TSX/JSX files (the goal for TSX/JSX is 50%).

@MatanBobi
Copy link
Contributor Author

I believe the issue we have now is the opposite one.. superset-ui doesn't have tests almost at all. The coverage configs are all on 0.

This is not true. superset-ui core packages (@superset-ui/core and @superset-ui/chart-controls) do enforce 100% test coverage at least on non TSX/JSX files (the goal for TSX/JSX is 50%).

That's my bad, sorry. I was looking at this inside the jest configs.

@villebro
Copy link
Member

villebro commented Mar 3, 2021

Thanks so much for the comprehensive post and keeping this SIP alive!

I think it would be nice if the superset-ui npm packages would be synced with the official release versions; when a new Superset version is released, the all superset-ui packages would be released with the same version on npm. If we stay true to semantic versioning (the SIP has now passed), people would know that if their external applications is using e.g. v. 1.1.0 of superset-ui, it will be compatible with a backend running 1.5.0.

@villebro I do agree that it can be nice, but I'm not sure it is really helpful. This means that whenever you want to release a patch to a plugin you'll also bump all the packages and the superset-ui core module, sounds redundant to me but I'm not familiar with the whole release mechanism so I might be missing some stuff.

One consequence of moving superset-ui to the apache/superset repo is we're going to be effectively changing the ownership of the code to the Apache Software Foundation (ASF). That in itself isn't a problem, but the ASF is very strict about the protocol surrounding releases and publishing code from an ASF repo. In practice we won't be able to do non-ASF sanctioned releases of the superset-ui packages without going through the full ASF voting process, which requires 72 h voting time and sign-off from a minimum of 3 PMC members (there's some lenience wrt to latest builds from master). In practice this means that anything we release to npm will need to go through the normal voting process. There's no problem with doing different product releases under the Superset umbrella, even following different version numbers (=we could do superset-ui releases separately from apache/superset), but they'll still need to go through the regular voting flow before being published.

@rusackas
Copy link
Member

Shall we put this up for a VOTE @MatanBobi ?

@MatanBobi
Copy link
Contributor Author

Just updating here as I'll probably start working on it this week.
After consulting with @amitmiran137 and @rusackas, we'll probably start with building the monorepo approach and only move the core package. Then, we'll start moving the plugins one by one.

@villebro
Copy link
Member

villebro commented Nov 2, 2021

Hi all,

after carefully considering all points brought up in this discussion, we have done some prerequisite work required by the monorepo effort to more closely align the superset-ui repo with the main Superset repo. Some notable PRs:

Based on initial testing, both repos should now be in a state that should make the monorepo transition possible. In the following doc we summarize some of the key motivations for migrating the superset-ui code to the main Superset repo, consider pros/cons of the various alternatives proposed in this SIP and present a proposed solution for performing the migration:

Spike_-_Superset_Monorepo.pdf

A vote for the SIP will be initiated shortly.

@john-bodley
Copy link
Member

Out of interest @villebro what are the plans for the other repos in the https://github.com/apache-superset organization?

@villebro
Copy link
Member

villebro commented Nov 3, 2021

@john-bodley that's a good question, and TBH I haven't thought that far ahead yet. IMO apache-superset can be thought of as a kind of incubator org for stuff that's loosely related to the main repo, but not necessarily ready for inclusion in the main repo just yet/ever. For instance, the cherrytree repo is used to make the release process easier, but doesn't necessarily need to be a part of the main repo as it can just as easily be used to bake releases for any other GitHub project. So I think the org may well be good to have around even in the future, but we should probably consider at least archiving the ones that are no longer relevant, as many of them don't appear to be relevant anymore.

This was referenced Dec 10, 2021
@villebro
Copy link
Member

With the majority of this work done, this SIP can now be considered done, hence closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sip Superset Improvement Proposal
Projects
Status: Implemented / Done
Development

Successfully merging a pull request may close this issue.

8 participants