-
Notifications
You must be signed in to change notification settings - Fork 14.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SIP-57] Proposal for Semantic Versioning #12566
Comments
Thank you for the comphrensive writeup! I learned a lot from reading this. Some thoughts on a first pass:
We should never allow a new flag to change the default. A feature flag is like an A/B test, it should only add or change features when the flag is set to true, and the changes should not start as the default.
Would love to see an expansion on this, i.e., how the UI may break. For example,
We should also start thinking about breaking changes for the plugin systems, e.g., how the controls are defined, how query objects are built, and how charts are rendered... Currently it's in a "use at your own risk" state, but as the API matures, we may want to define any updates to |
Thanks for getting this doc out, I think it aligns with many of my hopes and dreams for a stable Superset and predictable version bumps. A few notes: There's no reference to function renaming or class refactoring in the breaking change section. I believe at this point we have some modules of code (specifically the Security Manager and the presto/hive jinja template code, although there might be others) where renaming functions/classes could be considered a breaking change to a public API. I'm not sure the best way to handle this, apart from adding CI checks on certain files to ensure that the structure of classes and functions remain the same in non-major versions. Note that this might also apply when changes to the TS plugins are made too. When discussing database migrations, the main consideration mentioned here is "time to run", but it doesn't address backwards compatibility of database migrations or push safety of db migrations. I'd recommend some additional guidelines for non-major changes:
This is mainly to ensure that when deploying db migrations in non-major versions, it can be done so without any downtime The note about putting breaking security changes into a minor release makes sense, but I'm a bit concerned that if someone is pinning their Superset release to the most recent minor release and automatically upgrading, then they won't notice the change. I might propose instead patching the current minor release with the breaking security fix and releasing it with a new major version number, then following up again with another major version. That way we maintain the meaning of semantic versions, even when security issues show up. Finally, I'm a bit concerned about the ability to keep PRs open and up to date with breaking changes in between the major versions. This might not be an issue if we expect a new major release every month or 2, but could be if there's 3-6 months in between major releases. Have we considered merging breaking changes into a seperate release branch once reviewed and having a release manager rebase this regularly? This would also allow people to stack multiple breaking changes on top of others, something that might be tricky with a bunch of PRs left open. Thanks for putting this SIP together, and I look forward to working with you and the rest of the community to enforce it and provide stable releases out of the box for our community! |
Thanks for the feedback @ktmud. I removed
Yeah, I was thinking about NPM plugins for example, and I noticed that we didn't have any peer dependencies, so that didn't seem to be an issue. Other than maybe node and NPM itself, I didn't find any other conflicts with the system that was running Superset. We don't have any specific node or npm/nvm version requirements other than to use the "latest", so there's no obvious way to know if it's breaking. Everything else in the npm modules should be self-contained within the package-lock file. Other than that, I'm not familiar with any breaking changes with superset-ui. As long as we are properly bumping the dependency in superset, is there anything else that the user should be aware of? |
Great, thanks for the feedback @etr2460! I'm going to respond a bit out of order just for simplicity while I think about some of the other points..
Yeah, I agree on this possibility. One concern that I'm hearing is that pushing out a major version will take time, and that releasing a security patch should be addressed immediately. What I hear you suggesting is an alternative which would be for example if you're on 1.4.3, to push out 2.0.0 with just the security fix on top of 1.4.3, and then follow immediately with 3.0.0 with the latest master including updated breaking changes/removed deprecations, etc? I think that could be a safer solution, if anyone else has opinions on that. I think in this example a 2.0.0 would take less time than the 3.0, but would it be more time than a 1.4.4? Maybe not. It would mean two quick major versions back to back, but I don't think it should happen often either. |
What I meant is more about breaking changes for chart plugins, because they depend on certain API in I don't think this is a real concern right now since there are still a lot of work we need to do before announcing official support for dynamic plugins, but it's something to think about. |
Re: breaking security fix releases.
If someone pinned their Superset deployment to the most recent minor release and chose to ignore (or wait for maturity of) a new major release, they might not notice the change, either. So maybe in addition to publishing the breaking security fix as a major version, we should also release a minor version that adds a deprecation warning. I think (or hope) in the (ideal) future, Superset would have much more frequent release cycles, and pushing out a new version would be much less work than it is now. So I'd vote for releasing breaking security fixes as new major versions. In any case, this should be rare, so a back-to-back major version updates may not be that big of a concern. We can always choose which big version updates to heavily market on. We also don't have to immediately release another major version if the features in the work were not breaking changes. |
Yeah, I agree. After thinking about it more, if we released the security fix as a patch on the previous version with a major bump, I don't think the next version will necessarily need to be another major version. It could be a minor.
That's a good point. So your preference would be to go with the original proposal which is to push a patch fix out ( I think it could likely be a patch instead of a minor if we only cherry-pick the security fix and that should get picked up with pinned minor versions) and then fast follow with the major? Or to go straight to the major version, with the expectation that that pushing out a new version will be less work in the future? |
I think I was confused on what is "pinning at the latest minor version". I assumed it was like using I was proposing to go straight to the major version, and release a minor/patch version that adds a deprecate warning, so that the release doesn't break things for users with autoupdates enabled and they would also know an upgrade is needed. If the fix is going to be a breaking change, we may not want it to be either a minor or patch release. Because according to Semantic Versioning, minor versions should always be backward compatible. The major version could either be straight from the |
Thanks for this point @etr2460. @betodealmeida followed up with #12635, and I'll add notes to this doc. Great callout! |
Ideally we would find solutions that allow us to continue to roll out code regularly and not have to hold on to breaking changes. Putting changes behind a feature flag for example. If there was a situation where we absolutely couldn't push out the code and had to manage it separately until the next release, then yes, regular rebases managed by a release manager would make sense, albeit it would be a complicated process as you noted! I feel that if all possible we should always encourage backward compatibility over feature branches, though, and use major releases as a point in time for cleanup of the deprecated features. |
Ok, I got you.. you're saying that security fixes with a breaking change should always go into the next release (similar to what @etr2460 suggested) as a major version. Security fixes that are backward compatible should go with the next release or as a patch on the last release. If the fix is backward compatible and the next release is a major version (as if the security issue was detected as we were building the next major version), then we should also backport the security fix to the last release as well. Did I get that correctly? |
Forgive me if I am interjecting a novice question here. Since breaking changes are critical to the proposed semantic versioning scheme (major/minor version distinction), does it make sense to add some additional structure / formalization around how breaking changes are identified and documented? As I understand it, the current process relies on individual contributors adding information to UPDATING.md when merging code that results in a breaking change, and that information lives nowhere else outside the mind of the contributor and the results of existing regression testing. Edit: Maybe PR labels (like "risk:breaking-change"), could do the job with a script to run before each major release that scoops all PRs with breaking changes for automated addition to UPDATING.md. More information could be appended to that document by hand, but at least we wouldn't run as large a risk of breaking changes slipping through undocumented. |
You're correct, @garden-of-delete.. we haven't come to a decision, but I left two options in the sip for now about marking PR as breaking, either as a title prefix or a tag. The downside to a tag is that only committers can add them, but I think they may be easier to work with when trying to extract prs that are breaking. We could also do both the title prefix and the tag, and have a committer add the tag before merging if needed. I was hoping for a discussion from some people, and you're the perfect person to ask about how we are going to extract this information into the release notes and changelog. It's also the responsibility of reviewers to look for breaking changes. Oftentimes breaking changes are caught by reviewers/committers that the author didn't even know of. The other consideration to add is that sometimes an author may mark the PR as breaking, and it would be the responsibility of the reviewer to encourage and guide them to make it backward compatible if possible. |
Would be happy to contribute the automation i'm proposing above as well. re: "Oftentimes breaking changes are caught by reviewers/committers that the author didn't even know of." Somewhat tangential, but is there a way enable contributors to ask the label bot to add a label from a defined set of "temporary/tenative labels" that are removed at merge time? That would allow us to keep everything in tags, but still allow contributors to identify their own open PR as potentially breaking during the initial submission or review process. |
NOTE: I just added this as an extra bullet point. I thought it was helpful to clearly define an intentional breaking change vs a bug:
|
That is correct. Hopefully the backport wouldn't be too much work unless we are doing some crazy major new version. |
💯 Even though the plugin API is not yet mature, it is a very important one so I think it would be wise to bump Superset's major version if we make breaking changes to that interface. We also need more thorough documentation of what exactly is included in the plugin interface in order to really determine if a change to the interface is breaking for plugins or not. Just because a change doesn't happen to break our builtin plugins doesn't mean it won't break some third party code somewhere, so we should start being clearer on what third party code can expect from Superset. |
Thanks @etr2460 and @ktmud for the feedback on this point. I've updated the issue. |
Awesome. I'll reach out.
Cool.. I think we'll need a way for the author to flag that the PR has a breaking change, and then a committer to flag or label it as well. I'll update the doc to suggest that we do both, and we can go from there.
I was thinking something similar, where if the title prefix |
Thanks @suddjian and @ktmud. Do you have suggestions on what wording for this interface? I'll start with @ktmud's wording.
|
@eschutho I think we can revisit this after this part is done:
However, IMHO, the plugin system still need some major refinement (e.g. simplify the registry, async loading for controls and |
After a discussion in #12754 I updated the section on the prefix to show that according to "conventional commits" a |
Thanks @ktmud I updated the sip. It sounds like we'll need to take these plugin breaking changes on an ad-hoc basis until the system has been updated, correct? |
@etr2460 I agree. I think it would be safest to also add that any migrations that drop a table or column should be in the major release version, in case people bump two minors at once. Even if people skip a minor version and go straight to the major, they'll still run into the same downtime issues. Therefore I'd also recommend that if there is a destructive db migration in a major version we encourage people to upgrade at least to the last minor version, run the migrations, and then upgrade to the major version. Do you think that's doable? I updated the doc with the recommendations if you would like to see the suggestion I proposed. |
I agree that it needs refinement and should avoid creating obstacles for ourselves on something this unpolished. But if we can start with a narrow definition of what is public and what constitutes a major change, rather than an anything goes approach, it will give us a basis to move forward and clear reasoning for including most changes in a minor release. This gives plugin authors the confidence they need, and I believe will help us to better expand the API. For now, I think this could be as limited as the That is just my opinion, I also don't mind leaving the public plugin interface undefined for the time being. |
@suddjian I'm OK with a loosely defined spec and agree it would help us solidify the API. Just hope we'd make sure to add a disclaimer about its unstableness if we do decide to put it in a public doc. The |
Sounds good. Thanks for the updates, @eschutho ! |
Looking really good! Re: mergeability of PRs, it sounds like PRs that introduce a breaking change will be kept open until such time as there is a major version release branch, unless the change can be put behind a feature flag. Is that the intention? Will the same process apply to new features introduced for minor versions, or will these be mergeable to master prior to a release? Separately, I noticed the DB migration sections don't include anything about modifying the data in a table in a backwards-incompatible way. For example, a JSON blog being modified. |
The majority of changes will be able to be backward compatible, but if something just can't be made backward compatible, then yes, we should plan to either make the next release a major version or hold on to the PR until we can create a major version release. Another aspect is in planning. If we know in advance that a feature is not going to be backward compatible, it's good to keep that in mind and plan to build it at a time when we know we can do a major version release.
Any minor version feature should be able to be merged into master. Patch bumps usually are cherry-picks onto an existing release, so we shouldn't ever need to worry about not merging backward compatible features to master.
Ok, good point. So something like a JSON blob has a schema, and we run a script during the migration that updates the data that changes the schema, and then we push out the code that reads/writes the data according to the new schema, but between the script/migration and the code update, the application will throw errors? |
Yes, exactly. Changing the format or shape of the data within a JSON blob in a non-backwards compatible way. Minor nit. Re: the release strategy, I'm a bit concerned about the feasibility of cherry-picking from master into release branches at the current speed master is changing, but let's cross that bridge when we come to it. No blockers from me on the SIP. |
Sorry, my bad, I updated my comment. I meant to say that we should only need to cherry pick for patch releases, like we're doing for 1.0.1. |
@robdiciuccio I added more to the DB migration bullet points. LMK if that looks ok to you:
|
LGTM, thanks! |
The official vote to adopt this SIP PASSED with 11 binding +1, 3 non-binding +1, 0 neutral, and 0 negative votes. |
Motivation
As Superset moves to version 1.0, we want to keep this and future versions stable for users, while also communicating clearly to users when upgrading if they should expect any breaking changes, and how to prepare for new versions in updating their own code or usage of the application.
Proposed Change
The principles of Semantic Versioning are outlined in this document by Tom Preston-Werner co-founder of GitHub. The proposal is to follow these guidelines with regard to public releases of Superset. The definitions for what is a major/minor and patch release are described in the document, but in this document, we should clearly define what a breaking change on Superset is and then how to communicate it.
We also want to set up guidelines for deprecating features and announcing those deprecations in the changelog and release notes, and then fully removing the feature in a breaking change at some later date with a major version release. The goal is to have fewer breaking changes and to aim for changes to be backward compatible and then grouped together in a major release.
What is a breaking change?
ChartPlugin
class and its constructor arguments that break an existing visual pluginBest practices to make changes backward compatible:
How to announce a breaking change in a PR
According to https://www.conventionalcommits.org/en/v1.0.0/#specification to announcing a breaking change in a commit message an
!
is added to the prefix just before the colon, such asfeat!: replace old charts with different ones
. Reviewers should only allow this to be merged when the next release planned is a breaking change or if the author was able to make sufficient changes after feedback from reviewers to make the PR backward compatible. Also arisk:breaking-change
tag should be added to the PR. This change should be notated in the changelog.PR reviewers should also be diligent when looking at the code to identify if there are any breaking changes in the PR and make suggestions and recommendations to the author to label the PR as breaking if they feel it is warranted or even better, to make the code backward compatible.
A “TODO” should be added in the code with a greppable tag of “TODO: DEPRECATION” with any helpful instructions for what code needs to be changed or removed in order to remove the functionality at the next major version. Keep in mind that someone else other than the author may be removing the functionality at a later date, so clear instructions are helpful.
UPDATING.MD should be updated with a “Deprecated” section for each release, and for major bumps, a “Breaking Change” section should be added. This document adds more context than what we can get from the PR commit message, and will be partially copied into the changelog. The full format of the changelog with deprecation and breaking change notifications is TBD.
How to announce deprecations to users
The changelog should contain information about features that are deprecated, and clearly notate when it is going to be permanently removed. I.e., “The feature flag “ANIMATED_WINDOWS” has been deprecated and will be removed in the 2.0 release.
We should be able to automatically pull this from UPDATING.MD into the changelog.
When possible, it is encouraged to log a message to users when they are using a feature that is deprecated, and let them know that the feature will be removed soon. CLI logs are a good example.
We should try to give users a sufficient amount of time between deprecating and removing functionality, to allow them to update their integrations.
How to announce breaking changes to contributors
The changelog for the major version bumps should contain information about any necessary changes that users will need to make to their codebase, integrations, databases, or charts before upgrading so that there are no interruptions in their usage of the platform. We can continue to point contributors to UPDATING.MD for more specific information on how to make the update. The release notes should also mention deprecations and breaking changes, but likely with less technical information, so it may be a subset of what is communicated in the changelog.
How to implement breaking changes upon a major version bump
Just before a major version branch/rc branch is cut, a commit (or multiple commits) should be merged into master that remove(s) any deprecated features/flags/backward incompatible changes. This sweep through the code and changelog for deprecated features ensures that we keep our code clean and are diligent about removing deprecated functionality on each major version bump. If the changelog is well documented with deprecation notes, we should be able to find what needs to be removed from that list.
What if something is inadvertently added to a minor or patch release that is a breaking change?
If a breaking change is released, a patch fix should be released as soon as possible correcting/reverting the breaking change and a note for the affected version in the changelog should highlight that a breaking change is in that release, and encourage users to take the next patched version.
What if a security issue requires a change that is a breaking change?
Security fixes with a breaking change should always go into the next release as a major version. If necessary, this can happen quickly if it is just cherry picked onto the last version, with a major version bump. Security fixes that are backward compatible should go with the next release or as a patch on the last release. If the fix is backward compatible and the next release is a major version (as if the security issue was detected as we were building the next major version), then we should also backport the security fix to the last release as well so that we can fix the issue for people who have Superset pinned to the last major (or minor) version, and don't notice that there is a security fix/issue.
What if a breaking change gets introduced into a release as a bug?
If a breaking change is introduced into a minor or patch release that is considered a bug, fix the bug as soon as possible in the next patch or minor release. You do not need to do anything to make that release a major version.
Rejected Alternatives
It was also considered to have the major versions remain as product announcements, i.e., version 2.0 would represent a release of major features, or a brand new design. It was decided that Semantic Versioning would benefit the community when updating, and help users to be aware of breaking changes in their integrations.
Backporting security fixes to older versions was also considered, but it was decided for now that we do not have a strong need to support older versions, and instead focus on maintaining a stable "latest" release.
Changes that dropped tables or broke the structure of the metadata database was considered as a breaking change originally, but rejected with the thought that an admin should be prepared to do some changes to metadata queries if they perform a minor upgrade (likened to internal changes in the codebase/classes).
It was considered to only put security-related breaking changes in a major version, but due to the time involved in rolling out a major version, the suggestion of a minor with a fast-follow major version was added.
Adding an additional tag for security fixes i.e., with a CVSS scores 9-10 was considered to be outside of the scope of this proposal.
The text was updated successfully, but these errors were encountered: