Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JupyterLab Development Cycle RFC #54

Draft
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

saulshanabrook
Copy link
Member

@saulshanabrook saulshanabrook commented May 1, 2020

Goals:

  • Formalize our process around releases and git branches
  • Automate the release process as much as possible
  • Spread out any required human labor for a release into the PRs that make up the release
  • Decrease amount of time necessary to do a release

Non goals:

  • Change the user or extension developer expectations around our current releases
  • Switch from using SemVer
  • Change the organization of our code

I was originally writing a comment on jupyterlab/jupyterlab#8195, but as it grew longer I realized that an issue comment maybe wasn't the best format to get feedback.

TODO:

  • Come to agreement on how to bump JS versions not in sync with Python versions
  • Switch master branch to x.0.0 (wanted to do this originally for consistency, but felt that it wasn't worth upheaval. However, with all the discussion around removing master anyways, think this might make sense. similar proposal in Python).
  • Think about and add strategy for dealing with docs, with only goal to keep master up to date with all changelogs.

Copy link
Contributor

@blink1073 blink1073 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent writeup @saulshanabrook, thanks for tackling this thorny issue!

rfc/development-cycle.md Outdated Show resolved Hide resolved
rfc/development-cycle.md Outdated Show resolved Hide resolved
rfc/development-cycle.md Show resolved Hide resolved
rfc/development-cycle.md Outdated Show resolved Hide resolved
rfc/development-cycle.md Show resolved Hide resolved
rfc/development-cycle.md Outdated Show resolved Hide resolved
rfc/development-cycle.md Outdated Show resolved Hide resolved

## Open questions

1. How do we deal with changelogs properly? Where do we deploy them from?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there not prior art from matplotlib on this as well?

Copy link
Member Author

@saulshanabrook saulshanabrook May 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should ask them... Thinking about how the changelogs works is one of the most confusing parts for me.

I'll spell out how I am thinking about it here, and hopefully I am missing something that will make it simpler.

Let's say I have two branches 1.2.x and 1.x.0. I merge a patch into master, and this is backported and merged into those two branches.

Now, if I do new patch release next on the 1.2.x, I assume that changelog entry should show up in that patch release, but not in the next minor release.

However, if I do the next minor release first, I should show the changelog entry for the patch PR in that release! And then if I do a patch release later, it should also show it in this as well, I believe?


Another changelog question, for each RC or Alpha release, do we make a seperate changelog entry? For the final release, do we show what has just changed since the last pre release, or since the last final release?


If we tied changelog entries instead to commit messages or to PR description, instead of to files, we could have more flexibility here in building dynamic changelogs for different scenarios. I could even imagine on the most flexible side, an option where you select two release and it shows you the changelog entries between them.

But this would negatively impact our ability to curate changelogs and merge together multiple PRs into one entry.

@jasongrout you do a lot of changelog work! Maybe you also have opinions here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had a good conversation about this with @telamonian and @vidartf at the weekly meeting:

  • When users go to read the docs and click on changelog, they should see all changes, regardless of what branch they were made on.
  • doing a "forward port" pr of the added changelog entries on backport releases to the recent releases isn't a crazy way to do this.

I will work on updating this document with these principles in mind. Thank you all for the continued feedback.

Copy link
Member Author

@saulshanabrook saulshanabrook Jun 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about these changelog issues more, I reluctantly am coming to terms with the fact that we might benefit from a richer semantic model of changelog entries. Ignoring the "Which branch are these in??" question at first, I have started thinking about the changelog as a number of entries, each with:

  • PR(s): At least one, but maybe multiple PRs which this text summarizes
  • Description: Either a single sentence or a richer multiple node (images, paragraphs) description of the feature
  • Category (optional): A heading to put this under, like "Developer Changes" or "User Changes" or "Backwards Incompatible extension changes"

I believe this would cover the breadth of our existing changelog, although I would love @jasongrout's input on this since he has been running point on the more in depth ones lately.

Pairing this information with knowledge of what release(s) a certain PR was first present in (maybe multiple b/c of backports), we can create a diff of entries given some number of source releases and some target release. For example, we could ask "What are all the changes added to 3.0.0 starting from 2.0.0 and 2.2.2?" This would mostly be additions, but there conceptually be some removals, say for entries that were just targeted to some 2.x release that were not included the 3.0.0 release.

Given that, we could create some text description, in Markdown or RST. We could be specific about what the release we are comparing against for each release.

Now for our "Changelog" file in the docs, we would basically run that function a number of times and collect the various responses together.

One open question I have is "What is the base release(s) for some given changelog entry?" For example, for the changelog for 3.0.0, we could say diff from the last prelease of 3.0.0. However, what about for the first pre-release of 3.0.0? From a git perspective, the last release on that branch (the x.0.0/master branch) would have been 2.0.0. However, showing all changes since 2.0.0 is not what we are doing currently. What we are doing currently is probably all changes from the last release of 2.x before the first pre-release?

So some possible version of rules here (these likely need some massaging):

  1. Patch versions {x}.{y}.x branches: For final and pre release patch versions, base the changes off of the previous pre-release/patch. So for say 3.2.1 base it off of the previous release in the 3.2.x branch.
  2. Minor Versions {x}.x.0 branches: Base off of last release on this branch. Also, base on last patch release of previous minor release if that is more recent that latest release on this branch. i.e. a 3.2.0 release should include changes since 3.1.4 not since 3.1.0, if 3.1.4 was released before 3.2.0.
  3. Major versions x.0.0 branch: Base off of last release on this branch + last release minor release of previous major release + last patch release of last minor release of previous major release, if any of those are released after last release on this branch.

@saulshanabrook
Copy link
Member Author

Thank you @blink1073 for the comments and review here! I will update with your suggestions. I am glad it made some sense out of my own head :)

Any other feedback on how to make it more readable is very much appreciated.

@saulshanabrook
Copy link
Member Author

@blink1073 I updated it to use branch merges to release. One other thing I added is the ability to test out the full release, of JS and Python packages, in the branch PR without actually releasing anything, by using a proxy NPM server to publish the packages there. Also adds the built packages as artifacts to that PR action

Then, once it has been merge, all that needs to be done is grab those artifacts and actually publish them.

Do you think this would work?

@blink1073
Copy link
Contributor

JupyterLab Git has some dry run logic for both pypi and npm. We don't necessarily need to pull from the test servers, we could have two release targets.

@saulshanabrook
Copy link
Member Author

We don't necessarily need to pull from the test servers, we could have two release targets.

Yeah I meant we would pull from the github target artifacts.


The last two mean you have to totally finish a release on one branch before you start on the next one. For example, it would be illegal to have releases in this order: `1.2.0a0`, `1.3.0a0`, `1.2.0`, then `1.3.0`. However, you could have `1.2.0a0`, `1.1.1a0`, `1.2.0`, then `1.1.1`, because these would be on different branches, `1.x.0` and `1.1.x`.

**JavaScript versions In Sync** We are OK always keeping the JS version bumps in sync. Meaning that if we do a major release of the Python package we also do a major release of all JS versions.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My major concern with this point is that it will lead to a lot of dead/outdated extensions. I've made a script to profile the state of all extensions with the keyword "jupyterlab-extension" on npm, and got these results:

Processed 589 extensions:

Up to date (193)

Outdated (379):
  Support ends at v2.x: 3
  Support ends at v1.x: 173
  Support ends at v0.x: 203
Deprecated (4)

Unclassified (13)

(the "Support ends at v2.x" means someone pinned dependency to e.g. "~2.0", so they don't support latest, but support something in 2.x).

I'll do a separate PR to share the script, which should ideally be run regularly so we can track the development over time, specially following new releases.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work @vidartf! Those kinds of metrics are important for administrators (and users) looking to decide when to upgrade.

Copy link
Member

@vidartf vidartf May 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@blink1073 Thanks! Any suggestions on where I should put the code?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My major concern with this point is that it will lead to a lot of dead/outdated extensions.

I share this concern. Just want to be clear that this PR tried to make a process our existing way of doing versioning. It changes a few branch names, and would let us release more often, but substantially shouldn't be change the story for extension authors.

I choose to do this, because I thought this would be easier to agree on then having a larger discussion around how to change our release process in general, although I think that would also be useful!

Talking with @kgryte about this, we did bat around some of the ideas from the Node release processes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also yeah, nice analysis! This is really helpful to have...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just feels like this would formalize something that in my impression is not something "we are OK" with.

Suggested change
**JavaScript versions In Sync** We are OK always keeping the JS version bumps in sync. Meaning that if we do a major release of the Python package we also do a major release of all JS versions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I agree ! I would love to not contain this part as well. I remember arguing strongly against it in previous meetings, because it forces extension authors to upgrade.

However, it makes it untenable, theoretically, if we want to keep doing backport versioning. Or if we have a practical hack, like bumping new minor version +10 on the next major version, it makes things definitely more confusing.


If in turn we could actually version each package independently, which I do think would be ideal for downstream extension authors.

From the PR standpoint, we could conceptually have some way of doing this, having three labels per package. notebook-extension-minor or something.

However, I when I started thinking through how we maintain backports for this kind of thing, it starts to get a bit... wild. Haven't totally thought this through, but I started coming to having a branch of basically every combination of packages and their next release. So maybe like:

notebook-extension-2.1.x-and-notebook-2.1.x-and...

IDK I haven't totally figured it out, as I said, but could explore it more to see if anything reasonable could come out of it, though I am doubtful.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of good notes from our meeting here as well:

We would hope to see JupyterLab becoming more stable over time, so that means locking down a process, like is documented here, that forces extensions authors to release a new version every time we have a major release, is pretty harmful.

So we should figure out a way to not do major version bumps when we don't have to. I don't know how this will work at the moment.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One way to actually solve this would be to move out of a monorepo, so that each package could have its own repo and be versioned independently. However, this is a non starter.

@saulshanabrook
Copy link
Member Author

One big blocker for this is figuring out a plan if we want to allow not all NPM package versions to be bumped when we have a major release. Last we chatted, it was preferred to allow this behavior.

However, I am wondering if in our next major release, 3.0.0, we plan on selectively bumping major versions or just bumping all of them. If it is the latter, I wonder if we could consider this plan to at least automate our current process, even if it does force us to bump everything by a major version.

@saulshanabrook
Copy link
Member Author

saulshanabrook commented Jun 17, 2020

To @vidartf's comment, yes this would be formalizing something we are not OK with.

However, if we are doing it anyway, even if we aren't OK with it, then formalizing it is just being honest with ourselves and the community about the current state of how things works.

I know @blink1073 had a strategy in the past about bumping all minor versions up by ten or something, as a sort of hacky way around this. We could also formalize that kind of workaround instead, if we prefer that. EDIT: I feel like this hack is maybe a bad idea? If people don't pin exactly, and we are releasing things on two strands basically of minor versions for a single package, I could see this getting very confusing..

@blink1073
Copy link
Contributor

I prefer "very hacky and pragmatic" 😄

@saulshanabrook
Copy link
Member Author

i.e. to expand on my edit above:

Let's say we release version 3.0.0 and package a currently is at 2.0.0. My understanding of @blink1073's "hacky" approach (lmk if this is correct or not) is to bump a to 2.10.0 for the 3.0 release. Then we can continue having parallel minor releases of package a in the 2.x branch as well as in the new 3.x branch. Is that right?

If so, then let's say we release 2.11.0 as a new minor version of a on 3.0. Then, couldn't extensions built for the 2.x release pull this in if they use the ^2.0.0 version spec? Wouldn't that sorta be bad? Conceptually they would just wanna pull in 2.0.0 up to but less than 2.9.0, because that would be the last minor that could be on the 2 release?

@jasongrout
Copy link
Contributor

If so, then let's say we release 2.11.0 as a new minor version of a on 3.0. Then, couldn't extensions built for the 2.x release pull this in if they use the ^2.0.0 version spec? Wouldn't that sorta be bad? Conceptually they would just wanna pull in 2.0.0 up to but less than 2.9.0, because that would be the last minor that could be on the 2 release?

The whole point here is that some extensions would not need to change across major jlab releases, so indeed a 2.11 extension from jlab 3 could be used in jlab 2. The idea here is that we bump on actual backwards incompatible changes, so if there were no backwards incompatibilities for a particular extension, it doesn't get bumped.

@saulshanabrook
Copy link
Member Author

The whole point here is that some extensions would not need to change across major jlab releases, so indeed a 2.11 extension from jlab 3 could be used in jlab 2.

Would we release new minor versions though of the previous packages? We currently do this for patch releases, so maybe an example there is more illustrative.

Let's say we have an NPM package at version 2.0.0, and we release a new patch version for the 3.0.0 release, so we have 2.0.10 as the version NPM released in Python 3.0.0.

Now we do some backports into the 2 branch and release a 2.0.1 NPM release of that package.

However, there is now a 2.0.10 version as well, that has the patches we added for the 3.0.0 release. This will now be installed on old users who have the 2.0.0 Python package installed, since it is ahead of 2.0.1 in SemVer terms. This is the issue I mean above. Does that make sense?

@blink1073
Copy link
Contributor

If we adopted an "only move forward approach" where there is no 2.x minor release after a 3.x release, then we could avoid the hacky approach. The hacky approach was adopted to enable that capability.

@jasongrout
Copy link
Contributor

jasongrout commented Jun 17, 2020

If we adopted an "only move forward approach" where there is no 2.x minor release after a 3.x release, then we could avoid the hacky approach. The hacky approach was adopted to enable that capability.

We still have the problem if we release alphas/betas of 3.0 while releasing a new minor 2.x release. For example, if we release an alpha of 3.0, then release 2.2, we still need to handle there being two parallel versions out in the wild.

(so if by "a 3.x release" you mean "any 3.x release, including prereleases", then yes...)

@saulshanabrook
Copy link
Member Author

saulshanabrook commented Jun 22, 2020

I think I have lost some of the thread of our current discussion around the "hacky aproach".

Goal

To step back, the goal here is to articulate a strategy that works for not having to do a major version bump of all packages for each major python release, only those with breaking changes. We still want to do releases of the previous major version while the next major version is in the process of being released and has been released.

Issue

The root issue is that if we do the minimum bump of each package we can during each release, and we then do later releases on the older branch, we have two separate git branches to release the same release line of packages (like 1.2.x or 1.x.0).

Solution

One way to make it safe to keep doing releases on older branches is to make sure they can only upgrade the packages that won't possibly have a duplicate version. The rule would be like this:

If we are doing a major Python release, the next minor or patch versions for NPM packages of the previous major Python version are limited as so:

  1. If the NPM package has a major version bump in the next major Python release, then it can have subsequent minor or patch releases in the previous major Python version.
  2. If the NPM package has a minor version bump in the next major Python release, then it can have patch releases in the previous major Python version
  3. If the NPM package has a patch release, or not release in the in the next major Python release, then it cannot have anymore releases in the previous major Python version.

If we are doing a minor Python release, the next patch versions for NPM packages of the previous minor Python version are limited as so:

  1. If the NPM package has a minor version bump, it can have subsequent patch releases
  2. If the NPM package has a patch or no version bump, it cannot have any subsequent releases on the old minor Python version.

This seems like the maximally flexible rules that still preserves version consistency. We could opt to not use all this flexibility, however and trade off some preciseness in our SemVer version bumps for the information we need to persist across time. For example, we could use the strategy:

On every major release, bump all packages a minor release at a minimum. Any packages that have breaking changes, bump as a major version. This means we are free in the old major version to keep making as many patch releases as we like, and minor releases for any package that we bumped a major version.

On every minor release, bump all packages a a minor release. This allows us to make subsequent patch releases on any NPM packages on the previous minor release.

In this strategy, we would only have to care about what package was having a major version bump, for patch and minor, they would be global.

Otherwise, if we wanted to do the least invasive version bumps necessary, we could track for each PR what version bump it has on every package it touches. And then, we would track on each next release which bumps happen and limit the older releases on the other branches according to the above rules. I.e. if a package had just a patch release in the major python version, then we could not make any more releases of that package in the previous python version. However, if it had a minor, we could make only patch releases. And if it had a major, we could make either minor or patch releases.


Does this reasoning make sense @jasongrout @blink1073 @vidartf @telamonian? If not, I think it might be helpful to arrange a call around this if either of you are interested. It's quite confusing and nuanced...

@saulshanabrook saulshanabrook marked this pull request as draft June 22, 2020 15:20
@telamonian
Copy link
Member

On every major release, bump all packages a minor release at a minimum. Any packages that have breaking changes, bump as a major version. This means we are free in the old major version to keep making as many patch releases as we like, and minor releases for any package that we bumped a major version.

On every minor release, bump all packages a a minor release. This allows us to make subsequent patch releases on any NPM packages on the previous minor release.

This sounds like the most sane (and humane) of the alternatives you present, Saul. Otherwise, I say we just give up and start versioning everything with (pypi_ver, npm_ver) tuples

@saulshanabrook saulshanabrook changed the title Add JupyterLab Development Cycle RFC JupyterLab Development Cycle RFC Jun 22, 2020
@blink1073
Copy link
Contributor

I agree with @telamonian's assessment.

@saulshanabrook
Copy link
Member Author

A further question... Should we allow any minor releases on previous major versions? By "previous" I mean like if we have a 3.0.0-alpha0 then the 2 major version is now "previous".

  • Yes: We allow continued minor python releases previous versions. However, the only NPM packages which were had a major version bump in the first release of the current major python version (3.0.0-alpha0 in this example) can have minor releases. Those with a minor release, can only have patch releases. This means, when we do our first major version alpha, any packages which we decide to not major version bump at that time (note we can still major version bump them in subsequent major releases, say 3.0.0-alpha1), cannot have ongoing minor releases in the previous major. One way to think about this is that each release line can only exist in one branch at a time. There can only ever be one branch that can have the next release in a certain release line. So if a package gets a minor version bump for the next major, then the next major branch (master/x.0.0) now owns that minor release line and the previous major version can no longer release on it, only allowing patch releases.
  • No: We only allow patch releases in previous Python minor releases. This means after the first alpha we release of 3.0.0 we cannot release another minor release in the 2.x line. This makes it simpler to track, since we don't have to care which NPM packages had minor/major versions in the next major release, but does limit our ability to keep doing minor releases on the previous major release.

@blink1073
Copy link
Contributor

As an example, we're using JupyterLab 1.x at AWS, and planning to wait until 3.0 is released to upgrade, since we decided that the 2.0 release did not have enough incentives for users to upgrade and would be too disruptive. I am considering making a 1.3 release to backport a few features while we are waiting.

@saulshanabrook
Copy link
Member Author

I am considering making a 1.3 release to backport a few features while we are waiting.

OK so you are saying we should allow minor releases on older major versions?

@blink1073
Copy link
Contributor

OK so you are saying we should allow minor releases on older major versions?

Yep, that's what I'm advocating for.

@saulshanabrook
Copy link
Member Author

Sounds good. Well then my next step would be to try to put together a plan for how we can achieve this, given the constraints under the "yes" answer above and type that up in the doc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants