Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

De-duplicate builds #7123

Merged
merged 12 commits into from
Jun 8, 2020
Merged

De-duplicate builds #7123

merged 12 commits into from
Jun 8, 2020

Conversation

humitos
Copy link
Member

@humitos humitos commented May 25, 2020

When a build is triggered, it could be marked as to be skipped by builders if:

  • there is already a build running/queued for the same commit

    Multiple builds of the same commit will lead to the same results. So, we skip
    it if there is one already running/queued.

  • there is already a build queued for the same version

    When building a version without specifying the commit, the last commit is
    built. In this case, we can only skip the new triggered build if there is one
    build queued because both builds will just pick the same commit.

A different approach of #7031

@humitos humitos requested a review from ericholscher May 25, 2020 11:32
When a build is triggered, it could be marked as to be skipped by builders if:

* there is already a build running/queued for the same commit

  Multiple builds of the same commit will lead to the same results. So, we skip
  it if there is one already running/queued.

* there is already a build queued for the same version

  When building a version without specifying the commit, the last commit is
  built. In this case, we can only skip the new triggered build if there is one
  build queued because both builds will just pick the same commit.
@humitos humitos force-pushed the humitos/de-duplicate-builds branch from 3f072f5 to e76a948 Compare May 25, 2020 11:35
@humitos humitos requested a review from a team May 27, 2020 09:44
Copy link
Member

@ericholscher ericholscher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a good approach. 👍

I think we're hitting issues where we don't have enough build states, and we should probably think more about this. @agjohnson do you have thoughts via the redesign? It seems like "rebuild this commit" and "waiting for concurrency limit" and "cancelled because a commit is already being built" are mostly new states that we don't have a great way to present to users, but we should.

commit,
)
build.error = DuplicatedBuildError.message
build.success = False
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should make this nullable and set it to None here? It didn't succeed or fail, so it kinda makes sense.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes sense to me. However, I don't think we can change it now without changing the UI (knockout code) that renders in the frontend; which I'm not sure if we want to touch at this point.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I just worry this will break the badge display.

readthedocs/core/utils/__init__.py Show resolved Hide resolved
@agjohnson
Copy link
Contributor

I'd like to see build state eventually decoupled from build status (or whatever you want to call this concept). How we do this is mostly a matter of modeling, but I see state and status as two competing concepts -- at least as it will apply to our new UX.

I'd describe build state as where in the build process the build is. This allows us to show progression to the user in the form of a progress bar or in the build listing. Our steps now are fairly broad:

  • Queued
  • Cloning
  • Installing
  • Building
  • Uploading
  • Finished

I would describe status as why the build is in a particular state. This is helpful for communicating more details about state to the user, but it doesn't help describe progression. Queued is an important state description, but "Queued" vs "Queue due to build concurrency" is not helpful in describing how far along the build is. Build status would include why a build is queued, why a build was rejected, and error states like why a build failed. Currently, these are all weak concepts in our modeling/UI.

I've long wanted something like a status field on Build with an identifier that maps to a list of build status/exception classes. We would store metadata around the status/exception on builds -- for example save build.status_code = Build.STATUS_TIMEOUT -- not just store the error as a character model field or use some obtuse template logic to display copy that exists in templates only.

I'd propose we stop adding to the list of build states -- that is, describing build flow progression -- and move all reporting of why a build is in a particular state to more flexible modeling.

So, for this particular case, I'd suggest:

  • build.state = 'finished'
  • build.status_code = Build.STATUS_SKIPPED_DUPLICATE.status_code

From there, we have many more options with class attributes/etc to help describe the build in templates.

@humitos
Copy link
Member Author

humitos commented May 28, 2020

@agjohnson super interesting. I'm aligned with what you are saying here.

build.status_code = Build.STATUS_SKIPPED_DUPLICATE.status_code

We don't have a status_code currently. We have a exit_code, but I don't think it means the same. Are you suggesting to add a new field? If that's the case, should we change the UI with this change as well, or you are just thinking to use this new field in the redesign?

@ericholscher
Copy link
Member

I think this makes sense generally.

I also think we need to think more about "success" in this case as well. Here, the build doesn't fail or succeed, but is mostly just a no-op. I think we have weird edge cases here already with badges, where we show failed badges more often than we should because of this issue.

@agjohnson
Copy link
Contributor

Are you suggesting to add a new field?

Sorry, yeah my example was adding a new field for this, which would be separate from exit_code and would have special meaning internally. I'm not sure if we have a good middle ground though, and don't want to hold up this PR on a larger overhaul of error/status reporting in the UI. However, I did remove a lot of the complex template logic used to display build various state blocks from the build detail UI in anticipation of having stronger error/status reporting.

If the the simple, temporary solution is re-using the error field, then I think that is fine. We can prioritize adding some nicer error/status reporting before I get to a more mature release with the templates. I don't think we're talking about a lot of work though, but I'd probably only support better error/status modeling on the new templates

I also think we need to think more about "success" in this case as well

This is a great point as well. We might want to turn build.success from a boolean into a select field of failed, success, skipped, and maybe even unknown. I'd like to see skipped builds as a different state than builds that succeeded or failed.

@humitos
Copy link
Member Author

humitos commented Jun 3, 2020

I made some improvements here and did some more testing locally. It's ready for a re-review.

it's easier to test with multiple single process builders (see readthedocs/common#61 (comment))

Build output's page looks like this:

Screenshot_2020-06-03_17-50-17

If we want to change how the error is displayed this is the code generates the Error section.

  • "Error" title is hardcoded
  • Build.error is what it's shown to the user

The Build list's page shows "Passed" and "Failed" for each build. If we want to change that this is the code that generates that

  • it's based on state='finished' and success
  • we could use status_code instead

In both cases, we could inject these Exception classes the template context to share the string/error/message to display.

I'm still not sure if we want to do these template changes in this PR or not (cc @agjohnson). We could keep everything the same and make it behaves different if build.status_code != None to keep backward compatibility.

@humitos humitos marked this pull request as ready for review June 3, 2020 16:02
@humitos humitos requested review from ericholscher and agjohnson June 3, 2020 16:02
@@ -59,6 +59,8 @@ class BuildMaxConcurrencyError(BuildEnvironmentError):

class DuplicatedBuildError(BuildEnvironmentError):
message = ugettext_noop('Duplicated build.')
exit_code = 1
status_code = 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a constant probably, and something other than 0?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made an Enum for this. I don't think it's a problem if it's 0. Why did you think it's better to have it different than 0? and what value?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because 0 to me means "normal" status -- we want a status that represents the issue of "duplicated build", which should be assigned a specific number.

Copy link
Member

@ericholscher ericholscher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this approach seems reasonable, so I'm 👍 on shipping it, but there is definitely more work to be done here. I think fixing the modeling would be the first thing, and then we can fix the display in the new UI.

@humitos humitos closed this Jun 8, 2020
@humitos humitos reopened this Jun 8, 2020
@humitos
Copy link
Member Author

humitos commented Jun 8, 2020

Hrm... I'm not sure what happened with Travis 🤷

humitos added 2 commits June 8, 2020 14:07
Django does not convert Enum automatically to its value :/
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants