Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More logic changes to error reporting, cleanup #3310

Merged
merged 9 commits into from
Dec 7, 2017

Conversation

agjohnson
Copy link
Contributor

  • Alter project.task logic and include multiple points of error reporting during
    build task
  • Build task returns true or false, not project data, which is unused anyways
  • New pattern for error messages
  • Localize error messages?
  • Make more specific error messages for repository error and project error
  • Remove unnecessary git/hg/bzr/svn exit codes and superfluous command data from
    user errors
  • Add handling of git prompt for git 2.3+
  • Add errors for docker connection
  • Handle reporting on not able to connect to docker

@agjohnson agjohnson added the PR: work in progress Pull request is not ready for full review label Nov 23, 2017
@agjohnson
Copy link
Contributor Author

agjohnson commented Nov 23, 2017

Marked WIP, because:

  • Regression in reporting build finished in run_setup()
  • Not unittested yet
  • Need to play with exception translation pattern -- ultimately not worried about this yet, it will be an eventual improvement however.
  • Need to do more to catch an API exception with docker on memory issues and report this
  • Should actually write unittests around all of this

@agjohnson
Copy link
Contributor Author

Also, I'm thinking that we need to move the vcs execution into a build environment, the error reporting with git stderr is sloppy, but there's no reason we can't just show this step to users. This will be a separate ticket.

except DockerAPIError:
# Catch direct failures from Docker API, but also requests exceptions
# with the HTTP request
except (DockerAPIError, ConnectionError):
log.error(LOG_TEMPLATE
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't it better here to use log.exception ?


pass
class ProjectConfigurationError(BuildEnvironmentError):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this class is missing the get_default_message method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be inheriting this from BuildEnvironmentException

@@ -58,10 +59,8 @@ def repo_exists(self):
def fetch(self):
code, _, err = self.run('git', 'fetch', '--tags', '--prune')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

err could be _ now that's not used

@@ -79,16 +78,7 @@ def clone(self):
code, _, err = self.run('git', 'clone', '--recursive', '--quiet',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

err could be _ now that's not used

sterr=err
)
)
raise RepositoryError
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is too much simpler that what we had, but what happen if there is another problem different than a private repository. I mean, the error message is very generic and we are not checking a specific error code:

For example:

[humitos@julia:tmp]$ GIT_TERMINAL_PROMPT=0 git clone https://github.com/readthedocs/readthedocs-corporate
Clonando en 'readthedocs-corporate'...
fatal: could not read Username for 'https://github.com': terminal prompts disabled
[humitos@julia:tmp]$  echo $?
128

Ha, it doesn't seem to have a different error code:

[humitos@julia:tmp]$ LANG=C git clone [email protected]:rtfd/readthedocs
Cloning into 'readthedocs'...
ERROR: Repository not found.
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.
[humitos@julia:tmp]$ echo $?
128

Anyway, I think this is not really important if we don't have history/logs failing at these points because something different than a private repository.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, but this would be out of scope for this work, and would be best served by surfacing the commands we execute to users. Currently, the checkout step is hidden because it uses it's own shell execution. This could be easily moved into a BuildCommand executed through the setup environment. This way, users can actually see the output in a meaningful way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like your proposal of moving it inside a BuildCommand and I agree it could be done in another PR.

@@ -127,24 +126,40 @@ def run(self, pk, version_pk=None, build_pk=None, record=True,
self.config = None

setup_successful = self.run_setup(record=record)
if setup_successful:
self.run_build(record=record, docker=docker)
if not setup_successful:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is something missing here. If the build fails because of the setup at this point: https://github.com/rtfd/readthedocs.org/pull/3310/files#diff-b9399e1d3499066c5564f98a620e8881R198

we are not raising any exception, so run_setup method just returns False and the task return False but nobody is updating the self.setup_env.build['error'] as we do when unhandled exception is catched at https://github.com/rtfd/readthedocs.org/pull/3310/files#diff-b9399e1d3499066c5564f98a620e8881R137 or https://github.com/rtfd/readthedocs.org/pull/3310/files#diff-b9399e1d3499066c5564f98a620e8881R154

Also, I think that's better to move this line https://github.com/rtfd/readthedocs.org/pull/3310/files#diff-b9399e1d3499066c5564f98a620e8881R207 inside this if, so we try to update the build state in not so many places

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Links above have disconnected here, so hard to follow what's happening, but i think i understand this feedback.

So update of the build now always happens on build env __exit__, so we can just return false from the task.

In the case that another exception is raised outside the env, we still run update_build from the exception.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this logic ^^ documented somewhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, on the environment class. I can expand more.

humitos added a commit that referenced this pull request Nov 27, 2017
I'm using BuildEnvironmentError for now, but we should use
ProjectConfigurationError once the PR got merged:

#3310
@agjohnson agjohnson force-pushed the agj/more-build-error-reporting branch from ccc6fa6 to 95b94b1 Compare November 27, 2017 22:44
@ericholscher
Copy link
Member

Feel like this is a biggish change, which might conflict over time -- should try and get tests passing and work done, and then I'll review it.

* Alter project.task logic and include multiple points of error reporting during
  build task
* Build task returns true or false, not project data, which is unused anyways
* New pattern for error messages
* Localize error messages?
* Make more specific error messages for repository error and project error
* Remove unnecessary git/hg/bzr/svn exit codes and superfluous command data from
  user errors
* Add handling of git prompt for git 2.3+
* Add errors for docker connection
* Handle reporting on not able to connect to docker
* Add finalize argument to environments
@agjohnson agjohnson force-pushed the agj/more-build-error-reporting branch from 95b94b1 to 47df9ee Compare November 29, 2017 22:28
@agjohnson
Copy link
Contributor Author

agjohnson commented Nov 29, 2017

Test failure was unrelated. Work is rebased to get CI fix

Our old tests weren't correctly testing the api calls at all. This expands our
environment tests to actually test for build state updates.
@agjohnson
Copy link
Contributor Author

Unittests around doc build environments have been made working, and more have been added.

a9c2a22 is lint clean up, review that separately.

@agjohnson agjohnson added PR: ready for review and removed PR: work in progress Pull request is not ready for full review labels Nov 30, 2017
@ericholscher
Copy link
Member

How do I review it seperately?

@humitos
Copy link
Member

humitos commented Nov 30, 2017

You can select a range of commits to review from the top left dropdown: https://github.com/rtfd/readthedocs.org/pull/3310/files/0fd836669a2a3f87866634c534d3a32c7dff462b

(just the 2 commits to review with logic)

@ericholscher
Copy link
Member

@humitos thanks!

@@ -294,7 +297,7 @@ def __enter__(self):

def __exit__(self, exc_type, exc_value, tb):
ret = self.handle_exception(exc_type, exc_value, tb)
self.build['state'] = BUILD_STATE_FINISHED
self.update_build(BUILD_STATE_FINISHED)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't this post that status to the API? Seems like we don't want to publish here, but only below if finalize is False.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The finalize check is in update_build, so it should only post if finalize is True

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gah ok. Lost that in the review.

except BuildEnvironmentError:
# There may have been a problem connecting to Docker altogether, or
# some other handled exception here.
self.__exit__(*sys.exc_info())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this not automatically call __exit__?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It won't, because this is raised in __enter__. Errors raised in __enter__ raise to the parent context, so we have to explicitly clean up here.


pass
class ProjectConfigurationError(BuildEnvironmentError):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems we have a BuildEnvironmentException and a BuildEnvironmentError, what is the difference? Seems confusing..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exception is the base class, Error is the specific class that kills a build and reports to a user, Warning also is absed on Exception and doesn't kill the build.

PRIVATE_REPO = _(
'There was a problem connecting to your repository, '
'ensure that your repository URL is correct.'
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's a private repo, shouldn't we warn them here? Why are these messages the same?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We aren't thoroughly testing if this was a private repo or not, we'd have to grep the git output. @humitos raised a similar question and the best answer is to execute checkout in a build environment and report the commands, not parse them to determine what the error was.

Messages aren't the same, but usage could be clearer. I should probably rename these constants, as PRIVATE_REPO is raised if we support private repositories -- that is, we tell the user "check your url, because it wasn't the repository privacy that caused the problem. PUBLIC_REPO error is "check your url or your project privacy, because we don't support private repositories."

In fact, I'll update the copy here to be more explicit about repo privacy.

@@ -127,24 +126,40 @@ def run(self, pk, version_pk=None, build_pk=None, record=True,
self.config = None

setup_successful = self.run_setup(record=record)
if setup_successful:
self.run_build(record=record, docker=docker)
if not setup_successful:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this logic ^^ documented somewhere?

'Please include the build id ({build_id}) in any bug reports.'.format(
build_id=build_pk
))
self.build_env.update_build(BUILD_STATE_FINISHED)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another place we're updating build finished..?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems this is what we want, no? This is the fallback that was added in case there are any exceptions outside of the build environment __enter__, normal context, or in __exit__, such as in run_build.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, ok. I got lost when reviewing this for some reason, and didn't realize where everything was getting called from.

'setup': u'',
'output': u'',
'state': u'finished',
'builder': u'foo'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All these listings of massive argument lists seem brittle (it will break whenever we add/change/remove/ an argument). Can we just check for the actual value we care about being called, presumably error?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test are failing because of this, in fact:

'length': <ANY>

Besdies, it's complicated to find the differences when it fails. In case you want to leave as it is now, I'd suggest to use assertDictEqual

"Failed to get code from '%s' (git fetch): %s\n\nStderr:\n\n%s\n\n" % (
self.repo_url, code, err)
)
raise RepositoryError
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to lose the specifics of these errors?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think surfacing an error that is helpful provides more user guidance -- that is, showing a failed checkout command isn't clear to the user that the problem is their configuration. I think the ultimate fix here is to do both and show the command execution as a BuildCommand. I removed this for now, for consistency on errors and because displaying the stderr output like this looked sloppy. I think we can pretty quickly move towards combining execution environments though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that having both in the future is better.

I like the idea of communicate the "common problem" to the user in a very simplified way, but also show the output (that could be used when debugging something and could help the user to understand it better -also if the problem is something different that we are reporting as simplified way)

Copy link
Member

@humitos humitos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest to take a look at finalize new attribute since I think that doesn't work as you expect.

I'm not sure to understand how the test works, but since I think that attribute breaks the intermediate state of the building I would sugget to write a test to check that.

trace = sys.exc_info()[2]
six.reraise(ProjectImportError('Conf file not found'), None, trace)
six.reraise(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this six.reraise? It's not the same than just raise ProjectImportError(...)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure either, the docs aren't clear to me:
https://pythonhosted.org/six/#six.reraise

I'll leave it, assuming someone with more knowledge of py2/3 compat did this :)

@@ -275,15 +276,17 @@ class BuildEnvironment(object):
:param build: Build instance
:param record: Record status of build object
:param environment: shell environment variables
:param finalize: finalize the build by setting a finished state on exit
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this sentence is True only if the update_build method is called with the BUILD_STATE_FINISHED status, otherwise that method will just update the build, but not finalize it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah true. I didn't like finalize either. I'll rename this commit to match the Django usage. Would this change make more sense?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe the commit should be in the update_build method to avoid the other problems that I mentioned

try:
client.kill(self.container_id)
except DockerAPIError:
pass
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about logging something here (DEBUG level, maybe)?

log.info('Removing container %s', self.container_id)
client.remove_container(self.container_id)
# Catch direct failures from Docker API, but also requests exceptions
# with the HTTP request. These should not
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"These should not" ?

),
)
self.container = None
except BuildEnvironmentError:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not following this. I think it's not possible to BuildEnvironmentError be raised in the try: block here. What could be the case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_client raises a BuildEnvironmentError on failure here. I'll rethink refactoring this as well though, perhaps a more specific exception, or allowing the docker exception to bubble up makes more sense.

if unhandled_failure:
self.build_env.build['error'] = unhandled_failure
self.build_env.update_build(BUILD_STATE_FINISHED)
self.setup_env.update_build(BUILD_STATE_FINISHED)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since self.setup_env was instantiated with finalize=False this won't work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, good catch!

@@ -297,19 +313,10 @@ def setup_vcs(self):
self.setup_env.update_build(state=BUILD_STATE_CLONING)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also a problem here. The build won't update with this status since it's finalize=False.

* Removes some redundant calls to update build objects
* Updates some docs
* Logic behind update_build is improved to update in more cases
* Tests for incremental updates
@agjohnson
Copy link
Contributor Author

Okay! There is even more logic now. I believe I've addressed all of the feedback and have dropped in a bunch more docs while this is fresh. One last pass would be a good idea

@humitos
Copy link
Member

humitos commented Dec 1, 2017 via email

@agjohnson
Copy link
Contributor Author

Pinging @humitos ☎️

Copy link
Member

@humitos humitos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't want to block this PR. I think we can continue from here in another PR later.

I left some comments that can be done in this PR and some, though. Like typos, messages in logs or docstrings.


# Attempt to stop unicode errors on build reporting
for key, val in list(self.build.items()):
if isinstance(val, six.binary_type):
self.build[key] = val.decode('utf-8', 'ignore')

if self.finalize:
# We are selective about when we update the build object here
update_build = (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the use of any is clearer here, instead of ors


We only update the build through the API in one of three cases:

* The build is not done and needs incremental builds
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is an incremental build?


:param project: Project that is being built
:param version: Project version that is being built
:param build: Build instance
:param record: Record status of build object
:param environment: shell environment variables
:param commit: update the build object via API if the build was successful
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is still confusing.

When commit=True (the default) and we are in the middle of the build, the build won't be updated. So, I think this should have a better name (if that is possible) like last_command, since I think it's more representative of what it measn.

Anyway, I don't see the best solution very clear.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably the most explicit we can is update_on_success

for the build)
This step is skipped if we aren't recording the build. To avoid
recording successful builds yet (for instance, running setup commands for
the build), set the ``commit`` argument on environment instantiation.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

set the commit argument to False

try:
api_v2.build(self.build['id']).put(self.build)
except HttpClientError as e:
log.error("Unable to post a new build: %s", e.content)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unable to update the build: and maybe add the build id also to the log

client.kill(self.container_id)
except DockerAPIError:
log.exception(
'Unable to remove container: id=%s',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unable to kill the container

LOG_TEMPLATE.format(
project=self.project.slug,
version=self.version.slug,
msg='Could not connection to Docker API',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could not connect

msg=e,
),
)
raise BuildEnvironmentError('There was a problem connecting to Docker')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, the problem is similar than https://github.com/rtfd/readthedocs.org/pull/3310/files#diff-ca52b098301dd315a834b3556ab9a7d5R638 but in this case we are communicating the Docker error to the user.

@agjohnson agjohnson force-pushed the agj/more-build-error-reporting branch from 14899cc to 3bf4b9c Compare December 7, 2017 16:11
@agjohnson
Copy link
Contributor Author

@humitos Thanks for the thorough feedback! I'm just waiting on a final CI test here. If changes look good, lets merge!

@agjohnson agjohnson merged commit ba20191 into master Dec 7, 2017
@agjohnson agjohnson deleted the agj/more-build-error-reporting branch December 7, 2017 18:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants