Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve validation error message for missing data type #478

Merged
merged 3 commits into from
Dec 2, 2020

Conversation

dsleiter
Copy link
Contributor

@dsleiter dsleiter commented Nov 26, 2020

Motivation

Fix #284 by improving the validation error message shown when a missing data type has a name.

How to test the behavior?

from hdmf.spec import GroupSpec, SpecCatalog, SpecNamespace
from hdmf.build import GroupBuilder
from hdmf.validate import ValidatorMap

def get_vmap(specs):
    spec_catalog = SpecCatalog()
    for spec in specs:
        spec_catalog.register_spec(spec, 'test.yaml')
    namespace = SpecNamespace(
        'a test namespace', 'test_namespace', [{'source': 'test.yaml'}], version='0.1.0', catalog=spec_catalog)
    return ValidatorMap(namespace)

def build_specs():
    bar = GroupSpec('A test group specification with a data type', data_type_def='Bar')
    foo = GroupSpec('A test group that contains a data type', data_type_def='Foo',
                groups=[GroupSpec('A Bar group for Foos', name='my_bar', data_type_inc='Bar')])
    return (bar, foo)

foo_builder = GroupBuilder('my_foo', attributes={'data_type': 'Foo'})
vmap = get_vmap(build_specs())
vmap.validate(foo_builder)

Checklist

  • Did you update CHANGELOG.md with your changes?
  • Have you checked our Contributing document?
  • Have you ensured the PR clearly describes the problem and the solution?
  • Is your contribution compliant with our coding style? This can be checked running flake8 from the source directory.
  • Have you checked to ensure that there aren't other open Pull Requests for the same change?
  • Have you included the relevant issue number using "Fix #XXX" notation where XXX is the issue number? By including "Fix #XXX" you allow GitHub to close issue #XXX when the PR is merged.

* Fix hdmf-dev#284
* Add optional parameter `missing_dt_name` to validation error MissingDataType
* Add tests for bug fix
@rly rly closed this Nov 30, 2020
@rly rly reopened this Nov 30, 2020
@rly
Copy link
Contributor

rly commented Dec 1, 2020

Thanks for the PR, @dsleiter ! Your changes look great.

Normally the codecov app leaves a comment with code coverage information for the PR, but for reasons below, that is not being triggered by CI. Here it is for this PR: https://app.codecov.io/gh/hdmf-dev/hdmf/compare/478/diff It's often helpful for us to look at codecov to make sure that the changes are actually tested.

The change on line 475 of validate.py is not hit by tests, and further inspection reveals that it is not even possible to reach lines 469-477. I raised this in #479. No need to take action about this.

Just to document a couple meta notes from my Slack discussion with @dsleiter :

@rly
Copy link
Contributor

rly commented Dec 1, 2020

I'll merge this after the other developers have a chance to look this over in case they would like to.

@rly
Copy link
Contributor

rly commented Dec 1, 2020

Update: it looks like I can edit files in the PR and commit to the branch, so the fix to allow edits from maintainers worked.

Copy link
Contributor

@oruebel oruebel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

@dsleiter
Copy link
Contributor Author

dsleiter commented Dec 1, 2020

Thanks for the reviews @rly and @oruebel!

Good catch on unreachable code block @rly. I was looking at that last week while working on #197 and trying to understand the validator code, but switched to this task and then forgot to ask. Using coverage diff seems like a much better way to catch this kind of thing. For now, I'll check the coverage diff locally before making the PR.

In terms of getting my PRs to trigger the CI jobs, I don't think I can push branches to this repo - I believe I'm a member of the organization, but not the repo. If you'd like to give me write access, then could you please verify the permissions?

But this issue is likely to pop up again with other contributors, so it might be better just to build this into the CI pipeline, what do you think? It might just be a matter of adding a manual approval step to the circleci config (type: approval) triggered on PRs forked repos. I'm not certain that gets around all of the security concerns alone or if we'll have to implement something else, but I can take a look tomorrow and open up an issue with a suggestion if I find something that could work.

@dsleiter
Copy link
Contributor Author

dsleiter commented Dec 2, 2020

Looking deeper into the CircleCI issue, it looks like you could enable "Build forked bull requests" but keep "Pass secrets to builds from forked pull requests" disabled. None of the currently configured jobs that build on a branch/PR seem to require any secrets. I was able to actually to run all jobs on agencyenterprise CircleCI account from a forked PR without configuring any environment variables, and it even pushed coverage directly to our codecov account without configuring any tokens.

The only jobs which seem like they would fail are deploy-dev and deploy-release, neither of which would trigger from a forked PR under normal circumstances anyway.

If in the future you wanted to add steps which should not run on forked PRs, you could add a step like early_return_for_forked_pull_requests here: https://circleci.com/blog/managing-secrets-when-you-have-pull-requests-from-outside-contributors/

In terms of the azure builds, it seemed like they were being triggered properly from a forked PR, although I never checked the details to make sure they actually completed.

Do you think we could enable "Build forked bull requests" to give it a test?


As an aside, I did notice that when our CirleCI was triggered on dev, the final status of job deploy-dev was a success even though it wasn't able to publish (clearly). publish_github_release printed

skipping: A token is expected.

Specify the --token parameter or set GITHUB_TOKEN environment variable.
See https://help.github.com/articles/creating-an-access-token-for-command-line-use/

so it wasn't getting the token, but it returned with an exit code of 0, so CircleCI still marked it as success. Maybe this isn't worth spending time fixing since it shouldn't ever run that job on from a forked PR, but there might be other reasons why publish_github_release could fail without failing the job, so I thought I'd mention it.

@rly
Copy link
Contributor

rly commented Dec 2, 2020

Looking deeper into the CircleCI issue, it looks like you could enable "Build forked bull requests" but keep "Pass secrets to builds from forked pull requests" disabled.

That's how it is currently configured. The CircleCI docs also say:

Note: If a user submits a pull request to your repository from a fork, but no pipeline is triggered, then the user most likely is following a project fork on their personal account rather than the project itself of CircleCi, causing the jobs to trigger under the user’s personal account and not the organization account. To resolve this issue, have the user unfollow their fork of the project on CircleCI and instead follow the source project. This will trigger their jobs to run under the organization when they submit pull requests.

Can you see if this is the case for you for the next PR that you create?

None of the currently configured jobs that build on a branch/PR seem to require any secrets.

The way that we have been calling codecov requires passing a private token, at least for Azure and I thought for Circle as well. You can see this log with a codecov upload error from your earlier commit on this PR. We can easily get around that in several ways, and I propose one in #481.

The only jobs which seem like they would fail are deploy-dev and deploy-release, neither of which would trigger from a forked PR under normal circumstances anyway.

Good point.

there might be other reasons why publish_github_release could fail without failing the job, so I thought I'd mention it.

Thanks for letting us know! And thanks for looking into these issues and possible solutions in depth.

@rly
Copy link
Contributor

rly commented Dec 2, 2020

re: publish_github_release returning exit code 0 when the token is not found, it looks like we have set the flag --exit-success-if-missing-token. If you have a second to test this, can you remove that line from .circleci/config.yml in a PR from your fork and see if the CI job still succeeds?

@rly rly merged commit 2924f4f into hdmf-dev:dev Dec 2, 2020
dsleiter added a commit to agencyenterprise/hdmf that referenced this pull request Dec 2, 2020
@dsleiter
Copy link
Contributor Author

dsleiter commented Dec 2, 2020

@rly gotcha, I didn't realize it was already configured to build forked PRs. Good catch on the documentation about following the project, seems like that could have been the cause. I unfollowed our fork and started followed hdmf-dev/hdmf, so hopefully it will work on the next PR.

For codecov, I did notice in this documentation that at least for the bash uploader, an upload token may not be required: https://app.circleci.com/pipelines/github/agencyenterprise/hdmf

Not Required for some CI Providers

If you have a public project on TravisCI, CircleCI, AppVeyor, Azure Pipelines, or GitHub Actions an upload token is not required.

And during my test, it did upload coverage to agencyenterprise codecov without me configuring a token, so maybe it will work (at least from circleci). Let's first get circleci running, and then see if it runs codecov.

I also tried to remove --exit-success-if-missing-token as you suggested on the dev build on my fork, and the job then failed as expected. I'll make a PR here with that in order to test everything here.

@dsleiter dsleiter deleted the bug/missing_data_type_message branch December 2, 2020 21:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Validator says missing data type when dataset name is incorrect
3 participants