Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SCHEMA] Add metadata term files #774

Merged
merged 76 commits into from
Jul 13, 2021
Merged

[SCHEMA] Add metadata term files #774

merged 76 commits into from
Jul 13, 2021

Conversation

tsalo
Copy link
Member

@tsalo tsalo commented Apr 13, 2021

References #699. Continues from #762.

Changes proposed:

  • Take the metadata JSON files from the bids-validator and convert them to YAML format.
  • Add "description" and "key_name" fields to the metadata YAML files.
  • Add functions to render the metadata information in tables within the schema. These functions require the specification text to include the requirement levels.
  • Create files for a number of metadata terms that aren't present in the validator, but are present in specification tables.

To do:

Reviewing

I think the best approach to reviewing this PR is to (1) review the proposed metadata schema structure defined in the schema README and a handful or randomly chosen example terms, and (2) compare the rendered metadata tables against the hardcoded ones in the generated specification. These tasks can be split across multiple reviewers.

  • 02-common-principles
  • 03-modality-agnostic-files
  • 04-modality-specific-files/01-magnetic-resonance-imaging-data Common Metadata Fields
    • Scanner Hardware
    • Sequence Specifics
    • remaining sections
  • 04-modality-specific-files/01-magnetic-resonance-imaging-data Anatomy imaging data
  • 04-modality-specific-files/01-magnetic-resonance-imaging-data Task (including resting state) imaging data
  • 04-modality-specific-files/01-magnetic-resonance-imaging-data Diffusion imaging data
  • 04-modality-specific-files/01-magnetic-resonance-imaging-data Arterial Spin Labeling perfusion data
  • 04-modality-specific-files/01-magnetic-resonance-imaging-data Fieldmap data
  • 04-modality-specific-files/02-magnetoencephalography
  • 04-modality-specific-files/03-electroencephalography
  • 04-modality-specific-files/04-intracranial-electroencephalography
  • 04-modality-specific-files/08-genetic-descriptor
  • 09-positron-emission-tomography
  • 05-derivatives/02-common-data-types
  • 05-derivatives/03-imaging
  • 99-appendices/11-qmri
  • schema/README.md : The README describing the schema structure.
  • The concept of path "formats". I tried to break down the types of paths that are possible and give each one its own name, but that was before [ENH] Major update to pointing to files within, outside of, or remote to a current BIDS dataset #820. If possible, could someone who is familiar with [ENH] Major update to pointing to files within, outside of, or remote to a current BIDS dataset #820 take a look at these? They're discussed in the README file.

* Draft a handful of metadata term files.

* Add example with specific possible values.

* Match the validator schemas better.

* Fix formatting.

* Fix formatting again!

* Draft semi-functional rendering functions.

* Use unit abbreviations.

* Add more fields.

* Get macro working.

* Add terms from first table.

* Add AnatomicalLandmarkCoordinateSystem.

* fMRI task information table.

* More terms.

* More terms.

* More terms.

* EchoTime and FlipAngle

* Add tables.

* More tables.

* More terms.

* More terms.

* More terms.

* Fix spacing.

* Clean things up.

* More terms.

* More terms!

* More terms.

* Some iEEG terms.

* More iEEG terms.

* Add ASL labeling terms.

* Next batch.

* More terms.

* More terms.

* More terms.

* Fix mistakes.

* Last terms.

* Reference yamls.

* Fix mistakes.

* Change format of coordinate system files.

* Use degree in associated files.

* Some of the missing terms.

* A few more terms.

* Fix typos.

* More terms.

* More terms.

* More terms.

* More terms.

* More terms.

* More terms.

* Last terms.

* Fix link.

* Fix internal links.

* Fix links for real.

* Derivative terms.

* Fix up code link.

* Use backslashes for continued strings.

* Replace $ref with file contents.

Also support plural datatype strings and all manner of newlines in descriptions.

* Fix genetics.

* Describe the structure of metadata YAML files.

* Make metadatatype function recursive.

* Improve search function.

* Start adding PET fields.

* Add some fields.

* More terms.

* More terms.

* More terms.

* Fix mistakes.

* More terms.

* Replace InstitutionDepartmentName with existing InstitutionalDepartmentName.

* More terms.

* More terms.

* More terms.

* More terms.

* More terms.

* More terms.

* More terms.

* Last terms.

* Add unit format for strings.

Unused for now but could be useful later.

* Add dataset_relative and participant_relative string formats.

* Update READMEs.

* Fix formats in README.

* Support table-specific metadata description extensions.

* Employ description extensions with IntendedFor.

* Remove explicit defaults from YAML files.

* Replace Minimum with minimum.

* Replace inclusiveMaximum with maximum.

* Replace implicit links with explicit ones.

* Rename key_name to name.

* Rename "Unit" to "Units"
@tsalo
Copy link
Member Author

tsalo commented Apr 13, 2021

Now that these changes are on a branch of the main repository, it should be easier for other folks to contribute directly. The main thing that I could use help with is generalizing the base metadata definitions when necessary and appending section-specific information to those base definitions in the macro calls.

@tsalo tsalo added the schema Issues related to the YAML schema representation of the specification. Patch version release. label Apr 13, 2021
@tsalo
Copy link
Member Author

tsalo commented Apr 26, 2021

Okay, I think this is finally ready! @bids-standard/maintainers if any of you have the time, I'd appreciate your review.

Copy link
Collaborator

@effigies effigies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Started by reviewing the code. Have not yet looked through the schema files or checked all of the rendered tables.

One thing I worry about, after moving to this, is that minor bugs in the code may result in visible changes to a table without being obviously part of the review process. Do we have a strategy to note when a change to the schema code produces a change in the generated markdown?

tools/schemacode/schema.py Outdated Show resolved Hide resolved
tools/schemacode/schema.py Outdated Show resolved Hide resolved
tools/schemacode/schema.py Outdated Show resolved Hide resolved
tools/schemacode/schema.py Outdated Show resolved Hide resolved
tools/schemacode/schema.py Outdated Show resolved Hide resolved
tools/schemacode/schema.py Outdated Show resolved Hide resolved
tools/schemacode/schema.py Outdated Show resolved Hide resolved
tools/schemacode/schema.py Outdated Show resolved Hide resolved
tools/schemacode/schema.py Outdated Show resolved Hide resolved
tools/schemacode/schema.py Outdated Show resolved Hide resolved
Thanks @effigies!

Co-authored-by: Chris Markiewicz <[email protected]>
@effigies
Copy link
Collaborator

Quick note: I ran across yaml.safe_load which has been around forever as a convenience function equivalent to partial(yaml.load, loader=yaml.SafeLoader).

Copy link
Member

@sappelhoff sappelhoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

finished my run through the remaining MRI sections

src/schema/metadata/CASLType.yaml Outdated Show resolved Hide resolved
Copy link
Member

@sappelhoff sappelhoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I now finished going through all changes and provided that my remaining comments are addressed, I think we can merge this 👍

This is a great piece of work @tsalo, thanks a lot.

@tsalo
Copy link
Member Author

tsalo commented Jul 7, 2021

@Remi-Gau @effigies In yesterday's maintainers call, we decided that it would be unrealistic to request a full review from anyone else, so instead of reviewing all of the content we were wondering if you'd be willing to do more of a conceptual review or a code review. We should have plenty of time before the next release for folks to identify any issues with the content, so any small content issues can be resolved after this is merged. WDYT?

@Tokazama
Copy link
Member

Tokazama commented Jul 7, 2021

Is there any reason that all the metadata files are separate instead of consolidated into one file (or even just a few)?

@tsalo
Copy link
Member Author

tsalo commented Jul 8, 2021

The biggest reason is that I find individual files easier to work with than one big file. While the idea of a few files is appealing, I don't think it would work well for metadata. The obvious delineation is based on data type, but unfortunately many metadata terms can be used for multiple data types.

@Remi-Gau
Copy link
Collaborator

Remi-Gau commented Jul 9, 2021

@Remi-Gau @effigies In yesterday's maintainers call, we decided that it would be unrealistic to request a full review from anyone else, so instead of reviewing all of the content we were wondering if you'd be willing to do more of a conceptual review or a code review. We should have plenty of time before the next release for folks to identify any issues with the content, so any small content issues can be resolved after this is merged. WDYT?

Good with me though I will only be back "on duty" later this month (26th) so don't let my silence hold you back on the merge anyway (plus most of my comments will be on formatting and missing comas most likely).

Copy link
Member

@sappelhoff sappelhoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tsalo could you please update this PR with respect to

Then I'd be fine with removing the hard-coded tables and merging, as Chris and Remi are currently unavailable and we didn't plan on having a "deep" review anyhow (that's already happened).

@tsalo
Copy link
Member Author

tsalo commented Jul 13, 2021

Awesome! I incorporated the changes for the two PRs, and looked at the rendered docs. The docs look good, so I'm going to remove the hardcoded tables now.

@tsalo
Copy link
Member Author

tsalo commented Jul 13, 2021

Okay the tables have been removed and the CI is passing.... should I merge? Is it time?!

Copy link
Collaborator

@effigies effigies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a code review. Only one suggestion, really. I haven't tested, so you may want to test before merging. Or reject if it seems less clear.

tools/schemacode/schema.py Outdated Show resolved Hide resolved
Co-authored-by: Chris Markiewicz <[email protected]>
@tsalo tsalo mentioned this pull request Jul 13, 2021
3 tasks
@tsalo
Copy link
Member Author

tsalo commented Jul 13, 2021

The rendered docs look good with the new changes, so I'm merging!

@effigies
Copy link
Collaborator

Seems to work. If you're happy to merge, fire away!

@tsalo tsalo merged commit 5de7cfc into master Jul 13, 2021
@tsalo tsalo deleted the metadata-schema branch July 13, 2021 16:47
@sappelhoff
Copy link
Member

😎 nice work! Very happy to see this in

@tsalo tsalo added the schema-code Updates or changes to the code used to parse, filter, and render the schema. label Apr 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
schema Issues related to the YAML schema representation of the specification. Patch version release. schema-code Updates or changes to the code used to parse, filter, and render the schema.
Projects
Development

Successfully merging this pull request may close these issues.

7 participants