Skip to content

Commit

Permalink
deploy: 1c8a039
Browse files Browse the repository at this point in the history
  • Loading branch information
stuchalk committed Jan 31, 2024
1 parent 4d4178a commit 83adaab
Show file tree
Hide file tree
Showing 40 changed files with 280 additions and 450 deletions.
58 changes: 42 additions & 16 deletions _sources/about_cookbook.md
Original file line number Diff line number Diff line change
@@ -1,20 +1,46 @@
# About this Cookbook

The IUPAC FAIR Chemistry Cookbook is intended to be an open, collaborative, community focused resource on working
with FAIR data in chemistry. This community resource aims to provide a range of practical and re-usable training
materials that demonstrate how to manage digital data files and content. Our goal is to get more practical tools
& tips in the hands of practicing chemists and others working with digital chemical data, to lower barriers and
smooth the adoption of best practices for sharing and reusing FAIR chemical data. The content primarily consists
of actionable recipes for a range of tasks to prepare and deposit FAIR machine-enabled chemical data, identify
and extract chemically relevant metadata, and compile and validate chemical data files using online tools.
This resource was initially formulated as an output of the [IUPAC WorldFAIR Chemistry project](https://iupac.org/project/2022-028-1-024/), for the
[WorldFAIR initiative](https://worldfair-project.eu/) (see below). The IUPAC FAIR Chemistry Cookbook is intended to support the broader
community in understanding how to work with machine-readable chemical data and implement the FAIR data principles.
The site is designed to be a living community resource through the addition of new content as strategies evolve
and the sharing and reuse of FAIR chemical data continues to increase. Feedback and contributions are welcome.

FAIR data are findable, accessible, interoperable, and reusable for machine processing {cite:p}`Wilkinson2016`.
FAIR chemical data need to be machine-readable, and this can be an unfamiliar scenario for many researchers
and other stakeholders involved with publishing and managing experimental data. This cookbook aims to support
best practices for sharing and reusing chemical data aligned with the technical criteria for FAIR
machine-readable data. Practical, interactive tutorials based on common workflows and readily accessible
online tools for working with digital content augment broader guidance.
## Project contributors

The IUPAC FAIR Chemistry Cookbook is designed to be an evolving resource for the chemistry community. It is
supported by the International Union of Pure and Applied Chemistry (IUPAC) as part of the WorldFAIR
Initiative (see About this project).
- Stuart Chalk (Project Lead), University of North Florida
- Ann-Christin Andres, Johannes Gutenberg University Mainz
- Simon Coles, University of Southampton
- Jordi Cuadros, IQS Universitat Ramon Llull
- Sonja Herres-Pawlis, RWTH Aachen University
- John Jolliffe, Johannes Gutenberg University Mainz
- Sunghwan Kim, National Center for Biotechnology Information, National Institutes of Health
- Nicola Knight, University of Southampton
- Ken Kroenlein, Citrine Informatics
- Ye Li, Massachusetts Institute of Technology
- Leah McEwen, Cornell University
- Samuel Munday, University of Southampton
- Fatima Mustafa, Texas A&M San Antonio
- Vincent F. Scalfani, University of Alabama

## Cookbook development

This Cookbook is created online with Jupyter Book. Content is generated locally and managed through a GitHub repository.
This infrastructure enables the following requirements at the content creation level:
- Open and FAIR development and deployment
- Support for a diverse set of user personas
- Agile development and adaptation to user needs
- Community engagement for long term stability
- Documentation at user, contributor and administrator levels

## WorldFAIR Chemistry

The Committee on Data of the International Science Council ([CODATA](https://codata.org/)) and the Research Data
Alliance ([RDA](https://rd-alliance.org/)) launched the [WorldFAIR Initiative](https://worldfair-project.eu/) in
2022 to advance implementation of the [FAIR data principles](https://force11.org/info/the-fair-data-principles/)
within and across research domains. The International Union of Pure and Applied Chemistry ([IUPAC](https://iupac.org/)),
known as the world authority on chemical nomenclature, terminology, and standardized methods of measurement hosts
the WorldFAIR Chemistry project in a concerted effort to support broader data sharing of chemical data through
collaboration with related disciplines and data science communities. The goal of
[WorldFAIR Chemistry](https://iupac.org/project/2022-012-1-024) is to support the use of chemical data standards in
research workflows to enable downstream data reuse through practical direction and resources.
22 changes: 14 additions & 8 deletions _sources/contributions.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,16 @@
# Contributions to the Cookbook
# Contribute to the Cookbook!

We would like to thank all the contributors have made the IUPAC FAIR Chemistry Cookbook possible.
The Cookbook is an open, collaborative, community focused resource based on a broadly accessible online dynamic
platform in the form of a Jupyter Book that supports the development and publication of executable content.
The content covers a range of activities for working with FAIR chemical data likely to be encountered by researchers,
data scientists and many other stakeholders engaged in publishing and re-using chemical data. If you regularly
work with digital chemical data and have useful approaches that could be demonstrated through a Jupyter Notebook,
please consider contributing. Best practices for using standards and tools are emphasized and instructions for
how to contribute materials are provided.

- Stuart Chalk
- Jordi Cuadros
- Sunghwan Kim
- Nicola Knight
- Sam Munday
- Vin Scalfani
More information is available on how to contribute to the Cookbook in the [documentation wiki](https://github.com/IUPAC/WFChemCookbook/wiki):
- [What is the IUPAC FAIR Chemistry Cookbook?](https://github.com/IUPAC/WFChemCookbook/wiki/What-is-the-IUPAC-FAIR-Chemistry-Cookbook%3F)
- [How was the Cookbook developed?](https://github.com/IUPAC/WFChemCookbook/wiki/How-was-the-Cookbook-developed%3F)
- [How to create content for the Cookbook](https://github.com/IUPAC/WFChemCookbook/wiki/How-to-create-content-for-the-Cookbook)
- [How to submit a contribution to the Cookbook](https://github.com/IUPAC/WFChemCookbook/wiki/How-to-submit-a-contribution-to-the-Cookbook)
- [Benefits of contributing to the Cookbook](https://github.com/IUPAC/WFChemCookbook/wiki/Benefits-of-contributing-to-the-Cookbook)
66 changes: 4 additions & 62 deletions _sources/cooking.md
Original file line number Diff line number Diff line change
@@ -1,64 +1,6 @@
# The Joy of Cooking (FAIR data)

This cookbook provides a range of example protocols developed by active community members. These recipes target
different tasks across a range of possible use cases for working with machine-readable chemical data (i.e., FAIR data).
The aim is to present all materials with relevant chemistry examples, point to external content that are of high quality
where available, reference IUPAC and community digital standards where appropriate, and engage the chemistry community
in order to broaden the understanding of FAIR in chemistry.
# The Joy of Cooking

Working with data in terms of FAIR and in a digital environment means working with machine-readable data, therefore
different activities, different steps to handle the data. This section provides a brief background on machine-readable
data and the FAIR data principles in the context of chemistry, what you can do with machine-readable chemical data and
the importance of preparing data to be FAIR and discoverable in domain repositories.

In data science, a recipe describes a series of steps applied to a data set to prepare it for data analysis in a
systematic away. Recipes can describe all the steps taken in a project from data ingestion to transformation to
analysis to automate processes and share work with others. With recipes, you can prepare your dataset in a systematic
and repeatable way. Recipes can cover many aspects of data preparation including normalization and joining multiple
datasets. The recipes in this cookbook demonstrate actions…

[Pointing to other sources: ELIXIR FAIR cookbook, NFDI4Chem KnowledgeBase, other resources]

## Themes
- FAIR describes attributes of machine-readable data that enables them to be reusable
- Structure and consistency are important but there is no one rigid best way
- Standards are designed to encapsulate multiple attributes into convenient motifs and methods
- [Application of motifs/workflows enables FAIR attributes of data/metadata]
- Common motifs in machine-readable chemical data (also in Culinary School)
- Chemical identification and structure representation
- Standard file formats for different data types
- Chemical metadata description
- What you can do with machine-readable chemical data?
- Enhance discovery
- Compile data
- Programmatically query for data
- Making chemical Data FAIR (more practicals in later sections)
- Data files
- Data processing
- Data description
- Data sharing
- Absolute “minimum” (meaning that would enable data to be discoverable and a reuser can then try to do something with
it, even if not as efficiently as desirable)

- Basic concepts – (also include concepts in glossary for specific linking)
- [setting the stage, the vernacular, this is what is happening here and you will come across that]
- machine-readable
- programmatic access/reuse
- Data exchange
- Data export/import formats (e.g., JSON)
- Metadata
- Languages (e.g. python, R, markdown?)
- Platforms
- Workspaces
- Workflows
- Provenance
- PIDs
- How FAIR works? – (and other sections as appropriate)
- Chemistry particulars – (and other sections as appropriate)
- Data types
- Motifs (identifiers, representations, schema, formats, ontologies)
- Standards
- Organizing principles of data resources
- There should also be something in here about FAIR is a scale and that anything you can do to improve the FAIRness
of your data is a good thing (with comments on the benefits of doing even the lowest level improvements). SJC 12/5/23
- RIPE as a sequence of considerations…
different activities, different steps to handle the data. This section will provide a brief background on
machine-readable data and the FAIR data principles in the context of chemistry, what you can do with machine-readable
chemical data and the importance of preparing data to be FAIR and discoverable in domain repositories.
4 changes: 2 additions & 2 deletions _sources/data_sources.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# Sources of FAIR Chemical Data

Recipes in this section review accessible online sources of reliable FAIR chemical data, including research data
repositories and other aggregated sources. Materials include brief descriptions of content and available documentation,
and provide tutorials and demos of API protocols for searching and retrieving various types of data.
repositories and other aggregated sources. Materials include brief descriptions of content and available
documentation, and provide tutorials and demos of API protocols for searching and retrieving various types of data.
2 changes: 1 addition & 1 deletion _sources/manipulations.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,4 @@
FAIR data are Findable, Accessible, Interoperable and Reusable by both humans and machines and "Fully AI-Ready". The
recipes in this section demonstrate atomized workflows for specific tasks related to managing machine-readable chemical
data. Interactive examples are included as demonstrations that can be copied and executed through a Jupyter notebook or
other Python based code.
other Python based code.
74 changes: 6 additions & 68 deletions _sources/techniques.md
Original file line number Diff line number Diff line change
@@ -1,69 +1,7 @@
# FAIR Techniques for Chemical Data
# FAIR Techniques

Making your science FAIRer: This section details how you can make your work FAIRer by upgrading how you can do common
activities in a FAIR enabled way.

This material will be informed by the WorldFAIR Chemistry D3.1 project related to Reporting Guidance for FAIR chemical
data and other community resources, including the NDFI4Chem Knowledgebase and the ELIXIR FAIR Cookbook. In addition,
efforts to inform best practice, such as the IUPAC FAIR Spec Project, will be highlighted.

Test Kitchen / Checking your chemical data/metadata: This section will review protocols for confirming the completeness
and consistency of chemical data and metadata files, for example the checkCIF service for Crystallographic Information
Files (CIF). This material will also be informed by the WorldFAIR Chemistry D3.3 project related to Protocol Services
for standardized programmatic access to chemical data, and other community resources.

Ingredients / Data standards and formats for Chemistry: This section will provide descriptions of standard notation
and file formats available for sharing and reusing chemical data that are referred to in recipes throughout this book.
- Chemical structures
- Chemical properties
- Chemical terminology
- Other useful formats

This category introduces basic techniques on how to work with machine-readable data with particular emphasis on
chemical data nuances and ways chemical data can be made more FAIR, when it is initially shared and for reusing
data that are not fully FAIR. Techniques should be relatively easy to implement into common workflow(s) and give
tangible results/improvements.

- Overview of good FAIR practices
- You’ve got to manage your data [files, structures, description, etc.]
- You’ve got to get the data shared and licensed and citable
- Identifying things of import (people, instrument, samples)
- Critical stages of data processing (raw, processed, derived)
- ***Example of how InChI supports F-A-I-R
- Overview of working with FAIR chemical data
- Queries
- Matching on chemical identifiers/representations
- APIs (what is an API and then link to some the recipes that demo these for tools and resources)
- Chemical data standards!
- Safety/watch-outs
- Syntax: character sets, units
- Semantics: valence models, units, temperature scale, date format
- Normalization
- Validation
- Clean up
- Examples of unFAIR data
- (condensed chemical formula is not fully interoperable/identifiable)
- Data values without reference (or other provenance, conditions)
- Using chemical data standards

- General resources on FAIR for chemistry
- FAIR Data Principles | NFDI4Chem Knowledge Base
- Elixir FAIR Cookbook
- Elixir DRM…
- General topics about how to improve working with chemical data, for example…
- Basic data management
- Assigning unique identifiers (especially for chemicals)
- File naming conventions


Given that most users of the cookbook will not be cheminformatics/data science experts, there needs to be some content
that provides background material to users. Generally this would mean content about chemistry information and data
needed by a computer science/data science background AND computer science information needed by a chemistry professional
or student. Some of this material will be available externally and linked in pages, but other content might be best
discussed in the context of computer science or chemistry to communicate how they relate.
- Basic data manipulation stuff
- APIs, spreadsheets, languages
- What happens when you have unFAIR data
- Basic chemistry issues? And how do you manage these?
- Chemical description
- Chemistry data standards
The cookbook is meant to provide practical approaches to different data tasks to inspire others to improve their own
data practices. This section will introduce basic techniques on how to work with machine-readable data with particular
emphasis on chemical data nuances and ways chemical data can be made more FAIR, when it is initially shared and for
reusing data that are not fully FAIR. Techniques should be relatively easy to implement into common workflow(s) and
give tangible results/improvements.
2 changes: 1 addition & 1 deletion _sources/tools.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# Tools for working with FAIR Chemical Data

Recipes in this section highlight online cheminformatics tools and web services that data researchers in the chemical
sciences should know about. Material includes brief explanations, tutorials and demos of what can be done, and indicate
sciences should know about. Material includes brief explanations, tutorials and demos of what can be done, and indicate
scenarios where the tool might be used to manipulate machine-readable chemical data - both by humans and machines.
15 changes: 15 additions & 0 deletions _sources/use_cookbook.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,17 @@
# How to use the Cookbook

This cookbook provides a range of protocols developed by active community members. These recipes target different
tasks across a range of possible use cases for working with machine-readable chemical data (i.e., FAIR data).
The aim is to present all materials with relevant chemistry examples, point to external content that are of high
quality where available, reference IUPAC and community digital standards where appropriate, and engage the chemistry
community in order to broaden the understanding of FAIR in chemistry.

The cookbook presents a collection of annotated code snippets and workflows for specific tasks in manipulating
machine-readable chemical data and metadata.

- Many of the recipes on this site take advantage of Juypter Notebooks to run Python code in the browser for an
interactive (and educational) feel for the user.
- Information on how, what and when a recipe might be useful is available in the collapsable 'header' below the
title of the recipe.
- The header also includes bullets for skills and learning objectives
- Ideas to further characterize the applicability of recipes are welcome (see feedback)!
Loading

0 comments on commit 83adaab

Please sign in to comment.