This document describes the steps to convert raw MEG/EEG data from the CBU MEG scanner to BIDS format. The BIDS format is a standard for organizing and describing neuroimaging and behavioural data. It is designed to make data sharing and analysis easier. The BIDS format is described in detail at bids.neuroimaging.io. The extension for MEG data is described at the bids-specification.
The conversion process detailed below is done using the mne-bids
Python package. The mne-bids
package provides tools for converting raw MEG/EEG data to BIDS format. The mne-bids
package is described in detail at mne-bids.
If you have any questions about this tutorial, please contact Máté Aller.
- This tutorial assumes that you are running the conversion process on the CBU linux cluster.
- Simply download the scripts in this repository to your local folder.
- Dependencies:
- Python 3.7 or later
- MNE-Python 1.15 or later
- MNE-BIDS 0.12 or later
- dcm2niix
- All the required packages are installed on the CBU cluster under the
mne1.6.1_0
conda environment in/imaging/local/software/mne_python/
. You can activate this environment by typing the following command in a terminal:conda activate /imaging/local/software/mne_python/mne1.6.1_0
After downloading the scripts in this repository, you will need to do the following to convert your raw MEG data to BIDS format:
- Update the
config.py
file with the appropriate project specific path information. - Update the
subject_info.json
file with the appropriate information for each subject. - Update the
event_info.json
file with information on the event values (triggers) and their labels saved with your raw MEG data. - run
meg_bids_data_conversion.py
You can find detailed instructions on each of these steps below.
The raw data from the CBU MEG scanner are stored at /megdata/cbu/[project_code]/[meg_id]/[yymmdd]/
, where project_code
is the unique identifier of your MEG project, the meg_id
is the participant's unique identifier and yymmdd
is the date of the scan. You can see all your participant scan directories by typing a command like this in a terminal (replace speech_misperception
with your project code):
ls -d /megdata/cbu/[project_code]/*/*
As a standard practice, empty room recordings are saved along with the corresponding raw data in the BIDS repository. The best practice is to ask your operateor to save your own empty room recording at the beginning of each recording session, so that your empty room file is stored along with the raw files for each session.
Alternatively, you can use the empty room recordings made by the MEG operators at the beginning of the day of your MEG recording sessions. For each MEG recording day, you can find these empty room recordings in /megdata/cbu/camtest/no_name/[yymmdd]/
where yymmdd
is the date of the scan.
Finally, if you don't wish to save the empty room recordings with the MEG data, you can just leave them out from the subject_info.json
file.
Please refer to this github page on how to find the MRI scans belonging to your MEG participant. You will need what's referred to as subject code
and project code
on that page to be able to locate a particular participant. These could be structural scans which you collected or you could reuse already existing structural scans of your participants if they had been scanned at the CBU before. In the latter case, ask the MRI administrator to locate these scans for you.
- Update the
project_root
variable so that it points to the root folder of your project - By default, the following folder structure is assumed:
project_root ├── data │ ├── rawdata │ ├── sourcedata
- Please update the
data_root
,bids_raw_root
andsourcedata_root
variables if you wish to set up a different folder structure.sourcedata_root
is the folder where the temporary MEG and MRI data will be saved during the conversion process and will be deleted after the conversion is complete. You can change this behaviour by setting thedelete_source
variable toFalse
. - Please update the
event_info_path
andsubject_info_path
variables if you wish to save theevent_info.json
andsubject_info.json
files in a different location. - Plese update the
meg_system
variable depending on which MEG system was used to collect the data. This should be a string. The default value is"triux"
for the new system. Alternatively it can be"vectorview"
for the old system. - Please update the
event_channels
variable to specify the channels that contain the event triggers.- This should be: a
list
of the names of the binary STI channels ("STI001"
,"STI002"
, etc.) ordict
of formdict(stim=stim_channel_list, resp=resp_channel_list)
, where the values are the names of the binary STI channels used for stimulus triggers and responses, respectively. - If a list is provided, the channel values are converted to decimal values and summed. If a dictionary is provided, the values in the 'stim' channels are converted to decimal and summed, whereas the values in the 'resp' channels are converted to decimal but not summed.
- By specifying separate stimulus and response channels as a
dict
, you can differentiate between stimulus and response triggers so that you get up to 128 unique event values for stimuli (1-128) and independent of these, up to 8 trigger values for response button presses (256, 512, 1028, 2048, 4096, 8192, 16384, 32768). - By default this is set to
["STI101"]
, which takes into account all the trigger channels in the MEG data (STI001 - STI016
), but overlapping stimulus and response triggers will be summed up, so you can get various weird trigger values (see the common issues wiki page referenced below). - Make sure the event channels you specify are actually present in the raw data. If any of the specified event channels are not present in the raw data of a given subject, the script will first try to use the default
STI101
channel, and will raise an error only if not evenSTI101
is present. - Please refer to this and this wiki page for more information on how triggers are encoded by the CBU MEG system and the common issues and pitfalls regarding reading events from
STI
channels.
- This should be: a
- Set the
adjust_event_times
variable toTrue
if you want to to adjust the event times to account for the audio and visual latencies. Current (as of 02/2023) auditory and visual latency values are given inconfig.py
. If you use this functionality make sure to set thevisual_event_values
andauditory_event_values
lists inconfig.py
to the desired values based on what is defined inevent_info.json
, see below. By default event times are not adjusted. - Please update the
auditory_event_values
andvisual_event_values
lists if you wish to adjust the event times to account for the audio and visual latencies.auditory_event_values
andvisual_event_values
are the event values (triggers) for auditory and visual events, respectively. These are used to adjust the event times to account for the audio and visual latencies. It is recommended to look up the event values from the event_info.json file, but they can also be hard coded inconfig.py
. These should be lists of integers. If either is an empty list, the corresponding event times will not be adjusted. - Set the
process_structural
variable toTrue
if you would like to process the structural MRI data. By default structural MRI data are not processed.
-
The
subject_info.json
file contains information about each participant. This information is used in the the BIDS conversion process. -
Each top level JSON object (or dictionary) in
subject_info.json
should have a key of numbers represented as a string (usually the serial order in which the data were collected) and should have the following sub dictionaries in their value:bids_id
: The BIDS identifier for the participant, corresponding to the 'sub' BIDS entity. This should be a string. If the value isnull
, the participant will be skipped during the conversion process.meg_id
: The unique identifier for the participant in the MEG data repository.meg_date
: The date of the MEG recording session in the formatyymmdd
.meg_raw_dir
: The path to the directory containing the raw MEG data for the participant.meg_emptyroom_dir
: The path to the directory containing the empty room recording for the participant. Could be the same asmeg_raw_dir
or different. If you don't wish to specify the empty room recording, set this tonull
.meg_raw_files
: list of dictionaries containing information about the raw MEG files for the participant. Each dictionary should have the following key-value pairs:run
: The run number for the raw MEG file corresponding to the 'run' BIDS entity. For the empty room recording, set this to"emptyroom"
. This should be a string.task
: The name of the task performed during the run, corresponding to the 'task' BIDS entity. For the empty room recording, set this tonull
. This should be a string.file
: The name of the raw MEG file. This should be a string with file extension included.
meg_bad_channels
: list of bad channels for the participant as noted by the MEG operator during the recording session. This should be a list of strings in the format ofMEG#### or EEG###
, where#
represents channel numbers. If there were no bad channels, this should be an empty list,[]
.mri_id
: The unique identifier for the participant's MRI scan. This should be a string.mri_date
: The date of the MRI scan in the formatyymmdd
.mri_dcm_dir
: The path to the directory containing the DICOM files for the participant's structural MRI scan. This should be a string. If you specify this, the script will convert the DICOM files to NIfTI format and write the structural MRI into the BIDS dataset. If you don't wish to process structural MRI data for a given subject, set this tonull
.
-
Simply copy over the example JSON object in
subject_info.json
as many times as you have subjects in your dataset and update the values for each key as appropriate. -
Note: currently it is assumed that the MEG data were collected in a single session. If you have multiple MEG recording sessions for a participant, you will need to add the 'ses' BIDS identity to the
meg_raw_files
dictionary and have separatemeg_date
for each session. You'll also need to modify the code inmeg_bids_data_conversion.py
to handle these changes.
- The
event_info.json
file contains information about the event values (triggers) and their labels saved with the raw MEG data. This information is used in the the BIDS conversion process. - Each key-value pair in the JSON object is a mapping of the event label to its value (the trigger recorded in the MEG file). The key should be the label as a string and the value should be the event value (trigger) as an integer.
- Refer to this mne-python tutorial on how to best map event IDs to trial descriptors.
- The
meg_bids_data_conversion.py
script is the main script that does the conversion of the raw MEG data to BIDS format. - The script can be run from the command line as follows (make sure to activate the conda environment containing the required packages before running the script, see the installation section for how to do this):
cd /path/to/your/folder/containing/the/scripts python meg_bids_data_conversion.py
- The script takes the following command line arguments:
--keep_existing_folders
: If specified, it indicates to keep the existing BIDS folders before conversion. By default they are purged to avoid any conflicts which is recommended, but be careful not to delete important files.--keep_source_data
: If specified, it indicates to keep the temporary MEG and MRI data saved during the conversion process. By default the source data are deleted after the conversion is complete to save disk space.--config
: Absolute import path to the configuration file. This makes it possible to select different config files programmatically. Mainly used for testing purposes. Default is 'config'. Should be in the form of 'module_name' or 'package.module_name'. For more details see: https://docs.python.org/3/library/importlib.html#importlib.import_module
- Example usage:
- With default options:
python meg_bids_data_conversion.py
- With keeping the source data:
python meg_bids_data_conversion.py --keep_source_data
- With default options:
- The script will convert the raw MEG data for all subjects specified in your
subject_info.json
file to BIDS format. The BIDS data will be saved in thebids_raw_root
folder specified in theconfig.py
file. - The script also fixes EEG channel locations if the data were collected using the old Vectorview system. With the old Vectorview system, for EEG channels > 60, the EEG channel locations obtained from Polhemus digitiser were not copied properly to Neuromag acquisition software. Therefore you must apply mne_check_eeg_locations to your data. Do this as early as possible in the processing pipeline. There is no harm in applying this function (e.g. if the eeg locations are correct), read more about this here. This step is not necessary for the new Triux system.
- Make sure to keep
meg_bids_data_conversion.py
,config.py
,subject_info.json
andevent_info.json
in the same directory. - For larger datasets, the conversion process can take a long time, so it is recommended to run the script on a compute node using the slurm job scheduler. You can use the
slurm_meg_bids_conversion_job_script.sh
script to submit a job to the slurm scheduler. You can submit the job by navigating into your project root folder in a terminal and executing the following command:sbatch slurm_meg_bids_conversion_job_script.sh
- Please update the
slurm_meg_bids_conversion_job_script.sh
file with the appropriate project specific path information. - You can set the number of nodes, number of tasks per node and time limit for the job at the top of
slurm_meg_bids_conversion_job_script.sh
in the lines starting with##SBATCH
, by changing the values of the--nodes
,--ntasks
and--time
flags respectively. You can read more about the slurm job scheduler in the CBU intranet. - Note: currently the script does not support parallel execution. This feature will be added in the future.
- Please update the
You can add a dataset description file to your BIDS repository. mne_bids.make_dataset_description()
provides a convenient way of generating this file. You can see how it is done at the end of this mne-bids tutorial
As is, the BIDS repository generated by the conversion process is sufficient for you to work on, but it will contain information which in theory could be used along with other pieces of information obtained elsewhere to uniquely identify a participant. For example, the date of the recording is used in the BIDS dataset to link the meg raw data with the corresponding emptyroom recording is one such piece of information.
If you wish to share your data outside the CBU, you will need to anonymize the BIDS repository. Please read this mne-bids tutorial to learn more about anonymizing a BIDS repository. The package provides a convenient fucntion, mne_bids.anonymize_dataset()
to do the heavy lifting for you. Keep in mind to double check your anonymized dataset if there are any such pieces of information left in the dataset. Ultimately it is your responsibility make sure that the dataset you're sharing is GDPR compliant. Please email the methods group if you have any questions about this.
The BIDS standard specifies the 'derivatives' folder where you can save the results of your subsequent analyses. You can read more about this here. Generally, the rules of file naming and folder structure are more relaxed in the derivatives folder, but it is still a good idea to follow the BIDS standard as closely as possible. A suggested folder structure for the derivatives would be:
project_root
├── data
│ ├── rawdata
│ ├── sourcedata
│ ├── derivatives
│ │ ├── analysis_1
│ │ │ ├── sub-01
│ │ │ ├── etc...
│ │ ├── analysis_2
│ │ │ ├── etc...
For setting up folder structure and managing paths when accessing/saving files in your analyses we recommend using the BIDSPath
class provided by the mne-bids
package. You can read more about using BIDSPath
object in this tutorial and see meg_bids_data_conversion.py
for examples of how to use it. Pro tip: set check=False
when creating a BIDSPath
object to avoid imposing strict BIDS compliance on file naming in your derivatives folder.