Skip to content

Commit

Permalink
Bgc ocean new tool (#148)
Browse files Browse the repository at this point in the history
* Add ingestor tool

* Add test data files

* fix lint rm directory

* fix test

* fix test

* Rm useless test data file

* Improve id, name, description help

* Add new tool on bgc ocean

* Delete tools/ocean/bgc_harmonizer.xml

* Delete tools/ocean/test-data/D6901758_001.nc

* Delete tools/ocean/test-data/D6901758_002.nc

* Delete tools/ocean/test-data/D6901758_003.nc

* Delete tools/ocean/test-data/D6901758_004.nc

* Delete tools/ocean/test-data/D6901758_005.nc

* Create .shed.yml

* rm useless if
  • Loading branch information
Marie59 authored Dec 5, 2024
1 parent fa49586 commit 124fc52
Show file tree
Hide file tree
Showing 7 changed files with 141 additions and 0 deletions.
15 changes: 15 additions & 0 deletions tools/bgc_ocean/.shed.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
categories:
- Ecology
owner: ecology
remote_repository_url: https://github.com/galaxyecology/tools-ecology/tree/master/tools/bgc_ocean
homepage_url: https://github.com/Marie59/FE-ft-ESG/tree/main/ocean
long_description: |
Process in-situ and biogechemical oceanographic Argo or Glider data for the Earth System
type: unrestricted
auto_tool_repositories:
name_template: "{{ tool_id }}"
description_template: "Wrapper for ocean biogechemical data tool: {{ tool_name }}."
suite:
name: "bgc_ocean_suite"
description: "A suite of tools for the ocean biogeochemical compartment of the Earth System"
type: unrestricted
126 changes: 126 additions & 0 deletions tools/bgc_ocean/bgc_harmonizer.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
<tool id="harmonize_insitu_to_netcdf" name="QCV harmonizer" version="@TOOL_VERSION@+galaxy@VERSION_SUFFIX@" profile="22.01" license="MIT">
<description>and aggregator of insitu marine physical and biogeochemical data</description>
<macros>
<token name="@TOOL_VERSION@">1.1</token>
<token name="@VERSION_SUFFIX@">0</token>
</macros>
<requirements>
<container type="docker">pokapok/qcv_ingester:@TOOL_VERSION@</container>
</requirements>
<command detect_errors="exit_code"><![CDATA[
export HOME=\$PWD &&
#for $i, $infile in enumerate($infiles):
cp '$infile' '/runtime/data/original_data/work/ga_la_xy/${infile.element_identifier}' &&
#end for
/app/launchers/start-app.sh GALAXY &&
cp /runtime/data/harmonized_data/work/ga_la_xy_harm_agg.nc '$output_net'
]]></command>
<inputs>
<param name="infiles" type="data" format="netcdf" multiple="true" label="Input the netcdf data files" help="This files can netcdf raw Argo or Gliders datafiles following CMEMS convention."/>
</inputs>
<outputs>
<data name="output_net" format="netcdf" from_work_dir="/runtime/data/harmonized_data/work/*.nc" label="${tool.name} netcdf data" />
</outputs>
<tests>
<test expect_num_outputs="1">
<param name="infiles" value="D6901758_001.nc,D6901758_002.nc,D6901758_003.nc,D6901758_004.nc,D6901758_005.nc"/>
<output name="output_net">
<assert_contents>
<has_size value="427535" delta="0"/>
</assert_contents>
</output>
</test>
</tests>
<help><![CDATA[
.. class:: infomark
**What it does**
General presentation
The cerb_harmonizer tool aggregates and harmonizes marine in-situ data following the needs of the UseCase 2.1-BCG - Fair-Ease. It converts files of individual or already aggregated data profiles into concatened single file with harmonized vocabullary needed for the project.
Profiles are concatenated along the time dimension in the order given by the lising : if BGC data preceed CORE profiles, all BGC data will preceed CORE data. Use cycle_number variable to associate them again.
Vocabulary translations are below. On the left, the writing of each variable in the output file, on the right, all possible translations found over data exploration.
Coordinates and variables :
"lon" : ["longitude", "LONGITUDE", "LON", "lon"],
"lat" : ["latitude", "LATITUDE", "LAT", "lat"],
"time" : ["date", "time", "TIME", "JULD"],
"time_qc" : ["date_qc", "time_qc", "TIME_QC", "JULD_QC"],
"depth" : ["DEPTH", "depth"],
"cycle_number" : ["CYCLE_NUMBER"],
"ref_time" : ["REFERENCE_DATE_TIME"],
"pos_qc" : ["POSITION_QC",]
--
"temperature" : ["TEMP", "TEMPERATURE"],
"salinity" : ["PSAL", "PRACTICAL_SALINITY"],
"oxygen" : ["DOXY"],
"pressure" : ["PRES"],
"chlorophylle" : ["CHLA"],
"nitrate" : ["NO3", "NITRATE", "n_an"],
"bbp700" : ["BBP700"],
All variables are tagged with a suffix to indicate the state of the data (_raw, _dmadjusted, _rtadjusted)
--
Variables extracted from meta files are :
LAUNCH_DATE
PLATFORM_TYPE
PI_NAME meta variables are stored as metadata
--
Units :
"degree_east" : ["degree_east"],
"degree_north" : ["degree_north"],
"degree_celsius" : ["degree_celsius", "degree_Celsius"], # temperature
"psu" : ["psu", "practical_salinity_unit", "PSU"], # salinity
"micromol/l" : ["micromole.l-1", "micromole_per_liter", "μmol.l-1", "μmol/l", "μmol/L", "μmol.L-1"], # concentration liters
"mg/m3" : ["mg/m3",],
"micromole/kg" : ["micromole/kg",],
"m-1" : ["m-1"],
"decibar" : ["decibar", "dbar"], # pressure
Arbitrarily,
dates are written following : "seconds since 1950-01-01T00:00:00 in julian calendar"
longitude is set between : -180° and 180°
latitude is set between : -90° and 90°
WARNING : This application works only platform by platform. For example, it is possible to aggregate and harmonize a whole argo trajectory but one at a time. If two argo trajectories are needed to process, this tool needs to be run 2 times.
**Input**
a list of files in a txt file named cerb_listing.txt copntaining only filenames without paths. The tool will find the files automatically. A listing example here :
BD6901580_003.nc BD6901580_004.nc BD6901580_005.nc BD6901580_006.nc D6901580_003.nc D6901580_004.nc D6901580_005.nc D6901580_006.nc 6901580_meta.nc
paths of where are the data (volumes) containing configurations, listings and data. Paths are :
config path : where your textfile containing the list of files names is : it contains the listing cerb_listing.txt
data_path : highest folder including all the files to harmonize written in the textfile listing
**Output**
A concatenated and harmonized netcdf file
]]></help>
<citations>
<citation type="bibtex">
@Manual{,
title = {QCV Ingester},
author = {Pokapok},
year = {2024},
note = {https://gitlab.com/pokapok-projects/PKP8-OGI-BGC/qcv_ingester}
</citation>
</citations>
</tool>
Binary file added tools/bgc_ocean/test-data/D6901758_001.nc
Binary file not shown.
Binary file added tools/bgc_ocean/test-data/D6901758_002.nc
Binary file not shown.
Binary file added tools/bgc_ocean/test-data/D6901758_003.nc
Binary file not shown.
Binary file added tools/bgc_ocean/test-data/D6901758_004.nc
Binary file not shown.
Binary file added tools/bgc_ocean/test-data/D6901758_005.nc
Binary file not shown.

0 comments on commit 124fc52

Please sign in to comment.