-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
idr0035-caie-drugresponse S-BIAD847 #639
Comments
Looks like I had made some progress on this conversion in the same way as #649 using the following script import os
sourcedir="/uod/idr/filesets/idr0035-caie-drugresponse/images"
targetdir='/data/idr0035/sources/'
zarrdir="/data/idr0035/zarr/"
bf2raw_exec = "/opt/bioformats2raw/bioformats2raw-0.6.1/bin/bioformats2raw"
aws_exec = "~/venv/bin/aws"
with open('idr0035.HTD', 'r') as f:
htd_template = f.read()
plates = os.listdir(sourcedir)
commands = []
for plate in plates:
source_plate = os.path.join(sourcedir, plate)
target_plate = os.path.join(targetdir, plate)
os.makedirs(target_plate)
tiffs = os.listdir(source_plate)
for tiff in tiffs:
os.symlink(os.path.join(source_plate, tiff), os.path.join(target_plate, tiff))
htd_file = os.path.join(target_plate, "_".join(tiffs[0].split('_')[:2]) + ".HTD")
with open(htd_file,'w') as f:
f.write(htd_template.format(plate=plate))
zarr_plate = os.path.join(zarrdir, plate + ".zarr")
s3_zarr_plate = f"s3://idr0035/zarr/{plate}.zarr"
bf2raw_cmd = " ".join([bf2raw_exec, htd_file, zarr_plate, "-p"])
aws_cmd = " ".join(
[aws_exec, "--profile ebi", "--endpoint-url https://uk1s3.embassy.ebi.ac.uk" ,"s3 cp --recursive",
zarr_plate, s3_zarr_plate])
rm_cmd = " ".join(["rm", "-r", zarr_plate])
commands.append(" && ".join([bf2raw_cmd, aws_cmd, rm_cmd]))
with open("idr0035_commands", 'w') as f:
for command in commands:
f.write(command)
f.write('\n') and the following MetaXpress file
A first plate has been uploaded to the
|
Looks good in vizarr: Import test plate on idr0125-pilot...
|
Update symlinks to s3 plate with chunks...
Looks good: |
Any reason why this got moved to Note the conversion is primarily waiting on the decision of how to zip the NGFF datasets (with or without the top-level directory) in preparation of upload to the BioImage Archive |
@sbesson Sorry, yep - don't know why I did that. |
For the full study conversion, I used the following script to generate the symlinks, the HTD file and commands to executed import os
sourcedir="/uod/idr/filesets/idr0035-caie-drugresponse/images"
targetdir='/data/idr0035/sources/'
zarrdir="/data/idr0035/zarr/"
bf2raw_exec = "/opt/bioformats2raw/bioformats2raw-0.6.1/bin/bioformats2raw"
with open('idr0035.HTD', 'r') as f:
htd_template = f.read()
plates = os.listdir(sourcedir)
plates.sort()
commands = []
for plate in plates:
source_plate = os.path.join(sourcedir, plate)
target_plate = os.path.join(targetdir, plate)
os.makedirs(target_plate)
tiffs = os.listdir(source_plate)
tiffs.sort()
firsttiff = tiffs[0]
if firsttiff.startswith("B02"):
platename = os.path.basename(source_plate)
prefix = platename + "_"
else:
platename = firsttiff[0:firsttiff.index("_B02")]
prefix = ""
for tiff in tiffs:
os.symlink(os.path.join(source_plate, tiff), os.path.join(target_plate, prefix + tiff))
htd_file = os.path.join(target_plate, platename + ".HTD")
with open(htd_file,'w') as f:
f.write(htd_template.format(plate=plate))
zarr_plate = os.path.join(zarrdir, plate + ".zarr")
s3_zarr_plate = f"s3://idr0035/zarr/{plate}.zarr"
bf2raw_cmd = " ".join([bf2raw_exec, htd_file, zarr_plate, "-p"])
commands.append(bf2raw_cmd)
with open("idr0035_commands", 'w') as f:
for command in commands:
f.write(command)
f.write('\n') The prefix handling is required for the plates of All 55 plates have been successfully converted into OME-NGFF (105G in total). @will-moore is the next step to zip these Zarr folders in preparation of the upload or do we want them uploaded as such to the |
Thanks @sbesson - lets upload 1 or 2 representative plates to idr0035 temporary bucket for validation, import etc. But can also start zipping them all, ready for BioStudies upload. Zipping the outer |
Note the generated datasets only have |
It's possible that some steps of the workflow might need tweaking to accommodate that naming but I don't expect that to be a problem. |
Since it was only 100G in total, all 55 plates have been uploaded to the S3 bucket for validation. Also zipped them in-place using
Ready for the next phase |
Data look good in vizarr and validator: |
|
Delete zips dir...
|
We currently have 46 out of 55 images viewable at https://uk1s3.embassy.ebi.ac.uk/bia-integrator-data/pages/S-BIAD847.html But going to try and replace filesets for ALL, as described IDR/omero-mkngff#2 (without chunks - testing IDR/omero-mkngff#5). NB: needed to edit
This took about
Takes 1 or 2 secs for each sql. |
Test on Started mkngff 16:32... ...completed 22:56 (6.5 hours). Load image from first Plate: http://localhost:1080/webclient/?show=image-3426101 Memo file generation appears twice in logs for that fileset...
3710577 ms is 1 hour |
idr0035-caie-drugresponse
Sample plate conversion failed with:
That looks unrelated to IDR/bioformats#29 , doesn't it @sbesson ?
The text was updated successfully, but these errors were encountered: