Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

idr0004-thorpe-rad52 S-BIAD867 #637

Open
will-moore opened this issue Feb 22, 2023 · 44 comments
Open

idr0004-thorpe-rad52 S-BIAD867 #637

will-moore opened this issue Feb 22, 2023 · 44 comments

Comments

@will-moore
Copy link
Member

idr0004-thorpe-rad52

@dominikl
Copy link
Member

Fails with:

2023-02-27 11:54:05,356 [main] WARN  loci.formats.FormatHandler - Ignoring extra series for well #95
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by com.esotericsoftware.kryo.util.UnsafeUtil (file:/home/dlindner/bioformats2raw-0.7.0-SNAPSHOT/lib/kryo-2.24.0.jar) to constructor java.nio.DirectByteBuffer(long,int,java.lang.Object)
WARNING: Please consider reporting this to the maintainers of com.esotericsoftware.kryo.util.UnsafeUtil
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
2023-02-27 11:54:35,469 [main] ERROR c.g.bioformats2raw.Converter - Error while writing series 91
java.lang.IllegalArgumentException: Invalid series: 91  index=91
        at loci.formats.FormatReader.seriesToCoreIndex(FormatReader.java:1267)
        at loci.formats.FormatReader.setSeries(FormatReader.java:928)
        at loci.formats.ReaderWrapper.setSeries(ReaderWrapper.java:375)
        at loci.formats.ReaderWrapper.setSeries(ReaderWrapper.java:375)
        at loci.formats.ReaderWrapper.setSeries(ReaderWrapper.java:375)
        at com.glencoesoftware.bioformats2raw.Converter.lambda$write$1(Converte

@sbesson
Copy link
Member

sbesson commented Feb 27, 2023

The error rings a bell as I recall there were historical issues with some data from this historical submission.
Does this happen for all plates? If not, do you have a path to a representative sample?

@dominikl dominikl added the bug label Mar 6, 2023
@will-moore
Copy link
Member Author

will-moore commented Jul 11, 2023

Going to use idr-ftp.openmicroscopy.org for conversion...

$ ssh -A idr-ftp.openmicroscopy.org
$ cd /data
$ sudo mkdir ngff && sudo chown wmoore ngff && cd ngff

$ conda create -n bioformats2raw python=3.9
$ conda activate bioformats2raw
$ conda install -c ome bioformats2raw

$ wget https://github.com/IDR/bioformats2raw/releases/download/v0.6.0-24/bioformats2raw-0.6.0-24.zip
$ unzip bioformats2raw-0.6.0-24.zip

$ sudo -Es git clone [email protected]:IDR/idr-metadata.git

$ mkdir idr0004
$ screen -S idr0004_ngff
$ conda activate bioformats2raw

(bioformats2raw) [wmoore@idrftp-ftp ngff]$ ./bioformats2raw-0.6.0-24/bin/bioformats2raw --memo-directory ./memo idr-metadata/idr0004-thorpe-rad52/screens/P101.screen idr0004/P101.ome.zarr
OpenJDK 64-Bit Server VM warning: You have loaded library /tmp/opencv_openpnp5470699920951085994/nu/pattern/opencv/linux/x86_64/libopencv_java342.so which might have disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'.
Exception in thread "main" picocli.CommandLine$ExecutionException: Error while calling command (com.glencoesoftware.bioformats2raw.Converter@16150369): java.io.FileNotFoundException: /uod/idr/filesets/idr0004-thorpe-rad52/Rad52_old/Rad52/P101/a2 (No such file or directory)
        at picocli.CommandLine.executeUserObject(CommandLine.java:1962)
        at picocli.CommandLine.access$1300(CommandLine.java:145)
        at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2352)
        at picocli.CommandLine$RunLast.handle(CommandLine.java:2346)
        at picocli.CommandLine$RunLast.handle(CommandLine.java:2311)
        at picocli.CommandLine$AbstractParseResultHandler.handleParseResult(CommandLine.java:2172)
        at picocli.CommandLine.parseWithHandlers(CommandLine.java:2550)
        at picocli.CommandLine.parseWithHandler(CommandLine.java:2485)
        at picocli.CommandLine.call(CommandLine.java:2761)
        at com.glencoesoftware.bioformats2raw.Converter.main(Converter.java:2192)
Caused by: java.io.FileNotFoundException: /uod/idr/filesets/idr0004-thorpe-rad52/Rad52_old/Rad52/P101/a2 (No such file or directory)
        at java.base/java.io.RandomAccessFile.open0(Native Method)
        at java.base/java.io.RandomAccessFile.open(RandomAccessFile.java:345)
        at java.base/java.io.RandomAccessFile.<init>(RandomAccessFile.java:259)

Of course - idr-ftp doesn't have /uod/idr/filesets/ mounted!

@will-moore
Copy link
Member Author

Size estimate of data...
uint16 (2 bytes per pixel), z:10 c:2, single field, 96-wells (some have missing wells), 47 plates

2 * 672 * 510 * 10 * 2 * 96 * 47 / 1000000000

61.8 GB

@will-moore will-moore assigned will-moore and dominikl and unassigned will-moore Jul 13, 2023
@joshmoore
Copy link
Member

cc: @dgault in case it rings a bell for him as well.

@joshmoore
Copy link
Member

@melissalinkert and I did some dredging of old synapses re: rad52 during the formats meeting:

  • This is one of the original JCB datasets and used the original ScreenReader functionality (pre-IDR).
  • We assume that we tested the IDR-ScreenReader functionality against it and that's what it should be using today. (@dominikl: you do have the IDR version, right?)
  • If that's all the case, then it's unlikely that we want to try to touch the underlying format to fix this and additionally, the format won't have any metadata to speak of.
  • ergo: candidate for omero-cli-zarr?

@dominikl
Copy link
Member

dominikl commented Aug 7, 2023

I just used bioformats2raw, I didn't use any specific IDR reader version. Yes, probably easiest to use cli-zarr export. Let me try...

@dominikl
Copy link
Member

dominikl commented Aug 7, 2023

Unfortunately doesn't work either:

Exporting to P101.ome.zarr (0.4)
Traceback (most recent call last):
  File "/home/dlindner/miniconda3/envs/myenv/bin/omero", line 11, in <module>
    sys.exit(main())
  File "/home/dlindner/miniconda3/envs/myenv/lib/python3.9/site-packages/omero/main.py", line 125, in main
    rv = omero.cli.argv()
  File "/home/dlindner/miniconda3/envs/myenv/lib/python3.9/site-packages/omero/cli.py", line 1784, in argv
    cli.invoke(args[1:])
  File "/home/dlindner/miniconda3/envs/myenv/lib/python3.9/site-packages/omero/cli.py", line 1222, in invoke
    stop = self.onecmd(line, previous_args)
  File "/home/dlindner/miniconda3/envs/myenv/lib/python3.9/site-packages/omero/cli.py", line 1299, in onecmd
    self.execute(line, previous_args)
  File "/home/dlindner/miniconda3/envs/myenv/lib/python3.9/site-packages/omero/cli.py", line 1381, in execute
    args.func(args)
  File "/home/dlindner/miniconda3/envs/myenv/lib/python3.9/site-packages/omero_zarr/cli.py", line 125, in _wrapper
    return func(self, *args, **kwargs)
  File "/home/dlindner/miniconda3/envs/myenv/lib/python3.9/site-packages/omero_zarr/cli.py", line 345, in export
    plate_to_zarr(plate, args)
  File "/home/dlindner/miniconda3/envs/myenv/lib/python3.9/site-packages/omero_zarr/raw_pixels.py", line 311, in plate_to_zarr
    write_plate_metadata(
  File "/home/dlindner/miniconda3/envs/myenv/lib/python3.9/site-packages/ome_zarr/writer.py", line 378, in write_plate_metadata
    "wells": _validate_plate_wells(wells, rows, columns, fmt=fmt),
  File "/home/dlindner/miniconda3/envs/myenv/lib/python3.9/site-packages/ome_zarr/writer.py", line 157, in _validate_plate_wells
    raise ValueError("Empty wells list")
ValueError: Empty wells list

@dominikl
Copy link
Member

dominikl commented Aug 7, 2023

Tried some other plates again with bioformats2raw, all failed with errors like

...
2023-08-07 13:49:46,548 [main] WARN  loci.formats.FormatHandler - Ignoring extra series for well #94
2023-08-07 13:49:46,586 [main] WARN  loci.formats.FormatHandler - Ignoring extra series for well #95
2023-08-07 13:50:11,914 [main] ERROR c.g.bioformats2raw.Converter - Error while writing series 82
java.lang.IllegalArgumentException: Invalid series: 82  index=82
	at loci.formats.FormatReader.seriesToCoreIndex(FormatReader.java:1267)
	at loci.formats.FormatReader.setSeries(FormatReader.java:928)
	at loci.formats.ReaderWrapper.setSeries(ReaderWrapper.java:375)
	at loci.formats.ReaderWrapper.setSeries(ReaderWrapper.java:375)
	at loci.formats.ReaderWrapper.setSeries(ReaderWrapper.java:375)
	at com.glencoesoftware.bioformats2raw.Converter.lambda$write$1(Converter.
...

@joshmoore
Copy link
Member

ValueError: Empty wells list

ah, at least that might be a straight-forward one to fix in Python land.

@will-moore
Copy link
Member Author

The issue is that the first Well is empty, so when we write metadata after the first Well, we have no Wells to write (also found that print of progress/eta fails).

I just pushed a fix to ome/omero-cli-zarr@1d72626 which is the branch we're using (you'll want to use the --name_by name option for this export too.

@joshmoore
Copy link
Member

Awesome, thanks, @will-moore. How far away from a release on those changes do you think we are?

@will-moore
Copy link
Member Author

@joshmoore There are some issues at ome/omero-cli-zarr#147 like whether to support removal of .pattern from the plate name, and how to handle names with spaces.

Probably I'll remove the .pattern renaming since that can be handled via renaming on cli after export, as I've done at #638
Don't know how to handle whitespace in names. Tried a couple of times to create zarrs with whitespace in names but don't remember what I tried now.

@will-moore
Copy link
Member Author

@joshmoore Added more testing to ome/omero-cli-zarr#147 about writing zarrs with various names. Turns out that whitespace isn't the issue but [] being recognised as regex is the problem.

@dominikl
Copy link
Member

dominikl commented Aug 8, 2023

👍 Thanks Will, seems to work now. I'll carry on converting idr0004 then and upload to biostudies.

@dominikl
Copy link
Member

dominikl commented Aug 9, 2023

Converted and uploaded to Biostudies.

@dominikl dominikl moved this from test convert to BioStudies Submission in NGFF conversion Aug 9, 2023
@dominikl dominikl removed their assignment Aug 9, 2023
@dominikl dominikl removed the bug label Aug 17, 2023
@will-moore will-moore assigned francesw and will-moore and unassigned francesw Aug 17, 2023
@will-moore
Copy link
Member Author

We seem to be missing a zip as I only see 46 .zip on the page at https://www.ebi.ac.uk/biostudies/submissions/files?path=%2Fuser%2Fidr0004 but we need 47 https://idr.openmicroscopy.org/webclient/?show=screen-202

Used this JS code on the submissions page above to load names from IDR and compare:

let url = "https://idr.openmicroscopy.org/webclient/api/plates/?id=202"
let idr_plates = await fetch(url).then(rsp => rsp.json());
let idr_names = idr_plates.plates.map(p => p.name);
let names = [];
[].forEach.call(document.querySelectorAll("div [role='row'] .ag-cell[col-id='name']"), function(div) {
  names.push(div.innerHTML.trim().replace(".ome.zarr.zip", ""));
});
idr_names.forEach(n => {if (names.indexOf(n) == -1) {console.log(n)}; });

It doesn't find any idr_names that are missing from this page, even though the idr_names.length is 47 and names.length is 46.
It turns out that there are 2 plates named P132!
https://idr.openmicroscopy.org/webclient/?show=plate-1774
https://idr.openmicroscopy.org/webclient/?show=plate-1966

They both appear to have the same Wells, but in different positions!
This is going to need some thought as our current workflow relies on different-named Filesets

cc @sbesson

@will-moore
Copy link
Member Author

As before, this now finds the 9 empty wells for Plate-1966 (Well<0)&(Plate==1966):
http://localhost:12345/webclient/omero_table/48436502/?query=(Well%3C0)%26(Plate==1966) now finds the

C4, C8, C10, E9, F10, F11, G5, H3, H11 - that corresponds to #637 (comment)

@will-moore will-moore moved this from Data on Embassy s3 to create new Filesets in idr-next in NGFF conversion Sep 6, 2023
@will-moore
Copy link
Member Author

will-moore commented Sep 6, 2023

Testing mkngff on idr0125-pilot...

idr0004/P170.ome.zarr,S-BIAD867/00d88a93-8d21-4a50-b8b5-60f11bcae0d3,12953
idr0004/P144.ome.zarr,S-BIAD867/02c5d63f-36f5-4862-9682-ec3a2702a1e5,12945
idr0004/P145.ome.zarr,S-BIAD867/06e3fba2-825a-441d-a3cb-2084515b1b14,12947
idr0004/P111.ome.zarr,S-BIAD867/0bb5992f-e8d8-45b1-9e5d-d0fb8325aabb,12917
idr0004/P120.ome.zarr,S-BIAD867/0d3e6be1-0c0a-42ef-8775-e3557c359b2d,12990
idr0004/P101.ome.zarr,S-BIAD867/103d9428-b86b-4f4e-84d8-966b5d89aae1,12909
idr0004/P105.ome.zarr,S-BIAD867/1d37d3c1-08f2-42a9-8c61-97fde7f221dd,12911
idr0004/P142.ome.zarr,S-BIAD867/264695a5-9c23-4204-aa46-dd7b709fe137,12943
idr0004/P149.ome.zarr,S-BIAD867/2b2aab5c-6d70-4e1d-8336-11e7d018759d,12951
idr0004/P138.ome.zarr,S-BIAD867/2bff1cee-81c0-467b-98f5-01db06f4d042,12939

Only let it run for 3 filesets... - took about 3 minutes each...

Found prefix demo_2/2015-10/01 // 08-49-38.885 for fileset 12953
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2015-10/01/08-49-38.885
Creating dir at /data/OMERO/ManagedRepository/demo_2/2015-10/01/08-49-38.885_mkngff
Creating symlink /data/OMERO/ManagedRepository/demo_2/2015-10/01/08-49-38.885_mkngff/00d88a93-8d21-4a50-b8b5-60f11bcae0d3.zarr -> /bia-integrator-data/S-BIAD867/00d88a93-8d21-4a50-b8b5-60f11bcae0d3/00d88a93-8d21-4a50-b8b5-60f11bcae0d3.zarr
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Found prefix demo_2/2015-10/01 // 08-34-48.864 for fileset 12945
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2015-10/01/08-34-48.864
Creating dir at /data/OMERO/ManagedRepository/demo_2/2015-10/01/08-34-48.864_mkngff
Creating symlink /data/OMERO/ManagedRepository/demo_2/2015-10/01/08-34-48.864_mkngff/02c5d63f-36f5-4862-9682-ec3a2702a1e5.zarr -> /bia-integrator-data/S-BIAD867/02c5d63f-36f5-4862-9682-ec3a2702a1e5/02c5d63f-36f5-4862-9682-ec3a2702a1e5.zarr
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Found prefix demo_2/2015-10/01 // 08-37-19.100 for fileset 12947
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2015-10/01/08-37-19.100
Creating dir at /data/OMERO/ManagedRepository/demo_2/2015-10/01/08-37-19.100_mkngff
Creating symlink /data/OMERO/ManagedRepository/demo_2/2015-10/01/08-37-19.100_mkngff/06e3fba2-825a-441d-a3cb-2084515b1b14.zarr -> /bia-integrator-data/S-BIAD867/06e3fba2-825a-441d-a3cb-2084515b1b14/06e3fba2-825a-441d-a3cb-2084515b1b14.zarr
bash-4.2$ for r in $(cat $IDRID.csv); do
>   fsid=$(echo $r | cut -d',' -f3)
>   psql -U omero -d idr -h $DBHOST -f "$fsid.sql"
> done
UPDATE 95
BEGIN
 mkngff_fileset 
----------------
        5287570
(1 row)
COMMIT
UPDATE 98
BEGIN
 mkngff_fileset 
----------------
        5287571
(1 row)
COMMIT
UPDATE 95
BEGIN
 mkngff_fileset 
----------------
        5287572
(1 row)
COMMIT

@will-moore
Copy link
Member Author

Waiting on memo regeneration, e.g. http://localhost:1080/webclient/?show=image-698736

@will-moore
Copy link
Member Author

Looks good - thumbnails updated..

Screenshot 2023-09-06 at 12 05 44

@will-moore
Copy link
Member Author

will-moore commented Sep 11, 2023

To test mkngff on all 46 Plates on idr-testing... NB - plate to be deleted above will be ignored by mkngff based on table below...

https://uk1s3.embassy.ebi.ac.uk/bia-integrator-data/pages/S-BIAD867.html

idr0004/P170.ome.zarr,S-BIAD867/00d88a93-8d21-4a50-b8b5-60f11bcae0d3,12953
idr0004/P144.ome.zarr,S-BIAD867/02c5d63f-36f5-4862-9682-ec3a2702a1e5,12945
idr0004/P145.ome.zarr,S-BIAD867/06e3fba2-825a-441d-a3cb-2084515b1b14,12947
idr0004/P111.ome.zarr,S-BIAD867/0bb5992f-e8d8-45b1-9e5d-d0fb8325aabb,12917
idr0004/P120.ome.zarr,S-BIAD867/0d3e6be1-0c0a-42ef-8775-e3557c359b2d,12990
idr0004/P101.ome.zarr,S-BIAD867/103d9428-b86b-4f4e-84d8-966b5d89aae1,12909
idr0004/P105.ome.zarr,S-BIAD867/1d37d3c1-08f2-42a9-8c61-97fde7f221dd,12911
idr0004/P142.ome.zarr,S-BIAD867/264695a5-9c23-4204-aa46-dd7b709fe137,12943
idr0004/P149.ome.zarr,S-BIAD867/2b2aab5c-6d70-4e1d-8336-11e7d018759d,12951
idr0004/P138.ome.zarr,S-BIAD867/2bff1cee-81c0-467b-98f5-01db06f4d042,12939
idr0004/P128.ome.zarr,S-BIAD867/2f42ce30-0ac3-4056-98d8-9ef1608dc019,13124
idr0004/P115.ome.zarr,S-BIAD867/35cfc0db-7795-497c-aed5-1ae591b2d9f1,12919
idr0004/P132.ome.zarr,S-BIAD867/3df515f9-75c8-4cc4-8c29-480ae9817880,13125
idr0004/P109.ome.zarr,S-BIAD867/430b28a8-a2c9-4d71-8e54-5b45ca051c51,12915
idr0004/P112.ome.zarr,S-BIAD867/4f4a0699-9f42-4272-a929-9e0139ec3857,12918
idr0004/P121.ome.zarr,S-BIAD867/59ef0f42-3306-411c-a168-8977911fe63c,12923
idr0004/P117.ome.zarr,S-BIAD867/66925bed-7857-461d-9c56-f6a9c5b7dd69,12920
idr0004/P133.ome.zarr,S-BIAD867/66f4e3df-441c-4003-be2b-1b9f6984543a,12934
idr0004/P140.ome.zarr,S-BIAD867/685006c9-293e-43f1-819b-2669c5916add,12941
idr0004/P110.ome.zarr,S-BIAD867/686fa20f-c678-4d3c-8367-6572dc5aca4d,12916
idr0004/P129.ome.zarr,S-BIAD867/74005c12-4837-48f5-b6f2-e6eb71a89ac3,12929
idr0004/P134.ome.zarr,S-BIAD867/74e0533d-d06b-4a46-bb52-b155190e3c8d,12935
idr0004/P139.ome.zarr,S-BIAD867/7ac3742c-2c37-45b2-8d04-3ef62daeeb8d,12940
idr0004/P146.ome.zarr,S-BIAD867/7c35875b-1f46-46a4-95db-a6efceba0ae9,12948
idr0004/P148.ome.zarr,S-BIAD867/857896cf-5b33-40e4-8fbb-56b0aa15decd,12950
idr0004/P130.ome.zarr,S-BIAD867/881a57a6-6da1-4306-bd8b-db0da2c8a076,12930
idr0004/P119.ome.zarr,S-BIAD867/8c9e1c63-61cc-4264-81ff-15534e962fcb,12922
idr0004/P150.ome.zarr,S-BIAD867/904d4a80-a3d3-4af9-af47-8a0f6e1776b7,12952
idr0004/P118.ome.zarr,S-BIAD867/a0394bf7-13c9-44c4-812d-c29e3b765bc0,12921
idr0004/P126.ome.zarr,S-BIAD867/a3ff8e5d-f665-4c18-a0eb-543730ca7b12,12927
idr0004/P135.ome.zarr,S-BIAD867/af83f2a1-2d0d-4d16-bcc3-6c83bf4e6b98,12936
idr0004/P136.ome.zarr,S-BIAD867/bc7ea09d-9a23-4d7a-88a4-af48459bd9ee,12937
idr0004/P147.ome.zarr,S-BIAD867/bf30beb6-72f5-4235-8fdc-12c946636951,12949
idr0004/P108.ome.zarr,S-BIAD867/c310fd5e-d6d9-49bb-840e-4cbb7b275b81,12914
idr0004/P131.ome.zarr,S-BIAD867/d408a260-f019-45e9-8e46-d09a874bcbc5,12932
idr0004/P107.ome.zarr,S-BIAD867/d98c3997-7b71-440d-a815-4f9bc70b8b22,12913
idr0004/P143.ome.zarr,S-BIAD867/dcd2adf1-10c1-4960-b01c-1426e1b46f6b,12944
idr0004/P125.ome.zarr,S-BIAD867/dff92046-c53b-4948-95be-cea12e577e9e,12926
idr0004/P171.ome.zarr,S-BIAD867/e3283a6a-d25b-41e1-8ab7-1837b89e3a6e,12954
idr0004/P123.ome.zarr,S-BIAD867/e6a5a8ba-3cdb-425d-b701-c1c0382a3eeb,12924
idr0004/P102.ome.zarr,S-BIAD867/ee396ed4-07f1-4351-ac9e-5956dd92000b,12910
idr0004/P124.ome.zarr,S-BIAD867/ee8872c8-e4b1-41fa-aa4f-a9e3e200c540,12925
idr0004/P141.ome.zarr,S-BIAD867/ef42a819-34ad-4fbb-b7b9-3e2eb0f4fd17,12942
idr0004/P137.ome.zarr,S-BIAD867/f5ce45be-0b8c-4539-ae29-66978555f0ec,12938
idr0004/P106.ome.zarr,S-BIAD867/f791de8d-cd01-4303-8b08-67cbbbb45b64,12912
idr0004/P127.ome.zarr,S-BIAD867/fcc7c91a-f0e6-43f4-93a6-220e9224eda5,12928

Started mkngff 12:24. "about 3 mins each" x 46 = 2.5 hours...

Viewing images on first Plate P101...

$ grep -A 2 "saved memo" /opt/omero/server/OMERO.server/var/log/Blitz-0.log | grep -A 2 "185_mkngff"

2023-09-11 20:22:49,697 DEBUG [                   loci.formats.Memoizer] (l.Server-6) saved memo file: /data/OMERO/BioFormatsCache/data/OMERO/ManagedRepository/demo_2/2015-10/01/07-25-30.185_mkngff/103d9428-b86b-4f4e-84d8-966b5d89aae1.zarr/..zattrs.bfmemo (171077 bytes)
2023-09-11 20:22:49,697 DEBUG [                   loci.formats.Memoizer] (l.Server-6) start[1694463707838] time[61858] tag[loci.formats.Memoizer.setId]
2023-09-11 20:22:49,697 INFO  [                ome.io.nio.PixelsService] (l.Server-6) Creating BfPixelBuffer: /data/OMERO/ManagedRepository/demo_2/2015-10/01/07-25-30.185_mkngff/103d9428-b86b-4f4e-84d8-966b5d89aae1.zarr/.zattrs Series: 0

SetId took about a minute.

@will-moore
Copy link
Member Author

Results of check_pixels...

Checking 50 images from each plate: IDR/idr-utils#55 (comment)

@will-moore
Copy link
Member Author

Plate P132 with mismatches found by check_pixels was previously duplicated in IDR (and 1 of the duplicates deleted in last release). It looks like we have the wrong NGFF plate for the remaining IDR plate.
Need to re-export...

The image at C7 on P115 with "missing chunks" at IDR/idr-utils#55 (comment) looks OK at https://ome.github.io/ome-ngff-validator/?source=https://uk1s3.embassy.ebi.ac.uk/bia-integrator-data/S-BIAD867/35cfc0db-7795-497c-aed5-1ae591b2d9f1/35cfc0db-7795-497c-aed5-1ae591b2d9f1.zarr/C/7/0/

Image

Rendering the image is clearly corrupted: http://localhost:1080/webclient/render_image/692975/0/0/?c=1|586:2652$FFFFFF,2|257:1400$00FF00

Image

@will-moore
Copy link
Member Author

will-moore commented Nov 28, 2023

With the duplicate plates issue

Re-export on idr-ftp /data/idr0004/

conda activate omero_zarr_export
omero zarr export Plate:1966 --name_by=name
...

Deleted P132.ome.zarr.zip on https://www.ebi.ac.uk/biostudies/submissions/files?path=%2Fuser%2Fidr0004 and replaced:

(base) [wmoore@idrftp-ftp ~]$ sudo /root/.aspera/cli/bin/ascp -P33001 -i /root/.aspera/cli/etc/asperaweb_id_dsa.openssh -d /data/idr0004/idr0004/ [email protected]:5f/xxx-xxx-xx-x-xxxx
P132.ome.zarr.zip                                                                                                                                                                                                     100%  364MB  377Mb/s    00:07    
Completed: 372995K bytes transferred in 7 seconds
 (417463K bits/sec), in 1 file, 1 directory.

@will-moore
Copy link
Member Author

Checking image above:

$ python check_pixels.py Image:692975 --max-planes=sizeC
0/1 Check Image:692975 P115 [Well C7, Field 1]
ERROR:omero.gateway:Failed to getPlane() or getTile() from rawPixelsStore
Traceback (most recent call last):
  File "/Users/wmoore/Desktop/PY/omero-py/target/omero/gateway/__init__.py", line 7542, in getTiles
    convertedPlane = unpack(convertType, rawPlane)
struct.error: unpack requires a buffer of 614400 bytes
Error: Image:692975 unpack requires a buffer of 614400 bytes
End: 2023-11-29 15:47:14.657066

If we print the size of the bytes right before the error above, we can see that the bytes we get is too large:

                convertType = '>%d%s' % (
                    (planeY*planeX), pixelTypes[pixelType][0])
                print("rawPlane", len(rawPlane), convertType)
                if isinstance(rawPlane, bytes):
                    convertedPlane = unpack(convertType, rawPlane)

prints

rawPlane 685440 >307200H

685440 is what is returned when we expect 614400 bytes (2 bytes per pixel, 307200 pixels is 480 * 640).
685440 / 2 is 342720 pixels or 672 x 510.

When rendering the 1 resolution (half size) we get 336 x 255 if we ask for a big Tile:

Screenshot 2023-11-29 at 16 00 26

but we don't see this when requesting 0 level (full size):

Screenshot 2023-11-29 at 16 01 44

Requesting tile=1,0,0,320,240 crops to those dimensions so we get the image without spacers.

@will-moore
Copy link
Member Author

The size mismatch is because ZarrReader is assuming that all images in the Plate are the same size.
So we need to specify NOT to do this for the 2 Filesets affected by adding zarrreader.quick_read=false into the bfoptions file. As omero-server user on idr-testing:omeroreadwrite...

Plate P115:

vi /data/OMERO/ManagedRepository/demo_2/2015-10/01/07-46-42.965_mkngff/35cfc0db-7795-497c-aed5-1ae591b2d9f1.zarr.bfoptions

Plate P124:

vi /data/OMERO/ManagedRepository/demo_2/2015-10/01/07-57-40.271_mkngff/ee8872c8-e4b1-41fa-aa4f-a9e3e200c540.zarr.bfoptions

Now they look like this:

omezarr.list_pixels=false
zarrreader.quick_read=false

Now delete the existing memo files...

(venv3) bash-4.2$ rm /data/OMERO/BioFormatsCache/data/OMERO/ManagedRepository/demo_2/2015-10/01/07-57-40.271_mkngff/ee8872c8-e4b1-41fa-aa4f-a9e3e200c540.zarr/..zattrs.bfmemo 
(venv3) bash-4.2$ rm /data/OMERO/BioFormatsCache/data/OMERO/ManagedRepository/demo_2/2015-10/01/07-46-42.965_mkngff/35cfc0db-7795-497c-aed5-1ae591b2d9f1.zarr/..zattrs.bfmemo 

Try to view images again...

@will-moore
Copy link
Member Author

Image looks good now:

Screenshot 2023-12-01 at 13 48 04

python check_pixels.py Image:692975 --max-planes=sizeC

Start: 2023-12-01 13:50:59.158362
Checking Image:692975
max_planes: sizeC
max_images: 0
0/1 Check Image:692975 P115 [Well C7, Field 1]
End: 2023-12-01 13:51:18.206637

@will-moore
Copy link
Member Author

P132 plate has been updated on BioStudies.

https://hms-dbmi.github.io/vizarr/?source=https://uk1s3.embassy.ebi.ac.uk/bia-integrator-data/S-BIAD867/77517221-a983-4761-8021-c0039a7728e1/77517221-a983-4761-8021-c0039a7728e1.zarr
now matches https://idr.openmicroscopy.org/webclient/?show=image-797783 (Fileset ID: 13125)

On idr0125-pilot...

$ omero mkngff sql 13125 --clientpath="https://uk1s3.embassy.ebi.ac.uk/bia-integrator-data/S-BIAD867/77517221-a983-4761-8021-c0039a7728e1/77517221-a983-4761-8021-c0039a7728e1.zarr" "/bia-integrator-data/S-BIAD867/77517221-a983-4761-8021-c0039a7728e1/77517221-a983-4761-8021-c0039a7728e1.zarr" > "idr0004/13125.sql"

$ cat idr0004/13125.sql | wc
    719    2855  212416

Added to IDR/mkngff_upgrade_scripts@db9efb8

@will-moore
Copy link
Member Author

The sql generated above when logged-in to idr.openmicroscopy.org, so we have the original Fileset ID etc.
But we don't have a test server to test that, as they've all had the idr0004 Plate P132 updated with mkngff already.

So lets generate fresh sql from the updated plate on idr0125-pilot...

as omero-server... logged in to localhost

omero mkngff sql 5287668 --secret=$SECRET --clientpath="https://uk1s3.embassy.ebi.ac.uk/bia-integrator-data/S-BIAD867/77517221-a983-4761-8021-c0039a7728e1/77517221-a983-4761-8021-c0039a7728e1.zarr" "/bia-integrator-data/S-BIAD867/77517221-a983-4761-8021-c0039a7728e1/77517221-a983-4761-8021-c0039a7728e1.zarr" > "idr0004/5287668.sql"

$ psql -U omero -d idr -h $DBHOST -f idr0004/5287668.sql 
UPDATE 91
BEGIN
 mkngff_fileset 
----------------
        5289226
(1 row)
COMMIT

$ omero mkngff symlink /data/OMERO/ManagedRepository 5287668 "/bia-integrator-data/S-BIAD867/77517221-a983-4761-8021-c0039a7728e1/77517221-a983-4761-8021-c0039a7728e1.zarr" --bfoptions
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2015-10/01/18-25-11.206_mkngff
Creating dir at /data/OMERO/ManagedRepository/demo_2/2015-10/01/18-25-11.206_mkngff_mkngff
Creating symlink /data/OMERO/ManagedRepository/demo_2/2015-10/01/18-25-11.206_mkngff_mkngff/77517221-a983-4761-8021-c0039a7728e1.zarr -> /bia-integrator-data/S-BIAD867/77517221-a983-4761-8021-c0039a7728e1/77517221-a983-4761-8021-c0039a7728e1.zarr
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2015-10/01/18-25-11.206_mkngff
write bfoptions to: /data/OMERO/ManagedRepository/demo_2/2015-10/01/18-25-11.206_mkngff_mkngff/77517221-a983-4761-8021-c0039a7728e1.zarr.bfoptions

Viewing in webclient looks good - after memo regenerated...

python check_pixels.py Plate:1966 --max-planes=sizeC > /tmp/check_pix_20231219_plate1966.log

...
82/87 Check Image:797812 P132 [Well F8, Field 1]
83/87 Check Image:797813 P132 [Well A8, Field 1]
84/87 Check Image:797814 P132 [Well F12, Field 1]
85/87 Check Image:797815 P132 [Well A7, Field 1]
86/87 Check Image:797816 P132 [Well H10, Field 1]
End: 2023-12-19 16:33:26.819467

(base) bash-4.2$ grep Error !$
grep Error /tmp/check_pix_20231219_plate1966.log

@will-moore
Copy link
Member Author

On idr-next, as omero-server...

cd
git clone https://github.com/IDR/mkngff_upgrade_scripts.git

cd mkngff_upgrade_scripts/ngff_filesets/idr0004
sed -i 's/SECRETUUID/e3e1ac30-7b69-473b-98b2-428780578b1c/g' 13125.sql

$ psql -U omero -d idr -h $DBHOST -f 13125.sql 
UPDATE 91
BEGIN
 mkngff_fileset 
----------------
        6314437
(1 row)
COMMIT

$ omero mkngff symlink /data/OMERO/ManagedRepository 13125 "/bia-integrator-data/S-BIAD867/77517221-a983-4761-8021-c0039a7728e1/77517221-a983-4761-8021-c0039a7728e1.zarr" --bfoptions

Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2015-10/01/18-25-11.206
Creating dir at /data/OMERO/ManagedRepository/demo_2/2015-10/01/18-25-11.206_mkngff
Creating symlink /data/OMERO/ManagedRepository/demo_2/2015-10/01/18-25-11.206_mkngff/77517221-a983-4761-8021-c0039a7728e1.zarr -> /bia-integrator-data/S-BIAD867/77517221-a983-4761-8021-c0039a7728e1/77517221-a983-4761-8021-c0039a7728e1.zarr
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2015-10/01/18-25-11.206
write bfoptions to: /data/OMERO/ManagedRepository/demo_2/2015-10/01/18-25-11.206_mkngff/77517221-a983-4761-8021-c0039a7728e1.zarr.bfoptions
python check_pixels.py Plate:1966 --max-planes=sizeC > /tmp/check_pix_20231219_plate1966.log
...
83/87 Check Image:797813 P132 [Well A8, Field 1]
84/87 Check Image:797814 P132 [Well F12, Field 1]
85/87 Check Image:797815 P132 [Well A7, Field 1]
86/87 Check Image:797816 P132 [Well H10, Field 1]
End: 2023-12-19 17:01:26.115466

grep Error /tmp/check_pix_20231219_plate1966.log

@will-moore will-moore moved this from check_pixels to check_pixels in progress in NGFF conversion Jan 3, 2024
@will-moore will-moore moved this from check_pixels in progress to pixels validated in NGFF conversion Jan 4, 2024
@will-moore
Copy link
Member Author

Checking Fileset IDs still valid:

(base) Williams-MacBook-Pro:ngff_filesets wmoore$ pwd
/Users/wmoore/Desktop/IDR/mkngff_upgrade_scripts/ngff_filesets
(base) Williams-MacBook-Pro:ngff_filesets wmoore$ python parse_bia_uuids.py idr0004
46 filesets matched

@will-moore will-moore moved this from pixels validated to Round 2 - psql fileset IDs checked in NGFF conversion Mar 18, 2024
@will-moore will-moore moved this from Other issues (not studies) to NGFF studies in NGFF conversion May 21, 2024
@will-moore will-moore mentioned this issue May 21, 2024
15 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: NGFF studies
Development

No branches or pull requests

5 participants