Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add section about hdf5 in the headless mode doc. #142

Merged
merged 1 commit into from
Oct 16, 2020

Conversation

jjerphan
Copy link
Contributor

This change adds a section of interest for Ilastik's headless mode documentation.

See discussions on image.sc forum:

@imagesc-bot
Copy link

This pull request has been mentioned on Image.sc Forum. There might be relevant details there:

https://forum.image.sc/t/notable-memory-usage-difference-when-running-ilastik-in-headless-mode-on-different-machines/41144/6

Copy link
Contributor

@k-dominik k-dominik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @jjerphan,

thank you very much for the contribution! The text is on point. One thing that is missing is maybe the complete list of methods to convert your data to hdf5 as mentioned in the FAQ.

The only thing I am afraid of is that users find this information too late. When you're in the headless stage you have already undergone quite the suffering. Would you think this could be more findable in the Data Selection documentation? Would you, from a user perspective even read it?

@jjerphan
Copy link
Contributor Author

jjerphan commented Aug 11, 2020

Hi @k-dominik

One thing that is missing is maybe the complete list of methods to convert your data to hdf5 as mentioned in the FAQ.

That's true. On my side, I am using a simple python script to convert TIF stacks to hdf5 which can be boiled down to:

#! /usr/bin/env python

import argparse
import os
import h5py

from skimage import io

def main():
    parser = argparse.ArgumentParser("TIF Stack to hdf5 converter")

    parser.add_argument("in_tif", help="Input TIF stack (3D image)")
    parser.add_argument("out_folder", help="Output folder")

    args = parser.parse_args()

    tif_stack_file = args.in_tif
    data = io.imread(tif_stack_file)

    os.makedirs(args.out_folder, exist_ok=True)

    # Convert a path like '/path/to/file.name.ext' to 'file.name'
    basename = ".".join(tif_stack_file.split(os.sep)[-1].split(".")[:-1])

    file_name = os.path.join(args.out_folder, f"{basename}.h5")
    hf = h5py.File(file_name, 'w')
    # Chunking for better 3D access then
    hf.create_dataset("dataset", data=data, chunks=True)
    hf.close()

if __name__ == "__main__":
    main()

This example might be handy for people who are more comfortable using scripts that the plugin for example or who would like to automate some processing. I don't what the best ways to list methods is to be honest. Do you have any idea? 🙂

Would you think this could be more findable in the Data Selection documentation? Would you, from a user perspective even read it?

I don't really know: I am the kind of reader who only read the documentation when I have a problem — I don't remember to have read that section for example. From a user perspective, I mainly rely on the indication given by the GUI or CLI or explicit warnings (in logs for example). If I have ant trouble, I am first searching in the forum, in the issues and then in the doc with specific keywords.

But I think that this might really depend on the background of users.

@k-dominik
Copy link
Contributor

Hi @jjerphan :)

That's true. On my side, I am using a simple python script to convert TIF stacks to hdf5 which can be boiled down to:

I'd think that people who are capable of writing their own scripts to convert data probably don't need any hints. So I'd probably just add those from the FAQ (which would be the only change I'd suggest for this PR).

But looking at your nice script (Thanks for sharing!) makes me think that we need a place to put things like these... -> #143

As a follow up on this PR I've opened #144 to maybe make all those performance tips more findable.

@jjerphan
Copy link
Contributor Author

Hi @k-dominik

I'd think that people who are capable of writing their own scripts to convert data probably don't need any hints. So I'd probably just add those from the FAQ (which would be the only change I'd suggest for this PR).

Which scripts are you referring to? So that I can add them to this PR. 🙂

@k-dominik
Copy link
Contributor

Hi @k-dominik

I'd think that people who are capable of writing their own scripts to convert data probably don't need any hints. So I'd probably just add those from the FAQ (which would be the only change I'd suggest for this PR).

Which scripts are you referring to? So that I can add them to this PR. slightly_smiling_face

I'd propose to keep this separate. I meant the script you have shared here for tif-stack to hdf5 conversion. i wanted to collect some more ideas/opinions on where to put them in #143

@jjerphan
Copy link
Contributor Author

OK I see. Shall I modify or add something to this PR? 🙂

@jjerphan
Copy link
Contributor Author

Up @k-dominik.

@k-dominik
Copy link
Contributor

thank you very much for your contribution @jjerphan !

@k-dominik k-dominik merged commit 34347f4 into ilastik:master Oct 16, 2020
@jjerphan jjerphan deleted the headless_hdf5_section branch October 16, 2020 21:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants