Factor stream resource producing code into a separate class #290

skarakuzu · 2024-05-08T18:21:36Z

Closes #173

danielballan · 2024-05-09T13:57:23Z

src/ophyd_async/epics/areadetector/writers/general_hdffile.py

-from ._hdfdataset import _HDFDataset
+
+@dataclass
+class _HDFDataset:


Per @coretl:

This should be slimmed to only what's needed to describe a file, not what's needed to write it.

The pattern detector is different from the others because it is actually writing the HDF5 file, not describing a file written by an external program (IOC). For the pattern detector, separate dataclass should extend this to add maxshape and fillvalue.

With that separtion, we can make all of these fields required fields.

Some of this information is used to construct the EventDescriptor (shape, dtype, name ?, multiplier). Other information is used to construct the StreamResource (path, meaning the "path" of the dataset inside the layout of the HDF5 file, distinct from the file; and optionally smwr).

Add dtype_str, the numpy-style dtype like "<i4" or "<f8".

This should also include the uri. (For old-style StreamResource we can convert this into root and resource_path by ignoring the hostname, hard-coding root to "/" as discussed, and using the rest of the path as resource_path.)

https://github.com/bluesky/bluesky/blob/bc8222be2e099a00baafcede1d761d61b213bf19/src/bluesky/callbacks/tiled_writer.py#L167-L178

pyproject.toml

src/ophyd_async/epics/areadetector/writers/general_hdffile.py

coretl · 2024-06-10T16:07:06Z

src/ophyd_async/epics/areadetector/writers/general_hdffile.py

+                bundler_composer(
+                    mimetype="application/x-hdf5",
+                    uri=uri,
+                    data_key=ds.name.replace("/", "_"),


Which would make this:

Suggested change

data_key=ds.name.replace("/", "_"),

data_key=ds.data_key,

I would make data_keys with / in them an error, not a silent replacement

I have a question regarding the removal of replace("/", "_") . If I remove this I have to replace DATA_PATH = /entry/data/data with DATA_PATH = _entry_data_data in the pattern_generator.py and so on. If not I get a regex not matching error from the bluesky runengine . I tentatively replaced the paths this way in the code but would like to ask for suggestions.

They are different things, data_key is the name within the descriptor, and dataset is the path within the HDF file. So for pattern generator I think we want something like:

_HDFDataset( data_key=f"{detector_name}-data, dataset=DATA_PATH, )

I'm not sure quite how you pass the detector name through and into the pattern generator though...

In pattern_generator.py, PATH parameters were assigned to name (now data_key). What I meant is that, deleting the replace("/", "_") resulted in errors so I had to modify the PATH parameters. The other thing is that, I am not sure I understand passing detector name to _HDFDataset since I am not experienced with the usage of this class. Is a device going to assign the parameters in this class? I would be happy to have more context.

coretl · 2024-06-10T16:11:07Z

src/ophyd_async/epics/areadetector/writers/general_hdffile.py

+                    uri=uri,
+                    data_key=ds.name.replace("/", "_"),
+                    parameters={
+                        "path": ds.path,


And this would be:

Suggested change

"path": ds.path,

"dataset": ds.dataset,

DiamondJoseph · 2024-06-20T12:18:16Z

src/ophyd_async/epics/areadetector/writers/general_hdffile.py

+    #: Name of the data_key within the Descriptor document
+    data_key: str
+    dtype_numpy: Optional[str] = None
+    swmr: bool = False
+    shape: Optional[List[int]] = None
+    multiplier: Optional[int] = 1
+    #: Name of the dataset within the HDF file
+    dataset: Optional[str] = None
+    device_name: Optional[str] = None
+    block: Optional[str] = None
+    maxshape: tuple[Any, ...] = (None,)
+    dtype: Optional[Any] = None
+    fillvalue: Optional[int] = None


Suggested change

#: Name of the data_key within the Descriptor document

data_key: str

dtype_numpy: Optional[str] = None

swmr: bool = False

shape: Optional[List[int]] = None

multiplier: Optional[int] = 1

#: Name of the dataset within the HDF file

dataset: Optional[str] = None

device_name: Optional[str] = None

block: Optional[str] = None

maxshape: tuple[Any, ...] = (None,)

dtype: Optional[Any] = None

fillvalue: Optional[int] = None

#: Name of the data_key within the Descriptor document

data_key: str

# Numpy dtype representation of the type

dtype_numpy: str

# Is the h5 file written in SingleWriter, MultipleRead mode

swmr: bool = False

# Shape of a frame of the h5 file

shape: List[int] = []

# How many frames per Stream Datum index

multiplier: int = 1

#: Name of the dataset within the HDF file

dataset: str = None

that looks right to me

thanks for explaining!

DiamondJoseph · 2024-06-20T12:21:14Z

src/ophyd_async/sim/pattern_generator.py


 # pixel sum path
-SUM_PATH = "/entry/sum"
+SUM_PATH = "_entry_sum"

 MAX_UINT8_VALUE = np.iinfo(np.uint8).max

 SLICE_NAME = "AD_HDF5_SWMR_SLICE"



Suggested change

@dataclass

class PatternDataset(_HDFDataset):

maxshape: tuple[Any, ...] = (None,)

fillvalue: Optional[int] = None

dtype: Optional[type] = None # Or whatever this was on the previous class

DiamondJoseph · 2024-06-20T12:21:26Z

src/ophyd_async/sim/pattern_generator.py

 def get_full_file_description(
-    datasets: List[DatasetConfig], outer_shape: tuple[int, ...]
+    datasets: List[_HDFDataset], outer_shape: tuple[int, ...]


Suggested change

datasets: List[_HDFDataset], outer_shape: tuple[int, ...]

datasets: List[PatternDataset], outer_shape: tuple[int, ...]

DiamondJoseph · 2024-06-20T12:22:49Z

src/ophyd_async/sim/pattern_generator.py

@@ -255,24 +173,24 @@ def _get_new_path(self, directory: DirectoryProvider) -> Path:
        new_path: Path = info.root / info.resource_dir / filename
        return new_path

-    def _get_datasets(self) -> List[DatasetConfig]:
-        raw_dataset = DatasetConfig(
+    def _get_datasets(self) -> List[_HDFDataset]:


Suggested change

def _get_datasets(self) -> List[_HDFDataset]:

def _get_datasets(self) -> List[PatternDataset]:

DiamondJoseph · 2024-06-20T12:23:07Z

src/ophyd_async/sim/pattern_generator.py

-    def _get_datasets(self) -> List[DatasetConfig]:
-        raw_dataset = DatasetConfig(
+    def _get_datasets(self) -> List[_HDFDataset]:
+        raw_dataset = _HDFDataset(


Suggested change

raw_dataset = _HDFDataset(

raw_dataset = PatternDataset(

danielballan reviewed May 9, 2024

View reviewed changes

danielballan added this to the 0.4 milestone Jun 4, 2024

skarakuzu force-pushed the Factor-StreamResource-producing-code-into-a-separate-class branch from 86d7f97 to 2ac7873 Compare June 4, 2024 20:05

skarakuzu marked this pull request as ready for review June 5, 2024 20:51

skarakuzu requested review from coretl and abbiemery June 6, 2024 15:32

danielballan reviewed Jun 6, 2024

View reviewed changes

pyproject.toml Show resolved Hide resolved

coretl requested changes Jun 10, 2024

View reviewed changes

Seher Karakuzu added 7 commits June 11, 2024 10:02

initial attempt to refactoring _HDFFile class

2da8dbe

few more changes to merge _HDFFile classes

87f765c

function to distinguish event model version

9978024

event model version check: some more changes

02ef379

few more changes

872a14c

completed initial refactoring

5a59c21

try downloading event-model via https

2aaab09

skarakuzu force-pushed the Factor-StreamResource-producing-code-into-a-separate-class branch from 9f50cb7 to 2aaab09 Compare June 11, 2024 14:13

Seher Karakuzu and others added 9 commits June 11, 2024 10:19

addressed comments

81623e5

Add event-model-version to CI test matrix.

69d229e

Declare event-model-version.

c553804

Default to latest event-model

d5d21cc

Fix syntax

b8402d2

some fix for tests

0eb1eb0

some more test fixes

9c1cb91

some more test fixes

7516df5

some more test fixes

8c6750e

abbiemery removed this from the 0.4 milestone Jun 17, 2024

subinsaji changed the base branch from main to dev June 18, 2024 09:42

subinsaji changed the base branch from dev to main June 18, 2024 09:51

subinsaji mentioned this pull request Jun 18, 2024

Refactor stream resource producing code #397

Closed

DiamondJoseph reviewed Jun 20, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Factor stream resource producing code into a separate class #290

Factor stream resource producing code into a separate class #290

skarakuzu commented May 8, 2024 •

edited by danielballan

Loading

danielballan May 9, 2024

coretl Jun 10, 2024

skarakuzu Jun 12, 2024 •

edited

Loading

coretl Jun 14, 2024

skarakuzu Jun 17, 2024 •

edited

Loading

coretl Jun 10, 2024

DiamondJoseph Jun 20, 2024

DiamondJoseph Jun 20, 2024

coretl Jun 20, 2024

subinsaji Jun 20, 2024

DiamondJoseph Jun 20, 2024

DiamondJoseph Jun 20, 2024

DiamondJoseph Jun 20, 2024

DiamondJoseph Jun 20, 2024

	datasets: List[_HDFDataset], outer_shape: tuple[int, ...]
	datasets: List[PatternDataset], outer_shape: tuple[int, ...]

	def _get_datasets(self) -> List[_HDFDataset]:
	def _get_datasets(self) -> List[PatternDataset]:

Factor stream resource producing code into a separate class #290

Are you sure you want to change the base?

Factor stream resource producing code into a separate class #290

Conversation

skarakuzu commented May 8, 2024 • edited by danielballan Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

skarakuzu Jun 12, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

skarakuzu Jun 17, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

skarakuzu commented May 8, 2024 •

edited by danielballan

Loading

skarakuzu Jun 12, 2024 •

edited

Loading

skarakuzu Jun 17, 2024 •

edited

Loading