You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
In many data processing tasks there are multiple distinct inputs, e.g. generating the smartspim template or training a model. We added minimal support for this in #1166 in the form of a list of input locations, but discussed a plan to add additional context explaining the selection of those inputs.
Describe the solution you'd like
classCompositeData(AindModel):
"""Description of a group of data assets used together"""data_assets: List[str] =Field(..., title="Data assets")
shared_metadata: Optional[AindGenericType] =Field(
default=None,
title="Shared metadata",
description="Common attributes that provide context for this grouping of assets"
)
curation_purpose: Optional[str] =Field(
default=None,
title="Curation purpose",
description="Reason for grouping assets together for processing"
)
Describe alternatives you've considered
The most obvious alternative is to create a separate DataProcess stage for "curation" or the like, with no inputs and the list of curated assets as an output parameter. Other contextual fields could be entered as notes or parameters of the process.
Additional context
One advantage of a separate object for this is that it would be easier to write a script to run a new process on the same set of data assets as a previous processing result (could be a common use case for model evaluation etc).
Unclear if this would be reused/allowed anywhere other than the input field of DataProcess
Yes I guess we'd discussed limiting that to something like "Key shared attributes used to select this group of assets" - maybe the title could actually be "Defining metadata" or "Defining attributes"?
Is your feature request related to a problem? Please describe.
In many data processing tasks there are multiple distinct inputs, e.g. generating the smartspim template or training a model. We added minimal support for this in #1166 in the form of a list of input locations, but discussed a plan to add additional context explaining the selection of those inputs.
Describe the solution you'd like
Describe alternatives you've considered
The most obvious alternative is to create a separate DataProcess stage for "curation" or the like, with no inputs and the list of curated assets as an output parameter. Other contextual fields could be entered as notes or parameters of the process.
Additional context
One advantage of a separate object for this is that it would be easier to write a script to run a new process on the same set of data assets as a previous processing result (could be a common use case for model evaluation etc).
Unclear if this would be reused/allowed anywhere other than the input field of DataProcess
This is also related to the discussion #1148
The text was updated successfully, but these errors were encountered: