-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add initial draft of captions extension to semantic labels proposal #67
base: main
Are you sure you want to change the base?
Conversation
Hey @nvmkuruc , For example, this is what I would propose for accessibility information
This follows standard accessibility forms across multiple accessibility frameworks in multiple browsers and operating systems etc... where Label is the short description and alternate is a longer description should someone want that. Much like yours, I use I would also (in ours) encourage combining use with the proposed language schema so that you could do things like
I realize that the semantic description and the accessibility description might differ, but the abject schema seems so similar, that I wonder if we shouldn't combine forces. Without trying to be presumptuous, I think the accessibility schema could cover your needs too with |
@dgovil Do you happen to have an example of allowed / preferred values for the |
Probably just low, standard, high. It's not as common a metadata but it does help prioritize tokens for a system when someone asks with natural language for a description |
def Xform "learning_robot" ( | ||
apiSchemas = ["SemanticsCaptionsAPI:skills"] | ||
) { | ||
string semantics:captions:skills.timeSamples = { | ||
0 : "The robot does not know how to dance", | ||
100 : "The robot is learning the box step", | ||
150 : "The robot knows the box step" | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned in @dgovil's proposal, we also discussed future consideration for time-based descriptions. Sometimes a relevant time sequence needs an "announcement" for assistive technology, too... Either tied to a transition (for example, between slide builds where the change is more important than either end state) or a time code of the overall timelines, similar to closed captions or audio descriptions.
Potentially relevant: I'm working on a PR for VTT to add an ATTRIBUTES block, generally to disambiguate various types of metadata, but specifically because it's a prerequisite for using VTT to define time-based general flash data (seizure avoidance, etc.) in this follow-up VTT issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious if you see these timeSamples
keys aligning with other timed-text formats like VTT.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand VTT
correctly, it specifies time code ranges while OpenUSD holds an authored value until the next authored time sample and pulls from the first or last time sample when querying out of range. To describe that format in OpenUSD, you'd likely have to do something like this--
string semantics:captions:skills.timeSamples = {
99.9999: "" # Suppress out of range queries
100 : "This is some state." # This is valid between time codes 100 and 150.
150.00001: "" # Suppress out of range queries
}
This is highly speculative, but I'm curious if there's path to building something like VTT using the time series proposal as a starting point. It's currently designed for animation splines, but might provide a path for eventually describing more complicated time based value resolution?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Time series have actually been removed from the design for animation splines, in favor of more simply leveraging timeSamples for all non-scalar, non-floating-point varying data.
Description of Proposal
The semantic labels schema will add support for labeling subgraphs with tokens from discrete taxonomies.
This proposes a peer schema in the
UsdSemantics
domain forsemantics:captions
to capture instances of natural language descriptions.Link to Rendered Proposal
There is overlap with what the accessibility schema is looking to achieve. We're looking for feedback on if and how these two proposals should align.
Supporting Materials
Contributing