Add initial draft of captions extension to semantic labels proposal #67

nvmkuruc · 2024-06-28T21:43:28Z

Description of Proposal

The semantic labels schema will add support for labeling subgraphs with tokens from discrete taxonomies.

This proposes a peer schema in the UsdSemantics domain for semantics:captions to capture instances of natural language descriptions.

Link to Rendered Proposal

There is overlap with what the accessibility schema is looking to achieve. We're looking for feedback on if and how these two proposals should align.

Supporting Materials

Contributing

I agree to and accept the Supplemental Terms.

dgovil · 2024-07-02T17:56:30Z

Hey @nvmkuruc ,
We have a very similar proposal centered around Accessibility that I've unfortunately been slow to put up as we discuss internally. I wonder if we should discuss the overlap because I think there's applicability here to other use cases.

For example, this is what I would propose for accessibility information

def Mesh "Cube" (
    prepend apiSchemas = ["AccessibilityAPI"]
)
{
    string accessibility:label = "A Cube"
    string accessibility:alternate:default = "This cube is a wonderful looking cube"
    token accessibility:importance:default = "Standard"
    
    string accessibility:alternate:size = "As big as a house"
    string accessibility:importance:size = "Low"
}

This follows standard accessibility forms across multiple accessibility frameworks in multiple browsers and operating systems etc... where Label is the short description and alternate is a longer description should someone want that.

Much like yours, I use namespace:<label|alternate|importance>:<optional purpose>

I would also (in ours) encourage combining use with the proposed language schema so that you could do things like

#usda 1.0
(
    language = "en_ca"
)

def Mesh "Cube" (
    prepend apiSchemas = ["AccessibilityAPI"]
)
{
    string accessibility:label = "A Cube"
    string accessibility:alternate = "This cube is a wonderful looking cube"
    token accessibility:importance = "Standard"
    
    string accessibility:label:lang:fr = "Un cube"
    string accessibility:alternate:lang:fr = "Ce cube est un cube magnifique"
    string accessibility:alternate:lang:fr_ca = "Ce cube est un cube magnifique canadien"
}

I realize that the semantic description and the accessibility description might differ, but the abject schema seems so similar, that I wonder if we shouldn't combine forces.

Without trying to be presumptuous, I think the accessibility schema could cover your needs too with accessibility:semantics:skills for example. Anyway happy to discuss more. I think it could be a meaningful change to USD.

nvmkuruc · 2024-07-02T20:23:43Z

@dgovil Do you happen to have an example of allowed / preferred values for the importance tokens?

dgovil · 2024-07-02T20:27:30Z

Probably just low, standard, high. It's not as common a metadata but it does help prioritize tokens for a system when someone asks with natural language for a description

cookiecrook · 2024-07-03T20:39:42Z

proposals/semantic_schema/captions.md

+def Xform "learning_robot" (
+    apiSchemas = ["SemanticsCaptionsAPI:skills"]
+) {
+    string semantics:captions:skills.timeSamples = {
+        0 : "The robot does not know how to dance",
+        100 : "The robot is learning the box step",
+        150 : "The robot knows the box step"
+    }
+}


As mentioned in @dgovil's proposal, we also discussed future consideration for time-based descriptions. Sometimes a relevant time sequence needs an "announcement" for assistive technology, too... Either tied to a transition (for example, between slide builds where the change is more important than either end state) or a time code of the overall timelines, similar to closed captions or audio descriptions.

Potentially relevant: I'm working on a PR for VTT to add an ATTRIBUTES block, generally to disambiguate various types of metadata, but specifically because it's a prerequisite for using VTT to define time-based general flash data (seizure avoidance, etc.) in this follow-up VTT issue.

I'm curious if you see these timeSamples keys aligning with other timed-text formats like VTT.

If I understand VTT correctly, it specifies time code ranges while OpenUSD holds an authored value until the next authored time sample and pulls from the first or last time sample when querying out of range. To describe that format in OpenUSD, you'd likely have to do something like this--

string semantics:captions:skills.timeSamples = { 99.9999: "" # Suppress out of range queries 100 : "This is some state." # This is valid between time codes 100 and 150. 150.00001: "" # Suppress out of range queries }

This is highly speculative, but I'm curious if there's path to building something like VTT using the time series proposal as a starting point. It's currently designed for animation splines, but might provide a path for eventually describing more complicated time based value resolution?

Time series have actually been removed from the design for animation splines, in favor of more simply leveraging timeSamples for all non-scalar, non-floating-point varying data.

Add initial draft of captions extension to semantic labels proposal

3fea8fa

cookiecrook reviewed Jul 3, 2024

View reviewed changes

dgovil mentioned this pull request Jul 3, 2024

Accessibility Schema #69

Merged

1 task

dsyu-pixar added the semantic-captions label Jul 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add initial draft of captions extension to semantic labels proposal #67

Add initial draft of captions extension to semantic labels proposal #67

nvmkuruc commented Jun 28, 2024 •

edited

Loading

dgovil commented Jul 2, 2024 •

edited

Loading

nvmkuruc commented Jul 2, 2024

dgovil commented Jul 2, 2024

cookiecrook Jul 3, 2024

cookiecrook Jul 3, 2024

nvmkuruc Jul 3, 2024

spiffmon Jul 4, 2024

Add initial draft of captions extension to semantic labels proposal #67

Are you sure you want to change the base?

Add initial draft of captions extension to semantic labels proposal #67

Conversation

nvmkuruc commented Jun 28, 2024 • edited Loading

Description of Proposal

Supporting Materials

Contributing

dgovil commented Jul 2, 2024 • edited Loading

nvmkuruc commented Jul 2, 2024

dgovil commented Jul 2, 2024

cookiecrook Jul 3, 2024

Choose a reason for hiding this comment

cookiecrook Jul 3, 2024

Choose a reason for hiding this comment

nvmkuruc Jul 3, 2024

Choose a reason for hiding this comment

spiffmon Jul 4, 2024

Choose a reason for hiding this comment

nvmkuruc commented Jun 28, 2024 •

edited

Loading

dgovil commented Jul 2, 2024 •

edited

Loading