-
Notifications
You must be signed in to change notification settings - Fork 183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add guidance on newlines in tables #547
Comments
This is tough. Apparently there is no pure-markdown solution, only the escape-into-HTML So far we have not permitted Having no very clean solution to offer, I am forced to wonder to what extent unstructured tabular data -- with latent structure -- needs to be supported in OSCAL source. |
More constructively, I agree we need to offer guidance. Not sure what guidance to offer beyond 'don't use tables' which might not be helpful. Sketching the limits here -- OSCAL tables don't support lists or sequences of lines, only single lines -- would potentially be helpful. This could be a work item for docs, at least. |
@bradh At the moment, including a We are limited by what we can support in Markdown and what can be roundtripped in the conversion from Markdown -> HTML -> Markdown. We might be able to deal with We need to think about this one. |
I've raised this before, but am going to make one more push here, since we'll have to live with the decision long term following the official 1.0.0 publication: I believe Markup-Multiline fields should always be a robust subset of HTML5, regardless of OSCAL format (XML, JSON, or YAML), even if that means our schema validation tools can't validate the content within the markup-multiline fields. Our OSCAL formats are intended to be read and manipulated by computers, not people. I believe it is fair to have tools insert the appropriate escape characters required for storing HTML5 in JSON. Our converters should be able to add/interpret the escape characters when converting between formats, and otherwise leave the content as-is. We have clear use cases for robust formatting - including the use of tables in control responses - , and continue to see evidence that while Markdown is sufficient for some organizations, it is not sufficient for others. We are being artificially limited by the use of different markup-multiline formats in the different OSCAL formats. Or perhaps there can be a property to markup-multiline fields indicating the format (MD or HTML), so that organizations who prefer MD can continue to use it, but organizations who require the more robust HTML formatting are also able to do so. At least that is something we could add a non-breaking change post 1.0.0 delivery. (The absence of such a property causes things to work as they do now.) |
@brianrufgsa @david-waltermire-nist I wonder if we shouldn't consider an entirely different approach to this requirement. Maybe instead of looking at markup-multiline we should consider permitting embedded HTML through the "any" construct (for which we have already stipulated nominal support, in some places). <item>
<title>A1</title>
<description>
<p>OSCAL description, and/or ...</p>
<body xmlns="an-html-namespace"> ... near-HTML goes here ... <br/> ... and here ... </body>
</description>
</item> This would not address issues on the JSON side but there might be mitigations we could support, such as letting there be a link to out of line HTML on the JSON side. Indeed, along those lines, in either XML or JSON OSCAL, we could define a specialized <item>
<title>A1</title>
<description>
<link rel="include" href="fragment.html#a1"/>
</description>
</item> In my view both these mechanisms (literal HTML inline or using links to reference) would be easier and cleaner than Markdown-extension, for both developers, and organizations that have to take on burdens of rules definition and enforcement to whatever extent OSCAL says "anything goes". While they present problems in JSON/YAML representations, those are no worse than what we face extending the Markdown syntax to support (even some subset of) "office document semantics". |
Perturbing factors to consider:
|
We follow commonmark as a base specification for markup, which doesn't support tables. We use the GitHub Flavored markup table extension to support tables. Commonmark supports HTML blocks and inline raw HTML, which can be used to embed HTML in Markdown. The current html datatype support in OSCAL does not support this however. To move forward we need to either:
Option 1 is easy, requiring no extra work, but limiting functionality. For options 2 and 3, the XSLT implementation would need to be enhanced to support this. The liboscal-java implementation has support for full inline HTML, but is largely untested so some aspects may not work. If support for all or a subset of inline HTML is desired, test content will need to be engineered to ensure proper implementation support. |
In today's model review, there was a pretty active discussion on formatting of prose in OSCAL (specifically, |
Just noting we should couple any action on this with unit testing of bidirectional conversion of (wrapped and unwrapped) markup-line and markup-multiline. Indeed due to the nature of Markdown (lack of a grammar) this is really the only way of validating it: converting Markdown to markup (in this case OSCAL XML) with a conformant engine, then converting back, then comparing. (Even this will not be enough for free-form kinds of Markdown.) This implies ensuring conformance, which is where the unit tests come in. XSpec that can provide a foundation for this was merged with usnistgov/metaschema#218. |
Given that there are no strong opinions, I believe option #2 is potentially a good way forward to adopt additional HTML tagging over time. I agree with @wendellpiez that round-trip unit testing is needed here. Perhaps we could keep OSCAL as-is for now and explore this more after the OSCAL 1.1 release? Anyone have feedback on this proposed way forward? |
+1 to maintain as-is for now. |
@david-waltermire-nist I agree option #2 sounds like the best balance, and agree with @wendellpiez on the need to include any expanded HTML-in-MD tagging in unit testing. While I can confirm that not having a new-line ability with a table cell will block many FedRAMP SSP from being faithfully converted to OSCAL without re-work of the content, I cannot say how much demand (if any) there is for people actually converting those SSPs at this time. I suspect this is not yet urgent, but at whatever point we start to see an up-tick in OSCAL adoption among systems with legacy SSPs, it will become urgent. So I think there is time. It would be nice to not wait until OSCAL 2.0 and I believe a sub-set can be implemented a non-breaking change. Just my $0.02 on timing. |
Given the feedback above. I think we should create and reference some issues around better testing og markdown <-> HTML conversion and close this for now. Any concerns with this approach? |
Given that there has been no additional feedback around adding support for specific tags based on option 2, I am going to close this issue. We can reopen this if the community sentiment changes on this. |
User Story:
As an OSCAL producer or consumer, I need to know if tables can contain newlines / line breaks, and if so, how to encode / interpret them.
Goals:
Add documentation to explain new line / breaks in tables.
The Australian ISM uses this representation:
My OSCAL for the same content (not from the HTML, but source Word document):
(note the use of literal breaks)
In Markdown:
(note the use of
<br/>
tags)The conflict here is trying to keep close to the source document (ideally automatically extracting OSCAL from the Word document) while also being OSCAL compliant.
Dependencies:
None identified
Acceptance Criteria
{The items above are general acceptance criteria for all User Stories. Please describe anything else that must be completed for this issue to be considered resolved.}
The text was updated successfully, but these errors were encountered: