Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add guidance on newlines in tables #547

Closed
3 tasks
bradh opened this issue Nov 29, 2019 · 14 comments
Closed
3 tasks

Add guidance on newlines in tables #547

bradh opened this issue Nov 29, 2019 · 14 comments
Assignees
Labels
Community Feedback Needed Discussion Needed This issues needs to be reviewed by the OSCAL development team. enhancement Model Engineering An issue to be discussed during the bi-weekly Model Engineering Meeting Scope: Metaschema Issues targeted at the metaschema pipeline Scope: Modeling Issues targeted at development of OSCAL formats User Story
Milestone

Comments

@bradh
Copy link
Contributor

bradh commented Nov 29, 2019

User Story:

As an OSCAL producer or consumer, I need to know if tables can contain newlines / line breaks, and if so, how to encode / interpret them.

Goals:

Add documentation to explain new line / breaks in tables.
The Australian ISM uses this representation:

<table>
	<thead style="background-color: #dbe5f1;">
		<tr>
			<td>
			<p><strong>Regular User Account</strong></p>
			</td>
			<td>
			<p><strong>Unprivileged Administration Account</strong></p>
			</td>
			<td>
			<p><strong>Privileged Administration Account</strong></p>
			</td>
		</tr>
	</thead>
	<tbody>
		<tr>
			<td>
			<p>Unprivileged account</p>
			</td>
			<td>
			<p>Unprivileged account</p>
			</td>
			<td>
			<p>Privileged account</p>
			</td>
		</tr>
		<tr>
			<td>
			<p>Used for web and email access</p>

			<p>Used for day-to-day non-administrative tasks</p>
			</td>
			<td>
			<p>Used for authentication to dedicated administrator workstation</p>

			<p>Used for authentication to jump server(s)</p>
			</td>
			<td>
			<p>Used for performance of administration tasks</p>
			</td>
		</tr>
		<tr>
			<td>
			<p> </p>
			</td>
			<td>
			<p>Different username and passphrase to regular user account</p>
			</td>
			<td>
			<p>Different username and passphrase to regular user account</p>
			</td>
		</tr>
	</tbody>
</table>

My OSCAL for the same content (not from the HTML, but source Word document):

                    <table>
                        <tr>
                            <th>Regular User Account</th>
                            <th>Unprivileged Administration Account</th>
                            <th>Privileged Administration Account</th>
                        </tr>
                        <tr>
                            <td>Unprivileged account</td>
                            <td>Unprivileged account</td>
                            <td>Privileged account</td>
                        </tr>
                        <tr>
                            <td>Used for web and email access
Used for day-to-day non-administrative tasks</td>
                            <td>Used for authentication to dedicated administrator workstation
Used for authentication to jump server(s)</td>
                            <td>Used for performance of administration tasks</td>
                        </tr>
                        <tr>
                            <td></td>
                            <td>Different username and passphrase to regular user account</td>
                            <td>Different username and passphrase to regular user account</td>
                        </tr>
                    </table>

(note the use of literal breaks)

In Markdown:

| Regular User Account                                                           | Unprivileged Administration Account                                                                          | Privileged Administration Account                         |
| ------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------ | --------------------------------------------------------- |
| Unprivileged account                                                           | Unprivileged account                                                                                         | Privileged account                                        |
| Used for web and email access<br/>Used for day-to-day non-administrative tasks | Used for authentication to dedicated administrator workstation<br/>Used for authentication to jump server(s) | Used for performance of administration tasks              |
|                                                                                | Different username and passphrase to regular user account                                                    | Different username and passphrase to regular user account |

(note the use of <br/> tags)

The conflict here is trying to keep close to the source document (ideally automatically extracting OSCAL from the Word document) while also being OSCAL compliant.

Dependencies:

None identified

Acceptance Criteria

  • All OSCAL website and readme documentation affected by the changes in this issue have been updated. Changes to the OSCAL website can be made in the docs/content directory of your branch.
  • A Pull Request (PR) is submitted that fully addresses the goals of this User Story. This issue is referenced in the PR.
  • The CI-CD build process runs without any reported errors on the PR. This can be confirmed by reviewing that all checks have passed in the PR.

{The items above are general acceptance criteria for all User Stories. Please describe anything else that must be completed for this issue to be considered resolved.}

@wendellpiez
Copy link
Contributor

wendellpiez commented Dec 12, 2019

This is tough. Apparently there is no pure-markdown solution, only the escape-into-HTML <br>

So far we have not permitted br even though it occasionally used as a necessary workaround to problems of this sort -- which are due to deeper modeling issues such as what are the proper contents of table cells, and specifically whether that is inline 'soup' or structured, or either, or a mix. HTML has no solution (effectively supporting a mix); Markdown notation for tables implies inline soup, but markdown has no signal for <br/> in this context, hence the escape back into angle brackets.

Having no very clean solution to offer, I am forced to wonder to what extent unstructured tabular data -- with latent structure -- needs to be supported in OSCAL source.

@wendellpiez
Copy link
Contributor

More constructively, I agree we need to offer guidance. Not sure what guidance to offer beyond 'don't use tables' which might not be helpful. Sketching the limits here -- OSCAL tables don't support lists or sequences of lines, only single lines -- would potentially be helpful.

This could be a work item for docs, at least.

@david-waltermire david-waltermire added the Discussion Needed This issues needs to be reviewed by the OSCAL development team. label Jan 9, 2020
@david-waltermire
Copy link
Contributor

david-waltermire commented Jan 9, 2020

@bradh At the moment, including a <br/> will cause the OSCAL content to not validate.

We are limited by what we can support in Markdown and what can be roundtripped in the conversion from Markdown -> HTML -> Markdown. We might be able to deal with <br/>, but this will make implementation of this conversion process much more difficult.

We need to think about this one.

@brian-ruf
Copy link
Contributor

I've raised this before, but am going to make one more push here, since we'll have to live with the decision long term following the official 1.0.0 publication:

I believe Markup-Multiline fields should always be a robust subset of HTML5, regardless of OSCAL format (XML, JSON, or YAML), even if that means our schema validation tools can't validate the content within the markup-multiline fields.

Our OSCAL formats are intended to be read and manipulated by computers, not people. I believe it is fair to have tools insert the appropriate escape characters required for storing HTML5 in JSON. Our converters should be able to add/interpret the escape characters when converting between formats, and otherwise leave the content as-is.

We have clear use cases for robust formatting - including the use of tables in control responses - , and continue to see evidence that while Markdown is sufficient for some organizations, it is not sufficient for others. We are being artificially limited by the use of different markup-multiline formats in the different OSCAL formats.

Or perhaps there can be a property to markup-multiline fields indicating the format (MD or HTML), so that organizations who prefer MD can continue to use it, but organizations who require the more robust HTML formatting are also able to do so. At least that is something we could add a non-breaking change post 1.0.0 delivery. (The absence of such a property causes things to work as they do now.)

@wendellpiez
Copy link
Contributor

wendellpiez commented Nov 22, 2020

@brianrufgsa @david-waltermire-nist I wonder if we shouldn't consider an entirely different approach to this requirement. Maybe instead of looking at markup-multiline we should consider permitting embedded HTML through the "any" construct (for which we have already stipulated nominal support, in some places).

<item>
  <title>A1</title>
  <description>
    <p>OSCAL description, and/or ...</p>
    <body xmlns="an-html-namespace"> ... near-HTML goes here ... <br/> ... and here ... </body>
  </description>
</item>

This would not address issues on the JSON side but there might be mitigations we could support, such as letting there be a link to out of line HTML on the JSON side.

Indeed, along those lines, in either XML or JSON OSCAL, we could define a specialized link/@rel as an "include" mechanism, meaning any processor could pick up and expand to include the referenced content at the point of call. (Although I suppose there are now questions of appropriate MIME types, etc.)

<item>
  <title>A1</title>
  <description>
    <link rel="include" href="fragment.html#a1"/>
  </description>
</item>

In my view both these mechanisms (literal HTML inline or using links to reference) would be easier and cleaner than Markdown-extension, for both developers, and organizations that have to take on burdens of rules definition and enforcement to whatever extent OSCAL says "anything goes". While they present problems in JSON/YAML representations, those are no worse than what we face extending the Markdown syntax to support (even some subset of) "office document semantics".

@wendellpiez
Copy link
Contributor

wendellpiez commented Aug 22, 2022

Perturbing factors to consider:

  • However we represent new lines or (related but different) paragraphs (such as mix of p ul even nested table) inside td (and th?), it must have a graceful JSON/YAML/Markdown representation (right?)
    So one question to pose is how JSON or YAML consumers would like this data to look (how does CommonMark do it, etc.).
  • We can also provide alternative strategies for encoding particular use cases --
    • Often a clean structured representation in the data with explicit "tables" only in the representation (the display) is the way to go
    • Or information can be tagged out of line (in an external document) and referenced

@david-waltermire
Copy link
Contributor

We follow commonmark as a base specification for markup, which doesn't support tables. We use the GitHub Flavored markup table extension to support tables.

Commonmark supports HTML blocks and inline raw HTML, which can be used to embed HTML in Markdown. The current html datatype support in OSCAL does not support this however.

To move forward we need to either:

  1. Disallow inline HTML in markup.
  2. Allow a subset of HTML in markup (i.e. <br/>) to support newlines and similar use cases.
  3. Allow full support for inline HTML.

Option 1 is easy, requiring no extra work, but limiting functionality.

For options 2 and 3, the XSLT implementation would need to be enhanced to support this. The liboscal-java implementation has support for full inline HTML, but is largely untested so some aspects may not work.

If support for all or a subset of inline HTML is desired, test content will need to be engineered to ensure proper implementation support.

@david-waltermire david-waltermire added the Model Engineering An issue to be discussed during the bi-weekly Model Engineering Meeting label Sep 23, 2022
@david-waltermire david-waltermire moved this from Todo to In Progress in NIST OSCAL Work Board Sep 23, 2022
@david-waltermire david-waltermire moved this from In Progress to Under Review in NIST OSCAL Work Board Sep 27, 2022
@aj-stein-nist
Copy link
Contributor

In today's model review, there was a pretty active discussion on formatting of prose in OSCAL (specifically, markup-multiline with complex structures around tables/list/whitespace management, but not any registered interest in the particular <br/>/newline in table issue, or any specific use case similar to that. It leaves us open to review the above 3 options as we see fit for that particular use case, barring no other feedback in subsequent comments today or following the model review.

@wendellpiez
Copy link
Contributor

wendellpiez commented Oct 3, 2022

Just noting we should couple any action on this with unit testing of bidirectional conversion of (wrapped and unwrapped) markup-line and markup-multiline.

Indeed due to the nature of Markdown (lack of a grammar) this is really the only way of validating it: converting Markdown to markup (in this case OSCAL XML) with a conformant engine, then converting back, then comparing. (Even this will not be enough for free-form kinds of Markdown.) This implies ensuring conformance, which is where the unit tests come in.

XSpec that can provide a foundation for this was merged with usnistgov/metaschema#218.

@david-waltermire
Copy link
Contributor

Given that there are no strong opinions, I believe option #2 is potentially a good way forward to adopt additional HTML tagging over time. I agree with @wendellpiez that round-trip unit testing is needed here. Perhaps we could keep OSCAL as-is for now and explore this more after the OSCAL 1.1 release?

Anyone have feedback on this proposed way forward?

@GaryGapinski
Copy link

+1 to maintain as-is for now.

@brian-comply0
Copy link

@david-waltermire-nist I agree option #2 sounds like the best balance, and agree with @wendellpiez on the need to include any expanded HTML-in-MD tagging in unit testing.

While I can confirm that not having a new-line ability with a table cell will block many FedRAMP SSP from being faithfully converted to OSCAL without re-work of the content, I cannot say how much demand (if any) there is for people actually converting those SSPs at this time.

I suspect this is not yet urgent, but at whatever point we start to see an up-tick in OSCAL adoption among systems with legacy SSPs, it will become urgent. So I think there is time. It would be nice to not wait until OSCAL 2.0 and I believe a sub-set can be implemented a non-breaking change. Just my $0.02 on timing.

@david-waltermire
Copy link
Contributor

Given the feedback above. I think we should create and reference some issues around better testing og markdown <-> HTML conversion and close this for now. Any concerns with this approach?

@david-waltermire
Copy link
Contributor

Given that there has been no additional feedback around adding support for specific tags based on option 2, I am going to close this issue. We can reopen this if the community sentiment changes on this.

Repository owner moved this from Under Review to Done in NIST OSCAL Work Board Nov 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Community Feedback Needed Discussion Needed This issues needs to be reviewed by the OSCAL development team. enhancement Model Engineering An issue to be discussed during the bi-weekly Model Engineering Meeting Scope: Metaschema Issues targeted at the metaschema pipeline Scope: Modeling Issues targeted at development of OSCAL formats User Story
Projects
Status: Done
Development

No branches or pull requests

7 participants