Skip to content

Commit

Permalink
Merge pull request #55 from wendellpiez/issue51-xsltpipeline-fixup
Browse files Browse the repository at this point in the history
Updated XSLT converter-generator pipelines to use new packaging XSLTs; readme and other improvements
  • Loading branch information
wendellpiez authored Aug 11, 2023
2 parents 6b47074 + 946dec5 commit 10f72aa
Show file tree
Hide file tree
Showing 4 changed files with 108 additions and 391 deletions.
58 changes: 43 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ Additionally, we care about, but do not prioritize:

### Origins

Formerly housed in the Metaschema repository, this code base traces the history of development of the Metaschema concept in the context of the OSCAL project. It was originally conceived as a demonstration and proof of concept, providing a bridge enabling JSON- and XML-based development in parallel over common problem sets and common data. Success in this effort led to a determination that multiple implementations of a platform-independent specification were needed, at which point this implementation was carved out into its own repository.
Formerly housed in the [Metaschema repository](https://github.com/usnistgov/metaschema), this code base traces the history of development of the Metaschema concept in the context of the OSCAL project. It was originally conceived as a demonstration and proof of concept, providing a bridge enabling JSON- and XML-based development in parallel over common problem sets and common data. Success in this effort led to a determination that multiple implementations of a platform-independent specification were needed, at which point this implementation was carved out into its own repository.

### Project sunset

Expand All @@ -50,49 +50,77 @@ The best way to ensure long-term access to the code base is to clone or fork the

## Repository contents

`src` includes XSLT source code, with supporting infrastructure including ad-hoc testing
`bin` includes utility scripting and might be useful to have on your system path.

`support` includes dependent submodules with other static resources for configuration
`src` includes XSLT source code, with supporting infrastructure including ad-hoc testing.

## Installation and operation
`support` includes dependent submodules with other static resources.

To operate in trial, test or 'bare-bones' mode, scripts are offered to perform operations with no installation except Maven (with JDK as required) and `bash` as a command line environment.
## Installation and operation

The utilities are however designed for integration in a range of environments, and core functionalities are implemented in XSLT 3, which is supported across platforms including Java, node JS and C.
These utilities are designed for integration in a range of environments, and core functionalities are implemented in XSLT 3, which is supported across platforms including Java, Node.js and C. Please deconstruct and reverse engineer. (Consider proposing improvements as [contributions](CONTRIBUTING.md).)

The software is designed to be used in a range of ways:

- Directly, in development of metaschemas and Metaschema-based software and tools
- Within Metaschema-based builds, including under CI/CD, to generate artifacts or productions from metaschema source under controlled conditions

### To run

The following generalized services are provided by the tools in this repository, separately or in combination

- XSD and JSON schema generation - [`src/schema-gen` folder](src/schema-gen)
- Converter XSLTs for metaschema-supported data - [`src/converter-gen` folder](src/converter-gen)
- Metaschema documentation production - [`src/document` folder](src/document).
- (*Forthcoming*) Schematron generation and more

### Using `make` utility

Currently we are supporting "smoke testing" and regression testing via `make`. See more details in [src/README.md](src/README.md).

**Work in progress. Please work with us.**

[`make`](https://www.gnu.org/software/make/) is helpful for providing a clean and versatile interface on the command line, with features supporting build management and process dispatching (parallelization). `make` comes pre-installed in many Linux distributions.

We recommend running `make` from a bash command line under Linux or WSL, and using `make help` for discovery of its features (from any subdirectory in the project):

```bash
$ make help
```

Note that depending on the subdirectory, the help offered will be different.

Run directly from script for more transparency, and see the next section for more details on available processes.

Scripts and stylesheets are documented in place using readmes and in line. Most scripts depend on Apache Maven supporting a Java runtime. Since XSLTs can call, import, include or read XSLTs from elsewhere in the repo, and sometimes do, keep the modules together: each folder on its own is *not* self-contained.
### Directly from script

Accordingly, a good place to start for further research is the `src` directory with [its `readme.md`](src/README.md).
The same scripts used by `make` can also be used directly for a more dynamic and versatile interface, for example for developers of new Metaschema instances who wish to generate artifacts or documentation for their metaschemas.

For testing, all XSpec scenarios (`*.xspec`) can be run in place to generate local test reports.
[bin/metaschema-xslt](bin/metaschema-xslt) is a top-level `bash` script that dispatches to lower-level scripts for the processes. With the `bin` directory on your path invoke it directly for more help:

Users are also expected to call resources in this repository from their own scripts. Do this either by cloning, copying and modifying scripts here; by writing your own; or by adapting code into the XML/XSLT processing framework or stack of your choice.
```
> bin/metaschema-xslt -h
```

In general, at least two invocations will be offered for each process, an XProc-based invocation and a pure-XSLT-based invocation. Either may be useful in different scenarios.
#### Dedicated scripts

A convention is used indicating that an XProc (`*.xpl` file) or XSLT (`*.xsl`) intended to be invoked directly (that is, not only to be used as a module or component) is given a name entirely or partly in `ALL-CAPITALS`. For example, `src/schema-gen/METASCHEMA-ALL-SCHEMAS.xpl` is such an XProc pipeline (a step definition intended to be used directly). The XSLTs that observe this convention are, additionally, higher-order transformations by virtue of using the `transform()` function; for all other resources the convention `lower-case-hyphenated` is followed.
See more details in the [src/README](src/README.md). Using the scripts directly provides more fine-grained access to the logic (for example, if only a single kind of schema output is wanted), while not always offering the same efficiencies.

### Dependencies

As a freely-available XSLT 3.0 engine, the Saxon XSLT processor can be regarded as a *de facto* dependency - while this XSLT-conformant code should in principle run in any processor implementing the language. Saxon-HE can be bundled using Maven or another Java packaging technology.
Within the Maven architecture, the software depends on two libraries:

- [**XML Calabash**](https://xmlcalabash.com/) XProc processor, by Norman Walsh
- **Saxon** XSLT processor from [Saxonica](https://saxonica.com/welcome/welcome.xml)

Note however that the underlying XSLT-conformant code should in principle run in any processor implementing the language (version 3.0).

The [POM file](support/pom.xml) for Java/Maven configuration indicates the current tested version of Saxon. At time of writing, Saxon versions 10 and 11 are known to work with this codebase. When reporting bugs please include the version of your processor.

Some processes are also configured to run using XProc, the XML Pipelining Language, for greater runtime efficiency and transparency (debuggability). XProc is supported by XML Calabash, which also includes Saxon as a dependency.

Developers interested in demonstrating the viability of these processes in different processors and environments are eagerly invited to participate in development of this tool or related tools.

Additional dependencies for some functionalities (XSLT libraries) are included as submodule repositories, in the [support](support) subdirectory.

### Git Client Setup

See more on git setup on the page on the [Contributing](CONTRIBUTING) page.
Expand Down
65 changes: 53 additions & 12 deletions src/README.md
Original file line number Diff line number Diff line change
@@ -1,61 +1,102 @@
# XSLT-M4 `src`
# Metaschema-XSLT `src`

An XSLT implementation of the [Metaschema](https://pages.nist.gov/metaschema) toolchain for generating schemas, converters, and model documentation.

Typically any of these operations will combine several lower-level operations in a defined sequence.

More details (produced by surveying the files) can be seen in [file-manifest.md](file-manifest.md). Note however that this file is not reliable if it is not more recent than the files described.
## To run

In addition to this readme, this folder contains XSLT transformations (`*.xsl`), and XProc pipelines (`xpl`). The XSLT provides stable runtimes to the supported operations as described below. The XProc provides optimized runtimes when producing multiple outputs (results) from single inputs.
Also see the [site README](../README.md) for background information.

Runtime support and dependency management are provided with [Apache Maven](https://maven.apache.org/). The included stylesheets (XSLT) and pipeline configurations (XProc) should also be portable to other environments and runtimes.

Please install Maven, configure its system paths, and test before proceeding.

### `bash` scripts

A bash script located in this distribution provides a single unified interface to functionalities provided by this library. Add [../bin](../bin) to your path, or invoke the script directly, using `-h` for help:

```
> path/to/bin/metaschema-xslt -h
```

The help message includes a list of the supported subcommands, indicating which processes are to be run on given inputs with a particular configuration. (If provided with no arguments, the script returns an error `Error: SUBCOMMAND not specified` along with the same help.) Typically scripts use Maven and rely on it for dependency management.

See each subdirectory README for more instructions.

#### Dedicated scripts

Within any of the subdirectories in `src`, recognize the scripts by their `.sh` file suffix. The scripts follow a naming convention, with an initial segment identifying the primary executable invoked by the script (usually `mvn` for Maven); a final segment `xpl` or `xsl` indicating XPoc or XSLT entries, and intermediate segments indicating what the script produces.

For example, `mvn-xsd-schema-xsl.sh` can be run to produce an XSD schema from a metaschema, using an XSLT-based process (i.e., Saxon with an appropriate XSLT transformation), run under Maven.

Each script also requires arguments, typically the path to the metaschema source (input) file along with a name or keyword directing where to write results. Invoke the script without arguments to get help on its syntax requirements.

Scripts and stylesheets are also documented in place using readmes and in line. Since XSLTs can call, import, include or read XSLTs from elsewhere in the distribution, and sometimes do, keep the modules together: each folder on its own is *not* self-contained.

Users may also apply and use resources in this repository in their own scripts. Do this either by cloning, copying and modifying scripts here; by writing your own scripts or shells; or by adapting code into the XML/XSLT processing framework or stack of your choice.

A convention is used indicating that an XProc (`*.xpl` file) or XSLT (`*.xsl`) intended to be invoked directly (that is, not only to be used as a module or component) is given a name entirely or partly in `ALL-CAPITALS`. For example, `src/schema-gen/METASCHEMA-ALL-SCHEMAS.xpl` is such an XProc pipeline (a step definition intended to be used directly). The XSLTs that observe this convention are, additionally, higher-order transformations by virtue of using the `transform()` function; for all other resources the convention `lower-case-hyphenated` is followed.

### `make` support

Additionally, some subdirectories include `make` configurations. These are used for testing including regression testing, but may also be used to support processing.

To use `make`, confirm you have [`make`](https://www.gnu.org/software/make/), or install it. In any directory with a Makefile, including this one, test it:

```
> src/schema-gen make
```

The system returns a list of available (configured) targets, typically running tests.

## Subdirectories

## common
### common

XSLT and logic used as common modules by other utilities.

Moving or removing this directory will often break things.

## compose
### compose

Implements a metaschema composition pipeline - producing a unified single metaschema from a metaschema top-level module, by performing imports and linking references.

This subroutine is a dependency for most other metaschema processes, so like `common` this directory should be kept in place.

## converter-gen
### converter-gen

Logic to generate converter transformations (XSLT) capable of producing JSON from XML or XML from JSON, according to mappings defined by appropriate metaschema definitions, defining schemas to which the respective data sets are valid.

## document
### document

Logic to create HTML-based web-ready documentation of XML and JSON schemas based on a metaschema.

## metapath
### metapath

Provides support for parsing and mapping Metapath, the metaschema path language.

This directory is a dependency for logic in converter generation, which uses it to match JSON in conversion into XML, and schema generation, which uses it to implement path traversal in constraints definition and implementation.

## schema-gen
### schema-gen

Logic to provide schemas for validating XML or JSON according to definitions provided in a metaschema.

Generators for XSD and JSON Schema v7 are provided.

Additionally, a partial implementation of Metaschema constraints via a Schematron cast is offered, as a basis for future work.

## testing
### testing

Some testing artifacts.

Also find testing within each subdirectory, appropriate to its functionalities.

## util
### util

Miscellaneous utilities. Due for cleanup.

## validate
### validate

Provides support for *extra-schema validation* of Metaschema instances against constraints implicit in Metaschema semantics.

Expand Down
Loading

0 comments on commit 10f72aa

Please sign in to comment.