Skip to content
This repository has been archived by the owner on Sep 16, 2024. It is now read-only.

Loading files

Rob Rudin edited this page Jan 25, 2024 · 8 revisions

ml-javaclient-util includes support for loading files from disk into a MarkLogic database, with specific support for loading modules and schemas.

For information specifically on modules, please see Loading modules.

Loading any kind of file

Support for loading any kind of files is handled by GenericFileLoader, which provides a flexible and extensible API for loading files from multiple directories, where permissions and collections can be specified on a per-file basis.

GenericFileLoader provides the following features:

  1. Files can be loaded from one or more paths.
  2. Files are loaded via a BatchWriter, which means they can be loaded via the REST API, DMSDK, or XCC.
  3. Zero or more FileFilter objects can be used to configure which files are loaded.
  4. Zero or more DocumentFileProcessor objects can be used to process each file as it's loaded. A processor can decide not to load the file, or it could modify the target URI, or the collections or permissions (see more details below), etc.
  5. A default set of permissions and default set of collections can be defined.
  6. Tokens can be replaced in the text of a file before it's loaded.
  7. GenericFileLoader is aware of a number of extensions that indicate a file should be loaded as a binary, and more extensions can be added.

Specifying collections and permissions

As of 2.13.0, you can now specify collections and permissions for schemas, and this will soon be supported for modules too (this is true as of 3.0.0, and works for any kind of file). This has been added specifically for making it easier to add ML9 redaction rulesets to specific collections. You can do this by defining either of the following files in any directory containing schemas:

  • collections.properties
  • permissions.properties

These are expected to have key/value pairs of filename=collection1,collection2 and filename=role,capability,role,capability.

For example, for a file named "my.ruleset", you could have the following in collections.properties:

my.ruleset=coll1,coll2

And in permissions.properties:

my.ruleset=rest-reader,read,rest-writer,update

And your directory would look like this:

collections.properties
my.ruleset
permissions.properties

Note that these special properties files will NOT be loaded into MarkLogic - they're just there to provide metadata for files that you do want to load.

New in release 3.11.0 - you can now specify collections for every file in a directory via a wildcard:

*=collection1,collection2

And likewise for permissions:

*=manage-user,read

These are additive too, so you can still include collections/permissions for specific files:

*=all-data
test.json=json-data
test.xml=xml-data

This feature will be present in version 3.13.0 of ml-app-deployer and ml-gradle.

New in release 3.14.0 - when using ml-javaclient-util within a tool like ml-app-deployer or ml-gradle, tokens will be replaced within these files. For example, a permissions file can contain the following:

test.json=%%myRoleName%%,update

And in an ml-gradle context, the Gradle property myRoleName (if defined) will be substituted in for "%%myRoleName%%".

New in release 4.6.0 - GenericFileLoader and its subclasses can now cascade the values in collections.properties and permissions.properties. Calling setCascadeCollections(true) and setCascadePermissions(true) will enable this feature.

New in release 4.7.0 - you can now use any glob expression to refer to many files instead of being limited to just *. For example, the following entry will be applied to all files ending in ".json":

*.json=rest-reader,read
Clone this wiki locally