-
Notifications
You must be signed in to change notification settings - Fork 20
Loading data
The 3.13.0 release provides a new command for loading data via the support provided in ml-javaclient-util. The intent of this is to provide a simple mechanism for loading data that should be part of any deployment. The support for loading files from disk as documents is already used for loading modules and schemas, so this is just extending that support for loading arbitrary data.
For an ml-gradle project, it's likely you'll still have plenty of reasons for using mlcp to load data, as mlcp provides a number of useful options for parsing delimited data and loading data from zips. This feature is focused strictly on loading a directory of files into a database such that the documents in the database mirror those in the directory.
This feature has the following default behavior:
-
src/main/ml-data
is the default path for finding files to load. - Any file in any data path will be loaded with a URI relative to the data path that it belongs to; e.g. the file
src/main/ml-data/my/data/test.json
will be loaded with a URI of/my/data/test.json
. - Collections and permissions can be specified via files in each directory.
- The files will be loaded via a
DatabaseClient
that uses the port defined byappConfig.getRestPort()
. This can be overridden by specifying a database name to load files into, in which caseappConfig.getAppServicesPort()
will be used for making a connection.
The following properties are available for configuring this feature (all, unless otherwise noted, were introduced in 3.13.0):
Property | Description |
---|---|
mlDataBatchSize | The number of documents to include in each call to MarkLogic. Defaults to 100. |
mlDataCollections | Comma-delimited list of collection names assigned to each document. No default value. |
mlDataDatabaseName | Database to load documents into; if set, then ml-app-deployer will connect via the App-Services app server to load the documents. No default value. |
mlDataLoadingEnabled | Whether this feature is enabled. Defaults to true. |
mlDataLogUris | Whether the URI of every document inserted should be logged. Defaults to true. |
mlDataPaths | Comma-delimited list of data paths. Defaults to src/main/ml-data. |
mlDataPermissions | Comma-delimited list of permissions (role1,capability1,role2,capability2,etc) assigned to each document. No default value (which typically means you'll get rest-reader/read and rest-writer/update as permissions on each document). |
mlDataReplaceTokens | Whether tokens should be replaced in each document, where tokens are obtained from the custom tokens map on the AppConfig object. Defaults to true. |
Starting in 3.15.0, the DataConfig object belonging to AppConfig defaults its fileFilter property to be an instance of DefaultFileFilter, which ignores every file starting with a "." or in a directory starting with ".".
As of 4.6.0, the properties mlCascadeCollections
and mlCascadePermissions
can be set to true so that the settings in collections.properties
and permissions.properties
will be applied to child directories, unless a child directory has its own files.