Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Out-of-the-box support for deploying Entity Service models #123

Closed
jmakeig opened this issue Aug 16, 2016 · 9 comments
Closed

Out-of-the-box support for deploying Entity Service models #123

jmakeig opened this issue Aug 16, 2016 · 9 comments
Assignees
Milestone

Comments

@jmakeig
Copy link

jmakeig commented Aug 16, 2016

Probably would want a ml-models folder as a sibling of ml-modules and ml-schemas. Need to be able to configure permissions and collections. Since models go in the content database, they should have *-admin visibility by default and should be segregated into a directory, probably mirroring any sub-directories in the local config. They also need to be loaded into the http://marklogic.com/entity-services/models collection to be picked up by the default TDE.

This is specific to MarkLogic 9.

@rjrudin
Copy link
Contributor

rjrudin commented Aug 16, 2016

@jmakeig Could you either email me or open up a PR that has sample model files in it? Should be easy to support this.

@jmakeig
Copy link
Author

jmakeig commented Aug 16, 2016

The core models are described in JSON or XML documents. These get distilled into triples once the model descriptor hits the database, but that’s not relevant to ml-gradle, other than the required collection, http://marklogic.com/entity-services/models, with which models need to be tagged.

Here’s an example from Early Access:

{
  "info": {
    "title": "Race",
    "version": "0.0.1",
    "baseUri": "http://grechaw.github.io/entity-types",
    "description":"This schema represents a Runner who runs Runs and has the potential of winning Races.  We'll start with this entity-type, then decide on and populate instances, tie the data in with an external RDF-based model, and query it.  There are interesting problems with the bare-bones approach in this entity-type."
  },
  "definitions": {
    "Race": {
        "properties": {
            "name": {
                "datatype": "string",
                "description":"The name of the race."
            },
            "comprisedOfRuns": {
                "datatype": "array",
                "description":"An array of Runs that comprise the race.",
                "items":{
                    "$ref": "#/definitions/Run"
                }
            },
            "wonByRunner": {
                "$ref":"#/definitions/Runner",
                "description":"The (single) winner of the race.  (rule) Should match the run of shortest duration."
            },
            "courseLength": {
                "datatype":"decimal",
                "description":"Length of the course in a scalar unit (decimal miles)"
            }
        },
        "primaryKey":"name"
    },
    "Run": {
        "properties": {
            "id": {
                "datatype": "string",
                "description":"A unique id for the run. maybe date/runByRunner (assumes one run per day per person)"
            },
            "date": {
                "datatype": "date",
                "description":"The date on which the run occurred."
            },
            "distance": {
                "datatype": "decimal",
                "description":"The distance covered, in a scalar value."
            },
            "distanceLabel": {
                "datatype": "string",
                "description":"The distance covered, in a conventional notation."
            },
            "duration": {
                "datatype": "dayTimeDuration",
                "description":"The duration of the run."
            },
            "runByRunner": {
                "$ref": "#/definitions/Runner"
            }
        },
        "primaryKey":"id",
        "required":["date","distance","duration","runByRunner"],
        "rangeIndex":["date", "distance", "duration", "runByRunner"]
    },
    "Runner": {
        "properties": {
            "name": {
                "datatype": "string",
                "description":"The name of the runner.  In this early model, unique and a PK."
            },
            "age": {
                "datatype": "int",
                "description":"age, in years."
            },
            "gender" : {
                "datatype" : "string",
                "description": "The gender of the runner (for the purposes of race categories.)"
            }
        },
        "primaryKey": "name",
        "wordLexicon": ["name"],
        "required": ["name", "age"]
    }
  }
}

@jmakeig
Copy link
Author

jmakeig commented Aug 16, 2016

Another related aspect is all of the scaffolding that Entity Services provides once you’ve defined a model. From the model Entity Services can generate:

  • Transformation code
  • Search API options
  • Database indexing configuration
  • XML schemas
  • TDEs for Optic API queries

It would be nice to have some ml-gradle scaffolding that could generate these artifacts as well and put them in the right place. For example, here’s some XQuery to illustrate this:

xquery version "1.0-ml";
import module namespace es = "http://marklogic.com/entity-services" at "/MarkLogic/entity-services/entity-services.xqy"; 
import module namespace esi = "http://marklogic.com/entity-services-impl" at "/MarkLogic/entity-services/entity-services-impl.xqy"; 

declare variable $PATH as xs:string := "/Users/jmakeig/Workspaces";

(: this script generates all of the artifacts supported in Entity Services for EA-2 :)
let $project-location := $PATH || "/xdmp-entity-services/entity-services-examples"
let $gen-prefix := $project-location || "/gen/"
let $d := es:entity-type-from-node(fn:doc("simple-race.json"))
return ($d ,
  xdmp:save($gen-prefix || "ml-modules/ext/Race-0.0.1.xqy", es:conversion-module-generate($d)),
  xdmp:save($gen-prefix || "ml-schemas/Race-0.0.1.tdex", es:extraction-template-generate($d)),
  xdmp:save($gen-prefix || "ml-schemas/Race-0.0.1.xsd", es:schema-generate($d)),
  xdmp:save($gen-prefix || "ml-config/databases/content-database.json", es:database-properties-generate($d))
)

@rjrudin
Copy link
Contributor

rjrudin commented Aug 16, 2016

Interesting... so you're thinking ml-gradle (really ml-app-deployer, but ml-gradle will expose it) could do something like this:

  1. Load any models under ml-models (maybe ml-entity-models? or ml-entities? or ml-es-models? I think we need something more specific than ml-models)
  2. For each model, do e.g. a /v1/eval call with something like the code above. I'm thinking it would return a big XML document containing each of those pieces, and then ml-gradle could write them out to the correct place (since xdmp:save won't work if my ML instance is remote)

@rjrudin
Copy link
Contributor

rjrudin commented Oct 13, 2016

@jmakeig Is the design in the above comment suitable?

@jmakeig
Copy link
Author

jmakeig commented Oct 13, 2016

Yeah, that sounds good, especially your delineation between loading models and generating artifacts from the model. I like entity-models for the directory name. (Why does it need to be prefixed with ml?)

To reiterate what you surmised in 2, the generated artifacts need to live on the local file system even though they're generated remotely. The key aspect here is version control. A generated artifact would then be deployed just like any other artifact of its type.

@rjrudin rjrudin self-assigned this Dec 10, 2016
@rjrudin rjrudin added this to the 2.5.0 milestone Dec 10, 2016
rjrudin added a commit that referenced this issue Dec 12, 2016
@rjrudin
Copy link
Contributor

rjrudin commented Dec 12, 2016

There's an examples/entity-services-project to show how this works

@rjrudin rjrudin closed this as completed Dec 12, 2016
@grechaw
Copy link
Contributor

grechaw commented Feb 6, 2017

wow

@grechaw
Copy link
Contributor

grechaw commented Feb 6, 2017

the future comes so quickly sometimes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants