Cortex Profiles SDK

The Cortex Profiles SDK is collection of Java/Kotlin libraries, examples, and templates for utilizing Cortex Fabric in a Spark based environment, either on your local instance or in a Cortex Fabric Cluster.

These examples are structured in step-by-step way to display the array of usages of the currently available features in the Cortex Profiles SDK.

Overview
Installation and Setup
Examples
Skill Template
Resources

Overview

The core of the Profiles SDK is a library that exposes an interface to Cortex for utilizing Spark for custom processing of Profile related data. The entrypoint to the Profiles SDK is the CortexSession, a session based API around Spark and the SparkSession. The Profiles SDK provides:

An extensible dependency injected platform that allows for process, module, and environment (local vs in Cortex cluster) specific configuration
Access to Cortex Catalog
Access to Cortex Backend Storage (e.g. Managed Content and Profiles)
Configurable provider for Cortex Secrets
Stream and batch processing support for Cortex Connections
Access to Cortex Fabric job flows for ingesting Data Sources and building Profiles
Spark property-based configuration options
A Cortex Skill Template with a spark-submit based launcher

Installation and Setup

The Cortex Profiles SDK consists of:

The Profiles SDK jar file (com.c12e.cortex.profiles:profiles-sdk)
Platform dependencies jar file (com.c12e.cortex.profiles:platform-dependencies)
Example materials and templates located in this repo

The Profile SDK jar files can be pulled from CognitiveScale's JFrog Artifactory if access has been shared with you. Follow the JFrog Artifactory Developer Setup.

Recommended JVM Settings

JVM Settings can be set via the GRADLE_OPTS environment variable:

export GRADLE_OPTS="-Dorg.gradle.jvmargs='-Xmx2g -XX:MaxMetaspaceSize=512m -XX:+UseG1GC -XX:+UseStringDeduplication -XX:+OptimizeStringConcat'"

Alternatively, you can update the $USER_HOME/.gradle/gradle.properties file by adding the following lines. Create the file if it does not already exist.

org.gradle.jvmargs=-Xmx2g -XX:MaxMetaspaceSize=512m -XX:+UseG1GC -XX:+UseStringDeduplication -XX:+OptimizeStringConcat

JFrog Artifactory Setup

Install Java 11 using the Resources section.
Obtain JFrog Artifactory credentials (shared in LastPass with everyone in Shared-Engineering folder).
Install IntelliJ IDEA with the latest Kotlin plugin enabled (Intellij IDEA).
Put JFrog Artifactory credentials in $USER_HOME/.gradle/gradle.properties file. (See gradle.properties.template for instructions.)

Developer Setup

To work with a local (developer) installation of the Cortex Profiles SDK see dev.md.

Examples

Examples are structured to build upon themselves and grow in complexity. Each provides its own instructions for running as well as additional context. The top level main-app is a CLI wrapper around the other examples:

Sequence	Example	Description
1	Using Local Cortex Clients	This example is intended to introduce working with the Cortex Profiles SDK in local development environment.
2	Join Two Connections	This example is a CLI application for Joining two Cortex Connections and saving the resulting dataset to another Cortex Connection.
3	Refresh a DataSource	This example is a CLI application for refreshing a Data Source by reading its Cortex Connection and writing the dataset to the Data Source.
4	Build Profiles	This example is a CLI application for building Cortex Profiles.
5	Streaming to a Data Source	This example contains a CLI application for refreshing a Data Source via streaming.
6	Using a CData Connection	This example is a CLI application for reading data from a JDBC CData Cortex Connection and writing that data to a separate Connection.
7	Reading From BigQuery	This example is a CLI application that writes data from a Google BigQuery Table to the location of a Cortex Connection. This builds off of the Local Clients example for its initial setup.
8	Caching Profiles	This example is a CLI application that writes Profiles Data from a Delta table to Redis, for real-time profile fetch. This builds off of the Local Clients and Build Profilesexample for its initial setup.
9	Profiles Daemon for Realtime Query	This example is a Spring API server application that exposes some APIs for realtime Profile fetch. This works in conjunction with Cache Profile example for its initial setup.
10	KPI Queries	This CLI Application enables users to be able to evaluate KPI expression written in Javascript, through the profiles-sdk, similar to KPI Dashboard. The goal is to provide an interface over profiles to evaluate straight forward KPI expressions or to define cohorts on the profiles to write complex KPI expressions to be aggregated over certain window duration between a timeframe. This example builds off of the Local Clients and Build Profiles example for its initial setup.
11	Filter and Aggregate Query examples	This CLI Application showcases filter and aggregate queries for `member-profile` Profile Schema using profiles-sdk. This example builds off of the Local Clients and Build Profilesexample for its initial setup.
12	Catalog Management	This example is a CLI application that uses a secondary configuration, app-config.json, to define a number of catalog entities to be managed during execution. The Connections/Data Sources/Profile Schema in the app-config.json are created with the attributes defined in the configuration and are then available to use through the Profiles SDK.

picocli is used by each example to create a minimal CLI application for running the example. Refer to the instructions in each example.

Add a Java Module Example

The examples are structured as a Gradle multi-project build.

To include a new project in the example Profiles application you will need to:

Create a new Java module. Ensure the new project is included in the settings.gradle.kts file.
Include the com.c12e.cortex.profiles:profiles-sdk and com.c12e.cortex.profiles:platform-dependencies as api dependencies in your configuration. You can refer to the join-connections/build.gradle.kts for an example setup including the Profiles SDK, picocli, and Junit dependencies.
(Optional) Include a main CLI entrypoint in your module using picocli.
Include your project in the main application.
- Add your project as a dependency of the main-app. In main-app/build.gradle.kts add implementation(project(":<your-project>")) to dependencies.
- Add your project source to main-app jar file. In main-app/build.gradle.kts add from(project(":<your-project>")) to the Jar task (tasks.withType<Jar>).
- (Optional) If you included a CLI entrypoint in your module, then you can list it as a subcommand by importing the class in Application.java. Refer to the existing subcommands in Application.java for an example on how to include the class as a subcommand.

Skill Template

The Skill Template directory contains files for packaging as Cortex Job Skill, where:

The input to the skill is a JSON Payload with the path to Spark Configuration File.
The output of the skill is the Job execution logs.
The Docker image for the main-app uses a spark-submit based wrapper to launch the Spark application. The resources for the spark-submit wrapper is in the main-app/src/main/resources/python/ directory and is necessary for packaging the Skill.

NOTE: The ENTRYPOINT for the Docker image is scuttle. When running application in a Docker container locally, you should set the --entrypoint option.

Before creating the Skill, you will need to:

Set the Private Registry URL accessible from Cortex as an environment variable, export DOCKER_PREGISTRY_URL=....
Set the Name of the Project to save the Skill, Action, and Types, export PROJECT_NAME=xxxx.
Set the Cortex Token authenticating to Cortex, export CORTEX_TOKEN=xxxx.
Update the spark-conf.json with the CLI application command and other config options.
Verify the Skill's payload.json file refers to the above Spark configuration file (in the built container).

Build the Skill.

make all

The final output should look similar to:

docker build --build-arg base_img=c12e/spark-template:profile-jar-base-6.3.2-rc.2 -t profiles-example:latest -f ./main-app/build/resources/main/Dockerfile ./main-app/build
[+] Building 1.5s (17/17) FINISHED
=> [internal] load build definition from Dockerfile                                                                                                                                                                                     0.0s
=> => transferring dockerfile: 37B                                                                                                                                                                                                      0.0s
=> [internal] load .dockerignore                                                                                                                                                                                                        0.0s
=> => transferring context: 2B                                                                                                                                                                                                          0.0s
=> [internal] load metadata for docker.io/c12e/spark-template:profile-jar-base-6.3.2-rc.2                                                                                                                                              1.0s
=> [auth] c12e/spark-template:pull token for registry-1.docker.io                                                                                                                                                                       0.0s
=> FROM docker.io/redboxoss/scuttle:latest                                                                                                                                                                                              0.4s
=> => resolve docker.io/redboxoss/scuttle:latest                                                                                                                                                                                        0.4s
=> [internal] load build context                                                                                                                                                                                                        0.0s
=> => transferring context: 1.16kB                                                                                                                                                                                                      0.0s
=> [stage-0 1/9] FROM docker.io/c12e/spark-template:profile-jar-base-6.3.2-rc.2@sha256:331f93e1290442934adbd14e904740ef458d2ea012c3288d689608e9202899dd                                                                                0.0s
=> [auth] redboxoss/scuttle:pull token for registry-1.docker.io                                                                                                                                                                         0.0s
=> CACHED [stage-0 2/9] COPY --from=redboxoss/scuttle:latest /scuttle /bin/scuttle                                                                                                                                                      0.0s
=> CACHED [stage-0 3/9] COPY ./resources/main/python/ .                                                                                                                                                                                 0.0s
=> CACHED [stage-0 4/9] RUN pip3 install -r requirements.txt                                                                                                                                                                            0.0s
=> CACHED [stage-0 5/9] COPY ./libs/main-app-1.0.0-SNAPSHOT.jar /app/libs/app.jar                                                                                                                                                       0.0s
=> CACHED [stage-0 6/9] COPY ./libs/main-app-1.0.0-SNAPSHOT.jar /opt/spark/jars                                                                                                                                                         0.0s
=> CACHED [stage-0 7/9] COPY ./resources/main/spark-conf /opt/spark/conf                                                                                                                                                                0.0s
=> CACHED [stage-0 8/9] COPY ./resources/main/lib/*.jar /opt/spark/jars/                                                                                                                                                                0.0s
=> CACHED [stage-0 9/9] COPY ./resources/main/conf /app/conf                                                                                                                                                                            0.0s
=> exporting to image                                                                                                                                                                                                                   0.0s
=> => exporting layers                                                                                                                                                                                                                  0.0s
=> => writing image sha256:94dd0d11a31abcd3d23d97b003ffceb30e9eafc38c77dba9f3b92d6ea8633526                                                                                                                                             0.0s
=> => naming to docker.io/library/profiles-example:latest                                                                                                                                                                               0.0s
docker tag profiles-example:latest private-registry.dci-dev.dev-eks.insights.ai/profiles-example:latest
docker push private-registry.dci-dev.dev-eks.insights.ai/profiles-example:latest
The push refers to repository [private-registry.dci-dev.dev-eks.insights.ai/profiles-example]
12898f60c37f: Layer already exists
ece5a5cb892e: Layer already exists
4bc783276212: Layer already exists
e310b2eec001: Layer already exists
b995497ccd6e: Layer already exists
4b156e5303b4: Layer already exists
2012777427ee: Layer already exists
4fae41f79235: Layer already exists
ea16b97b6399: Layer already exists
7628da35a3c9: Layer already exists
5f70bf18a086: Layer already exists
7723dc94285f: Layer already exists
latest: digest: sha256:aa4c3a4dc42a4af55ab0eac6d5bdc3c226828b133013bdea21d297698b43471d size: 2830
cortex types save -y templates/types.yaml --project testi-69257
Type definition saved
cortex actions deploy --actionName profiles-example --actionType job --docker private-registry.dci-dev.dev-eks.insights.ai/profiles-example:latest --project laguirre-testi-69257 --cmd '["scuttle", "python", "submit_job.py"]' --podspec ./templates/podspec.yaml
    {
    "success": true,
    "action": {
        "_isDeleted": false,
        "_projectId": "testi-69257",
        "_createdBy": "[email protected]",
        "name": "profiles-example",
        "description": "",
        "image": "private-registry.dci-dev.dev-eks.insights.ai/profiles-example:latest",
        "type": "job",
        "command": [
            "scuttle",
            "python",
            "submit_job.py"
        ],
        "scaleCount": 1,
        "podSpec": "[{\"path\":\"/containers/0/imagePullPolicy\",\"value\":\"Always\"}]",
        "jobTimeout": 0,
        "k8sResources": [],
        "environmentVariables": null,
        "createdAt": "2022-07-15T22:33:46.250Z",
        "updatedAt": "2022-07-15T22:33:46.250Z",
        "_version": 8
    }
}
cortex skills save -y templates/skill.yaml --project testi-69257
Skill saved: {"success":true,"version":8,"message":"Skill definition profiles-example saved."}

Invoke the Skill.

make invoke

Example Output:

cortex skills invoke --params-file templates/payload.json profiles-example params --project laguirre-testi-69257
{
  "success": true,
  "activationId": "115d1196-408a-47fa-91c4-6f8e8a391641"
}

Run cortex agents get-activation <activation-id> or cortex tasks logs <task-name> to view logs of the skill activation.

Name		Name	Last commit message	Last commit date
Latest commit History 114 Commits
bigquery-connection		bigquery-connection
build-profiles		build-profiles
cache-profile		cache-profile
catalog-management		catalog-management
cdata-connection		cdata-connection
datasource-refresh		datasource-refresh
datasource-streaming		datasource-streaming
docs		docs
filter-queries		filter-queries
gradle/wrapper		gradle/wrapper
join-connections		join-connections
kpi-queries		kpi-queries
local-clients		local-clients
main-app		main-app
notebooks		notebooks
profiles-daemon		profiles-daemon
services/redis		services/redis
templates		templates
.gitignore		.gitignore
LICENSE.md		LICENSE.md
Makefile		Makefile
README.md		README.md
build.gradle.kts		build.gradle.kts
build.sh		build.sh
c12e-ci.yml		c12e-ci.yml
cortex-profiles-sdk-examples.gocd.yaml		cortex-profiles-sdk-examples.gocd.yaml
gradle.properties.template		gradle.properties.template
gradlew		gradlew
gradlew.bat		gradlew.bat
settings.gradle.kts		settings.gradle.kts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Cortex Profiles SDK

Overview

Installation and Setup

Recommended JVM Settings

JFrog Artifactory Setup

Developer Setup

Examples

Add a Java Module Example

Skill Template

Resources

About

Releases

Packages

Contributors 6

Languages

License

CognitiveScale/cortex-profiles-sdk-examples

Folders and files

Latest commit

History

Repository files navigation

Cortex Profiles SDK

Overview

Installation and Setup

Recommended JVM Settings

JFrog Artifactory Setup

Developer Setup

Examples

Add a Java Module Example

Skill Template

Resources

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 6

Languages

Packages