diff --git a/.circleci/db-migration.sh b/.circleci/db-migration.sh index ee70512e1e..6567d91876 100755 --- a/.circleci/db-migration.sh +++ b/.circleci/db-migration.sh @@ -13,7 +13,7 @@ # Version of PostgreSQL readonly POSTGRES_VERSION="14" # Version of Marquez -readonly MARQUEZ_VERSION=0.44.0 +readonly MARQUEZ_VERSION=0.45.0-rc.1 # Build version of Marquez readonly MARQUEZ_BUILD_VERSION="$(git log --pretty=format:'%h' -n 1)" # SHA1 diff --git a/.env.example b/.env.example index cff436e317..bd9a02db22 100644 --- a/.env.example +++ b/.env.example @@ -1,4 +1,4 @@ API_PORT=5000 API_ADMIN_PORT=5001 WEB_PORT=3000 -TAG=0.44.0 +TAG=0.45.0-rc.1 diff --git a/chart/Chart.yaml b/chart/Chart.yaml index 24a3d562ad..e38add6c4a 100644 --- a/chart/Chart.yaml +++ b/chart/Chart.yaml @@ -29,4 +29,4 @@ name: marquez sources: - https://github.com/MarquezProject/marquez - https://marquezproject.github.io/marquez/ -version: 0.44.0 +version: 0.45.0-rc.1 diff --git a/chart/values.yaml b/chart/values.yaml index 0ce58e9d6d..5cda62dffb 100644 --- a/chart/values.yaml +++ b/chart/values.yaml @@ -17,7 +17,7 @@ marquez: image: registry: docker.io repository: marquezproject/marquez - tag: 0.44.0 + tag: 0.45.0-rc.1 pullPolicy: IfNotPresent ## Name of the existing secret containing credentials for the Marquez installation. ## When this is specified, it will take precedence over the values configured in the 'db' section. @@ -75,7 +75,7 @@ web: image: registry: docker.io repository: marquezproject/marquez-web - tag: 0.44.0 + tag: 0.45.0-rc.1 pullPolicy: IfNotPresent ## Marquez website will run on this port ## diff --git a/clients/java/README.md b/clients/java/README.md index 3fe4e7501a..723bdf5c20 100644 --- a/clients/java/README.md +++ b/clients/java/README.md @@ -10,14 +10,14 @@ Maven: io.github.marquezproject marquez-java - 0.44.0 + 0.45.0-rc.1 ``` or Gradle: ```groovy -implementation 'io.github.marquezproject:marquez-java:0.44.0 +implementation 'io.github.marquezproject:marquez-java:0.45.0-rc.1 ``` ## Usage diff --git a/docker/up.sh b/docker/up.sh index 68f3787543..05b47a61bf 100755 --- a/docker/up.sh +++ b/docker/up.sh @@ -8,9 +8,9 @@ set -e # Version of Marquez -readonly VERSION=0.44.0 +readonly VERSION=0.45.0-rc.1 # Build version of Marquez -readonly BUILD_VERSION=0.44.0 +readonly BUILD_VERSION=0.45.0-rc.1 title() { echo -e "\033[1m${1}\033[0m" diff --git a/docs/openapi.html b/docs/openapi.html index c6dfc526a2..52d55b11aa 100644 --- a/docs/openapi.html +++ b/docs/openapi.html @@ -13,21 +13,27 @@ } -

Marquez (0.44.0)

Download OpenAPI specification:Download

License: Apache 2.0

Marquez is an open source metadata service for the collection, aggregation, and visualization of a data ecosystem's metadata.

-

Namespaces

Create a namespace

Creates a new namespace object. A namespace enables the contextual grouping of related jobs and datasets. Namespaces must contain only letters (a-z, A-Z), numbers (0-9), underscores (_), dashes (-), colons (:), slashes (/), or dots (.). A namespace is case-insensitive with a maximum length of 1024 characters. Note jobs and datasets will be unique within a namespace, but not across namespaces.

-
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

-
Request Body schema: application/json
ownerName
required
string

The owner of the namespace.

-
description
string

The description of the namespace.

-

Responses

Request samples

Content type
application/json
{
  • "ownerName": "me",
  • "description": "My first namespace!"
}

Response samples

Content type
application/json
{
  • "name": "my-namespace",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "updatedAt": "2019-05-09T19:49:24.201361Z",
  • "ownerName": "me",
  • "description": "My first namespace!"
}

Retrieve a namespace

Returns a namespace.

-
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

-

Responses

Response samples

Content type
application/json
{
  • "name": "my-namespace",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "updatedAt": "2019-05-09T19:49:24.201361Z",
  • "ownerName": "me",
  • "description": "My first namespace!"
}

Deletes a namespace

Soft deletes a namespace, and every job and dataset inside. On next event containing this namespace, the namespace will be undeleted.

-
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

-

Responses

Response samples

Content type
application/json
{
  • "name": "my-namespace",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "updatedAt": "2019-05-09T19:49:24.201361Z",
  • "ownerName": "me",
  • "description": "My first namespace!"
}

List all namespaces

Returns a list of namespaces.

-
query Parameters
limit
integer
Default: 100
Example: limit=25

The number of results to return from offset.

-
offset
integer
Default: 0

The initial position from which to return results.

-

Responses

Response samples

Content type
application/json
{
  • "namespaces": [
    ]
}

Events

List all received OpenLineage events.

Returns a list of OpenLineage events, sorted in direction of passed sort parameter. By default it is desc.

-
query Parameters
sortDirection
string
Example: sortDirection=name

Sorts the results of your query by indicated direction asc or desc.

-
before
string <date-time>
Example: before=2022-09-15T07:47:19Z

Returns events before passed date.

-
after
string <date-time>
Example: after=2022-09-15T07:47:19Z

Returns events after passed date.

-
limit
integer
Default: 100
Example: limit=25

The number of results to return from offset.

-
offset
integer
Default: 0

The initial position from which to return results.

-

Responses

Response samples

Content type
application/json
{}

Sources

Create a source Deprecated

Creates a new source object. A source is the physical location of a dataset such as a table in PostgreSQL, or topic in Kafka. A source enables the grouping of physical datasets to their physical source.

-
path Parameters
source
required
string <= 1024 characters
Example: my-source

The name of the source.

-
Request Body schema: application/json
type
required
string

The type of the source.

-
connectionUrl
required
string <URL>

The URL to the location of the source.

-
description
string

The description of the source.

-

Responses

Request samples

Content type
application/json
{
  • "type": "POSTGRESQL",
  • "connectionUrl": "jdbc:postgresql://db.example.com/mydb",
  • "description": "My first source!"
}

Response samples

Content type
application/json
{
  • "type": "POSTGRESQL",
  • "name": "my-source",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "updatedAt": "2019-05-09T19:49:24.201361Z",
  • "connectionUrl": "jdbc:postgresql://db.example.com/mydb",
  • "description": "My first source!"
}

Retrieve a source

Returns a source.

-
path Parameters
source
required
string <= 1024 characters
Example: my-source

The name of the source.

-

Responses

Response samples

Content type
application/json
{
  • "type": "POSTGRESQL",
  • "name": "my-source",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "updatedAt": "2019-05-09T19:49:24.201361Z",
  • "connectionUrl": "jdbc:postgresql://db.example.com/mydb",
  • "description": "My first source!"
}

List all sources

Returns a list of sources.

-
query Parameters
limit
integer
Default: 100
Example: limit=25

The number of results to return from offset.

-
offset
integer
Default: 0

The initial position from which to return results.

-

Responses

Response samples

Content type
application/json
{
  • "sources": [
    ]
}

Datasets

Create a dataset Deprecated

Creates a new dataset.

-
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

-
dataset
required
string <= 1024 characters
Example: my-dataset

The name of the dataset.

-
Request Body schema: application/json
Any of
type
required
string
Value: "DB_TABLE"

The type of the dataset.

-
physicalName
required
string

The physical name of the table.

-
sourceName
required
string

The name of the source associated with the table.

-
required
Array of objects[ items ]

The fields of the table.

-
tags
Array of strings

List of tags.

-
description
string

The description of the table.

-
runId
string

The ID associated with the run modifying the table.

-

Responses

Request samples

Content type
application/json
Example
{
  • "type": "DB_TABLE",
  • "physicalName": "public.mytable",
  • "sourceName": "my-source",
  • "fields": [
    ],
  • "description": "My first dataset!"
}

Response samples

Content type
application/json
{
  • "id": {
    },
  • "type": "DB_TABLE",
  • "name": "my-dataset",
  • "physicalName": "public.mytable",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "upodatedAt": "2019-05-09T19:49:24.201361Z",
  • "namespace": "my-namespace",
  • "sourceName": "my-source",
  • "fields": [
    ],
  • "tags": [ ],
  • "lastModifiedAt": null,
  • "description": "My first dataset!",
  • "facets": { },
  • "currentVersion": "b1d626a2-6d3a-475e-9ecf-943176d4a8c6"
}

Retrieve a dataset

Returns a dataset.

-
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

-
dataset
required
string <= 1024 characters
Example: my-dataset

The name of the dataset.

-

Responses

Response samples

Content type
application/json
{
  • "id": {
    },
  • "type": "DB_TABLE",
  • "name": "my-dataset",
  • "physicalName": "public.mytable",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "upodatedAt": "2019-05-09T19:49:24.201361Z",
  • "namespace": "my-namespace",
  • "sourceName": "my-source",
  • "fields": [
    ],
  • "tags": [ ],
  • "lastModifiedAt": null,
  • "description": "My first dataset!",
  • "facets": { },
  • "currentVersion": "b1d626a2-6d3a-475e-9ecf-943176d4a8c6"
}

Soft deletes dataset.

Soft deletes dataset. It will be un-deleted if new OpenLineage event containing this dataset comes.

-
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

-
dataset
required
string <= 1024 characters
Example: my-dataset

The name of the dataset.

-

Responses

Response samples

Content type
application/json
{
  • "id": {
    },
  • "type": "DB_TABLE",
  • "name": "my-dataset",
  • "physicalName": "public.mytable",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "upodatedAt": "2019-05-09T19:49:24.201361Z",
  • "namespace": "my-namespace",
  • "sourceName": "my-source",
  • "fields": [
    ],
  • "tags": [ ],
  • "lastModifiedAt": null,
  • "description": "My first dataset!",
  • "facets": { },
  • "currentVersion": "b1d626a2-6d3a-475e-9ecf-943176d4a8c6"
}

Retrieve a version for a dataset

Returns a version for a dataset.

-
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

-
dataset
required
string <= 1024 characters
Example: my-dataset

The name of the dataset.

-
version
required
string <uuid>
Example: ea9badc5-7cb2-49af-9a9f-155771d3a797

The ID of the job or dataset version.

-

Responses

Response samples

Content type
application/json
{
  • "id": {
    },
  • "type": "DB_TABLE",
  • "name": "my-dataset",
  • "physicalName": "public.mytable",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "version": "d224dac0-35d7-4d9b-bbbe-6fff1a8485ad",
  • "namespace": "my-namespace",
  • "sourceName": "my-source",
  • "fields": [
    ],
  • "tags": [ ],
  • "description": "My first dataset!",
  • "createdByRun": {
    }
}

List all versions for a dataset

Returns a list of versions for a dataset.

-
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

-
dataset
required
string <= 1024 characters
Example: my-dataset

The name of the dataset.

-
query Parameters
limit
integer
Default: 100
Example: limit=25

The number of results to return from offset.

-
offset
integer
Default: 0

The initial position from which to return results.

-

Responses

Response samples

Content type
application/json
{
  • "versions": [
    ]
}

List all datasets

Returns a list of datasets.

-
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

-
dataset
required
string <= 1024 characters
Example: my-dataset

The name of the dataset.

-
query Parameters
limit
integer
Default: 100
Example: limit=25

The number of results to return from offset.

-
offset
integer
Default: 0

The initial position from which to return results.

-

Responses

Response samples

Content type
application/json
{
  • "datasets": [
    ],
  • "totalCount": 0
}

Tag a dataset

Tag an existing dataset.

-
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

-
dataset
required
string <= 1024 characters
Example: my-dataset

The name of the dataset.

-
tag
required
string
Example: SENSITIVE

The name of the tag.

-

Responses

Response samples

Content type
application/json
{
  • "id": {
    },
  • "type": "DB_TABLE",
  • "name": "my-dataset",
  • "physicalName": "public.mytable",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "upodatedAt": "2019-05-09T19:49:24.201361Z",
  • "namespace": "my-namespace",
  • "sourceName": "my-source",
  • "fields": [
    ],
  • "tags": [ ],
  • "lastModifiedAt": null,
  • "description": "My first dataset!",
  • "facets": { },
  • "currentVersion": "b1d626a2-6d3a-475e-9ecf-943176d4a8c6"
}

Tag a field

Tag an existing field of a dataset.

-
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

-
dataset
required
string <= 1024 characters
Example: my-dataset

The name of the dataset.

-
field
required
string
Example: my_field

The name of the field.

-
tag
required
string
Example: SENSITIVE

The name of the tag.

-

Responses

Response samples

Content type
application/json
{
  • "id": {
    },
  • "type": "DB_TABLE",
  • "name": "my-dataset",
  • "physicalName": "public.mytable",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "upodatedAt": "2019-05-09T19:49:24.201361Z",
  • "namespace": "my-namespace",
  • "sourceName": "my-source",
  • "fields": [
    ],
  • "tags": [ ],
  • "lastModifiedAt": null,
  • "description": "My first dataset!",
  • "facets": { },
  • "currentVersion": "b1d626a2-6d3a-475e-9ecf-943176d4a8c6"
}

Jobs

Create a job Deprecated

Creates a new job object. All job objects are immutable and are uniquely identified by a generated ID. Marquez will create a version of a job each time the contents of the object is modified. For example, the location of a job may change over time resulting in new versions. The accumulated versions can be listed, used to rerun a specific job version or possibly help debug a failed job run.

-
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

-
job
required
string <= 1024 characters
Example: my-job

The name of the job.

-
Request Body schema: application/json
object

The ID of the job.

-
type
required
string (JobType)
Enum: "BATCH" "STREAM" "SERVICE"

The type of the job.

-
required
Array of objects (DatasetId) unique [ items ]

The set of input datasets.

-
required
Array of objects (DatasetId) unique [ items ]

The set of output datasets.

-
location
string <URL>

The URL of the job source code or artifact.

-
context
object
Deprecated

A key/value pair that must be of type string. A context can be used for getting additional details about the job.

-
description
string

The description of the job.

-
runId
string

An optional run ID used to associate a job version to an existing job run.

-

Responses

Request samples

Content type
application/json
{}

Response samples

Content type
application/json
{
  • "id": {
    },
  • "type": "BATCH",
  • "name": "my-job",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "updatedAt": "2019-05-09T19:49:24.201361Z",
  • "namespace": "my-namespace",
  • "inputs": [
    ],
  • "outputs": [ ],
  • "context": {
    },
  • "description": "My first job!",
  • "latestRun": null,
  • "facets": { },
  • "currentVersion": "b1d626a2-6d3a-475e-9ecf-943176d4a8c6"
}

Retrieve a job

Retrieve a job.

-
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

-
job
required
string <= 1024 characters
Example: my-job

The name of the job.

-

Responses

Response samples

Content type
application/json
{
  • "id": {
    },
  • "type": "BATCH",
  • "name": "my-job",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "updatedAt": "2019-05-09T19:49:24.201361Z",
  • "namespace": "my-namespace",
  • "inputs": [
    ],
  • "outputs": [ ],
  • "context": {
    },
  • "description": "My first job!",
  • "latestRun": null,
  • "facets": { },
  • "currentVersion": "b1d626a2-6d3a-475e-9ecf-943176d4a8c6"
}

Soft deletes job.

Soft deletes job. It will be un-deleted if new OpenLineage event containing this job comes.

-
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

-
job
required
string <= 1024 characters
Example: my-job

The name of the job.

-

Responses

Response samples

Content type
application/json
{
  • "id": {
    },
  • "type": "BATCH",
  • "name": "my-job",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "updatedAt": "2019-05-09T19:49:24.201361Z",
  • "namespace": "my-namespace",
  • "inputs": [
    ],
  • "outputs": [ ],
  • "context": {
    },
  • "description": "My first job!",
  • "latestRun": null,
  • "facets": { },
  • "currentVersion": "b1d626a2-6d3a-475e-9ecf-943176d4a8c6"
}

List all jobs

Returns a list of jobs.

-
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

-
query Parameters
limit
integer
Default: 100
Example: limit=25

The number of results to return from offset.

-
offset
integer
Default: 0

The initial position from which to return results.

-

Responses

Response samples

Content type
application/json
{
  • "jobs": [
    ],
  • "totalCount": 0
}

Retrieve a version for a job

Returns a version for a job.

-
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

-
job
required
string <= 1024 characters
Example: my-job

The name of the job.

-
version
required
string <uuid>
Example: ea9badc5-7cb2-49af-9a9f-155771d3a797

The ID of the job or dataset version.

-

Responses

Response samples

Content type
application/json
{
  • "id": {
    },
  • "type": "BATCH",
  • "name": "my-job",
  • "version": "56472c57-a2ef-4218-b7b7-d2af02a343fd",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "namespace": "my-namespace",
  • "inputs": [
    ],
  • "outputs": [ ],
  • "context": {
    },
  • "description": "My first job!",
  • "facets": { }
}

List all versions for a job

Returns a list of versions for a job.

-
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

-
job
required
string <= 1024 characters
Example: my-job

The name of the job.

-

Responses

Response samples

Content type
application/json
{
  • "versions": [
    ]
}

Create a run Deprecated

Creates a new run object for a job.

-
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

-
job
required
string <= 1024 characters
Example: my-job

The name of the job.

-
Request Body schema: application/json
id
string <uuid>

An optional user-provided unique ID of the run. A run ID must be an UUID. If an ID for the run is not provided, a random UUID will be generated for the given run.

-
nominalStartTime
string <date-time>

An ISO-8601 timestamp representing the nominal start time of the run.

-
nominalEndTime
string <date-time>

An ISO-8601 timestamp representing the nominal end time of the run.

-
args
object

The arguments of the run.

-

Responses

Request samples

Content type
application/json
{
  • "args": {
    }
}

Response samples

Content type
application/json
Example
{
  • "id": "870492da-ecfb-4be0-91b9-9a89ddd3db90",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "updatedAt": "2019-05-09T19:49:24.201361Z",
  • "nominalStartTime": null,
  • "nominalEndTime": null,
  • "state": "RUNNING",
  • "startedAt": "2019-05-09T15:17:32.690346",
  • "endedAt": null,
  • "durationMs": null,
  • "args": {
    },
  • "facets": { }
}

List all runs

Returns a list of runs for a job.

-
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

-
job
required
string <= 1024 characters
Example: my-job

The name of the job.

-
query Parameters
limit
integer
Default: 100
Example: limit=25

The number of results to return from offset.

-
offset
integer
Default: 0

The initial position from which to return results.

-

Responses

Response samples

Content type
application/json
{
  • "runs": [
    ]
}

Retrieve a run

Retrieve a run.

-
path Parameters
id
required
string <uuid>
Example: ea9badc5-7cb2-49af-9a9f-155771d3a797

The ID of the run.

-

Responses

Response samples

Content type
application/json
Example
{
  • "id": "870492da-ecfb-4be0-91b9-9a89ddd3db90",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "updatedAt": "2019-05-09T19:49:24.201361Z",
  • "nominalStartTime": null,
  • "nominalEndTime": null,
  • "state": "RUNNING",
  • "startedAt": "2019-05-09T15:17:32.690346",
  • "endedAt": null,
  • "durationMs": null,
  • "args": {
    },
  • "facets": { }
}

Retrieve run or job facets for a run.

Retrieve run or job facets for a run.

-
path Parameters
id
required
string <uuid>
Example: ea9badc5-7cb2-49af-9a9f-155771d3a797

The ID of the run.

-
query Parameters
type
required
string
Enum: "run" "job"

Indicates if should return job or run facets.

-

Responses

Response samples

Content type
application/json

Start a run Deprecated

Marks the run as RUNNING.

-
path Parameters
id
required
string <uuid>
Example: ea9badc5-7cb2-49af-9a9f-155771d3a797

The ID of the run.

-
query Parameters
at
string <date-time>

An ISO-8601 timestamp representing the time when the run transitioned.

-

Responses

Response samples

Content type
application/json
{
  • "id": "870492da-ecfb-4be0-91b9-9a89ddd3db90",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "updatedAt": "2019-05-09T19:49:24.201361Z",
  • "nominalStartTime": null,
  • "nominalEndTime": null,
  • "state": "RUNNING",
  • "startedAt": "2019-05-09T15:17:32.690346",
  • "endedAt": null,
  • "durationMs": null,
  • "args": {
    },
  • "facets": { }
}

Complete a run Deprecated

Marks the run as COMPLETED.

-
path Parameters
id
required
string <uuid>
Example: ea9badc5-7cb2-49af-9a9f-155771d3a797

The ID of the run.

-
query Parameters
at
string <date-time>

An ISO-8601 timestamp representing the time when the run transitioned.

-

Responses

Response samples

Content type
application/json
Example
{
  • "id": "870492da-ecfb-4be0-91b9-9a89ddd3db90",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "updatedAt": "2019-05-09T19:49:24.201361Z",
  • "nominalStartTime": null,
  • "nominalEndTime": null,
  • "state": "RUNNING",
  • "startedAt": "2019-05-09T15:17:32.690346",
  • "endedAt": null,
  • "durationMs": null,
  • "args": {
    },
  • "facets": { }
}

Fail a run Deprecated

Marks the run as FAILED.

-
path Parameters
id
required
string <uuid>
Example: ea9badc5-7cb2-49af-9a9f-155771d3a797

The ID of the run.

-
query Parameters
at
string <date-time>

An ISO-8601 timestamp representing the time when the run transitioned.

-

Responses

Response samples

Content type
application/json
{
  • "id": "870492da-ecfb-4be0-91b9-9a89ddd3db90",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "updatedAt": "2019-05-09T19:49:24.201361Z",
  • "nominalStartTime": null,
  • "nominalEndTime": null,
  • "state": "RUNNING",
  • "startedAt": "2019-05-09T15:17:32.690346",
  • "endedAt": null,
  • "durationMs": null,
  • "args": {
    },
  • "facets": { }
}

Abort a run Deprecated

Marks the run as ABORTED.

-
path Parameters
id
required
string <uuid>
Example: ea9badc5-7cb2-49af-9a9f-155771d3a797

The ID of the run.

-
query Parameters
at
string <date-time>

An ISO-8601 timestamp representing the time when the run transitioned.

-

Responses

Response samples

Content type
application/json
{
  • "id": "870492da-ecfb-4be0-91b9-9a89ddd3db90",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "updatedAt": "2019-05-09T19:49:24.201361Z",
  • "nominalStartTime": null,
  • "nominalEndTime": null,
  • "state": "RUNNING",
  • "startedAt": "2019-05-09T15:17:32.690346",
  • "endedAt": null,
  • "durationMs": null,
  • "args": {
    },
  • "facets": { }
}

Lineage

Record a single lineage event

Receive, process, and store lineage metadata using the OpenLineage standard.

-
Request Body schema: application/json
any (LineageEvent)

Responses

Request samples

Content type
application/json
{}

Get a lineage graph

query Parameters
nodeId
required
string
Example: nodeId=dataset:food_delivery:public.delivery_7_days

The ID of the node. A node can either be a dataset node, a dataset field node or a job node. The format of nodeId for dataset is dataset:<namespace_of_dataset>:<name_of_the_dataset>, for dataset field is datasetField:<namespace_of_dataset>:<name_of_the_dataset>:<name_of_field>, and for job is job:<namespace_of_the_job>:<name_of_the_job>.

-
depth
integer
Default: 20

Depth of lineage graph to create.

-

Responses

Response samples

Content type
application/json
{
  • "graph": [
    ]
}

Get the upstream lineage for a given run

Responses

Response samples

Content type
application/json
{
  • "runs": [
    ]
}

Column lineage

Get a column lineage graph

query Parameters
nodeId
required
string
Example: nodeId=dataset:food_delivery:public.delivery_7_days

The ID of the node. A node can either be a dataset node, a dataset field node or a job node. The format of nodeId for dataset is dataset:<namespace_of_dataset>:<name_of_the_dataset>, for dataset field is datasetField:<namespace_of_dataset>:<name_of_the_dataset>:<name_of_field>, and for job is job:<namespace_of_the_job>:<name_of_the_job>.

-
depth
integer
Default: 20

Depth of lineage graph to create.

-
withDownstream
boolean
Default: false

Determines if downstream lineage should be returned.

-

Responses

Response samples

Content type
application/json
{
  • "graph": [
    ]
}

Tags

Create a tag

Creates a new tag object.

-
path Parameters
tag
required
string
Example: SENSITIVE

The name of the tag.

-
Request Body schema: application/json
description
string

The description of the tag.

-

Responses

Request samples

Content type
application/json
{
  • "description": "My first tag!"
}

Response samples

Content type
application/json
{
  • "tags": [
    ]
}

List all tags

Returns a list of tags.

-
query Parameters
limit
integer
Default: 100
Example: limit=25

The number of results to return from offset.

-
offset
integer
Default: 0

The initial position from which to return results.

-

Responses

Response samples

Content type
application/json
{
  • "tags": [
    ]
}

Search

Query all datasets and jobs

Returns one or more datasets and jobs of your query.

-
query Parameters
q
required
string
Example: q=my-dataset

Query containing pattern to match; datasets and jobs pattern matching is string based and case-insensitive. Use percent sign (%) to match any string of zero or more characters (my-job%), or an underscore (_) to match a single character (_job_).

-
filter
string
Example: filter=dataset

Filters the results of your query by dataset or job.

-
sort
string
Example: sort=name

Sorts the results of your query by name or updated_at.

-
limit
integer
Default: 100
Example: limit=25

The number of results to return from offset.

-
namespace
string <= 1024 characters
Example: namespace=my-namespace

Match jobs or datasets within the given namespace.

-
before
stringYYYY-MM-DD
Example: before=2022-09-15

Match jobs or datasets before YYYY-MM-DD.

-
after
stringYYYY-MM-DD
Example: after=2022-09-15

Match jobs or datasets after YYYY-MM-DD.

-

Responses

Response samples

Content type
application/json
{
  • "totalCount": 1,
  • "results": [
    ]
}
+ " fill="currentColor">

Marquez (0.45.0-rc.1)

Download OpenAPI specification:Download

License: Apache 2.0

Marquez is an open source metadata service for the collection, aggregation, and visualization of a data ecosystem's metadata.

+

Namespaces

Create a namespace

Creates a new namespace object. A namespace enables the contextual grouping of related jobs and datasets. Namespaces must contain only letters (a-z, A-Z), numbers (0-9), underscores (_), dashes (-), colons (:), slashes (/), or dots (.). A namespace is case-insensitive with a maximum length of 1024 characters. Note jobs and datasets will be unique within a namespace, but not across namespaces.

+
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

+
Request Body schema: application/json
ownerName
required
string

The owner of the namespace.

+
description
string

The description of the namespace.

+

Responses

Request samples

Content type
application/json
{
  • "ownerName": "me",
  • "description": "My first namespace!"
}

Response samples

Content type
application/json
{
  • "name": "my-namespace",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "updatedAt": "2019-05-09T19:49:24.201361Z",
  • "ownerName": "me",
  • "description": "My first namespace!"
}

Retrieve a namespace

Returns a namespace.

+
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

+

Responses

Response samples

Content type
application/json
{
  • "name": "my-namespace",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "updatedAt": "2019-05-09T19:49:24.201361Z",
  • "ownerName": "me",
  • "description": "My first namespace!"
}

Deletes a namespace

Soft deletes a namespace, and every job and dataset inside. On next event containing this namespace, the namespace will be undeleted.

+
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

+

Responses

Response samples

Content type
application/json
{
  • "name": "my-namespace",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "updatedAt": "2019-05-09T19:49:24.201361Z",
  • "ownerName": "me",
  • "description": "My first namespace!"
}

List all namespaces

Returns a list of namespaces.

+
query Parameters
limit
integer
Default: 100
Example: limit=25

The number of results to return from offset.

+
offset
integer
Default: 0

The initial position from which to return results.

+

Responses

Response samples

Content type
application/json
{
  • "namespaces": [
    ]
}

Events

List all received OpenLineage events.

Returns a list of OpenLineage events, sorted in direction of passed sort parameter. By default it is desc.

+
query Parameters
sortDirection
string
Example: sortDirection=name

Sorts the results of your query by indicated direction asc or desc.

+
before
string <date-time>
Example: before=2022-09-15T07:47:19Z

Returns events before passed date.

+
after
string <date-time>
Example: after=2022-09-15T07:47:19Z

Returns events after passed date.

+
limit
integer
Default: 100
Example: limit=25

The number of results to return from offset.

+
offset
integer
Default: 0

The initial position from which to return results.

+

Responses

Response samples

Content type
application/json
{}

Sources

Create a source Deprecated

Creates a new source object. A source is the physical location of a dataset such as a table in PostgreSQL, or topic in Kafka. A source enables the grouping of physical datasets to their physical source.

+
path Parameters
source
required
string <= 1024 characters
Example: my-source

The name of the source.

+
Request Body schema: application/json
type
required
string

The type of the source.

+
connectionUrl
required
string <URL>

The URL to the location of the source.

+
description
string

The description of the source.

+

Responses

Request samples

Content type
application/json
{
  • "type": "POSTGRESQL",
  • "connectionUrl": "jdbc:postgresql://db.example.com/mydb",
  • "description": "My first source!"
}

Response samples

Content type
application/json
{
  • "type": "POSTGRESQL",
  • "name": "my-source",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "updatedAt": "2019-05-09T19:49:24.201361Z",
  • "connectionUrl": "jdbc:postgresql://db.example.com/mydb",
  • "description": "My first source!"
}

Retrieve a source

Returns a source.

+
path Parameters
source
required
string <= 1024 characters
Example: my-source

The name of the source.

+

Responses

Response samples

Content type
application/json
{
  • "type": "POSTGRESQL",
  • "name": "my-source",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "updatedAt": "2019-05-09T19:49:24.201361Z",
  • "connectionUrl": "jdbc:postgresql://db.example.com/mydb",
  • "description": "My first source!"
}

List all sources

Returns a list of sources.

+
query Parameters
limit
integer
Default: 100
Example: limit=25

The number of results to return from offset.

+
offset
integer
Default: 0

The initial position from which to return results.

+

Responses

Response samples

Content type
application/json
{
  • "sources": [
    ]
}

Datasets

Create a dataset Deprecated

Creates a new dataset.

+
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

+
dataset
required
string <= 1024 characters
Example: my-dataset

The name of the dataset.

+
Request Body schema: application/json
Any of
type
required
string
Value: "DB_TABLE"

The type of the dataset.

+
physicalName
required
string

The physical name of the table.

+
sourceName
required
string

The name of the source associated with the table.

+
required
Array of objects

The fields of the table.

+
tags
Array of strings

List of tags.

+
description
string

The description of the table.

+
runId
string

The ID associated with the run modifying the table.

+

Responses

Request samples

Content type
application/json
Example
{
  • "type": "DB_TABLE",
  • "physicalName": "public.mytable",
  • "sourceName": "my-source",
  • "fields": [
    ],
  • "description": "My first dataset!"
}

Response samples

Content type
application/json
{
  • "id": {
    },
  • "type": "DB_TABLE",
  • "name": "my-dataset",
  • "physicalName": "public.mytable",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "upodatedAt": "2019-05-09T19:49:24.201361Z",
  • "namespace": "my-namespace",
  • "sourceName": "my-source",
  • "fields": [
    ],
  • "tags": [ ],
  • "lastModifiedAt": null,
  • "description": "My first dataset!",
  • "facets": { },
  • "currentVersion": "b1d626a2-6d3a-475e-9ecf-943176d4a8c6"
}

Retrieve a dataset

Returns a dataset.

+
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

+
dataset
required
string <= 1024 characters
Example: my-dataset

The name of the dataset.

+

Responses

Response samples

Content type
application/json
{
  • "id": {
    },
  • "type": "DB_TABLE",
  • "name": "my-dataset",
  • "physicalName": "public.mytable",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "upodatedAt": "2019-05-09T19:49:24.201361Z",
  • "namespace": "my-namespace",
  • "sourceName": "my-source",
  • "fields": [
    ],
  • "tags": [ ],
  • "lastModifiedAt": null,
  • "description": "My first dataset!",
  • "facets": { },
  • "currentVersion": "b1d626a2-6d3a-475e-9ecf-943176d4a8c6"
}

Soft deletes dataset.

Soft deletes dataset. It will be un-deleted if new OpenLineage event containing this dataset comes.

+
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

+
dataset
required
string <= 1024 characters
Example: my-dataset

The name of the dataset.

+

Responses

Response samples

Content type
application/json
{
  • "id": {
    },
  • "type": "DB_TABLE",
  • "name": "my-dataset",
  • "physicalName": "public.mytable",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "upodatedAt": "2019-05-09T19:49:24.201361Z",
  • "namespace": "my-namespace",
  • "sourceName": "my-source",
  • "fields": [
    ],
  • "tags": [ ],
  • "lastModifiedAt": null,
  • "description": "My first dataset!",
  • "facets": { },
  • "currentVersion": "b1d626a2-6d3a-475e-9ecf-943176d4a8c6"
}

Retrieve a version for a dataset

Returns a version for a dataset.

+
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

+
dataset
required
string <= 1024 characters
Example: my-dataset

The name of the dataset.

+
version
required
string <uuid>
Example: ea9badc5-7cb2-49af-9a9f-155771d3a797

The ID of the job or dataset version.

+

Responses

Response samples

Content type
application/json
{
  • "id": {
    },
  • "type": "DB_TABLE",
  • "name": "my-dataset",
  • "physicalName": "public.mytable",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "version": "d224dac0-35d7-4d9b-bbbe-6fff1a8485ad",
  • "namespace": "my-namespace",
  • "sourceName": "my-source",
  • "fields": [
    ],
  • "tags": [ ],
  • "description": "My first dataset!",
  • "createdByRun": {
    }
}

List all versions for a dataset

Returns a list of versions for a dataset.

+
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

+
dataset
required
string <= 1024 characters
Example: my-dataset

The name of the dataset.

+
query Parameters
limit
integer
Default: 100
Example: limit=25

The number of results to return from offset.

+
offset
integer
Default: 0

The initial position from which to return results.

+

Responses

Response samples

Content type
application/json
{
  • "versions": [
    ]
}

List all datasets

Returns a list of datasets.

+
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

+
dataset
required
string <= 1024 characters
Example: my-dataset

The name of the dataset.

+
query Parameters
limit
integer
Default: 100
Example: limit=25

The number of results to return from offset.

+
offset
integer
Default: 0

The initial position from which to return results.

+

Responses

Response samples

Content type
application/json
{
  • "datasets": [
    ],
  • "totalCount": 0
}

Tag a dataset

Tag an existing dataset.

+
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

+
dataset
required
string <= 1024 characters
Example: my-dataset

The name of the dataset.

+
tag
required
string
Example: SENSITIVE

The name of the tag.

+

Responses

Response samples

Content type
application/json
{
  • "id": {
    },
  • "type": "DB_TABLE",
  • "name": "my-dataset",
  • "physicalName": "public.mytable",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "upodatedAt": "2019-05-09T19:49:24.201361Z",
  • "namespace": "my-namespace",
  • "sourceName": "my-source",
  • "fields": [
    ],
  • "tags": [ ],
  • "lastModifiedAt": null,
  • "description": "My first dataset!",
  • "facets": { },
  • "currentVersion": "b1d626a2-6d3a-475e-9ecf-943176d4a8c6"
}

Tag a field

Tag an existing field of a dataset.

+
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

+
dataset
required
string <= 1024 characters
Example: my-dataset

The name of the dataset.

+
field
required
string
Example: my_field

The name of the field.

+
tag
required
string
Example: SENSITIVE

The name of the tag.

+

Responses

Response samples

Content type
application/json
{
  • "id": {
    },
  • "type": "DB_TABLE",
  • "name": "my-dataset",
  • "physicalName": "public.mytable",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "upodatedAt": "2019-05-09T19:49:24.201361Z",
  • "namespace": "my-namespace",
  • "sourceName": "my-source",
  • "fields": [
    ],
  • "tags": [ ],
  • "lastModifiedAt": null,
  • "description": "My first dataset!",
  • "facets": { },
  • "currentVersion": "b1d626a2-6d3a-475e-9ecf-943176d4a8c6"
}

Jobs

Create a job Deprecated

Creates a new job object. All job objects are immutable and are uniquely identified by a generated ID. Marquez will create a version of a job each time the contents of the object is modified. For example, the location of a job may change over time resulting in new versions. The accumulated versions can be listed, used to rerun a specific job version or possibly help debug a failed job run.

+
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

+
job
required
string <= 1024 characters
Example: my-job

The name of the job.

+
Request Body schema: application/json
object

The ID of the job.

+
type
required
string (JobType)
Enum: "BATCH" "STREAM" "SERVICE"

The type of the job.

+
required
Array of objects (DatasetId) unique

The set of input datasets.

+
required
Array of objects (DatasetId) unique

The set of output datasets.

+
location
string <URL>

The URL of the job source code or artifact.

+
context
object
Deprecated

A key/value pair that must be of type string. A context can be used for getting additional details about the job.

+
description
string

The description of the job.

+
runId
string

An optional run ID used to associate a job version to an existing job run.

+

Responses

Request samples

Content type
application/json
{}

Response samples

Content type
application/json
{
  • "id": {
    },
  • "type": "BATCH",
  • "name": "my-job",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "updatedAt": "2019-05-09T19:49:24.201361Z",
  • "namespace": "my-namespace",
  • "inputs": [
    ],
  • "outputs": [ ],
  • "context": {
    },
  • "description": "My first job!",
  • "latestRun": null,
  • "facets": { },
  • "currentVersion": "b1d626a2-6d3a-475e-9ecf-943176d4a8c6"
}

Retrieve a job

Retrieve a job.

+
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

+
job
required
string <= 1024 characters
Example: my-job

The name of the job.

+

Responses

Response samples

Content type
application/json
{
  • "id": {
    },
  • "type": "BATCH",
  • "name": "my-job",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "updatedAt": "2019-05-09T19:49:24.201361Z",
  • "namespace": "my-namespace",
  • "inputs": [
    ],
  • "outputs": [ ],
  • "context": {
    },
  • "description": "My first job!",
  • "latestRun": null,
  • "facets": { },
  • "currentVersion": "b1d626a2-6d3a-475e-9ecf-943176d4a8c6"
}

Soft deletes job.

Soft deletes job. It will be un-deleted if new OpenLineage event containing this job comes.

+
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

+
job
required
string <= 1024 characters
Example: my-job

The name of the job.

+

Responses

Response samples

Content type
application/json
{
  • "id": {
    },
  • "type": "BATCH",
  • "name": "my-job",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "updatedAt": "2019-05-09T19:49:24.201361Z",
  • "namespace": "my-namespace",
  • "inputs": [
    ],
  • "outputs": [ ],
  • "context": {
    },
  • "description": "My first job!",
  • "latestRun": null,
  • "facets": { },
  • "currentVersion": "b1d626a2-6d3a-475e-9ecf-943176d4a8c6"
}

List all jobs

Returns a list of jobs.

+
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

+
query Parameters
limit
integer
Default: 100
Example: limit=25

The number of results to return from offset.

+
offset
integer
Default: 0

The initial position from which to return results.

+

Responses

Response samples

Content type
application/json
{
  • "jobs": [
    ],
  • "totalCount": 0
}

Retrieve a version for a job

Returns a version for a job.

+
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

+
job
required
string <= 1024 characters
Example: my-job

The name of the job.

+
version
required
string <uuid>
Example: ea9badc5-7cb2-49af-9a9f-155771d3a797

The ID of the job or dataset version.

+

Responses

Response samples

Content type
application/json
{
  • "id": {
    },
  • "type": "BATCH",
  • "name": "my-job",
  • "version": "56472c57-a2ef-4218-b7b7-d2af02a343fd",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "namespace": "my-namespace",
  • "inputs": [
    ],
  • "outputs": [ ],
  • "context": {
    },
  • "description": "My first job!",
  • "facets": { }
}

List all versions for a job

Returns a list of versions for a job.

+
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

+
job
required
string <= 1024 characters
Example: my-job

The name of the job.

+

Responses

Response samples

Content type
application/json
{
  • "versions": [
    ]
}

Create a run Deprecated

Creates a new run object for a job.

+
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

+
job
required
string <= 1024 characters
Example: my-job

The name of the job.

+
Request Body schema: application/json
id
string <uuid>

An optional user-provided unique ID of the run. A run ID must be an UUID. If an ID for the run is not provided, a random UUID will be generated for the given run.

+
nominalStartTime
string <date-time>

An ISO-8601 timestamp representing the nominal start time of the run.

+
nominalEndTime
string <date-time>

An ISO-8601 timestamp representing the nominal end time of the run.

+
args
object

The arguments of the run.

+

Responses

Request samples

Content type
application/json
{
  • "args": {
    }
}

Response samples

Content type
application/json
Example
{
  • "id": "870492da-ecfb-4be0-91b9-9a89ddd3db90",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "updatedAt": "2019-05-09T19:49:24.201361Z",
  • "nominalStartTime": null,
  • "nominalEndTime": null,
  • "state": "RUNNING",
  • "startedAt": "2019-05-09T15:17:32.690346",
  • "endedAt": null,
  • "durationMs": null,
  • "args": {
    },
  • "facets": { }
}

List all runs

Returns a list of runs for a job.

+
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

+
job
required
string <= 1024 characters
Example: my-job

The name of the job.

+
query Parameters
limit
integer
Default: 100
Example: limit=25

The number of results to return from offset.

+
offset
integer
Default: 0

The initial position from which to return results.

+

Responses

Response samples

Content type
application/json
{
  • "runs": [
    ]
}

Retrieve a run

Retrieve a run.

+
path Parameters
id
required
string <uuid>
Example: ea9badc5-7cb2-49af-9a9f-155771d3a797

The ID of the run.

+

Responses

Response samples

Content type
application/json
Example
{
  • "id": "870492da-ecfb-4be0-91b9-9a89ddd3db90",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "updatedAt": "2019-05-09T19:49:24.201361Z",
  • "nominalStartTime": null,
  • "nominalEndTime": null,
  • "state": "RUNNING",
  • "startedAt": "2019-05-09T15:17:32.690346",
  • "endedAt": null,
  • "durationMs": null,
  • "args": {
    },
  • "facets": { }
}

Retrieve run or job facets for a run.

Retrieve run or job facets for a run.

+
path Parameters
id
required
string <uuid>
Example: ea9badc5-7cb2-49af-9a9f-155771d3a797

The ID of the run.

+
query Parameters
type
required
string
Enum: "run" "job"

Indicates if should return job or run facets.

+

Responses

Response samples

Content type
application/json

Start a run Deprecated

Marks the run as RUNNING.

+
path Parameters
id
required
string <uuid>
Example: ea9badc5-7cb2-49af-9a9f-155771d3a797

The ID of the run.

+
query Parameters
at
string <date-time>

An ISO-8601 timestamp representing the time when the run transitioned.

+

Responses

Response samples

Content type
application/json
{
  • "id": "870492da-ecfb-4be0-91b9-9a89ddd3db90",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "updatedAt": "2019-05-09T19:49:24.201361Z",
  • "nominalStartTime": null,
  • "nominalEndTime": null,
  • "state": "RUNNING",
  • "startedAt": "2019-05-09T15:17:32.690346",
  • "endedAt": null,
  • "durationMs": null,
  • "args": {
    },
  • "facets": { }
}

Complete a run Deprecated

Marks the run as COMPLETED.

+
path Parameters
id
required
string <uuid>
Example: ea9badc5-7cb2-49af-9a9f-155771d3a797

The ID of the run.

+
query Parameters
at
string <date-time>

An ISO-8601 timestamp representing the time when the run transitioned.

+

Responses

Response samples

Content type
application/json
Example
{
  • "id": "870492da-ecfb-4be0-91b9-9a89ddd3db90",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "updatedAt": "2019-05-09T19:49:24.201361Z",
  • "nominalStartTime": null,
  • "nominalEndTime": null,
  • "state": "RUNNING",
  • "startedAt": "2019-05-09T15:17:32.690346",
  • "endedAt": null,
  • "durationMs": null,
  • "args": {
    },
  • "facets": { }
}

Fail a run Deprecated

Marks the run as FAILED.

+
path Parameters
id
required
string <uuid>
Example: ea9badc5-7cb2-49af-9a9f-155771d3a797

The ID of the run.

+
query Parameters
at
string <date-time>

An ISO-8601 timestamp representing the time when the run transitioned.

+

Responses

Response samples

Content type
application/json
{
  • "id": "870492da-ecfb-4be0-91b9-9a89ddd3db90",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "updatedAt": "2019-05-09T19:49:24.201361Z",
  • "nominalStartTime": null,
  • "nominalEndTime": null,
  • "state": "RUNNING",
  • "startedAt": "2019-05-09T15:17:32.690346",
  • "endedAt": null,
  • "durationMs": null,
  • "args": {
    },
  • "facets": { }
}

Abort a run Deprecated

Marks the run as ABORTED.

+
path Parameters
id
required
string <uuid>
Example: ea9badc5-7cb2-49af-9a9f-155771d3a797

The ID of the run.

+
query Parameters
at
string <date-time>

An ISO-8601 timestamp representing the time when the run transitioned.

+

Responses

Response samples

Content type
application/json
{
  • "id": "870492da-ecfb-4be0-91b9-9a89ddd3db90",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "updatedAt": "2019-05-09T19:49:24.201361Z",
  • "nominalStartTime": null,
  • "nominalEndTime": null,
  • "state": "RUNNING",
  • "startedAt": "2019-05-09T15:17:32.690346",
  • "endedAt": null,
  • "durationMs": null,
  • "args": {
    },
  • "facets": { }
}

Lineage

Record a single lineage event

Receive, process, and store lineage metadata using the OpenLineage standard.

+
Request Body schema: application/json
any (LineageEvent)

Responses

Request samples

Content type
application/json
{}

Get a lineage graph

query Parameters
nodeId
required
string
Example: nodeId=dataset:food_delivery:public.delivery_7_days

The ID of the node. A node can either be a dataset node, a dataset field node or a job node. The format of nodeId for dataset is dataset:<namespace_of_dataset>:<name_of_the_dataset>, for dataset field is datasetField:<namespace_of_dataset>:<name_of_the_dataset>:<name_of_field>, and for job is job:<namespace_of_the_job>:<name_of_the_job>.

+
depth
integer
Default: 20

Depth of lineage graph to create.

+

Responses

Response samples

Content type
application/json
{
  • "graph": [
    ]
}

Get the upstream lineage for a given run

Responses

Response samples

Content type
application/json
{
  • "runs": [
    ]
}

Column lineage

Get a column lineage graph

query Parameters
nodeId
required
string
Example: nodeId=dataset:food_delivery:public.delivery_7_days

The ID of the node. A node can either be a dataset node, a dataset field node or a job node. The format of nodeId for dataset is dataset:<namespace_of_dataset>:<name_of_the_dataset>, for dataset field is datasetField:<namespace_of_dataset>:<name_of_the_dataset>:<name_of_field>, and for job is job:<namespace_of_the_job>:<name_of_the_job>.

+
depth
integer
Default: 20

Depth of lineage graph to create.

+
withDownstream
boolean
Default: false

Determines if downstream lineage should be returned.

+

Responses

Response samples

Content type
application/json
{
  • "graph": [
    ]
}

Tags

Create a tag

Creates a new tag object.

+
path Parameters
tag
required
string
Example: SENSITIVE

The name of the tag.

+
Request Body schema: application/json
description
string

The description of the tag.

+

Responses

Request samples

Content type
application/json
{
  • "description": "My first tag!"
}

Response samples

Content type
application/json
{
  • "tags": [
    ]
}

List all tags

Returns a list of tags.

+
query Parameters
limit
integer
Default: 100
Example: limit=25

The number of results to return from offset.

+
offset
integer
Default: 0

The initial position from which to return results.

+

Responses

Response samples

Content type
application/json
{
  • "tags": [
    ]
}

Search

Query all datasets and jobs

Returns one or more datasets and jobs of your query.

+
query Parameters
q
required
string
Example: q=my-dataset

Query containing pattern to match; datasets and jobs pattern matching is string based and case-insensitive. Use percent sign (%) to match any string of zero or more characters (my-job%), or an underscore (_) to match a single character (_job_).

+
filter
string
Example: filter=dataset

Filters the results of your query by dataset or job.

+
sort
string
Example: sort=name

Sorts the results of your query by name or updated_at.

+
limit
integer
Default: 100
Example: limit=25

The number of results to return from offset.

+
namespace
string <= 1024 characters
Example: namespace=my-namespace

Match jobs or datasets within the given namespace.

+
before
stringYYYY-MM-DD
Example: before=2022-09-15

Match jobs or datasets before YYYY-MM-DD.

+
after
stringYYYY-MM-DD
Example: after=2022-09-15

Match jobs or datasets after YYYY-MM-DD.

+

Responses

Response samples

Content type
application/json
{
  • "totalCount": 1,
  • "results": [
    ]
}