Skip to content

Commit

Permalink
feat: new documentation structure + other issues (#168)
Browse files Browse the repository at this point in the history
* add headers and refactor

* remove unused images

* different structure

* add link to conduit platform

* tweak css

* style differently the first element

* fix menu items on small viewports

* disable until we finish so we have preview deploys

* fix flickering

* another iteration

* leave sidebar tidy

* structure pipeline pages

* update header nav bar

* document pipeline statuses

* document CLI flags

* structure finished

* add link to opencdc

* consistent formatting

* better urls

* update some links and rename pages

* upate links

* fix more broken links

* fix redirect

* finish broken links

* remove conclusion

* fix statuses definition

* update again

* update links

* add some redirects

* tweak implication

* update specs

* sort redirects

* fix redirect

* fix broken links
  • Loading branch information
raulb authored Oct 21, 2024
1 parent d64fd02 commit c0416f5
Show file tree
Hide file tree
Showing 140 changed files with 526 additions and 438 deletions.
2 changes: 1 addition & 1 deletion changelog/2024-03-24-conduit-0-9-0-release.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,5 +19,5 @@ Revolutionize your data processing with [**Conduit v0.9**](https://github.com/Co
- **Getting Started Guide**: A user-friendly guide is available to help new users set up Conduit and explore the latest features quickly.

:::tip
For an in-depth look at how the enhanced processors can transform your data processing workflows, check out our [blog post](https://meroxa.com/blog/introducing-conduit-0.9-revolutionizing-data-processing-with-enhanced-processors/), and visit our [Processors documentation page](/docs/processors).
For an in-depth look at how the enhanced processors can transform your data processing workflows, check out our [blog post](https://meroxa.com/blog/introducing-conduit-0.9-revolutionizing-data-processing-with-enhanced-processors/), and visit our [Processors documentation page](/docs/using/processors/getting-started.
:::
2 changes: 1 addition & 1 deletion changelog/2024-08-19-conduit-0-11-0-release.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,5 +16,5 @@ We’re thrilled to announce the release of [**Conduit v0.11**](https://github.c
- **Enhanced Transformation Capabilities:** Easily transform data as it flows through your pipelines, making integration smoother and more efficient.

:::tip
For an in-depth look at how these new features can elevate your data integration processes, check out our [blog post](https://meroxa.com/blog/conduit-v0.11-unveils-powerful-schema-support-for-enhanced-data-integration/), our [Schema Support documentation page](/docs/features/schema-support).
For an in-depth look at how these new features can elevate your data integration processes, check out our [blog post](https://meroxa.com/blog/conduit-v0.11-unveils-powerful-schema-support-for-enhanced-data-integration/), our [Schema Support documentation page](/docs/using/other-features/schema-support).
:::
2 changes: 1 addition & 1 deletion changelog/2024-10-10-conduit-0-12-0-release.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,5 +16,5 @@ We’re excited to announce the release of [**Conduit v0.12.0**](https://github.
- **Smart Retry Management:** Limits on retries prevent indefinite restarts, keeping your pipelines efficient and reliable.

:::tip
For a detailed overview of how Pipeline Recovery works and its benefits, check out our [blog post](https://meroxa.com/blog/unlocking-resilience:-conduit-v0.12.0-introduces-pipeline-recovery/), or our documentation for [Pipeline Recovery](/docs/features/pipeline-recovery) and learn how to make your data streaming experience smoother than ever!
For a detailed overview of how Pipeline Recovery works and its benefits, check out our [blog post](https://meroxa.com/blog/unlocking-resilience:-conduit-v0.12.0-introduces-pipeline-recovery/), or our documentation for [Pipeline Recovery](/docs/using/other-features/pipeline-recovery) and learn how to make your data streaming experience smoother than ever!
:::
4 changes: 2 additions & 2 deletions changelog/2024-10-15-pipelines-exit-on-degraded.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ $ conduit --help
...
```
If you were using a [Conduit Configuration file](/docs/features/configuration) this should look like:
If you were using a [Conduit Configuration file](/docs/configuration#configuration-file) this should look like:
```yaml title="conduit.yaml"
# ...
Expand All @@ -28,7 +28,7 @@ pipelines:
# ...
```
Previously, this functionality was handled by `pipelines.exit-on-error`. However, with the introduction of [Pipeline Recovery](/docs/features/pipeline-recovery), the old description no longer accurately reflected the behavior, as a pipeline may not necessarily exit even in the presence of an error.
Previously, this functionality was handled by `pipelines.exit-on-error`. However, with the introduction of [Pipeline Recovery](/docs/using/other-features/pipeline-recovery), the old description no longer accurately reflected the behavior, as a pipeline may not necessarily exit even in the presence of an error.
:::warning
The previous flag `pipelines.exit-on-error` will still be valid but is now hidden. We encourage all users to transition to `pipelines.exit-on-degraded` for improved clarity and functionality.
Expand Down
28 changes: 14 additions & 14 deletions docs/introduction.mdx → docs/0-what-is/0-introduction.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ sidebar_position: 0
hide_title: true
title: 'Introduction'
sidebar_label: "Introduction"
slug: /
slug: '/'
---

<img
Expand All @@ -12,23 +12,23 @@ slug: /
src="/img/conduit/on-white-conduit-logo.png"
/>

Conduit is a data integration tool for software engineers. Its purpose is to
Conduit is a data integration tool for software engineers, powered by [Meroxa](https://meroxa.io). Its purpose is to
help you move data from A to B. You can use Conduit to send data from Kafka to
Postgres, between files and APIs,
between [supported connectors](/docs/connectors/connector-list),
and [any datastore you can build a plugin for](/docs/connectors/building-connectors/).
between [supported connectors](/docs/using/connectors/list),
and [any datastore you can build a plugin for](/docs/developing/connectors/).

It's written in [Go](https://go.dev/), compiles to a binary, and is designed to
be easy to use and [deploy](/docs/getting-started/installing-and-running?option=binary).
be easy to use and [deploy](/docs/installing-and-running?option=binary).

Out of the box, Conduit comes with:

- A UI
- Common connectors
- Processors
- Observability
- Schema Support

In this getting started guide we'll use a pre-built binary, but Conduit can also be run using [Docker](/docs/getting-started/installing-and-running?option=docker).
In this getting started guide we'll use a pre-built binary, but Conduit can also be run using [Docker](/docs/installing-and-running?option=docker).

## Some of its features

Expand All @@ -49,7 +49,7 @@ allows your data applications to act upon those changes in real-time.
Conduit connectors are plugins that communicate with Conduit via a gRPC
interface. This means that plugins can be written in any language as long as
they conform to the required interface. Check out
our [connector docs](/docs/connectors)!
our [connector docs](/docs/using/connectors/getting-started)!

## Installing

Expand All @@ -63,7 +63,7 @@ curl https://conduit.io/install.sh | bash

If you're not using macOS or Linux system, you can still install Conduit
following one of the different options provided
in [our installation page](/docs/getting-started/installing-and-running).
in [our installation page](/docs/installing-and-running).

## Starting Conduit
Now that we have Conduit installed let's start it up to see what happens.
Expand Down Expand Up @@ -116,7 +116,7 @@ Now that we have Conduit up and running you can now navigate to `http://localhos
![Conduit Pipeline](/img/conduit/pipeline.png)

## Building a pipeline
While you can provision pipelines via Conduit's UI, the recommended way to do so is using a [pipeline configuation file](/docs/pipeline-configuration-files/getting-started).
While you can provision pipelines via Conduit's UI, the recommended way to do so is using a [pipeline configuation file](/docs/using/pipelines/configuration-file).

For this example we'll create a pipeline that will move data from one file to another.

Expand Down Expand Up @@ -267,9 +267,9 @@ Congratulations! You've pushed data through your first Conduit pipeline.
Looking for more examples? Check out the examples in our [repo](https://github.com/ConduitIO/conduit/tree/main/examples).

Now that you've got the basics of running Conduit and creating a pipeline covered. Here are a few places to dive in deeper:
- [Connectors](/docs/connectors/getting-started)
- [Pipelines](/docs/pipeline-configuration-files/getting-started)
- [Processors](/docs/processors/getting-started)
- [Conduit Architecture](/docs/getting-started/architecture)
- [Connectors](/docs/using/connectors/getting-started)
- [Pipelines](/docs/using/pipelines/configuration-file)
- [Processors](/docs/using/processors/getting-started)
- [Conduit Architecture](/docs/core-concepts/architecture)

![scarf pixel conduit-site-docs-introduction](https://static.scarf.sh/a.png?x-pxid=01346572-0d57-4df3-8399-1425db913a0a)
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: "Conduit Architecture"
sidebar_position: 2
slug: '/core-concepts/architecture'
---

Here is an overview of the internal Conduit Architecture.
Expand Down Expand Up @@ -93,14 +93,14 @@ as soon as possible without draining the pipeline.

This layer is used directly by the [Orchestration layer](#orchestration-layer) and indirectly by the [Core layer](#core-layer), and [Schema registry service](#schema-registry-service) (through stores) to persist data. It provides the functionality of creating transactions and storing, retrieving and deleting arbitrary data like configurations or state.

More information on [storage](/docs/features/storage).
More information on [storage](/docs/using/other-features/storage).

## Connector utility services

### Schema registry service

The schema service is responsible for managing the schema of the records that flow through the pipeline. It provides functionality to infer a schema from a record. The schema is stored in the schema store and can be referenced by connectors and processors. By default, Conduit provides a built-in schema registry, but this service can be run separately from Conduit.

More information on [Schema Registry](/docs/features/schema-support#schema-registry).
More information on [Schema Registry](/docs/using/other-features/schema-support#schema-registry).

![scarf pixel conduit-site-docs-introduction](https://static.scarf.sh/a.png?x-pxid=01346572-0d57-4df3-8399-1425db913a0a)
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: "Pipeline Semantics"
sidebar_position: 6
slug: '/core-concepts/pipeline-semantics'
---

This document describes the inner workings of a Conduit pipeline, its structure, and behavior. It also describes a
Expand Down
3 changes: 3 additions & 0 deletions docs/0-what-is/1-core-concepts/_category_.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{
"label": "Core concepts"
}
38 changes: 38 additions & 0 deletions docs/0-what-is/1-core-concepts/index.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
---
title: "Core concepts"
slug: '/core-concepts'
---

## Pipeline

A pipeline receives records from one or multiple source connectors, pushes them through zero
or multiple processors until they reach one or multiple destination connectors.

## Connector

A connector is the internal entity that communicates with a connector plugin and either pushes
records from the plugin into the pipeline (source connector) or the other way around
(destination connector).

## Connector plugin

Sometimes also referred to as "plugin", is an external process which communicates with Conduit
and knows how to read/write records from/to a data source/destination (e.g. a database).

## Processor

A component that executes an operation on a single record that flows through the pipeline.
It can either change the record or filter it out based on some criteria.

## OpenCDC Record

A record represents a single piece of data that flows through a pipeline (e.g. one database row).
[More info here](/docs/using/opencdc-record).

## Collection

A generic term used in Conduit to describe an entity in a 3rd party system from which records
are read from or to which records they are written to. Examples are: topics (in Kafka), tables
(in a database), indexes (in a search engine), collections (in NoSQL databases), etc.

![scarf pixel conduit-site-docs-introduction](https://static.scarf.sh/a.png?x-pxid=01346572-0d57-4df3-8399-1425db913a0a)
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
---
title: 'Getting Started with Pipeline Configuration Files'
title: 'Getting Started'
sidebar_label: "Getting Started"
sidebar_position: 0
slug: '/getting-started'
---

Pipeline configuration files give you the ability to define pipelines that are
Expand All @@ -13,7 +14,7 @@ configurations.

:::tip

In our [Conduit repository](https://github.com/ConduitIO/conduit), you can find [more examples](https://github.com/ConduitIO/conduit/tree/main/examples/pipelines), but to ilustrate a simple use case we'll show a pipeline using a file as a source, and another file as a destination. Check out the different [specifications](/docs/pipeline-configuration-files/specifications) to see the different configuration options.
In our [Conduit repository](https://github.com/ConduitIO/conduit), you can find [more examples](https://github.com/ConduitIO/conduit/tree/main/examples/pipelines), but to ilustrate a simple use case we'll show a pipeline using a file as a source, and another file as a destination. Check out the different [specifications](/docs/using/pipelines/configuration-file) to see the different configuration options.

:::

Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
title: "Installing and running"
sidebar_position: 0
hide_table_of_contents: true
slug: '/installing-and-running'
---

import Tabs from '@theme/Tabs';
Expand Down Expand Up @@ -155,11 +155,11 @@ You should now be able to interact with the Conduit UI and HTTP API on port 8080
## Next Steps

Now that you have Conduit installed you can
learn [how to build a pipeline](/docs/how-to/build-generator-to-log-pipeline).
learn [how to get started]÷(/docs/getting-started).
You can also explore some other topics, such as:

- [Pipelines](/docs/pipeline-configuration-files/getting-started)
- [Connectors](/docs/connectors/getting-started)
- [Processors](/docs/processors/getting-started)
- [Pipelines](/docs/using/pipelines/configuration-file)
- [Connectors](/docs/using/connectors/getting-started)
- [Processors](/docs/using/processors/getting-started

![scarf pixel conduit-site-docs-running](https://static.scarf.sh/a.png?x-pxid=db6468a8-7998-463e-800f-58a619edd9b3)
94 changes: 94 additions & 0 deletions docs/1-using/1-configuration.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
---
title: 'How to configure Conduit'
sidebar_label: 'Configuration'
slug: '/configuration'
---

Conduit accepts CLI flags, environment variables and a configuration file to
configure its behavior. Each CLI flag has a corresponding environment variable
and a corresponding field in the configuration file. Conduit uses the value for
each configuration option based on the following priorities:

## CLI flags

**CLI flags** (highest priority) - if a CLI flag is provided it will always be
respected, regardless of the environment variable or configuration file. To
see a full list of available flags run `conduit --help`:


```bash
$ conduit --help
Usage of conduit:
-api.enabled
enable HTTP and gRPC API (default true)
-config string
global config file (default "conduit.yaml")
-connectors.path string
path to standalone connectors' directory (default "./connectors")
-db.badger.path string
path to badger DB (default "conduit.db")
-db.postgres.connection-string string
postgres connection string, may be a database URL or in PostgreSQL keyword/value format
-db.postgres.table string
postgres table in which to store data (will be created if it does not exist) (default "conduit_kv_store")
-db.sqlite.path string
path to sqlite3 DB (default "conduit.db")
-db.sqlite.table string
sqlite3 table in which to store data (will be created if it does not exist) (default "conduit_kv_store")
-db.type string
database type; accepts badger,postgres,inmemory,sqlite (default "badger")
-grpc.address string
address for serving the gRPC API (default ":8084")
-http.address string
address for serving the HTTP API (default ":8080")
-log.format string
sets the format of the logging; accepts json, cli (default "cli")
-log.level string
sets logging level; accepts debug, info, warn, error, trace (default "info")
-pipelines.error-recovery.backoff-factor int
backoff factor applied to the last delay (default 2)
-pipelines.error-recovery.max-delay duration
maximum delay before restart (default 10m0s)
-pipelines.error-recovery.max-retries int
maximum number of retries (default -1)
-pipelines.error-recovery.max-retries-window duration
amount of time running without any errors after which a pipeline is considered healthy (default 5m0s)
-pipelines.error-recovery.min-delay duration
minimum delay before restart (default 1s)
-pipelines.exit-on-degraded
exit Conduit if a pipeline enters a degraded state
-pipelines.path string
path to the directory that has the yaml pipeline configuration files, or a single pipeline configuration file (default "./pipelines")
-processors.path string
path to standalone processors' directory (default "./processors")
-schema-registry.confluent.connection-string string
confluent schema registry connection string
-schema-registry.type string
schema registry type; accepts builtin,confluent (default "builtin")
-version
prints current Conduit version
```

## Environment variables

**Environment variables** (lower priority) - an environment variable is only
used if no CLI flag is provided for the same option. Environment variables
have the prefix `CONDUIT` and contain underscores instead of dots and
hyphens (e.g. the flag `-db.postgres.connection-string` corresponds
to `CONDUIT_DB_POSTGRES_CONNECTION_STRING`).

## Configuration file

**Configuration file** (lowest priority) - Conduit by default loads the
file `conduit.yaml` placed in the same folder as Conduit. The path to the file
can be customized using the CLI flag `-config`. It is not required to provide
a configuration file and any value in the configuration file can be overridden
by an environment variable or a flag. The file content should be a YAML
document where keys can be hierarchically split on `.`. For example:

```yaml
db:
type: postgres # corresponds to flag -db.type and env variable CONDUIT_DB_TYPE
postgres:
connection-string: postgres://localhost:5432/conduitdb # -db.postgres.connection-string or CONDUIT_DB_POSTGRES_CONNECTION_STRING
```
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
---
title: 'OpenCDC record'
sidebar_position: 4
---

An OpenCDC record in Conduit aims to standardize the format of data records
Expand Down Expand Up @@ -130,7 +129,7 @@ the `opencdc.StructuredData` type.

The supported data types for values in `opencdc.StructuredData` depend on following:
- connector or processor type (built-in or standalone)
- [schema support](/docs/features/schema-support) (enabled or disabled).
- [schema support](/docs/using/other-features/schema-support) (enabled or disabled).

In built-in connectors, the field values can be of any Go type, given that
there's no (de)serialization involved.
Expand Down Expand Up @@ -357,7 +356,7 @@ The version of the destination plugin that has written the record.
```

### `conduit.dlq.nack.error`
Contains the error that caused a record to be nacked and pushed to the [dead-letter queue (DLQ)](/docs/features/dead-letter-queue).
Contains the error that caused a record to be nacked and pushed to the [dead-letter queue (DLQ)](/docs/using/other-features/dead-letter-queue).

### `conduit.dlq.nack.node.id`
The ID of the internal node that nacked the record.
Expand Down
Loading

0 comments on commit c0416f5

Please sign in to comment.