Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

doc: Change readme and standalone docker quick start #14002

Merged
merged 3 commits into from
Apr 25, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .asf.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
#

github:
description: Apache DolphinScheduler is the modern data workflow orchestration platform with powerful user interface, dedicated to solving complex task dependencies in the data pipeline and providing various types of jobs available `out of the box`
description: Apache DolphinScheduler is the modern data orchestration platform. Agile to create high performance workflow with low-code
homepage: https://dolphinscheduler.apache.org/
labels:
- cloud-native
Expand Down
23 changes: 23 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -1 +1,24 @@
Please refer to the contribution document [How to contribute](docs/docs/en/contribute/join/contribute.md)

## How to Build

```bash
./mvnw clean install -Prelease
```

### Build with different Zookeeper versions

The default Zookeeper Server version supported is 3.8.0.
```bash
# Default Zookeeper 3.8.0
./mvnw clean install -Prelease
# Support to Zookeeper 3.4.6+
./mvnw clean install -Prelease -Dzk-3.4
```

Artifact:

```
dolphinscheduler-dist/target/apache-dolphinscheduler-${latest.release.version}-bin.tar.gz: Binary package of DolphinScheduler
dolphinscheduler-dist/target/apache-dolphinscheduler-${latest.release.version}-src.tar.gz: Source code package of DolphinScheduler
```
118 changes: 40 additions & 78 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,105 +1,70 @@
Dolphin Scheduler Official Website
[dolphinscheduler.apache.org](https://dolphinscheduler.apache.org)
==================================================================
# Apache Dolphinscheduler

[![License](https://img.shields.io/badge/license-Apache%202-4EB1BA.svg)](https://www.apache.org/licenses/LICENSE-2.0.html)
[![codecov](https://codecov.io/gh/apache/dolphinscheduler/branch/dev/graph/badge.svg)]()
![codecov](https://codecov.io/gh/apache/dolphinscheduler/branch/dev/graph/badge.svg)
[![Quality Gate Status](https://sonarcloud.io/api/project_badges/measure?project=apache-dolphinscheduler&metric=alert_status)](https://sonarcloud.io/dashboard?id=apache-dolphinscheduler)
[![Twitter Follow](https://img.shields.io/twitter/follow/dolphinschedule.svg?style=social&label=Follow)](https://twitter.com/dolphinschedule)
[![Slack Status](https://img.shields.io/badge/slack-join_chat-white.svg?logo=slack&style=social)](https://s.apache.org/dolphinscheduler-slack)
[![CN doc](https://img.shields.io/badge/文档-中文版-blue.svg)](README_zh_CN.md)

## Features
## About

Apache DolphinScheduler is the modern data workflow orchestration platform with powerful user interface, dedicated to solving complex task dependencies in the data pipeline and providing various types of jobs available `out of the box`
Apache DolphinScheduler is the modern data orchestration platform. Agile to create high performance workflow with low-code. It is also provided powerful user interface,
dedicated to solving complex task dependencies in the data pipeline and providing various types of jobs available **out of the box**

The key features for DolphinScheduler are as follows:

- Easy to deploy, we provide 4 ways to deploy, such as Standalone deployment,Cluster deployment,Docker / Kubernetes deployment and Rainbond deployment
- Easy to use, there are four ways to create workflows:

- Visually, create tasks by dragging and dropping tasks
- [PyDolphinScheduler](https://dolphinscheduler.apache.org/python/main/index.html), Creating workflows via Python API, aka workflow-as-code
- Yaml definition, mapping yaml into workflow(have to install PyDolphinScheduler currently)
- Open API, Creating workflows

- Highly Reliable,
DolphinScheduler uses a decentralized multi-master and multi-worker architecture, which naturally supports horizontal scaling and high availability
- Easy to deploy, provide four ways to deploy which including Standalone, Cluster, Docker and Kubernetes.
- Easy to use, workflow can be created and managed by four ways, which including Web UI, [Python SDK](https://dolphinscheduler.apache.org/python/main/index.html), Yaml file and Open API
- Highly reliable and high availability, decentralized architecture with multi-master and multi-worker, native supports horizontal scaling.
- High performance, its performance is N times faster than other orchestration platform and it can support tens of millions of tasks per day
- Supports multi-tenancy
- Supports various task types: Shell, MR, Spark, SQL (MySQL, OceanBase, PostgreSQL, Hive, Spark SQL), Python, Procedure, Sub_Workflow,
Http, K8s, Jupyter, MLflow, SageMaker, DVC, Pytorch, Amazon EMR, etc
- Orchestrating workflows and dependencies, you can pause/stop/recover task any time, failed tasks can be set to automatically retry
- Visualizing the running state of the task in real-time and seeing the task runtime log
- What you see is what you get when you edit the task on the UI
- Backfill can be operated on the UI directly
- Perfect project, resource, data source-level permission control
- Displaying workflow history in tree/Gantt chart, as well as statistical analysis on the task status & process status in each workflow
- Supports internationalization
- Cloud Native, DolphinScheduler supports orchestrating multi-cloud/data center workflow, and
supports custom task type
- More features waiting for partners to explore

## User Interface Screenshots

![dag](./images/en_US/dag.png)
<img width="1100" src="https://user-images.githubusercontent.com/15833811/197348110-1653ea32-ce07-436c-a0b8-6ac1af80aea5.png">
![data-source](./images/en_US/data-source.png)
![home](./images/en_US/home.png)
![master](./images/en_US/master.png)
![workflow-tree](./images/en_US/workflow-tree.png)
- Cloud Native, DolphinScheduler supports orchestrating multi-cloud/data center workflow, and supports custom task type
- Versioning both workflow and workflow instance(including tasks)
- Various state control of workflow and task, support pause/stop/recover them in any time
- Multi-tenancy support
- Others like backfill support(Web UI native), permission control including project, resource and data source

## QuickStart in Docker
## QuickStart

Please refer the official website document: [QuickStart in Docker](https://dolphinscheduler.apache.org/en-us/docs/3.1.2/guide/start/docker)
- For quick experience
- Want to [start with standalone](https://dolphinscheduler.apache.org/en-us/docs/3.1.5/guide/installation/standalone)
- Want to [start with Docker](https://dolphinscheduler.apache.org/en-us/docs/3.1.5/guide/start/docker)
- For Kubernetes
- [Start with Kubernetes](https://dolphinscheduler.apache.org/en-us/docs/3.1.5/guide/installation/kubernetes)

## QuickStart in Kubernetes
## User Interface Screenshots

Please refer to the official website document: [QuickStart in Kubernetes](https://dolphinscheduler.apache.org/en-us/docs/3.1.2/guide/installation/kubernetes)
* **Homepage:** Project and workflow overview, including the latest workflow instance and task instance status statistics.
![home](images/home.png)

## How to Build
* **Workflow Definition:** Create and manage workflow by drag and drop, easy to build and maintain complex workflow, support [bulk of tasks](https://dolphinscheduler.apache.org/en-us/docs/3.1.5/introduction-to-functions_menu/task_menu) out of box.
![workflow-definition](images/workflow-definition.png)

```bash
./mvnw clean install -Prelease
```
* **Workflow Tree View:** Abstract tree structure could clearer understanding of the relationship between tasks
![workflow-tree](images/workflow-tree.png)

### Build with different Zookeeper versions
* **Data source:** Manage support multiple external data sources, provide unified data access capabilities for such as MySQL, PostgreSQL, Hive, Trino, etc.
![data-source](images/data-source.png)

The default Zookeeper Server version supported is 3.8.0.
```bash
# Default Zookeeper 3.8.0
./mvnw clean install -Prelease
# Support to Zookeeper 3.4.6+
./mvnw clean install -Prelease -Dzk-3.4
```
* **Monitor:** View the status of the master, worker and database in real time, including server resource usage and load, do quick health check without logging in to the server.
![monitor](images/monitor.png)

Artifact:
## Suggestions & Bug Reports

```
dolphinscheduler-dist/target/apache-dolphinscheduler-${latest.release.version}-bin.tar.gz: Binary package of DolphinScheduler
dolphinscheduler-dist/target/apache-dolphinscheduler-${latest.release.version}-src.tar.gz: Source code package of DolphinScheduler
```
Follow [this guide](https://github.com/apache/dolphinscheduler/issues/new/choose) to report your suggestions or bugs.

## Get Help
## Contributing

1. Submit an [issue](https://github.com/apache/dolphinscheduler/issues/new/choose)
2. [Join our slack](https://s.apache.org/dolphinscheduler-slack) and send your question to channel `#general`
3. Send email to [email protected] or [email protected]
The community welcomes everyone to contribute, please refer to this page to find out more: [How to contribute](docs/docs/en/contribute/join/contribute.md),
find the good first issue in [here](https://github.com/apache/dolphinscheduler/contribute) if you are new to DolphinScheduler.

## Community

You are very welcome to communicate with the developers and users of Dolphin Scheduler. There are two ways to find them:
Welcome to join the Apache DolphinScheduler community by:

1. Join the Slack channel [Slack](https://asf-dolphinscheduler.slack.com/)
2. Follow the [Twitter account of DolphinScheduler](https://twitter.com/dolphinschedule) and get the latest news on time

## How to Contribute

The community welcomes everyone to contribute, please refer to this page to find out more: [How to contribute](docs/docs/en/contribute/join/contribute.md).

## Thanks

DolphinScheduler is based on a lot of excellent open-source projects, such as Google guava, grpc, netty, quartz, and many open-source projects of Apache and so on.
We would like to express our deep gratitude to all the open-source projects used in DolphinScheduler. We hope that we are not only the beneficiaries of open-source, but also give back to the community. Besides, we hope everyone who have the same enthusiasm and passion for open source could join in and contribute to the open-source community
- Join the [DolphinScheduler Slack](https://s.apache.org/dolphinscheduler-slack) to keep in touch with the community
- Follow the [DolphinScheduler Twitter](https://twitter.com/dolphinschedule) and get the latest news
- Subscribe DolphinScheduler mail list, [email protected] for user and [email protected] for developer

# Landscapes

Expand All @@ -111,6 +76,3 @@ DolphinScheduler enriches the <a href="https://landscape.cncf.io/?landscape=obse

</p >

## License

Please refer to the [LICENSE](https://github.com/apache/dolphinscheduler/blob/dev/LICENSE) file
118 changes: 46 additions & 72 deletions README_zh_CN.md
Original file line number Diff line number Diff line change
@@ -1,102 +1,76 @@
Dolphin Scheduler Official Website
[dolphinscheduler.apache.org](https://dolphinscheduler.apache.org)
==================================================================
# Apache Dolphinscheduler

[![License](https://img.shields.io/badge/license-Apache%202-4EB1BA.svg)](https://www.apache.org/licenses/LICENSE-2.0.html)
[![codecov](https://codecov.io/gh/apache/dolphinscheduler/branch/dev/graph/badge.svg)]()
[![Quality Gate Status](https://sonarcloud.io/api/project_badges/measure?project=apache-dolphinscheduler&metric=alert_status)](https://sonarcloud.io/dashboard?id=apache-dolphinscheduler)
[![Twitter Follow](https://img.shields.io/twitter/follow/dolphinschedule.svg?style=social&label=Follow)](https://twitter.com/dolphinschedule)
[![Slack Status](https://img.shields.io/badge/slack-join_chat-white.svg?logo=slack&style=social)](https://s.apache.org/dolphinscheduler-slack)

[![Stargazers over time](https://starchart.cc/apache/dolphinscheduler.svg)](https://starchart.cc/apache/dolphinscheduler)

[![CN doc](https://img.shields.io/badge/文档-中文版-blue.svg)](README_zh_CN.md)
[![EN doc](https://img.shields.io/badge/document-English-blue.svg)](README.md)

## 设计特点
## 关于

一个分布式易扩展的可视化 DAG 工作流任务调度系统。致力于解决数据处理流程中错综复杂的依赖关系,使调度系统在数据处理流程中`开箱即用`。

其主要目标如下:

- 以 DAG 图的方式将 Task 按照任务的依赖关系关联起来,可实时可视化监控任务的运行状态
- 支持丰富的任务类型:Shell、MR、Spark、SQL(mysql、oceanbase、postgresql、hive、sparksql)、Python、Sub_Process、Procedure 等
- 支持工作流定时调度、依赖调度、手动调度、手动暂停/停止/恢复,同时支持失败重试/告警、从指定节点恢复失败、Kill 任务等操作
- 支持工作流优先级、任务优先级及任务的故障转移及任务超时告警/失败
- 支持工作流全局参数及节点自定义参数设置
- 支持资源文件的在线上传/下载,管理等,支持在线文件创建、编辑
- 支持任务日志在线查看及滚动、在线下载日志等
- 实现集群 HA,通过 Zookeeper 实现 Master 集群和 Worker 集群去中心化
- 支持对`Master/Worker` cpu load,memory,cpu 在线查看
- 支持工作流运行历史树形/甘特图展示、支持任务状态统计、流程状态统计
- 支持补数
- 支持多租户
- 支持国际化
- 还有更多等待伙伴们探索

## 系统部分截图
DolphinScheduler 的主要特性如下:

![dag](./images/zh_CN/dag.png)
![data-source](./images/zh_CN/data-source.png)
![home](./images/zh_CN/home.png)
![master](./images/zh_CN/master.png)
![workflow-tree](./images/zh_CN/workflow-tree.png)
- 易于部署,提供四种部署方式,包括Standalone、Cluster、Docker和Kubernetes。
- 易于使用,可以通过四种方式创建和管理工作流,包括Web UI、[Python SDK](https://dolphinscheduler.apache.org/python/main/index.html)、Yaml文件和Open API
- 高可靠高可用,多主多从的去中心化架构,原生支持横向扩展。
- 高性能,性能比其他编排平台快N倍,每天可支持千万级任务
- Cloud Native,DolphinScheduler支持编排多云/数据中心工作流,支持自定义任务类型
- 对工作流和工作流实例(包括任务)进行版本控制
- 工作流和任务的多种状态控制,支持随时暂停/停止/恢复它们
- 多租户支持
- 其他如回填支持(Web UI 原生),包括项目、资源和数据源的权限控制

## 近期研发计划

DolphinScheduler 的工作计划:<a href="https://github.com/apache/dolphinscheduler/projects/1" target="_blank">研发计划</a> ,其中 In Develop 卡片下是正在研发的功能,TODO 卡片是待做事项(包括 feature ideas)

## 参与贡献
## 快速开始

非常欢迎大家来参与贡献,贡献流程请参考:
[[参与贡献](docs/docs/zh/contribute/join/contribute.md)]
- 如果想要体验
- [standalone 启动](https://dolphinscheduler.apache.org/zh-cn/docs/3.1.5/guide/installation/standalone)
- [Docker 启动](https://dolphinscheduler.apache.org/zh-cn/docs/3.1.5/guide/start/docker)
- 想 Kubernetes 部署
- [Kubernetes 部署](https://dolphinscheduler.apache.org/zh-cn/docs/3.1.5/guide/installation/kubernetes)

## 快速试用 Docker

请参考官方文档: [快速试用 Docker 部署](https://dolphinscheduler.apache.org/zh-cn/docs/3.1.2/guide/start/docker)

## 快速试用 Kubernetes
## 系统部分截图

请参考官方文档: [快速试用 Kubernetes 部署](https://dolphinscheduler.apache.org/zh-cn/docs/3.1.2/guide/installation/kubernetes)
* **主页**:项目和工作流概览,包括最新的工作流实例和任务实例状态统计。
![home](images/home.png)

## 如何构建
* **工作流定义**: 通过拖拉拽创建和管理工作流,轻松构建和维护复杂的工作流。
![workflow-definition](images/workflow-definition.png)

```bash
./mvnw clean install -Prelease
```
* **工作流树状图**: 抽象的树形结构可以更清晰的理解任务之间的关系
![workflow-tree](images/workflow-tree.png)

### 构建不同版本的 Zookeeper 依赖
* **数据源**: 管理支持多种外部数据源,为MySQL、PostgreSQL、Hive、Trino等,并提供统一的数据访问能力。
![data-source](images/data-source.png)

默认支持 Zookeeper Server 3.8.0。
```bash
# 默认 Zookeeper Client 3.8.0
./mvnw clean install -Prelease
# 构建支持 Zookeeper 3.4.6+
./mvnw clean install -Prelease -Dzk-3.4
```
* **监控**:实时查看master、worker和数据库的状态,包括服务器资源使用情况和负载情况,无需登录服务器即可快速进行健康检查。
![monitor](images/monitor.png)

制品:
## 建议和报告 bugs

```
dolphinscheduler-dist/target/apache-dolphinscheduler-${latest.release.version}-bin.tar.gz: DolphinScheduler 二进制包
dolphinscheduler-dist/target/apache-dolphinscheduler-${latest.release.version}-src.tar.gz: DolphinScheduler 源代码包
```
根据 [这个步骤](https://github.com/apache/dolphinscheduler/issues/new/choose) 来报告你的 bug 或者提交建议。

## 感谢
## 参与贡献

Dolphin Scheduler 使用了很多优秀的开源项目,比如 google 的 guava、grpc,netty,quartz,以及 apache 的众多开源项目等等,
正是由于站在这些开源项目的肩膀上,才有 Dolphin Scheduler 的诞生的可能。对此我们对使用的所有开源软件表示非常的感谢!我们也希望自己不仅是开源的受益者,也能成为开源的贡献者,也希望对开源有同样热情和信念的伙伴加入进来,一起为开源献出一份力!
社区欢迎大家贡献,请参考此页面了解更多:[如何贡献](docs/docs/zh/contribute/join/contribute.md),在[这里](https://github.com/apache/dolphinscheduler/contribute)可以找到good first issue
如果你是首次贡献 dolphinscheduler。

## 获得帮助
## 社区

1. 提交 [issue](https://github.com/apache/dolphinscheduler/issues/new/choose)
2. [加入 slack 群](https://s.apache.org/dolphinscheduler-slack) 并在频道 `#troubleshooting` 中提问
欢迎通过以方式加入社区:

## 社区
- 加入 [DolphinScheduler Slack](https://s.apache.org/dolphinscheduler-slack)
- 关注 [DolphinScheduler Twitter](https://twitter.com/dolphinschedule) 来获取最新消息
- 订阅 DolphinScheduler 邮件列表, 用户订阅 [email protected] 开发者请订阅 [email protected]

1. 通过[该申请链接](https://s.apache.org/dolphinscheduler-slack)加入 slack channel
2. 关注[Apache Dolphin Scheduler 的 Twitter 账号](https://twitter.com/dolphinschedule)获取实时动态
# Landscapes

## 版权
<p align="center">
<br/><br/>
<img src="https://landscape.cncf.io/images/left-logo.svg" width="150"/>&nbsp;&nbsp;<img src="https://landscape.cncf.io/images/right-logo.svg" width="200"/>
<br/><br/>
DolphinScheduler enriches the <a href="https://landscape.cncf.io/?landscape=observability-and-analysis&license=apache-license-2-0">CNCF CLOUD NATIVE Landscape.</a >

请参考 [LICENSE](https://github.com/apache/dolphinscheduler/blob/dev/LICENSE) 文件.
</p >
Loading