BitSail

Introduction

BitSail is ByteDance's open source data integration engine which is based on distributed architecture and provides high performance. It supports data synchronization between multiple heterogeneous data sources, and provides global data integration solutions in batch, streaming, and incremental scenarios. At present, it serves almost all business lines in ByteDance, such as Douyin, Toutiao, etc., and synchronizes hundreds of trillions of data every day.

Why Do We Use BitSail

BitSail has been widely used and supports hundreds of trillions of large traffic. At the same time, it has been verified in various scenarios such as the cloud native environment of the volcano engine and the on-premises private cloud environment.

We have accumulated a lot of experience and made a number of optimizations to improve the function of data integration

Global Data Integration, covering batch, streaming and incremental scenarios
Distributed and cloud-native architecture, supporting horizontal scaling
High maturity in terms of accuracy, stability and performance
Rich basic functions, such as type conversion, dirty data processing, flow control, data lake integration, automatic parallelism calculation , etc.
Task running status monitoring, such as traffic, QPS, dirty data, latency, etc.

BitSail Use Scenarios

Mass data synchronization in heterogeneous data sources
Streaming and batch integration data processing capability
Data lake and warehouse integration data processing capability
High performance, high reliability data synchronization
Distributed, cloud-native architecture data integration engine

Features of BitSail

Low start-up cost and high flexibility
Stream-batch integration and Data lake-warehouse integration architecture, one framework covers almost all data synchronization scenarios
High-performance, massive data processing capabilities
DDL automatic synchronization
Type system, conversion between different data source types
Engine independent reading and writing interface, low development cost
Real-time display of task progress, under development
Real-time monitoring of task status

Architecture of BitSail

Source[Input Sources] -> Framework[Data Transmission] -> Sink[Output Sinks]

The data processing pipeline is as follows. First, pull the source data through Input Sources, then process it through the intermediate framework layer, and finally write the data to the target through Output Sinks

At the framework layer, we provide rich functions and take effect for all synchronization scenarios, such as dirty data collection, auto parallelism calculation, task monitoring, etc.

In data synchronization scenarios, it covers batch, streaming, and incremental data synchronization

In the Runtime layer, it supports multiple execution modes, such as yarn, local, and k8s is under development

Supported Connectors

DataSource	Sub Modules	Reader	Writer
Hive	-	✅	✅
Hadoop	-	✅	✅
Hbase	-	✅	✅
Hudi	-	✅	✅
Kafka	-	✅	✅
RocketMQ	-		✅
Redis	-		✅
Doris	-		✅
MongoDB	-	✅	✅
Doris	-	✅
JDBC	MySQL	✅	✅
	Oracle
	PostgreSQL
	SqlServer
Fake	-	✅
Print	-		✅

Documentation for Connectors.

Community Support

Slack

Join BitSail Slack channel via this link

Mailing List

Currently, BitSail community use Google Group as the mailing list provider. You need to subscribe to the mailing list before starting a conversation

Subscribe: Email to this address [email protected]

Start a conversation: Email to this address [email protected]

Unsubscribe: Email to this address [email protected]

WeChat Group

Welcome to scan this QR code and to join the WeChat group chat.

Environment Setup

Link to Environment Setup.

Deployment Guide

Link to Deployment Guide.

BitSail Configuration

Link to Configuration Guide.

Contributing Guide

Link to Contributing Guide.

License

Apache 2.0 License.

Name		Name	Last commit message	Last commit date
Latest commit History 52 Commits
.github		.github
bitsail-base		bitsail-base
bitsail-clients		bitsail-clients
bitsail-common		bitsail-common
bitsail-components		bitsail-components
bitsail-connectors		bitsail-connectors
bitsail-cores		bitsail-cores
bitsail-dist		bitsail-dist
bitsail-shade		bitsail-shade
bitsail-test		bitsail-test
docs		docs
tools/maven		tools/maven
.gitignore		.gitignore
.licenserc.yaml		.licenserc.yaml
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
README_zh.md		README_zh.md
build.sh		build.sh
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BitSail

Introduction

Why Do We Use BitSail

BitSail Use Scenarios

Features of BitSail

Architecture of BitSail

Supported Connectors

Community Support

Slack

Mailing List

WeChat Group

Environment Setup

Deployment Guide

BitSail Configuration

Contributing Guide

License

About

Releases

Packages

Languages

License

xiongkaijun/bitsail

Folders and files

Latest commit

History

Repository files navigation

BitSail

Introduction

Why Do We Use BitSail

BitSail Use Scenarios

Features of BitSail

Architecture of BitSail

Supported Connectors

Community Support

Slack

Mailing List

WeChat Group

Environment Setup

Deployment Guide

BitSail Configuration

Contributing Guide

License

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages