Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support docker for loader #530

Merged
merged 7 commits into from
Nov 13, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 7 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,10 +12,16 @@
## Modules

- [hugegraph-loader](./hugegraph-loader): Loading datasets into the HugeGraph from multiple data sources.
- [hugegraph-hubble](./hugegraph-hubble): Online HugeGraph management and analysis dashboard (Include: data loading, schema management, graph traverser and display). We can use `docker run -itd --name=hubble -p 8088:8088 hugegraph/hubble` to quickly start [hubble](https://hub.docker.com/r/hugegraph/hubble) or we can follow [this](hugegraph-hubble/README.md#quick-start) to use docker-compose to start `hubble` with `server`.
- [hugegraph-hubble](./hugegraph-hubble): Online HugeGraph management and analysis dashboard (Include: data loading, schema management, graph traverser and display).
- [hugegraph-tools](./hugegraph-tools): Command line tool for deploying, managing and backing-up/restoring graphs from HugeGraph.
- [hugegraph-client](./hugegraph-client): A Java-written client for HugeGraph, providing `RESTful` APIs for accessing graph vertex/edge/schema/gremlin/variables and traversals etc.

## Usage

- [hugegraph-loader](./hugegraph-loader): We can use `docker run -itd --name loader hugegraph/loader` to quickly start [loader](https://hub.docker.com/r/hugegraph/loader) or we can follow [this](./hugegraph-loader/README.md#212-docker-compose) to use docker-compose to start `loader` with `server`. And we can find more details in the [doc](https://hugegraph.apache.org/docs/quickstart/hugegraph-loader/).
- [hugegraph-hubble](./hugegraph-hubble): We can use `docker run -itd --name=hubble -p 8088:8088 hugegraph/hubble` to quickly start [hubble](https://hub.docker.com/r/hugegraph/hubble) or we can follow [this](hugegraph-hubble/README.md#quick-start) to use docker-compose to start `hubble` with `server`. And we can find more details in the [doc](https://hugegraph.apache.org/docs/quickstart/hugegraph-hubble/).
- [hugegraph-client](./hugegraph-client): We can follow the [doc](https://hugegraph.apache.org/docs/quickstart/hugegraph-client/) to learn how to quick start with `client`.

## Maven Dependencies

You could use import the dependencies in `maven` like this:
Expand Down
3 changes: 1 addition & 2 deletions hugegraph-hubble/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,8 +20,7 @@ We can quickly start `hubble` in two ways:
1. We can use `docker run -itd --name=hubble -p 8088:8088 hugegraph/hubble` to quickly start [hubble](https://hub.docker.com/r/hugegraph/hubble).
2. Or we can use the `docker-compose.yml` to start `hubble` with `hugegraph-server`. If we set `PRELOAD=true`, we can preload the example graph when starting `hugegraph-server`:


```
```yaml
version: '3'
services:
server:
Expand Down
51 changes: 51 additions & 0 deletions hugegraph-loader/Dockerfile
imbajin marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

FROM maven:3.9.0-eclipse-temurin-11 AS build

COPY . /pkg
WORKDIR /pkg

RUN set -x \
&& mvn install -pl hugegraph-client,hugegraph-loader -am -Dmaven.javadoc.skip=true -DskipTests -ntp
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that in the center maven repo, the client-1.0.0 does not have some methods like istext(). Hence, we should install the jar locally.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that in the center maven repo, the client-1.0.0 does not have some methods like istext(). Hence, we should install the jar locally.

seems we don't rely on fixed version? we use $revision in the project, so install client first is necessary (if we want build the latest image)



RUN set -x \
&& cd /pkg/hugegraph-loader/ \
&& echo "$(ls)" \
&& mvn clean package -DskipTests


FROM openjdk:11-slim

COPY --from=build /pkg/hugegraph-loader/apache-hugegraph-loader-incubating-*/ /loader
WORKDIR /loader/

RUN set -x \
&& apt-get -q update \
&& apt-get -q install -y --no-install-recommends --no-install-suggests \
dumb-init \
procps \
curl \
lsof \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*

VOLUME /loader

ENTRYPOINT ["/usr/bin/dumb-init", "--"]
CMD ["tail","-f","/dev/null"]
124 changes: 120 additions & 4 deletions hugegraph-loader/README.md
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider also add a new section (run with docker) in root (README) https://github.com/apache/incubator-hugegraph-toolchain/blob/master/README.md#modules

and keep the module section clear
image

Original file line number Diff line number Diff line change
Expand Up @@ -7,15 +7,131 @@

hugegraph-loader is a customizable command line utility for loading small to medium size graph datasets into the HugeGraph database from multiple data sources with various input formats.

## Features
## 1. Features

- Multiple data sources, such as local file(path), HDFS file(path), MySQL
- Various input formats, such as json, csv, and text with any delimiters.
- Diverse options, with which users can manage the data loading intuitively.
- Detecting schema from data automatically, reduce the complex work of schema management.
- Advanced customized operations with groovy script, users can configure how to construct vertices and edges by themselves.

## Building
## 2. Usage for Docker(Recommand)

- Run `loader` with Docker
- Docker run
- Docker-compose
- Load data in docker container `loader`

### 2.1 Start with Docker

#### 2.1.1 Docker run

Use the command `docker run -itd --name loader hugegraph/loader` to start loader.

If you want to load your data, you can mount the data folder like `-v /path/to/data/file:/loader/file`


#### 2.1.2 Docker-compose

The example `docker-compose.yml` is [here](./docker/example/docker-compose.yml)

If you want to load your data, you can mount the data folder like:
```yaml
volumes:
- /path/to/data/file:/loader/file
```

Use the command `docker-compose up -d` to deploy `loader` with `server` and `hubble`.

### 2.2 Load data with docker container

#### 2.2.1 load data with docker

> If the `loader` and `server` is in the same docker network (for example, you deploy `loader` and `server` with `docker-compose`), we can set `-h {server_container_name}`. In our example, the container name of `server` is `graph`
>
> If `loader` is deployed alone, the `-h` should be set to the ip of the host of `server`. Other parameter description is [here](https://hugegraph.apache.org/docs/quickstart/hugegraph-loader/#341-parameter-description)

```bash
docker exec -it loader bin/hugegraph-loader.sh -g hugegraph -f example/file/struct.json -s example/file/schema.groovy -h graph -p 8080
```

Then we can see the result.

```bash
HugeGraphLoader worked in NORMAL MODE
vertices/edges loaded this time : 8/6
--------------------------------------------------
count metrics
input read success : 14
input read failure : 0
vertex parse success : 8
vertex parse failure : 0
vertex insert success : 8
vertex insert failure : 0
edge parse success : 6
edge parse failure : 0
edge insert success : 6
edge insert failure : 0
--------------------------------------------------
meter metrics
total time : 0.199s
read time : 0.046s
load time : 0.153s
vertex load time : 0.077s
vertex load rate(vertices/s) : 103
edge load time : 0.112s
edge load rate(edges/s) : 53
```

Then you can use `curl` or `hubble` to see the result.

```bash
> curl "http://localhost:8080/graphs/hugegraph/graph/vertices" | gunzip
{"vertices":[{"id":1,"label":"software","type":"vertex","properties":{"name":"lop","lang":"java","price":328.0}},{"id":2,"label":"software","type":"vertex","properties":{"name":"ripple","lang":"java","price":199.0}},{"id":"1:tom","label":"person","type":"vertex","properties":{"name":"tom"}},{"id":"1:josh","label":"person","type":"vertex","properties":{"name":"josh","age":32,"city":"Beijing"}},{"id":"1:marko","label":"person","type":"vertex","properties":{"name":"marko","age":29,"city":"Beijing"}},{"id":"1:peter","label":"person","type":"vertex","properties":{"name":"peter","age":35,"city":"Shanghai"}},{"id":"1:vadas","label":"person","type":"vertex","properties":{"name":"vadas","age":27,"city":"Hongkong"}},{"id":"1:li,nary","label":"person","type":"vertex","properties":{"name":"li,nary","age":26,"city":"Wu,han"}}]}
```

If you want to check the edges, use `curl "http://localhost:8080/graphs/hugegraph/graph/edges" | gunzip`

#### 2.2.2 enter the docker container to load data

If you want to do some additional operation in the container, you can enter the container as follows:

```bash
docker exec -it loader bash
```

Then, you can load data as follows:

```bash
sh bin/hugegraph-loader.sh -g hugegraph -f example/file/struct.json -s example/file/schema.groovy -h graph -p 8080
```

The result is as same as above.

## 3. Use loader directly

> notice: currently, version is `1.0.0`

Download and unzip the compiled archive

```bash
wget https://downloads.apache.org/incubator/hugegraph/{version}/apache-hugegraph-toolchain-incubating-{version}.tar.gz
tar zxf *hugegraph*.tar.gz
```

Then, load data with example file:

```bash
cd apache-hugegraph-toolchain-incubating-{version}
cd apache-hugegraph-loader-incubating-{version}
sh bin/hugegraph-loader.sh -g hugegraph -f example/file/struct.json -s example/file/schema.groovy
```

More details is in the [doc](https://hugegraph.apache.org/docs/quickstart/hugegraph-loader/)

## 4. Building

You can also build the `loader` by yourself.

Required:

Expand All @@ -34,10 +150,10 @@ To build with default tests:
mvn clean install
```

## Doc
## 5. Doc

The [loader homepage](https://hugegraph.apache.org/docs/quickstart/hugegraph-loader/) contains more information about it.

## License
## 6. License

hugegraph-loader is licensed under Apache 2.0 License.
37 changes: 37 additions & 0 deletions hugegraph-loader/docker/example/docker-compose.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

version: '3'
services:
server:
image: hugegraph/hugegraph
container_name: graph
ports:
- 8080:8080

hubble:
image: hugegraph/hubble
container_name: hubble
ports:
- 8088:8088

loader:
image: hugegraph/loader
container_name: loader
# mount your own data here
# volumes:
# - /path/to/data/file:/loader/file
Loading