Skip to content
This repository has been archived by the owner on Feb 27, 2023. It is now read-only.

Commit

Permalink
add supernode config
Browse files Browse the repository at this point in the history
Signed-off-by: yunfeiyangbuaa <[email protected]>
  • Loading branch information
yunfeiyanggzq committed Oct 22, 2019
1 parent bb999f2 commit 0381327
Show file tree
Hide file tree
Showing 7 changed files with 360 additions and 0 deletions.
30 changes: 30 additions & 0 deletions docs/config/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# About this folder

All the documents in `config` folder help user to configure Dragonfly.

## How to configure dragonfly

We can use cli and yaml file to configure Dragonfly when deploying the system.
This tutorial only teaches you how to configure by yaml file.
If you want to config Dragonfly by cli, you can read docs in [cli_reference folder](https://github.com/dragonflyoss/Dragonfly/tree/master/docs/cli_reference).
In fact, learn this tutorial also will help you a lot, because the two ways are similar.

## About the yaml file

Because Dragonfly is composed of supernode, dfget, dfdaemon, you should learn how to configure them separately.
You can reference the three tutorials([supernode](supernode_properties.md), [dfget](dfget_properties.md), [dfdaemon](dfdaemon_properties.md)) to finish the yaml file and deploy.

## About deploying in docker

When deploying with Docker, you can mount the default path when starting up image with `-v`.
For supernode, you should start a supernode image using the following command.

```sh
docker run -d --name supernode --restart=always -p 8001:8001 -p 8002:8002 -v /etc/dragonfly/supernode.yml:/etc/dragonfly/supernode.yml dragonflyoss/supernode:0.4.3
```

For dfdaemon, you can start the image in the same way.

```sh
docker run -d --net=host --name dfclient -p 65001:65001 -v /etc/dragonfly/dfdaemon.yml:/etc/dragonfly/dfdaemon.yml -v /root/.small-dragonfly:/root/.small-dragonfly dragonflyoss/dfclient:0.4.3
```
76 changes: 76 additions & 0 deletions docs/config/dfdaemon_config_template.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
# This file is the template of dfdaemon configuration file.
# You can configure your dfdaemon by change the parameter according your requirement.

# RegistryMirror configures the mirror of the official docker registry
registry_mirror:
# url for the registry mirror
# Remote url for the registry mirror, default is https://index.docker.io
  remote: https://index.docker.io
# whether to ignore https certificate errors
  insecure: false
# optional certificates if the remote server uses self-signed certificates
  certs: []

# Proxies is the list of rules for the transparent proxy. If no rules
# are provided, all requests will be proxied directly. Request will be
# proxied with the first matching rule.
proxies:
# proxy all http image layer download requests with dfget
  - regx: blobs/sha256.*
# change http requests to some-registry to https and proxy them with dfget
  - regx: some-registry/
    use_https: true
# proxy requests directly, without dfget
  - regx: no-proxy-reg
    direct: true

# HijackHTTPS is the list of hosts whose https requests should be hijacked
# by dfdaemon. Dfdaemon will be able to proxy requests from them with dfget
# if the url matches the proxy rules. The first matched rule will be used.
hijack_https:
# key pair used to hijack https requests
  cert: df.crt
  key: df.key
  hosts:
    - regx: mirror.aliyuncs.com:443 # regexp to match request hosts
# whether to ignore https certificate errors
    insecure: false
# optional certificates if the host uses self-signed certificates
    certs: []

# dfget properties
# node: specify the addresses
# ip: IP address that server will listen on
# port: port number that server will listen on
# expiretime: caching duration for which cached file keeps no accessed by any process(default 3min). Deploying with Docker, this param is supported after dragonfly 0.4.3
# alivetime: Alive duration for which uploader keeps no accessing by any uploading requests, after this period uploader will automically exit (default 5m0s)
# f: filter some query params of URL, use char '&' to separate different params
dfget_flags: ["--node","192.168.33.21","--verbose","--ip","192.168.33.23",
"--port","15001","--expiretime","3m0s","--alivetime","5m0s",
"-f","filterParam1&filterParam2"]

# Specify the addresses(host:port) of supernodes, it is just to be compatible with previous versions
supernodes:
- 127.0.0.1
- 10.10.10.1

# Net speed limit,format:xxxM/K
ratelimit: 20M

# Temp output dir of dfdaemon, it must be an absolute path. and the default value is `$HOME/.small-dragonfly/dfdaemon/data/`
localrepo: /home/admin/.small-dragonfly/dfdaemon/data/

# dfget path, which is the relative file path for the dfdaemon
dfpath: ./dfget

# https options
# port: 12001
# hostIp: 127.0.0.1
# certpem: ""
# keypem: ""

#Open detail info switch
verbose: false

# The maximum number of CPUs that the dfdaemon can use
maxprocs: 10
27 changes: 27 additions & 0 deletions docs/config/dfdaemon_properties.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Customize dfdaemon properties

This topic explains how to customize the dragonfly dfdaemon startup parameters.

## Parameter instructions

The following startup parameters are supported for `dfdaemon`

| Parameter | Description |
| ------------- | ------------- |
| dfget_flags | dfget properties |
| dfpath | dfget path |
| hijack_https | HijackHTTPS is the list of hosts whose https requests should be hijacked by dfdaemon. The first matched rule will be used |
| localrepo | Temp output dir of dfdaemon, by default `$HOME/.small-dragonfly/dfdaemon/data/` |
| maxprocs| The maximum number of CPUs that the dfdaemon can use |
| proxies | Proxies is the list of rules for the transparent proxy |
| ratelimit | Net speed limit,format:xxxM/K |
| registry_mirror | Registry mirror settings |
| supernodes | Specify the addresses(host:port) of supernodes, it is just to be compatible with previous versions |
| verbose | Open detail info switch |

## Examples

Parameters are configured in `/etc/dragonfly/dfdaemon.yml`.
To make it easier for you, you can copy the [template](dfdaemon_config_template.yml) and modify it according to your requirement.

Properties holds all configurable properties of dfdaemon including `dfget` properties. By default, dragonfly configuration files locate at `/etc/dragonfly`. You can create `dfdaemon.yml` for configuring dfdaemon startup params.
25 changes: 25 additions & 0 deletions docs/config/dfget_config_template.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
# This file is the template of dfget configuration file.
# You can configure your dfget by change the parameter according your requirement.

# Nodes specify supernodes.
nodes:
 - 127.0.0.1
 - 10.10.10.1

# LocalLimit rate limit about a single download task, format: G(B)/g/M(B)/m/K(B)/k/B
# pure number will also be parsed as Byte.
localLimit: 20M

# Minimal rate about a single download task, format: G(B)/g/M(B)/m/K(B)/k/B
# pure number will also be parsed as Byte.
minRate: 512

# TotalLimit rate limit about the whole host, format: G(B)/g/M(B)/m/K(B)/k/B
# pure number will also be parsed as Byte.
totalLimit: 40M

# ClientQueueSize is the size of client queue
# which controls the number of pieces that can be processed simultaneously.
# It is only useful when the Pattern equals "source".
# The default value is 6.
clientQueueSize: 6
22 changes: 22 additions & 0 deletions docs/config/dfget_properties.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Customize dfget properties

This topic explains how to customize the dragonfly dfget startup parameters.

## Parameter instructions

The following startup parameters are supported for `dfget`

| Parameter | Description |
| ------------- | ------------- |
| nodes | Nodes specify supernodes |
| localLimit | LocalLimit rate limit about a single download task,format: 20M/m/K/k |
| minRate | Minimal rate about a single download task. it's type is integer. The format of `M/m/K/k` will be supported soon |
| totalLimit | TotalLimit rate limit about the whole host,format: 20M/m/K/k |
| clientQueueSize | ClientQueueSize is the size of client queue, which controls the number of pieces that can be processed simultaneously. It is only useful when the Pattern equals "source". The default value is 6 |

## Examples

Parameters are configured in `/etc/dragonfly/dfget.yml`.
To make it easier for you, you can copy the [template](dfget_config_template.yml) and modify it according to your requirement.

By default, dragonfly config files locate at `/etc/dragonfly`. You can create `dfget.yml` in the path if you want to install dfget in physical machine.
121 changes: 121 additions & 0 deletions docs/config/supernode_config_template.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,121 @@
# This file is the template of supernode configuration file.
# You can configure your supernode by change the parameter according your requirement.

base:
# AdvertiseIP is used to set the ip that we advertise to other peer in the p2p-network.
# By default, the first non-loop address is advertised.
advertiseIP: 127.0.0.1

# ListenPort is the port supernode server listens on.
# default: 8002
listenPort: 8005

# DownloadPort is the port for download files from supernode.
# default: 8001
downloadPort: 8001

# HomeDir is working directory of supernode.
# default: /home/admin/supernode
homeDir: /home/admin/supernode

# the core pool size of ScheduledExecutorService.
# When a request to start a download task, supernode will construct a thread concurrent pool
# to download pieces of source file and write to specified storage.
# Note: source file downloading is into pieces via range attribute set in HTTP header.
# default: 10
schedulerCorePoolSize: 10

# DownloadPath specifies the path where to store downloaded files from source address.
# This path can be set beyond BaseDir, such as taking advantage of a different disk from BaseDir's.
# default: $BaseDir/downloads
downloadPath: /home/admin/supernode/downloads

# PeerUpLimit is the upload limit of a peer. When dfget starts to play a role of peer,
# it can only stand PeerUpLimit upload tasks from other peers.
# default: 5
peerUpLimit: 5

# PeerDownLimit is the download limit of a peer. When a peer starts to download a file/image,
# it will download file/image in the form of pieces. PeerDownLimit mean that a peer can only
# stand starting PeerDownLimit concurrent downloading tasks.
# default: 4
peerDownLimit: 4

# When dfget node starts to play a role of peer, it will provide services for other peers
# to pull pieces. If it runs into an issue when providing services for a peer, its self failure
# increases by 1. When the failure limit reaches EliminationLimit, the peer will isolate itself
# as a unhealthy state. Then this dfget will be no longer called by other peers.
# default: 5
eliminationLimit: 5

# FailureCountLimit is the failure count limit set in supernode for dfget client.
# When a dfget client takes part in the peer network constructed by supernode,
# supernode will command the peer to start distribution task.
# When dfget client fails to finish distribution task, the failure count of client
# increases by 1. When failure count of client reaches to FailureCountLimit(default 5),
# dfget client will be moved to blacklist of supernode to stop playing as a peer.
# default: 5
failureCountLimit: 5

# LinkLimit is set for supernode to limit every piece download network speed.
# default: 20 MB, in format of G(B)/g/M(B)/m/K(B)/k/B, pure number will also be parsed as Byte.
linkLimit: 20M

# SystemReservedBandwidth is the network bandwidth reserved for system software.
# default: 20 MB, in format of G(B)/g/M(B)/m/K(B)/k/B, pure number will also be parsed as Byte.
systemReservedBandwidth: 20M

# MaxBandwidth is the network bandwidth that supernode can use.
# default: 200 MB, in format of G(B)/g/M(B)/m/K(B)/k/B, pure number will also be parsed as Byte.
maxBandwidth: 200M

# Whether to enable profiler
# default: false
enableProfiler: false

# Whether to open DEBUG level
# default: false
debug: false

# FailAccessInterval is the interval time after failed to access the URL.
# default: 3m
failAccessInterval: 3m

# gc related

# GCInitialDelay is the delay time from the start to the first GC execution.
# default: 6s
gcInitialDelay: 6s

# GCMetaInterval is the interval time to execute GC meta.
# default: 2m0s
gcMetaInterval: 2m

# TaskExpireTime when a task is not accessed within the taskExpireTime,
# and it will be treated to be expired.
# default: 3m0s
taskExpireTime: 3m

# PeerGCDelay is the delay time to execute the GC after the peer has reported the offline.
# default: 3m0s
peerGCDelay: 3m

# GCDiskInterval is the interval time to execute GC disk.
# default: 15s
gcDiskInterval: 15s

# YoungGCThreshold if the available disk space is more than YoungGCThreshold
# and there is no need to GC disk.
# default: 100GB
youngGCThreshold: 100G

# FullGCThreshold if the available disk space is less than FullGCThreshold
# and the supernode should gc all task files which are not being used.
# default: 5GB
fullGCThreshold: 5G

# IntervalThreshold is the threshold of the interval at which the task file is accessed.
# default: 2h0m0s
IntervalThreshold: 2h
plugins: {}
storages: {}
59 changes: 59 additions & 0 deletions docs/config/supernode_properties.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
# Customize supernode properties

This topic explains how to customize the dragonfly supernode startup parameters.

## Parameter instructions

### The parameters we can configure in supernode are as follows

The following startup parameters are supported for `supernode`

| Parameter | Default | Description |
| ------------- | ------------- | ------------- |
| listenPort | 8002 | listenPort is the port that supernode server listens on |
| downloadPort | 8001 | downloadPort is the port for download files from supernode |
| homeDir | /home/admin/supernode | homeDir is the working directory of supernode |
| advertiseIP | the first non-loop address | the supernode ip is the ip we advertise to other peers in the p2p-network |
| schedulerCorePoolSize | 10 | pool size is the core pool size of ScheduledExecutorService(the parameter is aborted) |
| downloadPath | /home/admin/supernode/repo/download | DownloadPath specifies the path where to store downloaded files from source address |
| peerUpLimit | 5 | upload limit for a peer to serve download tasks |
| peerDownLimit | 4 |the task upload limit of a peer when dfget starts to play a role of peer |
| eliminationLimit | 5 | if a dfget fails to provide service for other peers up to eliminationLimit, it will be isolated |
| failureCountLimit | 5 | when dfget client fails to finish distribution task up to failureCountLimit, supernode will add it to blacklist|
| linkLimit | 20M | LinkLimit is set for supernode to limit every piece download network speed |
| systemReservedBandwidth | 20M | network rate reserved for system |
| maxBandwidth | 200M | network rate that supernode can use |
| enableProfiler | false | profiler sets whether supernode HTTP server setups profiler |
| debug | false | switch daemon log level to DEBUG mode |
| failAccessInterval | 3m0s | fail access interval is the interval time after failed to access the URL |
| gcInitialDelay | 6s | gc initial delay is the delay time from the start to the first GC execution |
| gcMetaInterval | 2m0s | gc meta interval is the interval time to execute the GC meta |
| taskExpireTime | 3m0s | task expire time is the time that a task is treated expired if the task is not accessed within the time |
| peerGCDelay | 3m0s | peer gc delay is the delay time to execute the GC after the peer has reported the offline |
| gcDiskInterval | 15s | GCDiskInterval is the interval time to execute GC disk |
| youngGCThreshold | 100GB | if the available disk space is more than YoungGCThreshold and there is no need to GC disk |
| fullGCThreshold | 5GB | if the available disk space is less than FullGCThreshold and the supernode should gc all task files which are not being used |
| IntervalThreshold | 2h0m0s | IntervalThreshold is the threshold of the interval at which the task file is accessed |

### Some common configurations

We use `--config` to specify the configuration file directory, the default value is `/etc/dragonfly/supernode.yml`
In Dragonfly, supernode provides `listenPort` for dfgets to connect, and dfget downloads document from `downloadPort` instead of `listenPort`.
We can also configure the supernode's IP via `advertiseIP`.

### About gc parameters

In supernode, gc will begin `gcInitialDelay` time after supernode works.
Then supernode will run peer-gc goroutine and task-gc goroutine every `gcMetaInterval` time.
If a task isn't accessed by dfgets in `taskExpireTime` time, task-gc goroutine will gc this task.
If a peer reports that it's offline and can't provide download service to other peers, peer-gc goroutine will gc this peer after `peerGCDelay` time.

## Examples

To make it easier for you, you can copy the [template](supernode_config_template.yml) and modify it according to your requirement.

When deploying in your physical machine, you can use `--config` to configure where is the configuration file.

```ssh
supernode --config /etc/dragonfly/supernode.yml
```

0 comments on commit 0381327

Please sign in to comment.