Persist data to a docker volume #14

davidonlaptop · 2015-03-21T20:26:36Z

As described in issue #13, we need to persist our container data using docker volumes.

Docker Volume Documentation

Data that needs to be persisted

It should only affect the HDFS service of our Hadoop image. The 3 HDFS daemons that stores data that require persistence and their associated property in hdfs-site.xml:

HDFS Name Node : dfs.namenode.name.dir (default: file://${hadoop.tmp.dir}/dfs/name)
HDFS Secondary Name Node : dfs.namenode.checkpoint.dir (default: file://${hadoop.tmp.dir}/dfs/namesecondary)
HDFS Data Node : dfs.datanode.data.dir (default: file://${hadoop.tmp.dir}/dfs/data)

Other data

The other properties that uses the hadoop.tmp.dir property as a variable:

in core-site.xml:
- io.seqfile.local.dir = ${hadoop.tmp.dir}/io/local
- fs.s3.buffer.dir = ${hadoop.tmp.dir}/s3
- fs.s3a.buffer.dir = ${hadoop.tmp.dir}/s3a
in yarn-site.xml:
- yarn.resourcemanager.fs.state-store.uri = ${hadoop.tmp.dir}/yarn/system/rmstore
- yarn.nodemanager.local-dirs = ${hadoop.tmp.dir}/nm-local-dir
- yarn.nodemanager.recovery.dir = ${hadoop.tmp.dir}/yarn-nm-recovery
- yarn.timeline-service.leveldb-timeline-store.path = ${hadoop.tmp.dir}/yarn/timeline
in mapred-site.xml:
- mapreduce.cluster.local.dir = ${hadoop.tmp.dir}/mapred/local
- mapreduce.jobtracker.system.dir = ${hadoop.tmp.dir}/mapred/system
- mapreduce.jobtracker.staging.root.dir = ${hadoop.tmp.dir}/mapred/staging
- mapreduce.cluster.temp.dir = ${hadoop.tmp.dir}/mapred/staging
- mapreduce.jobhistory.recovery.store.fs.uri = ${hadoop.tmp.dir}/mapred/history/recoverystore

Most of these directories store temporary intermediate data, or are related to the map reduce framework. Since, we won't support map reduce for phase one, it is probably ok to keep the default values for these properties.

Proposal

# in hadoop/Dockerfile
VOLUME /data

# in hadoop/hdfs-site.xml
  <property>
    <name>dfs.datanode.data.dir</name>
    <value>file:///data/dfs/data</value>
  </property>
  <property>
    <name>dfs.namenode.name.dir</name>
    <value>file:///data/dfs/name</value>
  </property>
  <property>
    <name>dfs.namenode.checkpoint.dir</name>
    <value>file:///data/dfs/namesecondary </value>
  </property>

To format the namenode (run this only once):

docker run --rm -v /hadoop-data:/data hadoop hdfs namenode -format

To run the namenode:

docker run --rm -v /hadoop-data:/data --name hdfs-namenode hadoop /usr/local/hadoop/sbin/hadoop-daemon.sh start namenode

The text was updated successfully, but these errors were encountered:

davidonlaptop · 2015-03-21T23:07:46Z

First approach: VOLUME instruction

I've experimented a bit with the VOLUME instruction (in the Dockerfile), and here's how it works:

When the container is created, 2 read-write layers are created: 1 for the container data (as usual), and 1 for the volume.
Another container on the same host can share that container's volume by using docker run --volumes-from.
The original container can be safely destroyed, and the volume won't be deleted.
When all containers using the volume are deleted, then the volume becomes unavailable.

As stated in the docs, if one deletes the container and forgets to use docker rm -v, then the volume data is NOT deleted
If you remove containers without using the -v option, you may end up with "dangling" volumes; volumes that are no longer referenced by a container. Dangling volumes are difficult to get rid of and can take up a large amount of disk space. We're working on improving volume management and you can check progress on this in pull request moby/moby#8484

To go around this limitation, Docker recommends using the Data Volume Container pattern, which consist of creating a container for the sole purpose of keeping a reference to the volume layer.

Approach 2: Mounting a volume from the host

An alternative to the VOLUME instruction would be not to use the VOLUME instruction, and instead mount a volume from the host when creating the container using docker run -v /path-in-the-host:/path-in-the-container.

When using this approach, docker will not create an additional layer and the data can still be shared among containers on the same host. It's probably even more efficient in terms of performance.

The only benefit we're missing from the first approach is that the VOLUME instruction is quite explicit. It helps inform users of the directories that needs persistence.

Approach 3: best of both worlds

It turns out that we can use the VOLUME instruction in the Dockerfile AND override it at container creation time using the '-v' parameter of docker run. When doing so, Docker will NOT create a layer for the volume, since it can use the mount point from the host.

And a user wishing to use the Data Volume design pattern from the 1st approach is still feasible.

I've validated my findings with docker inspect.

flangelier · 2015-04-21T02:07:25Z

That's implemented, using /hdfs-data

davidonlaptop added the enhancement label Mar 21, 2015

davidonlaptop added this to the 0.3 milestone Mar 21, 2015

davidonlaptop assigned flangelier Mar 21, 2015

flangelier closed this as completed Apr 21, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Persist data to a docker volume #14

Persist data to a docker volume #14

davidonlaptop commented Mar 21, 2015

davidonlaptop commented Mar 21, 2015

flangelier commented Apr 21, 2015

Persist data to a docker volume #14

Persist data to a docker volume #14

Comments

davidonlaptop commented Mar 21, 2015

Docker Volume Documentation

Data that needs to be persisted

Other data

Proposal

davidonlaptop commented Mar 21, 2015

First approach: VOLUME instruction

Approach 2: Mounting a volume from the host

Approach 3: best of both worlds

flangelier commented Apr 21, 2015