Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Persist data to a docker volume #14

Closed
davidonlaptop opened this issue Mar 21, 2015 · 2 comments
Closed

Persist data to a docker volume #14

davidonlaptop opened this issue Mar 21, 2015 · 2 comments
Assignees
Milestone

Comments

@davidonlaptop
Copy link
Member

As described in issue #13, we need to persist our container data using docker volumes.

Docker Volume Documentation

Data that needs to be persisted

It should only affect the HDFS service of our Hadoop image. The 3 HDFS daemons that stores data that require persistence and their associated property in hdfs-site.xml:

  • HDFS Name Node : dfs.namenode.name.dir (default: file://${hadoop.tmp.dir}/dfs/name)
  • HDFS Secondary Name Node : dfs.namenode.checkpoint.dir (default: file://${hadoop.tmp.dir}/dfs/namesecondary)
  • HDFS Data Node : dfs.datanode.data.dir (default: file://${hadoop.tmp.dir}/dfs/data)

Other data

The other properties that uses the hadoop.tmp.dir property as a variable:

  • in core-site.xml:
    • io.seqfile.local.dir = ${hadoop.tmp.dir}/io/local
    • fs.s3.buffer.dir = ${hadoop.tmp.dir}/s3
    • fs.s3a.buffer.dir = ${hadoop.tmp.dir}/s3a
  • in yarn-site.xml:
    • yarn.resourcemanager.fs.state-store.uri = ${hadoop.tmp.dir}/yarn/system/rmstore
    • yarn.nodemanager.local-dirs = ${hadoop.tmp.dir}/nm-local-dir
    • yarn.nodemanager.recovery.dir = ${hadoop.tmp.dir}/yarn-nm-recovery
    • yarn.timeline-service.leveldb-timeline-store.path = ${hadoop.tmp.dir}/yarn/timeline
  • in mapred-site.xml:
    • mapreduce.cluster.local.dir = ${hadoop.tmp.dir}/mapred/local
    • mapreduce.jobtracker.system.dir = ${hadoop.tmp.dir}/mapred/system
    • mapreduce.jobtracker.staging.root.dir = ${hadoop.tmp.dir}/mapred/staging
    • mapreduce.cluster.temp.dir = ${hadoop.tmp.dir}/mapred/staging
    • mapreduce.jobhistory.recovery.store.fs.uri = ${hadoop.tmp.dir}/mapred/history/recoverystore

Most of these directories store temporary intermediate data, or are related to the map reduce framework. Since, we won't support map reduce for phase one, it is probably ok to keep the default values for these properties.

Proposal

# in hadoop/Dockerfile
VOLUME /data

# in hadoop/hdfs-site.xml
  <property>
    <name>dfs.datanode.data.dir</name>
    <value>file:///data/dfs/data</value>
  </property>
  <property>
    <name>dfs.namenode.name.dir</name>
    <value>file:///data/dfs/name</value>
  </property>
  <property>
    <name>dfs.namenode.checkpoint.dir</name>
    <value>file:///data/dfs/namesecondary </value>
  </property>

To format the namenode (run this only once):

docker run --rm -v /hadoop-data:/data hadoop hdfs namenode -format

To run the namenode:

docker run --rm -v /hadoop-data:/data --name hdfs-namenode hadoop /usr/local/hadoop/sbin/hadoop-daemon.sh start namenode
@davidonlaptop
Copy link
Member Author

First approach: VOLUME instruction

I've experimented a bit with the VOLUME instruction (in the Dockerfile), and here's how it works:

  1. When the container is created, 2 read-write layers are created: 1 for the container data (as usual), and 1 for the volume.
  2. Another container on the same host can share that container's volume by using docker run --volumes-from.
  3. The original container can be safely destroyed, and the volume won't be deleted.
  4. When all containers using the volume are deleted, then the volume becomes unavailable.

As stated in the docs, if one deletes the container and forgets to use docker rm -v, then the volume data is NOT deleted
If you remove containers without using the -v option, you may end up with "dangling" volumes; volumes that are no longer referenced by a container. Dangling volumes are difficult to get rid of and can take up a large amount of disk space. We're working on improving volume management and you can check progress on this in pull request moby/moby#8484

To go around this limitation, Docker recommends using the Data Volume Container pattern, which consist of creating a container for the sole purpose of keeping a reference to the volume layer.

Approach 2: Mounting a volume from the host

An alternative to the VOLUME instruction would be not to use the VOLUME instruction, and instead mount a volume from the host when creating the container using docker run -v /path-in-the-host:/path-in-the-container.

When using this approach, docker will not create an additional layer and the data can still be shared among containers on the same host. It's probably even more efficient in terms of performance.

The only benefit we're missing from the first approach is that the VOLUME instruction is quite explicit. It helps inform users of the directories that needs persistence.

Approach 3: best of both worlds

It turns out that we can use the VOLUME instruction in the Dockerfile AND override it at container creation time using the '-v' parameter of docker run. When doing so, Docker will NOT create a layer for the volume, since it can use the mount point from the host.

And a user wishing to use the Data Volume design pattern from the 1st approach is still feasible.

I've validated my findings with docker inspect.

@flangelier
Copy link
Contributor

That's implemented, using /hdfs-data

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants