forked from cwensel/cascading
-
Notifications
You must be signed in to change notification settings - Fork 1
/
README.txt
92 lines (63 loc) · 3.53 KB
/
README.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
Thanks for using Cascading.
General Information:
Project and contact information: http://www.cascading.org/
This distribution includes four Cascading jar files:
cascading-x.y.z.jar - all relevant Cascading class files and libraries, with a 'lib' folder
cascading-core-x.y.z.jar - all Cascading Core class files
cascading-xml-x.y.z.jar - all Cascading XML operations class files
cascadgin-test-x.y.z.jar - all Cascading tests and test utilities
Building:
To build Cascading,
> cd <path to cascading>
> ant -Dhadoop.home=<path to hadoop> compile
To make all jars:
> ant -Dhadoop.home=<path to hadoop> jar
To run all tests:
> ant -Dhadoop.home=<path to hadoop> test
where <path to cascading> is the directory created after cloning or uncompressing the Cascading
distribution, and <path to hadoop> is where you installed Hadoop.
Note that ant will not interpret the ~ path, use ${user.home} instead. For example,
-Dhadoop.home=${user.home}/hadoop
Alternatively, you can put hadoop.home inside the file build.properties in the cascading project directory.
Using:
To use with Hadoop, we suggest stuffing cascading-core and cascading-xml jar files, and all third-party libs
into the 'lib' folder of your job jar and executing via 'hadoop jar your.jar <your args>'.
For example, your job jar would look like this (via: jar -t your.jar)
/<all your class and resource files>
/lib/cascading-core-x.y.z.jar
/lib/cascading-xml-x.y.z.jar
/lib/<cascading third-party jar files>
Hadoop will unpack the jar locally and remotely (in the cluster) and add any libraries in 'lib' to the classpath.
This is a feature specific to Hadoop.
The cascading-x.y.z.jar file is typically used with scripting languages and is completely self contained.
This ant snippet works quite well (you may need to override cascading.home):
<property name="cascading.home" location="${basedir}/../cascading"/>
<property file="${cascading.home}/version.properties"/>
<property name="cascading.release.version" value="x.y.z"/>
<property name="cascading.filename.core" value="cascading-core-${cascading.release.version}.jar"/>
<property name="cascading.filename.xml" value="cascading-xml-${cascading.release.version}.jar"/>
<property name="cascading.libs" value="${cascading.home}/lib"/>
<property name="cascading.libs.core" value="${cascading.libs}"/>
<property name="cascading.libs.xml" value="${cascading.libs}/xml"/>
<condition property="cascading.path" value="${cascading.home}/"
else="${cascading.home}/build">
<available file="${cascading.home}/${cascading.filename.core}"/>
</condition>
<property name="cascading.lib.core" value="${cascading.path}/${cascading.filename.core}"/>
<property name="cascading.lib.xml" value="${cascading.path}/${cascading.filename.xml}"/>
<target name="jar" depends="build" description="creates a Hadoop ready jar will all dependencies">
<!-- copy Cascading classes and libraries -->
<copy todir="${build.classes}/lib" file="${cascading.lib.core}"/>
<copy todir="${build.classes}/lib" file="${cascading.lib.xml}"/>
<copy todir="${build.classes}/lib">
<fileset dir="${cascading.libs.core}" includes="*.jar"/>
<fileset dir="${cascading.libs.xml}" includes="*.jar"/>
</copy>
<jar jarfile="${build.dir}/${ant.project.name}.jar">
<fileset dir="${build.classes}"/>
<fileset dir="${basedir}" includes="lib/"/>
<manifest>
<attribute name="Main-Class" value="${ant.project.name}/Main"/>
</manifest>
</jar>
</target>