Various helpful scripts to package and deploy arbitrary Sqoop 2 repository to CDH cluster (with associated CM service).
Various helpful scripts to package and deploy arbitrary Sqoop 2 repository to CDH cluster (with associated CM service).
These are some additional tools that need to be installed to run these scripts under OS X 10.11.
brew install md5sha1sum
brew install gnu-sed --with-default-names
The general flow is as follows:
- Generate parcel (=package) for given repository and branch
- Upload (deploy) the generated parcel to given CM instance
- Upload (deploy) CSD to given CM instance
- Create Sqoop 2 service in given CM instance
# Building parcels for upstream bits
./parcel.sh -r https://github.com/apache/sqoop.git -b sqoop2
./deploy-parcels.sh -i id_rsa -h cool.sever.somewhere.org
./deploy-csd.sh -i id_rsa -h cool.sever.somewhere.org
./deploy-service.sh -i id_rsa -h cool.sever.somewhere.org
You still need to CM to deploy the Sqoop 2 service on the cluster as this step hasn't been automated yet.
This section describes individual scripts that are available in the repository
Script parcel.sh
is responsible for creating parcels with Sqoop 2 bits that can be installed into Cloudera Manager and subsequently distributed across the cluster. The sripts requires two arguments - -r
with github repository (that will be cloned to working directory) and -b
with branch name inside this repository. The script should work with both upstream (Apache) and downstream (cloudera) repositories and branches (provided all dependent patches are available there).
# Building parcels for upstream bits
./parcel.sh -r https://github.com/apache/sqoop.git -b sqoop2
All parameters:
-r
Repository URL (anything thatgit clone
will accept)-b
Branch in the repository that we'll use to generate the parcels
Script deploy-parcels.sh
takes generated parcels (by default from target/parcel_repo
where script parcel.sh
will generate output) and uploads them to given CM host. After upload the new parcel is distributed and activated.
# Deploy parcels to given CM instance
./deploy-parcels.sh -h cool.sever.somewhere.org
All parameters:
-p
Local parcel repository (default istarget/parcel_repo
)-t
Target directory on CM server host where the parcel(s) should be uploaded (default is/opt/cloudera/parcel-repo
)-u
Username for SSH access to CM server (default isroot
)-w
Password for SSH access to CM server (default iscloudera
)-h
Hostname of CM server-c
Curl compatible login information for CM server (default isadmin:admin
)
Script deploy-csd.sh
will build CSD (Custom service descriptor, code for Cloudera Manager to actually manage the Sqoop 2 service) and deploy it to target CM host. This script will restart CM to force CM to load the CSD jar.
# Deploy CSD to given CM instance
./deploy-csd.sh -h cool.sever.somewhere.org
All parameters:
-t
Target directory on CM server host where the CSD should be uploaded (default is/opt/cloudera/csd
)-u
Username for SSH access to CM server (default isroot
)-w
Password for SSH access to CM server (default iscloudera
)-h
Hostname of CM server-c
Curl compatible login information for CM server (default isadmin:admin
)
Script deploy-csd.sh
deploy Sqoop 2 service in given CM server. If service of given name already exists, we'll drop it and re-create it again.
# Create service
./deploy-service.sh -h cool.sever.somewhere.org
All parameters:
-u
Username for SSH access to CM server (default isroot
)-w
Password for SSH access to CM server (default iscloudera
)-h
Hostname of CM server-c
Curl compatible login information for CM server (default isadmin:admin
)-n
Name for the deployed service (default isSqoop-2-beta
)-s
Hostname where the Sqoop 2 Server should be deployed (default is the same value as has been used for-h
)-y
Name of YARN service that should be used as dependency for newly deployed service (default is 1st YARN service available on the cluster)
Script get-config.sh
download client configs for given service. By default it will look for first YARN service and download it's configuration files.
# Get client configs
./get-configs.sh -h cool.sever.somewhere.org
All parameters:
-u
Username for SSH access to CM server (default isroot
)-w
Password for SSH access to CM server (default iscloudera
)-h
Hostname of CM server-y
Name of service for which we need client configs (default is 1st YARN service available on the cluster)