-
Notifications
You must be signed in to change notification settings - Fork 8
0.7 Openstack Platform Deployment
TAP recommends using Mirantis Openstack 7.0 for deployments.
Note: TAP install requires Internet connectivity
Hardware recommendations:
- 1x Controller node: 2 CPUs with 6 cores, 24 GB of RAM, 1 TB RAID1
- 1x Storage node: 1 CPU with 4 physical cores, 12 GB of RAM, 500 GB RAID1
- 1x fuel server: Quad-core CPU, 4 GB RAM, 1 Gbps Ethernet, 128 GB SAS Disk, IPMI access through independent management network.
- 6x compute node (each): Dual-socket CPU with at least 4 physical cores per socket, 64 GB RAM, 256 GB SSD
For VLAN networking 2 NICs are recommended, when using VXLAN only 5 are recommended.
Additional prerequisites for Hybrid deployments can be found here: 0.7 Openstack Hybrid Prerequisites
Configuration recommendations:
- X-Auth-Token should be valid for 24h
- Login to controller node as root
- Edit /etc/keystone/keystone.conf
- Find section [token]
- Change expiration = 3600 to expiration = 86400
- Restart apache2 service (if your controller runs on Ubuntu) or httpd service (if your controller runs on CentOS)
- Nova should use lvm type storage for VM. (Nova configuration.)
- Cloudera instances flavor must be set to at least m1.large - deployment with m1.medium instances fails due to lack of RAM.
-
Download heat template for stack. Use TAP-FullVM.yaml for Full VM type install, or TAP-Hybrid.yaml for Hybrid type install.
-
Log into OpenStack Horizon WebUI as admin.
-
Create a new OpenStack project (Identity -> Projects -> Create Project), set quotas for Volumes, Vol Snapshots, Total size of Vols and Security Groups to "-1".
-
Create a new OpenStack user (Identity -> Users -> Create User), grant admin rights to the project just created.
-
Logout from Horizon and log in with just created user identity.
-
Switch the UI context to the project just created (Top bar drop-down menu).
-
Import a SSH key pair (Project -> Compute -> Access & Security -> Key Pairs -> Import Key Pair).
-
Allocate and note down a Floating IP (Project -> Compute -> Access & Security -> Floating IPs -> Allocate IP To Project). Use it to register DNS A wildcard record of a TAP Domain.
-
Note down API URL (Project -> Compute -> Access & Security -> Api Access -> Identity).
-
Launch a Stack (Project -> Orchestration -> Stacks -> Launch Stack).
-
Provide a template file as Template Source.
-
Increase timeout to 300 minutes.
-
Set OpenStack identity API URL to noted down API URL.
-
Set Public IP to a noted down Floating IP
If you're behind a http proxy, and your Floating IP is accessed directly, put previously registered TAP Domain into No Proxy list - also if your OpenStack Horizon address is accessed directly, put Horizon IP (as in API URL) into No Proxy list
- When the stack is created - log in to a Jump Box instance using SSH with the key you've chosen:
ssh ubuntu@<jumpbox_server_ip> -i <ssh_key.pem>
- Run a shell script to finish the installation:
-
with Kerberos disabled:
curl https://s3.amazonaws.com/trustedanalytics/tqd.sh | sudo -i bash
-
with Kerberos enabled:
curl https://s3.amazonaws.com/trustedanalytics/tqd.sh | sudo -i KERBEROS_ENABLED=True bash
-
The whole deployment process should take from 2 to 5 hours. Once the process is complete (the script finishes without writing about failure), you can access the TAP console via https://console.DOMAIN_NAME_YOU_CHOSE and login with the username admin and the password you can find accessing Horizon UI for your OpenStack project (Project -> Orchestration -> Stacks -> (choose stack) -> Overview -> Outputs/password).
To access individual VMs, please SSH into the Jump Box machine using the procedure from the "Accessing installation logs" section below.
- Go to the Openstack Horizon UI Stacks tab (Project -> Orchestration -> Stacks -> (choose stack) -> Overview) and get the Jump box IP address:
- SSH to the instance using the user ubuntu and the key provided during the installation
ssh ubuntu@<jumpbox_server_ip> -i <ssh_key.pem>
. - Search /var/log/ansible.log for failed steps.
- Log in to a Jump Box instance using SSH with port forwarding set up to the
cdh-master-2
machine:ssh ubuntu@<JumpBoxEIP> -L 7180:cdh-master-2:7180 -i <key.pem>
- You should be able to access the CDH Manager web UI via http://localhost:7180
- Check if OpenStack X-Auth-Token can be valid for 24h (see: Configuration recommendations)
- Check installation logs
- If installation failed on
TASK [tap : command]
, log should looks like this:
- If installation failed on
2016-06-23 14:22:26,688 p=25908 u=root | TASK [tap : command] ***********************************************************
2016-06-23 14:22:27,302 p=25908 u=root | changed: [localhost] => (item=cf)
2016-06-23 14:22:27,314 p=25908 u=root | RUNNING HANDLER [tap : Deploy] *************************************************
2016-06-23 14:40:21,500 p=25908 u=root | fatal: [localhost]: FAILED! => {"changed": true, "cmd": "bosh --no-color -n deploy", "delta": "0:17:54.0847
02", "end": "2016-06-23 14:40:21.476447", "failed": true, "rc": 1, "start": "2016-06-23 14:22:27.391745", "stderr": "Acting as user 'admin' on deploy
ment 'cf' on 'WHOOP-8040-bosh'", "stdout": "Getting deployment properties from director...\nUnable to get properties list from director, trying witho
ut it...\nCannot get current deployment information from director, possibly a new deployment\n\nDeploying\n---------\n\nDirector task 5\n Started pr
eparing deployment > Preparing deployment. Done (00:00:00)\n\n Started preparing package compilation > Finding packages to compile. Done (00:00:00)\
n\n Started compiling packages\n Started compiling packages > ruby-2.1.6-intel/c10a92eb4684b9bbcd4b5eaa9b1c485ff10be0fd\n Started compiling packag
es > rootfs_cflinuxfs2/cbcda034c2bc785743c64ad4bbf689e5c96f09ed\n Started compiling packages > buildpack_binary/e0c8736b073d83c2459519851b5736c28831
1d92\n Started compiling packages > common-intel/7c774db615d36d85f4d905736833e7432524d567. Done (00:02:47)\n Started compiling packages > buildpack
_staticfile/f79dd915e8ee73b297ef3ae1d85b2895d5f5c106. Done (00:00:04)\n Started compiling packages > buildpack_php/a777948f80
125ad5ea3. Done (00:00:22)", " Started compiling packages > buildpack_go/9a0f49a47e179202fa04fe2ca39e5cb87110d570", " Done compiling packages >
buildpack_php/a777948f80667960f6bb8693253e59e888adf3e6 (00:00:40)", " Started compiling packages > buildpack_nodejs/a55b6669b5138c9d90720dd2dc678de4
8955560d. Done (00:00:37)", " Started compiling packages > buildpack_ruby/03b4c6236d1e663c05f5985fe964dd9262cdf2db", " Done compiling packages >
buildpack_go/9a0f49a47e179202fa04fe2ca39e5cb87110d570 (00:00:55)", " Started compiling packages > buildpack_java_offline/b13deaa98addc5d157885c8ec3
aad4df6640873f", " Done compiling packages > buildpack_ruby/03b4c6236d1e663c05f5985fe964dd9262cdf2db (00:00:36)", " Started compiling packages >
buildpack_java/b91bbdcc9fbe4d774ab47f4ded312151e741cb2a", " Done compiling packages > buildpack_java_offline/b13deaa98addc5d157885c8ec3aad4df664
0873f (00:01:12)", " Started compiling packages > nginx/bf3af6163e13887aacd230bbbc5eff90213ac6af", " Done compiling packages > buildpack_java/b9
1bbdcc9fbe4d774ab47f4ded312151e741cb2a (00:00:43)", " Started compiling packages > ruby-2.2.3/b1320e11c7ad997a68103042d3d1c38270309387", " Done
compiling packages > nginx/bf3af6163e13887aacd230bbbc5eff90213ac6af (00:00:33)", " Started compiling packages > mysqlclient-5.5/c97be6846302ac67d8ad
54ef08de4f741f8253ea. Done (00:00:04)", " Started compiling packages > libpq/e9383da451434bed183824a28693268596f7a578. Done (00:00:21)", " Started
compiling packages > postgres-9.4.2/ac1c8a521594f9459ffede25b9d7e0308811f139. Done (00:03:09)", " Started compiling packages > postgres/b63fe0176a93
609bd4ba44751ea490a3ee0f646c. Done (00:00:08)", " Started compiling packages > debian_nfs_server/aac05f22582b2f9faa6840da056084ed15772594. Done (00:
00:05)", " Started compiling packages > etcd-common/a5492fb0ad41a80d2fa083172c0430073213a296. Done (00:00:04)", " Started compiling packages > ruby
-2.1.7/c977026b967eab6fad2b03a820dca6f84a900f92", " Failed compiling packages > rootfs_cflinuxfs2/cbcda034c2bc785743c64ad4bbf689e5c96f09ed: Timed o
ut pinging to db7db92c-5289-4779-ab58-acbf81259b9e after 600 seconds (00:11:48)", " Done compiling packages > ruby-2.1.6-intel/c10a92eb4684b9bbcd
4b5eaa9b1c485ff10be0fd (00:14:39)", " Done compiling packages > ruby-2.2.3/b1320e11c7ad997a68103042d3d1c38270309387 (00:10:07)", " Done compi
ling packages > ruby-2.1.7/c977026b967eab6fad2b03a820dca6f84a900f92 (00:07:52)", "", "Error 450002: Timed out pinging to db7db92c-5289-4779-ab58-acbf
81259b9e after 600 seconds", "", "Task 5 error", "", "For a more detailed error report, run: bosh task 5 --debug"], "warnings": []}
2016-06-23 14:40:21,505 p=25908 u=root | NO MORE HOSTS LEFT *************************************************************
2016-06-23 14:40:21,509 p=25908 u=root | to retry, use: --limit @/root/.ansible/pull/jump-box.novalocal/local.retry
2016-06-23 14:40:21,509 p=25908 u=root | PLAY RECAP *********************************************************************
2016-06-23 14:40:21,510 p=25908 u=root | localhost : ok=80 changed=38 unreachable=0 failed=1
You can recover from this error by executing on JumpBox:
Note: You need to be logged to JumpBox as root user (use sudo -i to change active user from ubuntu to root)
Note: Run those commands under tmux or screen, as they can take a long time.
bosh deployment cf.yml
bosh -n deploy
ℹ️ Information |
---|
This task can take a long time, repeat in case of failure. |
bosh deployment docker-broker.yml
bosh -n deploy
ℹ️ Information |
---|
This task can take a long time, repeat in case of failure. |
/tmp/cloudfoundry.sh
- Log in to a Jump Box instance using SSH with port forwarding set up to the
cdh-master-2
machine:ssh ubuntu@<jumpbox_server_ip> -i <ssh_key.pem> -L 7180:cdh-master-2:7180
- You should be able to access the CDH Manager web UI via http://localhost:7180
-
Login to jumpbox and switch to root account
ssh ubuntu@<jumpbox_server_ip> -i <ssh_key.pem>
sudo -i
-
Clear extra routes in OS router
router_id=$(awk -F = '/router_id/ { print $2 }' /etc/ansible/hosts)
neutron --insecure --os-cloud TAP router-update ${router_id} --routes action=clear
-
Delete docker-broker deployment
bosh delete deployment docker-broker
-
Delete cf deployment
bosh delete deployment cf
ℹ️Information This task can take a long time, repeat in case of failure. -
Delete BOSH director
cd /root/<deployment-name>-bosh/
bosh-init delete bosh.yml
-
Login to Horizon UI
- Go to Stacks list (go to: Project -> Orchestration -> Stacks)
- Delete your Stack (select it, and click red button in upper right corner)
- After Stack delete clean up volume leftovers if any (Project -> Compute -> Volume)
- Remove BOSH Stemcells (Project -> Compute -> Images) and delete all images with name starting with BOSH
-
Optional (skip this step if in doubt)
- Release Floating IP (Project -> Compute -> Access & Security -> Floating IPs)
- Delete your SSH key pair (Project -> Compute -> Access & Security -> Floating IPs)
-
Log in to a Jump Box instance using SSH:
ssh ubuntu@<JumpBoxEIP> -i <key.pem>
-
Run a shell script to finish the installation:
- with Kerberos disabled:
curl https://s3.amazonaws.com/trustedanalytics/update_v0.7.0_to_v0.7.1.sh | sudo -i bash
- with Kerberos enabled:
curl https://s3.amazonaws.com/trustedanalytics/update_v0.7.0_to_v0.7.1.sh | sudo -i KERBEROS_ENABLED=True bash
For concrete upgrade contents please refer to 0.7.1 Release Notes
Q: Environment Source? (File? Direct Input?)
A: Do not use this field.
Q: Stack Name? are there recommendations for this?
A: Whatever suits You. Good practice is to use the same name as tenant name.
Q: TAP Domain? should this be the same as API URL?
A: No. Register a wildcard domain, and point it to floating IP allocated in step 8 of "Create a Stack"
Q: Where i can get Ubuntu VM Image?
A: You can use upstream Ubuntu 14.04 image from here: https://cloud-images.ubuntu.com/trusty/current/ . Ask Your OpenStack guy how to import images.
Q: Where i can get CentOS VM Image?
A: Use: https://s3-us-west-1.amazonaws.com/openstack-images-dp2/centos-6-x86_64.qcow2 . Ask Your OpenStack guy how to import images.
Q: (Hybrid Install Only) Cloudera Servers CIDR? it's asking the subnet in CIDR format, an example would be nice.
A: Ex: 1.1.1.0/24
Recommended reading: https://en.wikipedia.org/wiki/Classless_Inter-Domain_Routing
Q: (Hybrid Install Only) Cloudera Masters? This might be intuitive for some, but a bit more info and/or an example would be nice.
A: Ex: 1.1.1.1,1.1.1.2,1.1.1.3
<- list of IPs for servers dedicated to Cloudera Masters. Last one will by also a Manager.
Q: (Hybrid Install Only) Cloudera Workers? Same as Master.
A: Ex: 1.1.1.4,1.1.1.5,1.1.1.6
<- same as above, but for Workers.
Q: (Hybrid Install Only) Cloudera Storage Path? Same as Master.
A: If your servers have storage mounted as /tproot/data1,/tproot/data2,/tproot/data3
you should input storage paths like that: tproot/data1,tproot/data2,tproot/data3
.