-
Notifications
You must be signed in to change notification settings - Fork 0
Ogs
Previously known as Sun Grid Engine, Open Grid Scheduler is an Open Source scheduler we can use to run user jobs on the cluster. The documentation for OGS is pretty straight forward and is recommended as the primary source for information if there is a problem with these directions.
Where the other scheduler we made RPMS, OGS will use a shared directory for its applications. Some prefer to use the /software directory, some prefer /opt, and some choose something else. If you don't have a partition for software (we didn't create one for this guide), use /home/software. Putting it in the /home directory ensures it is already exported out to all the nodes, is backed up, and easily available.
$ sudo yum groupinstall "development tools"
$ sudo yum install csh pam-devel libXt-devel openmotif-devel libXpm-devel ncurses-devel
$ sudo mkdir -p /home/software/ogs
Now create a system user for ogs and give it permissions.$ sudo adduser ogs --system --home-dir /home/software/ogs --no-create-home
$ sudo chown -R ogs:ogs /home/software/ogs/
As the user, download the latest version of OGS.
Create the Code directory (if not already created), extract the downloaded tar.gz, and change into the directory.
$ mkdir -p ~/Code/ && cd ~/Code
$ tar -zxvf ~/Download/GE2011.11p1.tar.gz
$ cd ~/Code/GE2011.11p1
Now we will just follow the directions as given on the website.$ ./aimk -no-java -no-jni -no-secure -spool-classic -no-dump -only-depend
$ ./scripts/zerodepend
$ ./aimk -no-java -no-jni -no-secure -spool-classic -no-dump depend
$ ./aimk -no-java -no-jni -no-secure -spool-classic -no-dump
$ sudo export SGE_ROOT=/home/software/ogs/
Answer Y to the question on the following command. And the -E is important to pass the environment variables through to sudo.$ sudo -E scripts/distinst -all -local -noexit
$ sudo -E ./install_qmaster
Hit Enter to continue.
Hit y, yes we want to use a system user.
Enter 'ogs' then enter again to confirm.
Hit enter again to accept the SGE_ROOT of /home/software/ogs.
Enter again to accept default port.
Enter again for sge_qmaster
Enter again to accept default port.
Enter again for sge_execd
Enter again to keep the default name (you don't want to change this unless you know why you are changing it).
Give you cluster a name: TestCluster
Enter to confirm.
Enter to keep the default qmaster spool directory.
Select if you are going to have windows execution hosts. For this example, no.
File permissions, no we didn't install from a package.
Yes, verify and set. Enter.
Are all nodes in the cluster the same domain? Yes.
Enable JMX MBean server? No. Enter.
Setup Spooling. Enter.
Group ID Range. Enter to accept default.
Basic cluster configuration: Enter to accept default.
Enter email of admin.
Change configuration? no.
Qmaster startup script. Yes.
Add Grid Engine hosts. These cane be added, changed, modified later. Use a file? n
Hosts: Add hosts like this: node01 node02 node03 node04. Then hit Enter. Twice.
Add shadow host now? No.
Create default queues? Enter for default.
Scheduler Tuning: Default unless you know you want/need another option.
Make a note of the path it give and hit Enter.
Enter to go past the message.
Enter to leave setup.
$ . /home/software/ogs/default/common/settings.sh
Now we verify that it works.
$ qconf -sh
Now the following needs to be run on the nodes. This can be done manually, but we can script this as well.
$ sudo -E ./install_execd
Enter to go past the opening banner.
Enter to select the directory. Enter to confirm.
Enter to use the default Engine cell. Enter for default port.
Check host is admin. Enter.
Configure other spool? No.
Create local config. Enter.
Create startup script? Yes. Enter.Then after it starts, Enter again. Add host to a queue? y Enter to pass by message. Enter to confirm you saw the message.