Skip to content
cstackpole edited this page Dec 1, 2013 · 43 revisions

Welcome to the Cluster Builder Manual!

This manual is licensed under the Creative Commons Attribution-ShareAlike CC BY-SA license

This Guide is still under construction. If you run into a problem, please post an issue report and I will fix it as soon as I can. Thanks!

Why build a cluster?

The world of clusters can be a very broad subject. Some need computational power for research or for business while others might be interested in this guide as a means to learn parallel coding, to learn administration, or just for fun. I enjoy building clusters for fun, research, and for profit. This instructional guide will walk one through the process of building a cluster from the ground up, including the basics of administration and management. I am in the process of expanding the notes to cover a variety of possibilities and configurations.

Pre-build process

Cluster design.

How you build a cluster can vary wildly depending on how much money you are willing to invest. This guide is going to take the assumption of a smaller cluster on a standard Gigabit network.

The frontend should have at least an 80GB hard drive, but the more room you have for your /home partition the better. As your users will be logging into the frontend to compile, launch jobs, and do work then you really should have several GB of memory and multiple cores. A section on creating a Login Node for users will come later. The frontend should also have two network cards. The first will have access to the public network or the internet while the second will have access to the private network that is reserved specifically for the nodes.

The compute nodes should be as beefy as possible. The type of nodes completely depends on the type of work. Large data set jobs may require more memory then processing power while rendering jobs may require more GPUs then anything else. If you are just doing this for the education then whatever you have works. My first personal cluster had ancient hardware because that is all I had access to. My current personal playground setup consists of clearance-sale refurb boxes from Newegg; they cost me very little and are surprisingly powerful. When building a cluster for a job, tailoring the compute nodes to the job is very important but completely dependent on the job type.

In the 'simple' setup, this guide will assume that you will be exporting your /home directory from the frontend via NFS. This is not the only option that you have. It is not uncommon to have a SAN or NAS on which /home is stored. Some build their own while others buy one. This is more a more advanced topic outside the scope of this guide at this time.

Reliance on Puppet

It used to be that I would pack as much stuff into the node PXE kickstart file so that everything was configured on node install. This led to three reoccurring problems. 1) A very unwieldy kickstart file 2) If nodes drifted too far apart for some reason I was practically forced to re-kickstart them in order to bring them in line. 3) If I made a change to the cluster nodes, but forgot to apply the change to the kickstart file, I frequently would find myself fighting a battle I had previously fought and conquered.

This is not a preferable method. Enter Puppet. I don't really care what configuration management tool you use (puppet, CFEngine, Chef, Salt, ect) but managing your environment with one tool makes life SO much easier and better. I solve a problem, build a solution, put it into puppet, and I don't worry about solving that problem again. It has made a world of difference. The guide started with a minimal aspect of puppet use, but I am quickly using it more and more. I will be updating this guide to use Puppet because that is the one that I found to be most useful to me.

Guide example cluster

This guide is assuming the following setup.

Internet <-> Frontend01 <-> nodes

The frontend is known as frontend01.cluster.domain and it will have a public side address of 192.168.1.201 and a private side network of 10.10.10.10.

The nodes are known as node01, node02, node03, and node04.

The example cluster that this guide uses will be based on a 64bit system, but the guide will attempt to mask the commands for those with 32bit systems. Anytime the variable $ARCH is shown, substitute it with either i386 or x86_64.

Disable Autoupdate

One of the more infuriating things about being a sysadmin is an update wasn't needed, isn't wanted, and is installed without our explicit permission. Some genius upstream decided that having yum autoupdate be default. I frequently wonder if they have never been a sysadmin before or if they have just been incredibly fortunate to not have an update break things. Throughout this guide there will be notes about disabling auto-update. How you handle this is up to you. I personally follow a few different guidelines on this subject.

  • Leave auto-update in place but control the repo! Below are instructions for building a local repo. This guide configures it for cron job updates for ease of use. It is not uncommon for me to have two repos. One syncs up every day to the official mirrors, the second updates from the first only when I tell it to. This way I know what is syncing, when it is syncing, and it only syncs when I have declared it safe. Then the nodes syncing off the second repo can install and update whenever they want.
  • Testing. This is what determines when a system updates. I suggest a patch cycle of a week or two at the most. Of course this is completely dependent on the severity of the situation; I have rushed patches in a matter of hours before. If the box is still running after the patch cycle, I allow those patches to be pushed into the main repo. If a box has issues, then I have time to document the problem, the fix, and can prepare for this on the important servers. As for how I test, that is very application dependent too. Apache? Does it update without breaking the dev website? If so, great! Push it. The Linux Kernel? Well that has a slew of things I have to check for: our VMware environment, the Nvidia drivers, the SAN modules, the performance metrics, a few specialty applications....the list goes on for me. How a patch passes your test will all be dependent on your environment. The thing to note here is that you are never caught by surprise coming into work one day to discover an autoupdate has destroyed your systems, taken down every node in the cluster, and angered your users. Trust me, that is not the best way to start the day and your day never gets better from there. Leave that pain for the rookies who don't know any better and leave auto update on.
  • Manual update. Some servers I manually watch their updates. There are just a handful of these servers. Any more then that and the servers will be forgotten. These are very special servers and I can't afford for them to break. I update a very small few packages at a time with gaps between the updates. I watch the updates like a hawk to ensure they update properly.

So do yourself a favor. Disable auto update and set a reoccurring calendar event or whatever you need to do to make it habit that you test patches, then deploy to your servers.

If it is so bad, why is it installed by default? Because there are a lot of terrible terrible sysadmins who are far too lazy to properly manage their systems and shouldn't be doing it. These terrible sysadmins never patch their servers and provide vulnerable gateways for nefarious activity. For their incompetence, the rest of us have to suffer with auto update and zombie servers that send spam, DDOS, and other vile at us.

How to disable Yum autoupdate

$ sudo vim /etc/sysconfig/yum-autoupdate
ENABLED="false"

Build a repo.

If you are building a cluster of any significant size then you will be grabbing the same packages many times. This can be very time consuming for you over a slow internet connection and very load intensive on a community repository. In these situations it is often very useful having a local repository from which you can pull your packages from. Here is one way of building a local repository.
CreateRepo

Building a cluster.

Operating system.

Start with the installation of the frontend. The operating system one chooses is very important. Many prefer using Red Hat, CentOS, or Scientific Linux but there are many good reasons for choosing a Debian based system as well.
Scientific Linux
Debian (On the Todo list!) Ubuntu (On the Todo list!)

Configuring software on the frontend

First login to configure the new installation: On Scientific Linux.

Verify network settings.

Configure Puppet

Configure network IP forwarding

Configure NFS for /home.

Configure a DHCP/TFTP server.

Node Kickstart File

Create a Kickstart file for the nodes.

Node Puppet resources

Create a node resource in Puppet.

Kickstarting the nodes

The base cluster is now almost completely configured. Now, to add the most visible part of the cluster. The nodes

Resource Management

Hardware resource manager Torque
User resource manager Maui

-or-

Open Grid Scheduler

Parallel Computing

OpenMPI

Testing and benchmarking the cluster

Cbench

Administration of the cluster

Modules '(coming soon)'

Configuring users

Add users and push their logins to the nodes

Add a development user for creating packages for the cluster

Trouble shooting.

Things didn't go as planned, huh? I am truly sorry. Unfortunately, my fingers don't always type what my brain tells them to do and I typo something I shouldn't have. Chances are, that is what happened. The best place to start is at the beginning. Once we find a place where things are going wrong, we can narrow down the potential problems.

  • Does the DHCP server start?
    ** Verify your /etc/dnsmasq.conf file is typo free.
  • When booting the node, does it get a TFTP IP address?
    ** Is DNSMasq running properly?
    ** Watch the /var/log/messages file on the server.
    ** Verify your firewall settings. Try temporarily disabling the firewall to see if that helps. If so, fix your firewall rules and turn it back on.
    ** Verify your SELinux settings. Try temporarily disabling SELinux to see if that helps. If so, fix the SELinux permissions and turn it back on.
  • When installing the node, does it get a kickstart file?
    ** Check permissions on the http.cluster.domain server.
  • When installing the node, does it fail during install?
    ** Verify your kickstart file is correctly configured. Also, knowing where in the install process can help.

Helpful Links

Helpful Links

Creative Commons Attribution-ShareAlike CC BY-SA license