Building Your Own Super
Computer
Building Your Own Super
Computer ( 1 )
Why pay $10 million for a
supercomputer when this article can
show you how to build your own
supercomputer cluster with just a
handful of Windows/Linux
PC's...
James Cameron’s Titanic (the movie)
special effects crew couldn't afford
a supercomputer to do the critical
rendering, and anything less would
take far too long.
Like all high-end animators and
special effects houses, the Titanic
team had a slew of SGI Indigo
workstations (as well as a pile of
new Windows NT workstations for the
low end jobs), but Titanic romance
and tragedy was far more demanding
than most projects.
A
much greater degree of realism was
required than for the typical
science fiction epic, and realism is
expensive. Rendering the water
scenes was obviously a job for a
supercomputer, but with Titanic
already far over budget, a
$10,000,000 computer wasn't
realistic.
The
performance problem was solved by
assembling a cluster of DEC Alpha
based computers into a Linux
cluster, an instant supercomputer at
a small fraction of the cost, which
produced a large number of
extraordinarily challenging visual
effects for this demanding film.
In
this article, although a bit off
topic, I will discuss how to build a
generic Linux or Windows
supercomputer with the clustered
computing concept. You will find out
just how easy it is to build a
supercomputer with Linux clusters.
In this article we will limit our
discussion to building Linux and
Windows clusters to obtain
supercomputer computational power.
It is out of scope of this article
to discuss how to solve any
computational intensive algorithmic
problem and how to code those
algorithms for cluster architecture.
Building Your Own Super Computer -
Definitions and Benefits of
Clustering ( 2 )
Greg
Pfister, in his wonderful book In
Search of Clusters, defines a
cluster as "a type of parallel or
distributed system that: consists of
a collection of interconnected whole
computers, and is used as a single,
unified computing resource".
Therefore, a cluster is a group of
computers bound together into a
common resource pool. A given task
can be executed on all computers or
on any specific computer in the
cluster. Lets look into the benefits
of clustering:
-
Scientific applications:
Enterprises running scientific
applications on supercomputers
can benefit from migrating to a
more cost effective Linux
cluster
.
-
Large ISPs and
E-Commerce enterprises with a
large database:
Internet service
providers or e-commerce
web sites that require
high availability and load
balancing and scalability.
-
Graphics rendering and
animation: A Linux
cluster has become important in
the film industry for rendering
quality graphics. In the movie
Titanic, a Linux cluster was
used to render the background in
the ocean scenes. The same
concept was used in the movies
True Lies and Interview with the
Vampire.
We can
also characterize clusters by their
function:
|
|
|
|
|
Building Your Own Super
Computer - Building Windows
Clusters ( 3 )
Hardware
Before starting, you should have the
following hardware and
software:
-
At
least two computers with Windows
XP, Windows
NT, SP6 or Windows 2000
networked with some sort of
LAN equipment (hub,
switch etc.).
-
Ensure during the Windows set up
phase that TCP/IP,
and NETBUI are installed, and
that the network is started with
all the network cards
detected and the correct drivers
installed.
We
will call these two computers a
Windows cluster. You now you need
some sort of software that will help
you to develop, deploy and execute
applications over this cluster. This
software is the core of what makes a
Windows cluster possible.
Software
The Message Passing Interface (MPI)
is an evolving de facto standard for
supporting clustered computing based
on message passing. There are
several implementations of this
standard.
In
this article, we will use
mpich2,
which is freely available and you
can download it
here
for Windows clustering, and find
related documentation here
.
Please read the
PDF
before starting the following steps.
Step 1: Download
and unzip
mpich2
into any folder
and share this folder with write
permission.
Step 2:
Copy all files with the .dll
extension from C:\MPICH2\lib to
the C:\Windows\system32 folder.
Step 3:
Install the Cluster Manager Service
on each host you want to use for
remote execution of MPI processes.
For installation, start
rcluma-install.bat (located in the
C:\MPICH2\bin directory) by
double-clicking from the local or
network-drive. You must have
administrator rights on the hosts to
install this service.
Step 4:
Follow step 1 and 2 for each node in
the cluster (we will name each
computer in the cluster as node).
Step 5:
Now Start RexecShell (from
folder C:\MPICH2\bin) by double-clicking
it:
Open the configuration dialog by
pressing F2. The distribution
contains a precompiled example MPI
program named cpi.exe (located in MPICH2/bin). Choose it as the
actual program. Make sure that each
host can reach cpi.exe at the
specified path.
Choose ch_wsock as the active
plug-in. Select the hosts to compute
on. On the tab 'Account', enter your
username, domain and password, which
need to be valid on each host
chosen. Press OK to confirm your
selections. The Start Button (from
the Window RexecShell) is now
enabled and can be pressed to start
cpi.exe on all chosen hosts. The
output will be displayed in separate
windows.
Congratulations -- your
supercomputer (Windows cluster) is
ready to run MPI programs!
|
|
|
|
|
Building Your Own Super
Computer - Building a Linux
Cluster ( 4 )
Linux
clusters are generally more common,
robust, efficient and cost effective
than Windows clusters. We will now
look at the steps involved in
building up a Linux cluster. For
more information go
here
.
Step 1
Install a Linux distribution (I am
using Red Hat 7.1 and working with
two Linux boxes) on each computer in
your cluster. During the
installation process, assign
hostnames and of course, unique IP
addresses for each node in your
cluster.
Usually, one node is designated as
the master node (where you'll
control the cluster, write and run
programs, etc.) with all the other
nodes used as computational slaves.
We name one of our nodes as Master
and the other as Slave.
Our
cluster is private, so theoretically
we could assign any valid IP address
to our nodes as long as each has a
unique value. I have used IP address
192.168.0.190 for the master node
and 192.168.0.191 for the slave
node.
If
you already have Linux installed on
each node in your cluster, then you
don't have to make changes to your
IP addresses or hostnames unless you
want to. Changes (if needed) can be
made using your network
configuration program Linuxconf in
Red Hat.
Finally, create identical user
accounts on each node. In our
case, we create the user DevArticle
on each node in our cluster. You can
either create the identical user
accounts during installation, or you
can use the adduser command as root.
Step 2
We now need to configure rsh on each
node in our cluster. Create .rhosts
files in the user and root
directories. Our .rhosts files for
the DevArticle users are as follows:
Master DevArticle
Slave DevArticle
Moreover, the .rhosts files for root
users are as follows:
Master root
Slave root
Next, we create a hosts file in the
/etc directory. Below is our hosts
file for Master (the master node):
192.168.0.190 Master.home.net Master
127.0.0.1 localhost
192.168.0.191 Slave
Step 3
Do not remove the 127.0.0.1
localhost line. The hosts.allow
files on each node was modified by
adding ALL+ as the only line in the
file. This allows anyone on any node
permission to connect to any other
node in our private cluster. To
allow root users to use rsh, I had
to add the following lines to the
/etc/securetty file:
rsh, rlogin, rexec, pts/0, pts/1.
Also, I modified the /etc/pam.d/rsh
file:
#%PAM-1.0
# For root login to succeed here
with pam_securetty, "rsh" must be
# listed in /etc/securetty.
auth sufficient
/lib/security/pam_nologin.so
auth optional
/lib/security/pam_securetty.so
auth sufficient
/lib/security/pam_env.so
auth sufficient
/lib/security/pam_rhosts_auth.so
account sufficient /lib/security/pam_stack.so
service=system-auth
session sufficient
/lib/security/pam_stack.so
service=system-auth
Step 4
Rsh, rlogin, Telnet and rexec are
disabled in Red Hat 7.1 by default.
To change this, I navigated to the
/etc/xinetd.d directory and modified
each of the command files (rsh,
rlogin, telnet and rexec), changing
the disabled = yes line to disabled
= no.
Once the changes were made to each
file (and saved), I closed the
editor and issued the following
command: xinetd –restart -- to
enable rsh, rlogin, etc.
Step 5
Next, download the latest version of
MPICH (UNIX all flavors) to the
master node from here. Untar the
file in either the common user
directory (the identical user you
established for all nodes "DevArticle"
on our cluster) or in the root
directory (if you want to run the
cluster as root).
Issue the following command:
tar
zxfv mpich.tar.gz
Change into the newly created
mpich-1.2.2.3 directory. Type
./configure, and when the
configuration is complete and you
have a command prompt, type make.
The
make may take a few minutes,
depending on the speed of your
master computer. Once make has
finished, add the mpich-1.2.2.3/bin
and mpich-1.2.2.3/util directories
to your PATH in .bash_profile or
however you set your path
environment statement.
The
full root paths for the MPICH bin
and util directories on our master
node are /root/mpich-1.2.2.3/util
and /root/mpich-1.2.2.3/bin. For the
DevArticle user on our cluster,
/root is replaced with /home/DevArticle
in the path statements. Log out and
then log in to enable the modified
PATH containing your MPICH
directories.
Step 6
Next, make all of the example files
and the MPE graphic files. First,
navigate to the
mpich-1.2.2.3/examples/basic
directory and type make to make all
the basic example files.
When this process has finished, you
might as well change to the
mpich-1.2.2.3/mpe/contrib directory
and make some additional MPE example
files, especially if you want to
view graphics.
Within the mpe/contrib directory,
you should see several
subdirectories. The one we will be
interested in is the mandel
directory. Change into the mandel
directory and type make to create
the pmandel exec file. You are now
ready to test your cluster.
|
|
|
|
|
Building Your Own Super
Computer - Testing Your Linux
Cluster ( 5 )
The
first program we will run is cpilog.
From within the
mpich-.2.2.3/examples/basic
directory, copy the cpilog exec file
(if this file isn't present, use
make command again) to your
top-level directory. On our cluster,
this is either /root (if we are
logged in as root) or /home/DevArticle,
if we are logged in as DevArticle
(we have installed MPICH both
places).
Next, from your top directory, rcp
the cpilog file to each node in your
cluster, placing the file in the
corresponding directory on each
node. For example, if I am logged in
as DevArticle on the master node,
I'll issue rcp cpilog Slave:/home/
DevArticle to copy cpilog to the
DevArticle directory on Slave. I'll
do the same for each node (if there
are more than two nodes). If I want
to run a program as root, then I'll
copy the cpilog file to the root
directories of all nodes on the
cluster.
Congratulation your supercomputer
(Linux cluster) is ready to run MPI
programs!
Once the files have been copied,
I'll type the following from the top
directory of my master node to test
my cluster:
mpirun -np 1 cpilog
This will run the cpilog program on
the master node to see if the
program works correctly. Some MPI
programs require at least two
processors (-np 2), but cpilog will
work with only one. The output looks
like this:
pi
is approximately 3.1415926535899406,
Error is 0.0000000000001474
Process 0 is running on
Server.home.net
wall clock time = 0.360909
Now
try all two nodes (or however many
you want to try) by typing: mpirun -np
2 cpilog and you'll see something
like this:
pi
is approximately 3.1415926535899406,
Error is 0.0000000000001474
Process 0 is running on
Master.home.net
Process 1 is running on
Slave.home.net
wall clock time = 0.0611228
The
number following the -np parameter
corresponds with the number of
processors (nodes) you want to use
in running your program. This number
may not exceed the number of
machines listed in your
machines.LINUX file plus one (the
master node is not listed in the
machines.LINUX file).
To
see some graphics, we must run the
pmandel program. Copy the pmandel
exec file (from the mpich-1.2.2.3/mpe/contrib/mandel
directory) to your top-level
directory and then to each node (as
you did for cpilog). Then, if X
isn't already running, issue a
startx command. From a command
console, type xhost + to allow any
node to use your X display, and then
set your DISPLAY variable as
follows:
DISPLAY=Server:0 (be sure to replace
Server with the hostname of your
master node). Setting the DISPLAY
variable directs all graphics output
to your master node. Run pmandel by
typing: mpirun -np 2 pmandel
The
pmandel program requires at least
two processors to run correctly. You
should see the Mandelbrot set
rendered on your master node.
Adding more processors (mpirun -np
10 pmandel) should increase the
rendering speed dramatically. The
mandelbrot set graphic has been
partitioned into small rectangles
for rendering by the individual
nodes. You can actually see the
nodes working as the rectangles are
filled in. If one node is a bit
slow, then the rectangles from that
node will be the last to fill in. It
is quite fascinating to watch.
|
|
|
|
|
This article was not written by Web
Street. One of our customers found
it in a news room. We tested it and
found it credible. We now wish to
share it with you. We take no
responsibility, credit, fee or
referral from this article.
Post a Comment