Sunday, December 16, 2007

Building a Yellow Dog Linux PS3 Cluster

This is a modification of the Mueller group's protocol for building a PS3 cluster using YellowDog Linux (YDL) instead of Fedora Core 7 (see README.fedora7)

Download a public copy of YellowDog Linux (currently version 5.0.2) and burn the iso

Installation Guide

1. Install the Linux Operating System (YDL 5.0.2) on the PS3 Playstation. You
will need a HDMI cable. Only the default installation works. Trying to change the
partition scheme from default results in a bootloader installation failure. The
DVD contains the otheros bootloader image. You will need to partition the hard
drive, install the otheros bootloader, and change the Default OS option in System
Settings using the Sony Playstation controller the first time. Afterwards, the
Playstation Setup menu can be accessed via USB keyboard.

At the kboot prompt during startup, one can boot between Linux and PS3 using the
following command: boot-game-os (to boot into the PS3 console).

2. Install the SDK prerequisites using YUM
yum install rsync sed tcl wget ypbind nfs rsh-server xinetd
(some of these may already be installed)

3. Mount the DVD using the command mount /dev/dvd /mnt , then cd /mnt/YellowDog/RPMS.
Install the elfspe binary rpm from the DVD. rpm -ivh elfspe*.rpm . Alternatively, you
can install this and other packages' RPMs via the GUI based Software Installer. Install
the kernel source and kernel source headers at this time.

4. Update packages using the yum command: yum update.

5. Install OpenMPI rpm: yum install openmpi-1.1.1-1yhpc.2

8. Add exclusions to the YUM configuration file /etc/yum.conf (append to the [main] section):
exclude=blas kernel numactl oprofile

9. Test the Cell SDK that came with YellowDog using the instructions from the following links:
http://www.ibm.com/developerworks/power/library/pa-linuxps3-1/
http://www.ibm.com/developerworks/power/library/pa-linuxps3-2/

If running the compiled factorial program from the second link fails
(i.e. ./factorial), then you forgot to install the elfspe rpm
that is required to run SPE programs directly.

10. Disable and enable the following services using services, chkconfig, or ntsysv:
chkconfig --level 345 autofs on
chkconfig --level 345 nsf on
chkconfig --level 345 nsflock on
chkconfig --level 345 rsh on
chkconfig --level 345 ypbind on
chkconfig --level 345 nfslock on
chkconfig --level 345 nfs on
chkconfig --level 345 bluetooth off
chkconfig --level 12345 anacron off

11. Selinux should already be disabled.

12. Create the huge TLB Filesystem:
mkdir /huge

create /etc/init.d/hugetlbfs (needed early in boot sequence while consecutive memory
is available). Link hugetlbfs to startup scripts in rc3.d, rc4.d, and rc5.d:
ln -s /etc/init.d/hugetlbfs /etc/rc3.d/S07hugetlbfs :

#!/bin/csh
#
# This script will be executed *early*.
# You can put your own initialization stuff in here if you don't
# want to do the full Sys V style init stuff.

set kernel=`uname -a | awk '{print $3}'`
if ( "$kernel" == "2.6.22-0.ydl.rc4" ) then ;
echo 1 > /proc/sys/vm/nr_hugepages
mount -t hugetlbfs nodev /huge
chmod 755 /huge
endif

chmod 755 /etc/rc3.d/S07hugetlbfs

13. Install netpbm via yum:
yum install netpbm-devel-10.34-1.ppc (there doesn't seem to be a ppc64 version with
YDL either on the CD or the repo)

I didn't do the following test because of the differences in the installation:

#matmult example
yum -y install netpbm-devel.ppc64 (cannot install due to failed dependencies. Go to
http://fedora.fastbull.org/development/ppc/os/Fedora/ and try it yourself.)

ln -s libnetpbm.so.10 libnetpbm.so
tar xf /opt/cell/sdk/src/demos_source.tar
cd demos
make INSTALL_DIR=bin
matrix_mul/matrix_mul -i 10 -m 128 -s 6 -v -1
-> not working, /huge problem, VM kernel error in /var/log/messages, turn selinux,
usr addon kernel and it works
matrix_mul/matrix_mul -i 10 -m 128 -s 6 -v -1 -H
-> works

14. Setup rsh. Note: /etc/xinetd.d/rsh is already installed via yum.

Working /etc/xinetd.d/rsh:
# default: on
# description: The rshd server is the server for the rcmd(3) routine and, \
# consequently, for the rsh(1) program. The server provides \
# remote execution facilities with authentication based on \
# privileged port numbers from trusted hosts.
service shell
{
disable = no
socket_type = stream
wait = no
user = root
log_on_success += USERID
log_on_failure += USERID
server = /usr/sbin/in.rshd
}

append the following to /etc/securetty:
rsh

add this line to the first line of /etc/pam.d/rsh and remove the old pam_rhost auth line:
auth required pam_rhosts_auth.so promiscuous

create /etc/hosts.equiv:
+ +
create ~/.rhosts, chmod 600:
hostname of the remote system (i.e.ps3)

Restart the xinetd and network services:
service xinetd restart
service network restart

Test using the following command: rsh remote_hostname cmd (i.e. rsh ps3 pwd)

If you get an error stating "no route to host", from the console type the
command:
iptables -F ; iptables -L

If you then get a "permission denied" error when you run the rsh test, check
the /var/log/secure log. If it says,"rshd[4949]: pam_securetty(rsh:auth): access
denied: tty 'rsh' is not secure !", then what I did was copy a working
/etc/securetty, /etc/pam.d/rsh, and /etc/pam.d/rlogin using scp from a working
node to the broken node.

rsh should now work. I permanently disabled the firewall using the GUI
Configuration program.

15. Setup mpich:

yum install libXt-devel-1.0.0-2.2.ppc

One cannot install the java-1.5.0-gcj-devel rpm on YDL due to too many
failed dependencies.
yum -y install java-1.5.0-gcj-devel

Build the mpich2 rpm from the src rpm:
mkdir .kpackage
cd .kpackage
wget ftp://czar.eas.yorku.ca/pub/mpich2/mpich2-1.0.6-1.fc8.src.rpm
wget ftp://czar.eas.yorku.ca/pub/mpich2/mpich2.spec
rpm -Uvh mpich2-1.0.6-1.fc8.src.rpm
rpmbuild -bc -v /usr/src/yellowdog/SPECS/mpich2.spec
cd /usr/src/yellowdog/BUILD/mpich2-1.0.6
make install

create mpd.conf (http://www-cs.etsu.edu/hpc/ppga/MPICH2.doc):

For root:
echo "MPD_SECRETWORD=your_secretword" > /etc/mpd.conf
echo "#MPD_USE_ROOT_MPD=1" >> /etc/mpd.conf
chmod 600 /etc/mpd.conf

For regular users:
echo "MPD_SECRETWORD=your_secretword" > ~/.mpd.conf
echo "#MPD_USE_ROOT_MPD=1" >> ~/.mpd.conf
chmod 600 /etc/mpd.conf

In /etc/hosts remove the alias from the first line:

Change:
# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1 localhost.localdomain localhost node0
192.168.1.5 node0.google.net node0

To:
# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1 localhost.localdomain localhost
192.168.1.5 node0.google.net node0

Create the mpd.hosts file in your home directory (root can create /etc/mpd.hosts) -
only include working nodes:
node0
node1
node2

Start mpich2 service (for two nodes):
mpdboot --totalnum=2 --ncpus=2 -v --ifhn=node0 --rsh=rsh --file=/etc/mpd.hosts

Output:
running mpdallexit on node0
LAUNCHED mpd on node0 via
RUNNING: mpd on node0
LAUNCHED mpd on node1 via node0
RUNNING: mpd on node1


Test the mpich2 installation:
From a console, type mpdtrace. One should see:

[root@node1 ~]# mpdtrace
node0
node1

Create pi.c:

#include "mpi.h"
#include
#include

double f( double );
double f( double a )
{
return (4.0 / (1.0 + a*a));
}

int main( int argc, char *argv[])
{
int done = 0, n, myid, numprocs, i;
double PI25DT = 3.141592653589793238462643;
double mypi, pi, h, sum, x;
double startwtime = 0.0, endwtime;
int namelen;
char processor_name[MPI_MAX_PROCESSOR_NAME];

MPI_Init(&argc,&argv);
MPI_Comm_size(MPI_COMM_WORLD,&numprocs);
MPI_Comm_rank(MPI_COMM_WORLD,&myid);
MPI_Get_processor_name(processor_name,&namelen);

fprintf(stderr,"Process %d on %s\n",
myid, processor_name);

n = 0;
while (!done)
{
if (myid == 0)
{
/*
printf("Enter the number of intervals: (0 quits) ");
scanf("%d",&n);
*/
if (n==0) n=100; else n=0;

startwtime = MPI_Wtime();
}
MPI_Bcast(&n, 1, MPI_INT, 0, MPI_COMM_WORLD);
if (n == 0)
done = 1;
else
{
h = 1.0 / (double) n;
sum = 0.0;
for (i = myid + 1; i <= n; i += numprocs)
{
x = h * ((double)i - 0.5);
sum += f(x);
}
mypi = h * sum;

MPI_Reduce(&mypi, &pi, 1, MPI_DOUBLE, MPI_SUM, 0, MPI_COMM_WORLD);

if (myid == 0)
{
printf("pi is approximately %.16f, Error is %.16f\n",
pi, fabs(pi - PI25DT));
endwtime = MPI_Wtime();
printf("wall clock time = %f\n",
endwtime-startwtime);
}
}
}
MPI_Finalize();

return 0;
}

Compile:
mpicc -O3 -o pi pi.c

Execute:
mpiexec -n 4 /$HOME/pi (for single node)

rsync or scp pi executable to both nodes, then

mpiexec -n 4 -host node0 /$HOME/pi : -n 4 -host node1 /$HOME/pi

Stop mpich2:
mpdallexit

One can stop the X server by reverting to runlevel 3:
init 3 (fix this in /etc/inittab), or in /etc/kboot.conf, setting default=ydltext.

Labels:


Comments:
I modified pi to do a billion intervals:
if (n==0) n=100; else n=0;
becomes
if (n==0) n=1000000000; else n=0;

Various runs:

mpiexec -n 3 -host cray0 /root/pi : -n 3 -host cray1 /root/pi
Process 0 on cray0.verizon.net
Process 1 on cray0.verizon.net
Process 3 on cray1.verizon.net
Process 4 on cray1.verizon.net
Process 5 on cray1.verizon.net
Process 2 on cray0.verizon.net
pi is approximately 3.1415926535897052, Error is 0.0000000000000879
wall clock time = 22.509958

mpiexec -n 4 -host cray0 /root/pi : -n 4 -host cray1 /root/pi
Process 0 on cray0.verizon.net
Process 1 on cray0.verizon.net
Process 2 on cray0.verizon.net
Process 4 on cray1.verizon.net
Process 3 on cray0.verizon.net
Process 5 on cray1.verizon.net
Process 7 on cray1.verizon.net
Process 6 on cray1.verizon.net
pi is approximately 3.1415926535898278, Error is 0.0000000000000346
wall clock time = 25.332874

mpiexec -n 6 -host cray0 /root/pi : -n 6 -host cray1 /root/pi
Process 0 on cray0.verizon.net
Process 1 on cray0.verizon.net
Process 2 on cray0.verizon.net
Process 3 on cray0.verizon.net
Process 4 on cray0.verizon.net
Process 6 on cray1.verizon.net
Process 5 on cray0.verizon.net
Process 7 on cray1.verizon.net
Process 8 on cray1.verizon.net
Process 9 on cray1.verizon.net
Process 10 on cray1.verizon.net
Process 11 on cray1.verizon.net
pi is approximately 3.1415926535898397, Error is 0.0000000000000466
wall clock time = 22.525615

mpiexec -n 8 -host cray0 /root/pi : -n 8 -host cray1 /root/pi
Process 0 on cray0.verizon.net
Process 1 on cray0.verizon.net
Process 2 on cray0.verizon.net
Process 3 on cray0.verizon.net
Process 4 on cray0.verizon.net
Process 5 on cray0.verizon.net
Process 6 on cray0.verizon.net
Process 8 on cray1.verizon.net
Process 9 on cray1.verizon.net
Process 10 on cray1.verizon.net
Process 11 on cray1.verizon.net
Process 12 on cray1.verizon.net
Process 7 on cray0.verizon.net
Process 13 on cray1.verizon.net
Process 14 on cray1.verizon.net
Process 15 on cray1.verizon.net
pi is approximately 3.1415926535898451, Error is 0.0000000000000520
wall clock time = 26.761833

More processes per host actually bumps the error up somewhat.
 
Ok...on the PS3 you gave me, how do I fix it to go straight to the game os?
 
Not sure if the last comment saved.

How do I stop the linux dump and get it to go straight to the game system?
 
Post a Comment



<< Home

This page is powered by Blogger. Isn't yours?