Design and Construction of General Purpose Computing
Resources for Linux Based Computer Science Education
Authors:
- Richard Sharp, Assistant Professor of Computer Science
- Ed Harcourt, Associate Professor of Computer Science
Department of Mathematics, Computer Science, and Statistics
at St. Lawrence University
Abstract
For six years our computer science program had no dedicated
computing laboratories and limited influence on the software
that could be installed on university wide workstations.
In the Fall of 2009 we (the authors) built a computer science laboratory
for our CS program from scratch. This includes:
- Physical plant (took over old biology laboratories)
- Hardware/software configurations of workstations and servers.
- Assembly of workstations from components (majority by students).
- A motivation to teach a parallel computing course.
Result is a medium scale computing resource (58 CPUs / GPUs; 232
general purpose computing cores and 12,528 GPU cores) ideal
physically located in three classrooms (seating 25, 15, and 12)
and a server room. Ideal for classroom, student projects, and
single CPU or grid computing.
Go big or go home.
Motivation
- For six years, our computer science program
had no dedicated computing resources.
- Available computing resources were limited to
standard university terminals.
- We had limited influence on the kinds of software
that could be installed. (Windows based and installed
only once a semester with a 2 month pre-notice).
- These restrictions directly affected our ability
to teach computer science in nearly all our courses
except Theory of Computation.
- Regular workstation image.
- Source control.
- Computer graphics.
- Computer networking.
- Operating systems.
- Systems level programming in computer
organization.
Requirements
In designing the lab we considered both the educational,
computational, and long term administrative requirements
of such a project.
Educational
As an undergraduate liberal arts institution, we teach an
introductory level CS service course and a major that
follows the traditional curriculum for a liberal arts degree
in CS. The CS department teaches 15 courses, 5-10 senior
research projects, and 1-3 summer student fellowships per
year.
We have an average of 10 majors per year (although recently
that's been up) and teach about 150 students through our
introductory service course. We felt that the design of
the lab should take both missions into account.
The computing environment should be exposed to the student
as much as reasonably possible for educational purposes.
Computer science is about the practical implementation
of the theoretical foundations of computation.
Thus, requirements boiled down to this:
- Design a classroom to hold a the maximum number
of students. (Pack the service course).
- Upper level courses should be taught in an open
and more collaborative environment and allow
students flexibility with the environment and
hardware.
Administrative
We (the authors), and perhaps you the audience, have
have experienced frustration involving the need to install
software or configure OS settings for classes but being hampered
by bureaucratic processes in a rigid IT service model.
Thus, we wanted a service model that would allow us to make
fluid changes to the environment without a go-to person in
an IT department.
The only thing we didn't want to touch was maintaining login
information (consider a flux of nearly 200 students per year).
Our administrative requirements boiled down to this:
- Type and configuration of software should be decided
and implemented by the professor teaching the course.
- Environment should be flexible enough to install
software on the fly, during class if necessary.
- Backups should be solid and easily accessible.
- Hardware maintenance should be as painless as
possible.
- User access control should be maintained
by the central IT department.
- Students should have local CS filespace.
Physical Plant Construction
We were granted
"temporary" space in an abandoned section of hallway that once
housed part of the biology department.
Rooms were filled with garbage, raised laboratory tables,
leaking sinks, etc. A major concern to remodeling was the problem
of damaging the asbestos tiling.
Keeping our teaching needs in mind, we design two classroom
layouts. One for the service course, the other for the introductory
level:
Other physical plant issues we had to consider:
- Data network layout.
- Power needs for each room.
- Classroom stuff: Projector, projector display, podium, whiteboard,
furniture.
- After-hours classroom access.
- Air conditioning, 450W * 26 machines plus 26 humans
emits around 40000 BTUs per hour. Compare to my woodstove
rating of 50000 BTUs per hour.
Hardware Configuration and Construction
We wanted to assemble high quality workstations from components,
in part for the learning experience to the students, but also
to give us flexibility about how to handle eventual hardware
failures.
Workstation configuration:
- Intel i7 2.6GHz 4 core processor.
- 6 GB of RAM.
- NVidia GTX 260 graphics cards.
- 100GB hard drive.
- Gigabit Ethernet.
- Two 17" LCD monitors.
- 450W power supply
As part of the hardware specs, we also bought extra components
to replace failures over time. Our general rule of thumb was
"15 extra of each". (More on that later).
As part of the educational experience of building the labs
we taught a special topics course of which the first three
weeks was spent having students assemble workstations.
Here's the guts of a workstation:
Software Configuration
We considered Windows and Linux based platforms for our
software configurations. We chose Linux because we felt
it was a better choice to teach computer science for
the following reasons:
- Linux has the potential to expose much of the computing
process to the user; ideal for teaching computer science.
- Advanced configurations of the operating system can be achieved,
albeit with difficulty. However, once a system is
configured, it can be superior in terms of flexibility
to a Windows installation.
Currently we are running Ubuntu 10.04 on the workstations
and servers.
- File server shares home directories primarily
through NFS, but also with Samba for those
students who want to mount home directories
on a Windows system.
- Access control is handled through a complicated
configuration of Kerberos client on each workstation, Samba client to interface with the
Windows AD, and a modification to the workstation's PAM (plugable authentication modules)
configuration.
- Software can be installed/uninstalled as needed
at any time. (More on that later).
Configuring the server/client software was the most difficult
part of the entire project.
Maintenance
- Daily monitoring:
- Software updates:
- Backups:
- Primary file server runs a 2 disk RAID-1 (mirror) on the home
folders.
- Primary filer server runs an hourly rsync-like
backup called rsnapshot on the root home
folder to a separate partition consisting of a
2 disk RAID-1.
- Backup file server does hourly rsync of home
folders from primary. Data stored on a 2 disk
RAID-1.
- Backup also takes hourly backups using rsnapshot to
a 2 disk RAID-1 partition.
- Finally, the backup file server makes backups
of its backups to an offsite machine.
- We can switch to the backup file server if
the primary fails in minutes. (We change
an entry in the /etc/fstab of each workstation
to point to the backup file server.)
- Hardware:
- In anticipation of hardware failure, we bought
15 extra of everything. (Power supplies, mother
boards, memory, processors, etc.
Final Results
Student Perceptions
Anecdotally, we have had a good reaction to the change in
computing environment.
- Students are obviously excited about their hand
in the construction of labs. Also gives a bit
of ownership to the lower class.
- Higher level students like the switch to Linux
and can appreciate command line tricks.
- Introductory students don't have a problem
with the environment. (Use terminal in my intro class).
Long Term Observations
- Software management is easier than we imagined. Adding
a Python module, Eclipse plugin, or software package
has never been a problem.
- Hardware maintenance has been more difficult than expected
due to an initial bad batch of graphics cards in the manufacture.
This is a good job for IT.
- Never run upgrades in the middle of the semester, if ever.
- Server reliability has been good, as well as
monitoring. Recently a hard drive went flaky
on the backup file system. We got several emails about
it, pulled it, and installed another
in about 15 minutes. The operation system rebuilt the RAID-1
and carried on.
- Have used the workstations for several research projects
mostly involving parallel Monte Carlo simulations
using MPI and CUDA.
- Hardware failures:
- In anticipation of hardware failure, we bought
15 extra of everything. (Power supplies, mother
boards, memory, processors, etc.
- After a year of operation, we have seen
3 LCD monitor failures, 2 motherboard
failures, > 15 graphics card failures, and
one hard disk.
- IT department handles returns.
- Have seen failure rates according to the
classic bathtub curve:
Sources of Potential and Active Funding
- NSF MRI.
- NSF CCLI.
- Private foundations.
- Capital cycle.
Questions
- Happy to answer any questions
- Would also enjoy hearing how your schools handle
computing environment.