Second Forum on Parallel Computing Curricula
Sunday, June 22, 1997
Newport, RI
The integration of parallel processing into the undergraduate Computer Science
curriculum is a question of balance in many areas:
The Computer Science curriculum at Taylor University has been involved with modifications to incorporate parallel processing for about 5 years. During this process, various approaches have been attempted, goals have been made, and various software and hardware platforms have been used in an effort to incorporate parallel processing into our curriculum in a natural way. While many of the decisions were made based largely on local factors, the goals and evaluation will, hopefully, be beneficial to a wider audience.
The nature of the University and the CS program have, obviously, been influential. The University is a fairly small, but relatively selective, liberal arts school with an enrollment of approximately 1,850 students of whom approximately 110 are CS majors. The department also offers a BS in Systems degree which requires another approximately 200 students to take about 30 hours of courses in the department. Other determining factors in the role of parallel processing are the two fairly specialized tracks in the major: Intelligent Systems (Artificial Intelligence) and Graphics.
This paper will discuss the process that was used to establish the current curriculum, the philosophy behind the curriculum, the curriculum itself, typical assignments for various courses, an evaluation of the approach, and probable future directions.
Return to Table of ContentsIn the process of introducing parallel processing some original goals were made [Tol95]. Modifications have been made based on experience. The goals are based on the stated reasons:
Students need an exposure to the issues that arise in parallel computation that are not present in sequential processing. Students also need to realize that many of the same problems (and solutions) exist as for sequential processing. (How many of the problems/questions presented by students are not truly parallel concerns?) Students should be aware that issues of synchronization and protection are important in single processor operating systems and multi-user data bases and have similar solutions.
Based on these reasons, the following goals were established:
There are many ways to present parallel processing. Typically, on a given issue two opposite approaches are possible. Perhaps the biggest challenge in integrating parallel processing into a curriculum is discovering the proper balance based on the scales required by the departmental environment.
Typical trade-offs that must be balanced are discussed briefly below. Some of the items in the list appear to be redundant, but seem to the author to have at least different shades of meaning.
The study of parallel processing in a single course or a few dedicated courses has advantages such as coherent presentation and sequencing of material. The largest problem with integrating most of the material throughout the curriculum is that students do not always have the same background when new areas are approached. However, it is beneficial to present material in a context where students can see more immediate applicability to other areas of study.
Will parallel concepts be taught from a theoretical (maybe almost mathematical) perspective or will areas and algorithms where parallel processing has practical application be emphasized?
Will a course (or courses) be offered where students study about parallel processing (and do some as well) or will parallel processing be used only as a tool in courses where its use is appropriate? If students learn about parallel processing it is typically not very interesting if they can see no use for it. If use as a tool is predominant either students are not prepared to use the tool well or too much time is taken away from other concepts (such as image processing, neural networks, etc.) to discuss parallel issues.
Should programming concepts and models such as SIMD, shared memory, and distributed memory be emphasized or should programming of actual hardware to get answers be stressed? Programing actual machines is least beneficial in the area of parallel processing as anywhere in computer science due to the rapid changes in hardware, environments, and languages.
Simulators are cheap (usually free), typically easy to use, and often fairly realistic. However, they do not DO any parallel processing. It is difficult to convince students that a program has a given speedup on multiple processors because the simulator that took longer to run it says that it is faster.
This question overlaps the tool question to a large extent. Is the emphasis of our teaching parallel processing or are we teaching another area and using parallel processing to illustrate concepts and help the students explore larger, more realistic problems?
What is our definition of parallel processing? A very narrow definition would require special purpose hardware that is truly parallel by everyone's definition. A slightly broader definition would allow the use of networks of workstations and software such as PVM. A very broad definition would encompass the parallel activity inside a modern processor and distributed processing of data bases
Return to Table of ContentsA pleasant feature of the above questions is that they can all be answered "Yes" (or "No" depending on your outlook on life.) The key is to find a balance that fits with the other goals of a given curriculum. The answers that we have chosen (and modified) appear in the next section. It is important first to discuss another cornerstone of our approach to curriculum integration.
In order for students to derive the most benefit from discussions of parallel processing, they should be aware of the ways in which traditional sequential processing hardware and software operate in parallel. It is instructive to discuss many applications that are to some degree parallel. Such discussions reinforce the parallel concepts taught and strengthen understanding of concepts that are not narrowly parallel as well. An example would be the use of Microsoft SQL Server in our Information System Design and Database Concepts courses. While this application is more properly defined as distributed or client/server, comparisons can be made with more truly parallel approaches and with truly parallel database hardware. Perhaps this curriculum approach should be described as "parallel-aware."
Perhaps the approach could best be summarized by viewing computing ranging from a single user simple sequential processor system through massively parallel special purpose systems as a continuum with many stops along the way.
In the discussion of parallel applications in our curriculum appearing in the next section, most of our courses are listed. Many of the listed items fall into the category of broad definitions of parallelism.
Return to Table of ContentsOur curriculum modifications centered on the Intelligent Systems and Graphics tracks within our major. The approach chosen uses most of the concepts listed above, often both sides of a "balance" as appropriate and possible. The main course used to teach about parallel processing is the Algorithms course, which was increased by 1 credit hour to support the material included. The course is required of all CS majors. The Business Information Systems track, however, takes the course for only 2 hours credit, while the remainder take a 4 hour version of the course. This dual nature is accomplished by offering the course 4 days a week with the BIS students finishing at mid-term. Algorithmic topics and less theoretical areas are covered during the first half of the semester with the 2 hour course concluding with an introduction (and at least one programming assignment) to parallel processing.
The curriculum has been greatly strengthened by the purchase of 2 parallel processors through funding from an NSF grant (DUE-9552245). One of the processors is a Parsytec PowerXPlorer system. This machine consists of 8 nodes, each of which is an 80 MHz PowerPC with 8 MB of memory and a transputer for communications. The nodes are organized as a grid and use message passing either through the PARIX operating system or PVM. Decent debugging and profiling tools are available. The second machine is a more special purpose SIMD processor from Adaptive Solutions. This machine consists of 512 processors, each of which is essentially a digital signal processor (DSP). Available software includes ImageLNK (an image processing package) and a flexible window-based neural network system that supports training and statistical analysis of results as well as production use.
The parallel topics listed with the following courses have been used, are being used, or may be used in the future.
Assignments listed are those for which an html version was readily available.
Algorithms - sorting
Algorithms - PI by integration
Algorithms - life in parallel
Computer Organization - timing measurements
Language Structures - Occam
Computer Vision - image pre-processing and letter
recognition
Additional information including supprting materials used is available from the URL http://www.css.tayloru.edu and following the links to individual course pages.
ASSIGNED: September 11, 1996
DUE: September 24, 1996
The odd-even transposition sort is similar to the sequential bubble sort but modified to function in parallel. In each iteration, nearest neighbors are compared and, if necessary, exchanged. But since the comparisons and exchanges are done in parallel, they cannot all be done simultaneously since each value would be involved in two compares and potential exchanges at the same time.
To solve this problem, the comparisons and exchanges are done in two phases. The first phase compares each item with an odd index to the successor (even index) item and exchanges if necessary. The second phase compares each item with an even index to the successor (odd index) item and exchanges if necessary. A SIMD version of the algorithm is presented below.

The algorithm above assumes there are as many processors as there are data elements.
For P processors and N data items,
There are two possible solutions:
Assigned: September 27, 1996
Due: October 14, 1996
The value of pi may be obtained by doing the following integration:
In this assignment, the above integral will be done numerically on the PowerXPlorer network using PVM. A standard rectangular quadrature approach will be used, with the integration broken unto N slices. These slices will be distributed over the n processors in a "farm" or "worker"; approach. Thus, each processor will work on (N div n) slices, with the left over (N mod n) slices assigned to the first (N mod n) processors.
The value of n may range from 1 to 8. NOTE that
the node numbers range from 0 to n-1.
Block scheduling will be used which means that each processor will do (N div n) or (N div n + 1) consecutive slices beginning at a slice determined by node number.
The rectangular rule applied to a given slice is as follows:
The total area will be the sum of the partial areas generated by each of the n processors.
The following communication and processing will be necessary:
This assignment will use an SPMD approach: the code should be the same for all processors. Therefore, the code must check to see which node it is and behave appropriately as outlined above.
One of the major purposes
of this assignment is to investigate speedup on parallel processors.
Therefore, you need to conduct the following experiments:
SEE THE PVM EXAMPLE CODE AVAILABLE FROM THE COURSE WEB PAGE.
Assigned: November 13, 1996 Due: December 3, 1996
Implement the game of life using either pvm message passing on the PowerXPlorer or the shared memory simulator.
The game of life was invented by the mathematician John Conway and has been used for years as a programming exercise. It has sometimes been labelled cellular automata. The problem is conceptually simple and inherently parallel.
The game models life in a society of organisms, usually represented as a rectangular array of cells. Each cell has 8 neighbors (except for cells on the boundary of the finite array). Births and deaths occur according to the following rules:
Typical values of num_to_be_born, die_from_isolation, and die_from_overcrowding are 3, 2, and 3.
In a bounded array, edge cells have fewer adjacent cells, but the rules do not change. Possible adaptations would include wrapping around the "universe", etc.
The design questions involve how to divide the array among the processors and where in the algorithm are the critical regions and synchronization points. For example, regardless of how the array is divided among processors, some of the cells being updated by a processor are neighbors for some other processor and cannot be updated until the other processor has examined them as neighbors.
In addition to functioning code, a report should be submitted. This report should specify the way in which the array was partitioned among processors, why it was done this way, comments on the efficiency of this approach and suggested alternatives.
ASSIGNED: November 1, 1996
DUE: November 27, 1996
ESTIMATED COMPLETION TIME: 12 hours (without extra credit)
PURPOSE
The purpose of this experiment is to perform some quantitative
measurements to analyze the communication overhead in distributed
memory (and, optionally, SIMD) parallel architectures. Measurements
will be made on a commercial distributed memory system with a
dedicated, fast communication architecture and on a local area
network of pentium processors running linux.
BACKGROUND
The software overhead in message passing is so large that in dominates
other architectural differences such as network topology. (However,
there is a large difference between dedicated networks and distributed
machines on a local area network.) Differences in topology also
make some processors more difficult (or at least different) to
program than others. Therefore, various portable parallel programming
systems have been developed to allow the communication part of
the algorithm to be expressed in a processor independent manner
as long as there is an implementation of the necessary libraries
available for the given architecture. One of the more common
and popular systems is Parallel Virtual Machine (PVM). PVM was
developed by The Oak Ridge National Laboratory, the University
of Tennessee and Emory University. It is available for a variety
of parallel machines as well as various types of networks consisting
of almost any possible UNIX platform. For this experiment a version
of PVM tailored to the 8 node PowerXPlorer and a "generic"
version running over the network of linux processors will be used.
There are minor differences in the PVM implementation and in time measurement between the systems. Therefore example code to spawn a child process on a different processor, send a message to it, and receive the echo back will be provided for each platform.
The basic design of the experiment is to measure message passing overhead. This overhead can be expressed in terms of the latency or startup cost of sending a message and in terms of the transfer rate at which data can be sent.
EXPERIMENTS
GUIDELINES
RESULTS
EXTRA CREDIT
Any combination of the extra credit options may be done with a
maximum of 100 points of extra credit available.
DATA PACKING - 25 POINTS
The required experiment makes the assumption that the data to
be transferred between systems is unpacked from the receive buffer
and re-packed into the send buffer before being echoed back to
the originating processor. Modify the required experiment for
both the linux network and the PowerXPlorer to use the pvm_setsbuf()
function to allow the receive buffer to also be used as the send
buffer.
SIMD - 50 POINTS
A similar communication effect is visible on a SIMD machine that
is designed to support fast (parallel) communication to neighbors
and slow (sequential) communication to other processors. The
CNAPS processor from Adaptive Solutions is such a machine. The
processors are connected in a 1D grid where each processor is
connected to the one before it and after it with an interprocessor
bus in such a way that all processors may receive data from their
neighbor at the same time. Unfortunately, the C compiler does
not generate machine code to use this bus. There are, however,
inline assembly routines available to allow communication over
the interprocessor bus.
Design and conduct an experiment to measure:
As was done in the required part of the experiment, work with varying size data in an attempt to measure overhead and transfer rate.
PARIX - 50 POINTS
The Parsytec PowerXPlorer system has PARIX as its native operating
system. PARIX is able to effectively use the built in network
communication paths and transputer processors to produce lower
overhead and faster transfer rates for messages. Repeat the required
experiment using PARIX code on the PowerXPlorer system and note
differences.
HETEROGENEOUS PVM - 25 POINTS
(This project may not be possible. The necessary software is
not currently installed, but probably can be installed if there
is interest in pursuing this option.)
The version of PVM used on the PowerXPlorer differs from that
used in the linux network because it uses a custom communication
layer that works only within the parallel processor and, therefore,
makes some assumptions that leads to faster communication. For
example, standard PVM is designed to pass messages between machines
of different architecture so that data conversions are performed
before and after message transfer so that the messages are sent
in a generic network data format. Repeat the PowerXPlorer
experiment using the heterogeneous version of PVM.
PROCESS GRANULARITY - 50 POINTS
The effect of message size as well as communication path noted
in the required experiment should have an effect on actual parallel
programs. For this project, design some meaningful
application where the processing granularity can be varied so
as to require different amounts of communication and different
size messages. Measure how well the results of the required experiment
are verified in an actual application.
BROADCAST - 50 POINTS
Both implementations of PVM support broadcast messages with pvm_mcast(
). Design an experiment and gather data to determine how much
time is saved by using this feature for various message sizes
and numbers of recipients.
COS 382 - LANGUAGE STRUCTURES
OCCAM
ASSIGNED: May 9, 1997
DUE: May 16, 1997
SUBMIT IN: ~/382submit/occam
Keep a LOG of actual time needed to complete assignment
Using a pipeline sieve approach, write an occam program to generate a given number of primes. Treat 2 as a special case by simply printing it.
A generation process should generate odd numbers which are passed down a pipeline of filters. If a potential prime is divisible by the prime number assigned to the filter, the potential prime is removed from the stream. If the potential prime arrives at the end of the stream, it is a prime number and is assigned to the next filter. (HINT: A filter must assume that the first number it receives that is not stored in a previous filter is the prime to be assigned to it. A filter knows how many primes are stored in previous filters because it knows which filter it is. An output process will print all primes that survive the filtering process.
An End-Of-Sequence value should be sent down the pipeline so that the filters know when to expect no more input. If the program is to generate the first N primes, how many odd numbers should the generator process send down the pipeline? (If we were to generate all the primes less than M, we would not know how many filters to create.) Perhaps the last filter could communicate with the generation process.
Each number need only be checked up to its square root. An efficient approach is to store the square of a prime in its filter and check only those numbers whose square is less than the filter's square.
ASSIGNED: October 11, 1995
DUE: November 8, 1995
Using the CNAPS Image Library and a grey scale image, use at least two operations to enhance an image. Compare the original image and the two modified images from the perspective of both human understanding and the potential for further computer processing.
ASSIGNED: September 22, 1995
DUE: October 9, 1995
Based on the tutorial supplied with the BuildNet neural net software on the CNAPS parallel processor, investigate the use of neural nets to identify characters.
Measuring the success of any curriculum modification is a difficult task. There are a few students in our program who do not like to hear the word "parallel" but the modifications have been generally well received. One of the early indications that our efforts were moving in a good direction was the award of the NSF grant. The proposal was based on what had been done and the potential uses in our Intelligent Systems and Graphics programs.
Perhaps the most positive evaluation is that several students have chosen to use some form of parallel processing for class projects or research projects.
The largest drawback with the approach taken is the lack of a textbook that fits this pattern. Ideally, if time were available, a parallel manual could be written which covered the hardware and software systems available on campus as well as summarizing the more conceptual material involved in the various courses.
We are constantly investigating new approaches. Several simulator environments have been produced in the last few years which have some advantages such as allowing more than one parallel model to be programmed with the same interface [Pea95] or providing a more theoretical PRAM-style approach [Sbp]. The Parallaxis simulator [Bra93] was used before the SIMD processor was available and has a much nicer programming and user environment than the actual hardware. The Southhampton Portable Occam Compiler (SPOC) [Occ] converts parallel Occam code to sequential C and has been successfully used as an alternative to a single PC based transputer compiler. It is unlikely that we will use more simulators, however, as the students need to use our real hardware as tools as well as objects of study.
Return to Table of ContentsMost of the curriculum modifications in the near future will be attempts to actually do some of the things we have intended in various courses. The department has just completed some curriculum modifications that will effect the parallel effort to some degree, especially in that the Algorithm Design course has been changed from a 2 and 4 hour course to a 3 hour course required of all majors.
In addition, there will be continued effort to use parallel processing in research projects and in research areas tied to course projects.
The changes currently aniticapted are listed below.
As the CS2 course (Data Structures) is modified to include a closed lab in the fall, a supplied parallel sort application will be used for quantitative measurements.
The advanced graphics courses will make use of a multiple processor Silicon Graphics system in the Spring 1998 semester.
A project will be designed for the algorithms course which will behave like a parallel data base join operation. Such an example should be a more meaningful SIMD application for the Business track CS majors.
Return to Table of ContentsPartial support for this work was provided by the National Science Foundation's Division of Undergraduate Education through grant DUE#9552245.
Return to Table of Contents
Return to Table of Contents
Computing facilities which may be used for parallel processing projects include:
There have been various projects where parallel processors have been used for image processing and neural network use because of the specialized software available.
Other examples include
The CSS department has recently begun a research project named TA-DAA (Taylor Animation - Driven by Autonomous Agents) which has the following definition:
The goal of TA-DAA is to produce a system to support the creation of graphical animations driven by autonomous agents. The system will operate in a distributed heterogeneous environment and will support flexible input, output and agent structures.
More information on this research project is available from the URL http://www.css.tayloru.edu/~tadaa
Several CS majors are involved with the Taylor University Space Research Program which is responsible for the design and data analysis for multiple satellite instruments. Some of the data involves image processing which will be done in parallel beginning in the fall of 1997. Additional information is available from the URL http://www.css.tayloru.edu/~physics/srtp/contract.html
Return to Appendix ListNATIONAL SCIENCE FOUNDATION GRANT
Integrating Parallel Processing as a Tool Throughout the
Undergraduate Computer Science Curriculum
DUE-9552245
William E. Toll, Principal Investigator
Timothy C. Diller, Co-Investigator
Henry D. Voss, Co-Investigator
Most of the text of the grant proposal is included here.
The laboratory provided by this grant will allow the introduction of parallel processing into additional courses in a Computer Science curriculum where it is already established. Parallel processing will be used as a tool as well as an object of study. In addition, it will be used to support term projects and independent student research. Seven additional courses will include parallel processing bringing the total to 10 (48%) of the department's 21 major courses that include parallel processing. The department has a strong Artificial Intelligence track within the major and has recently begun a Graphics track. The lab provided by this grant will include a SIMD parallel processor which is ideally suited for many of the assignments and projects in areas such as image processing, speech and language processing and neural networks. In addition, a MIMD processor will be acquired which will be used in some of the AI courses, such as Natural Language Processing and Machine Learning, and in Graphics courses. The Computing and System Sciences department views term-long student projects and independent research as an integral part of the curriculum. The parallel processing lab will be used for these projects. Students often undertake research projects as part of an independent study or as part of a summer research program. This program provides funds for student wages while they conduct research projects under the direction of a faculty member. Some of the on-going projects are at a stage where the additional processing power provided by this lab is necessary. The Science Division has a Research Professor who has several projects involving the processing of images from satellite data. Students from other disciplines, such as physics and mathematics, will have access to the parallel processing lab for such research. This project will demonstrate the feasibility of using parallel processing on a large scale in the undergraduate computer science curriculum.
A. CURRENT SITUATION
B. DEVELOPMENT PLAN
C. EQUIPMENT
D. DEPARTMENTAL RESOURCES
E. FACULTY EXPERTISE
F. CURRICULAR NEEDS
G. DISSEMINATION AND EVALUATION
Institutional and Departmental Context
Taylor University is a private liberal arts college located in Upland, Indiana. The 1994 fall enrollment is 1825. The school has a stable enrollment pattern with the number of applicants more than twice the number that can be accepted.
The Computing and System Sciences department offers both majors and minors in Computer Science as well as a Systems program. The department enrolls about 100 CS majors and approximately 250 students who are receiving a BS in Systems with a variety of majors, including Business and Computer Science. The Computer Science major requires 64 hours and consists of a core of courses together with the choice of one of five tracks: Artificial Intelligence, Business Information Systems, Graphics, Scientific Computing or Integrated. The largest enrollments are in the AI and Business tracks although the relatively new Graphics track (1993) is attracting an increasing number of students. The Artificial Intelligence track in the major has been in existence since 1982 and has been well received by students and employers. The specialized courses are taught by knowledgeable faculty as listed below in the section titled Faculty Expertise.
The tracks primarily affected by this proposal are the AI and Graphics tracks. Assignments in AI courses do not require the use of parallel processing. Both tracks have courses requiring projects which include student research and/or application to solve new, practical problems. Many of these projects are developed in cooperation with industry. Courses that include term-long multi-student projects are Introduction to Artificial Intelligence, Computer Vision, Natural Language Processing, Knowledge Based Systems, Machine Learning, Advanced Computer Graphics and Directed Research. In addition, students often continue to work on projects or start new projects during independent studies. Taylor has also been able to fund a small number of students for summer research projects with the Summer Research Training Program. Many of the research projects involve image processing and/or neural networks.
Appendix 1 contains catalog descriptions and recent enrollments for the courses most affected by this project. Both the courses where parallel processing is currently taught and those where it will be added are listed. Appendix C contains current research projects that would be enhanced by this project. A listing of other recent projects is also included.
The Division of Natural Sciences hired a Research Professor in the fall of 1994. This individual has extensive experience with large grants and contracts from his previous employment with Lockheed and working relationships with such schools as Stanford University and the University of Chicago. He will be funded by continued and new contracts and will be able to provide research projects for several Taylor students in such majors as physics and mathematics as well as computer science. His current research emphasis is evaluation of satellite data that requires image processing.
The Computing and System Sciences Department has 7 full-time faculty plus a technical support person. One of the faculty is on leave each year in industry gaining current experience. The research professor mentioned above is also included in this proposal although not part of the Computing and System Sciences Department. His primary duties are conducting research, managing research projects and supervising student projects. Three of the CSS faculty have extensive experience managing student research projects. Several have experience managing large projects in industry.
While many parallel processing curricula emphasize more theoretical issues, the Taylor program also emphasizes practical applications. With the introduction of this project, parallel processing will be used as a tool throughout the curriculum besides being studied as a topic. Because of the present program and experience in AI, graphics and parallel processing, Taylor is in a unique position to demonstrate that a small college can afford and make good use of practical parallel processing facilities. As shown by the list of courses that follows, about 48% of the COS courses will use parallel processing either as a topic of study or as a tool as a result of the funding of this project.
The goals of the proposed parallel processing facility are:
1. to provide adequate parallel hardware for the courses where parallel processing is currently taught - most of these courses teach parallel processing as a topic rather than using it as a tool
2. to allow several additional courses to make use of parallel processing as a tool
3. to enhance the already strong AI program by providing parallel processing tools in the areas of image processing, speech and text processing and neural network use
4. to enable Taylor University CS graduates to have a large amount of actual, practical parallel processing experience
5. to provide adequate equipment to support the strong emphasis on undergraduate research already in place - many of the projects are in need of additional processing power and lend themselves well to a parallel approach
Appendix 2 lists details of the courses where parallel programming projects are currently assigned: Algorithm Design, Computer Organization and Language Structures. Each of these courses will be enhanced by the facilities provided by this project.
A separate section of Appendix 2 lists present course descriptions where parallel assignments will be given based on the availability of this lab. A short summary of these courses and the proposed modification is:
COS 250 Data Structures - parallel sort used with complexity discussion
COS 350 Computer Graphics - modify assignments, such as producing a ray traced image, to use a parallel ray tracer generating more realistic images
COS 351 Computer Vision - parallel image processing on SIMD and option of doing the course project on parallel hardware.
COS 380 Natural Language Processing - SIMD processing for recognition and MIMD processing to have independent agents doing phonological, syntactic and semantic analyses in parallel and communicating the results with each other via a blackboard
COS 423 Advanced Computer Graphics - required independent project on parallel hardware
SYS 411 Machine Learning - parallel processing for neural networks, genetic algorithms, case based learning and decision trees for rule induction - will use both SIMD and MIMD
All of the above courses are taught by Professor Toll or Dr. Diller. Professor Toll has experience in Graphics and in parallel graphics code while Dr. Diller has experience in AI.
Appendix 4 lists current research projects. Some of these projects will be greatly enhanced by the availability of parallel hardware. Most of the research projects in the department are directed by Prof. Toll, Dr. Diller or Dr. White. While Dr. White has no parallel processing experience, he does have experience in image processing research and in managing large coporate research projects.
In addition, Dr. Hank Voss, research professor in the Division of Natural Sciences, has much experience in directing research projects and in image processind and data visualization of satellite data. He is very interested in applying parallel processing techniques to the student projects he directs.
The modifications to the above courses and research projects cannot be done without the funding requested on this proposal. The present parallel environment is not adequate to support any expansion of parallel processing. Taylor has a unique and well-recognized AI program and a growing Graphics program. Taylor has a good history of undergraduate student research. Taylor has introduced parallel processing into the present curriculum. The availability of this lab will allow Taylor to be a real leader not only in the individual areas but in their integration and will enable Taylor to serve as a model of what a small liberal arts school can do with AI, Graphics and Parallel Processing.
The project calls for obtaining two pieces of parallel hardware along with necessary support hardware and software. Two units are requested because some of the proposed projects and assignments require the use of SIMD architecture while others require a MIMD architecture.
The timetable for the project includes obtaining the SIMD machine during the first year of the grant and the MIMD machine during the second year. This plan will require learning one new platform each year rather than two new platforms at the same time. A Sparc host will be acquired during the first year.
The goals of the proposed equipment acquisition are stated earlier. To meet these goals, two parallel architectures are needed. Several systems. all in the $50,000 to $65,000 range, were investigated. The only SIMD system seriously considered was the CNAPS system from Adaptive Solutions. This system is designed for exactly the types of AI applications needed at Taylor. No other SIMD systems were discovered during the search process that seemed appropriate enough to investigate carefully. Both shared memory and distributed memory MIMD systems were investigated.
The MIMD systems researched were:
1. PowerXPlorer distributed memory system from Parsytec. This system runs the Parix operating system and is based on 8 PowerPC chips with transputers for communication links. The system is quite flexible in configuration.
2. Gigacube shared memory system from Microway. This system is based on the i860 and consists of 20 nodes
3. A multi-transputer system from Computer System Architects. This system has a great deal of flexibility in that parts of the network of transputers can be used by different users and the network topology can be configured dynamically rather than requiring re-wiring.
The PowerXplorer has the advantage that the individual processing nodes are very fast and the communication links are user programmable which could lend itself to some data communications projects. The Gigacube has the advantages that it contains more nodes and is a shared memory system. The Gigacube is based on the i860 that is a good graphics processor. The CSA transputer system has the advantages that more processors are available, the processors can be shared and reconfigured easily and that the students and faculty already have transputer experience.
After analysis of the above advantages it was decided the best choice for a MIMD system is the PowerXplorer because of its raw processing power, its well developed and stable PARIX operating system, and the ability for different users to share the processing power during code testing.
Therefore the requested equipment is: (Details and price quotations for the selected systems are found in the budget section.)
1. CNAPS Server II with 512 DSP style SIMD processors. There is 16 Mb of data storage memory. A 68030 processor with 4 Mb of memory controls the interface between a Sparc host and the Server II. Software provided includes a C compiler, a set of libraries, the CodeNet assembly language development system, the BuildNet neural network development system, and a third-party image processing library LNK_ImageLib. The system connects to a controlling host through an ethernet connection. The total price of this system is $56,300.
2. Parsytec PowerXPlorer distributed memory MIMD system. Each of the 8 processor nodes consists of an 80 MHz MPC601 and a 30 MHz T805 Transputer for communication. A C compiler is included. The system runs the PARIX operating system that allows for several systems to be used such as PVM, P4 and Linda. In addition, the system supports multiuser partitioning allowing multiple students to do development work simultaneously. The system is controlled by a Sparc host connected through an S bus card and costs $58,718.
3. Tatung model 2050B Sparc 20 system with 64 Mb of memory and 1 Gb of disk. This system will be used as the system controller via ethernet for the CNAPS system and the controlling system via S bus interface for the Parsytec system. This system was chosen based on the attractive price relative to an equivalent Sun system and the positive experience the department has had with current Tatung systems. The model chosen was selected because it has sufficient processing power and memory to support large host data sets and multiple simultaneous compiles. The cost of this system is $10,160.
The CSS Department has good computing facilities to support its program. The hardware and software available to CSS students is listed in Appendix 1. The equipment which currently supports parallel processing and which will continue to be used includes:
Funding for these facilities has been provided by a combination of University special allocations and generous alumni support. The Sun workstations were funded by an NSF grant in 198X that provided facilities to begin the AI program. This previous grant was very important to the success of the AI program.
Research activities in the Science Division as a whole and the Computing and System Sciences Department have been greatly stimulated in the past 5 years by a joint effort between Eli Lilly, Inc. and Taylor University. Additional funds from other sources and a small endowment have allowed the Research Training Program to fund several student projects. Including funds secured by the Research Professor during the current school year, the RTP has had available about $435,000 which has been used to fund 78 projects involving 90 students and 12 faculty. Forty-two papers have been presented as a result of the work. The CSS Department has been active in the RTP since its inception and has had 3 faculty direct 7 projects involving several students. In the future significant funding will come from the grants and projects directed by the research professor as well as RTP. The RTP funds have been restricted to pay students, small faculty stipends and supplies.
Curriculum vitae of the principal investigators are attached. A short summary of experience relevant to this project is included here:
Dr. Timothy Diller, Professor of Computing and System Sciences and Director of the AI Program: MA and PhD in Linguistics. He has industrial experience in speech and natural language processing, expert system design and project management.
Professor Bill Toll, Associate Professor of Computing and System Sciences: MS in Computer Science, nearing completion of PhD in Computer Science with research in Graphics. His research includes implementation of a parallel algorithm on transputer and Sequent shared memory parallel processors. He has attended an NSF workshop on parallel processing at Illinois State University and has returned to this workshop as an invited speaker. He will present a paper at the 1995 SIGCSE conference on decisions in the introduction of parallel processing into the CS cuuriculum. He is responsible for implementation of parallel processing in the CS curriculum.
Dr. Hank Voss, Research Professor: Ph. D. in Electrical Engineering. He has considerable experience in space physics and image processing projects while working at Lockheed's Palo Alto Research Laboratory. He has been principal or co-investigator on numerous space projects.
Dr. Arthur White, Associate Professor of Computing and System Sciences: MA in Computer Science and EdD in Biology. He has experience in biologically-oriented image processing research projects and experience in managing and technical development of a large expert system for diagnosis of faults in medical analyzers. Dr. White will be involved with the student research projects.
Other faculty members have industrial or research experience in systems analysis and design, project management, development of large expert system, floating point numbers and compilers, satellite data analysis and geographic information systems.
Based on the above qualifications, the faculty of the Computing and System Sciences Department have the necessary background, experience and interest to effectively use parallel processing hardware in the AI and Graphics curriculum and to direct student research projects. The experience of Dr. Voss will also be valuable in the types of projects envisioned for these facilities. Professor Toll will have the responsibility, along with students experienced in parallel processing, of helping the other faculty learn to use the parallel processing capability provided by this project.
F. Curricular Needs
The introduction of parallel processing into the CS curriculum has been quite successful. Parallel processing is currently taught in 3 courses and has been used as a tool in a few research projects. The one exception has been the AI curriculum. Although AI has many natural applications of parallel processing, the courses neither teach applications of parallel techniques nor require the use of parallel processing in assignments. because of the lack of practical hardware.
The overall success of the parallel emphasis can perhaps best be measured by the fact that students are choosing to use parallel processing in course projects and independent research projects. The major current limitation to the use of parallel processing is the lack of true parallel hardware and software. The only parallel hardware available is the 6 transputer network and P4 running on the workstations. The transputers are accessible to only one user at a time and the workstations cannot be made exclusively available to network parallel processing. Real speedup cannot be measured in class assignments because of the use of simulators and the small number of transputers available. Even with these limitations, students have used the transputers for research projects.
The most pressing need in the CS curriculum is for practical parallel hardware and software that will not only function well in courses where parallel processing is currently taught but will allow for the introduction of parallel processing into additional courses and will be adequate and appropriate for the research part of Taylor's curriculum.
G. DISSEMINATION AND EVALUATION
It is very important that results of this grant be disseminated as widely as possible and in as many ways as possible. The bibliography lists many references where parallel processing is demonstrated to be of great value in AI and graphics. In addition many references report on the use of parallel processing from a theoretical perspective. Others report on the integration of parallel processing into many areas of the curriculum. This project will demonstrate that parallel processing can be integrated throughout an undergraduate curriculum in such a way that it is studied both as a topic in itself and as an important tool to solve problems. Furthermore, that knowledge can be applied by the students through the use of parallel processing in various research projects. Such results have not been seen in the literature.
Since equipment such as that requested in this proposal is beginning to be available at a reasonable cost, this project will demonstrate the feasibility of adopting a comprehensive integration of parallel processing into the undergraduate CS curriculum. Further it will show that such integration can be accomplished at a small liberal arts college. Therefore, the target audience for dissemination of the results of this project will undergraduate CS educators and research project leaders. The following avenues of dissemination will be appropriate.
In order to measure the success of a project such as this one and to create credibility for some of the presentations and papers discussed above, an evaluation of the changes caused by this project must be done. One obvious measure of the success of the project is the success of the attempts to disseminate results as listed above. The number of submitted papers and proposed presentations based on research projects that are accepted will be one measure of the success of the use of parallel processing in such areas. The number of accepted papers and presentations based on the curricular issues involved will measure the interest by other universities and colleges in the project and its results. The funding and interest in the possible workshop for undergraduate faculty will also measure thus level of interest.
The above evaluation criteria are of an external nature. To prepare some of the above material, internal evaluation must also be conducted. Such evaluation should be done on its own merit. Examples of such evaluations will include: