ECE 254B: Parallel Processing

      


Behrooz Parhami: 2007/06/19  ||  E-mail: parhami@ece.ucsb.edu  ||  Problems: webadmin@ece.ucsb.edu

Other contact info at: Bottom of this page  ||  Go up to: B. Parhami's course syllabi or his home page

      

On June 19, 2007, Professor Parhami's UCSB ECE website moved to a new location. For an up-to-date version of this page, visit it at the new address: http://www.ece.ucsb.edu/~parhami/ece_254b.htm

Background and history of ECE 254B

The graduate-level course ECE 254B was created by Dr. Parhami, shortly after he joined UCSB in 1988. It was first taught as ECE 594L (Special Topics in Computer Architecture: Parallel and Distributed Computations) in the spring quarter of 1989. A year later, it was converted to a regular graduate course (ECE 251). In 1991, Dr. Parhami led an effort to restructure and update UCSB's graduate course offerings in the area of computer architecture. The result was the creation of the three-course sequence ECE 254A/B/C to replace ECE 250 (Advanced Computer Architecture) and ECE 251. The three new courses were designed to cover high-performance uniprocessing, parallel computing, and distributed computer systems, respectively. In 1999, based on a decade of experience in teaching ECE 254B, Dr. Parhami published a textbook on parallel processing; see the course syllabus below.

Link to previous offerings of ECE 254B
ECE 254B: Spring Quarter 2006

This area reserved for important course announcements:  

2006/03/21: The textbook for this course is now available at the UCSB bookstore. Based on past experience, used copies of the required textbook can be purchased at significant discount (at times, 50% or more) from various on-line retailers, compared with its price at the campus bookstore.

Course:   

ECE 254B – Advanced Computer Architecture: Parallel Processing, University of California, Santa Barbara, Spring 2006, Enrollment Code 10835

Catalog entry:   

254B. Advanced Computer Architecture: Parallel Processing. (4) PARHAMI. Prerequisite: ECE 254A. Lecture, 4 hours. The nature of concurrent computations. Idealized models of parallel systems. Practical realization of concurrency. Interconnection networks. Building-block parallel algorithms. Algorithm design, optimality, and efficiency. Mapping and scheduling of computations. Example multiprocessors and multicomputers. (S)

Instructor:   

Behrooz Parhami, Room 5155 Engineering I, Phone  805-893-3211, parhami@ece.ucsb.edu

Meetings:   

MW 10:00-11:30, Phelps Hall, Room 1431 

Consultation:   

Open office hours, held in Room 5155 Engineering I – M 11:30-1:00, W 3:30-5:00

Motivation:   

The ultimate efficiency in parallel systems is to achieve a computation speedup factor of p with p processors. Although often this ideal cannot be achieved, some speedup is generally possible by using multiple processors in a concurrent (parallel or distributed) system. The actual speed gain depends on the system’s architecture and the algorithm run on it. This course focuses on the interplay of architectural and algorithmic speedup techniques. More specifically, the problem of algorithm design for “general-purpose” parallel systems and its “converse”, the incorporation of architectural features to help improve algorithm efficiency and, in the extreme, the design of algorithm-based special-purpose parallel architectures, are dealt with. The foregoing notions will be covered in sufficient detail to allow extensions and applications in a variety of contexts, from network processors, through desktop computers, game boxes, Web server farms, multiterabyte storage systems, and mainframes, to high-end supercomputers.

Prerequisite:   

Basic computer architecture at the level of ECE 154; an advanced architecture course, such as ECE 254A, would be helpful but can be waived.

References:   

Required Text–Parhami, B., Intro. Parallel Processing: Algorithms and Architectures, Plenum, 1999. Click on the title to view detailed information about the textbook, its errata, and downloadable PowerPoint presentations used in course lectures. 

Supplement 1, optional Dally, W.J. and B.P. Towles, Interconnection Networks: Principles and Practices, Morgan Kaufmann, 2004. Get more info via www.mkp.com (search for the title).

Supplement 2, optional Because the focus of this course is on architecture and its interplay  with algorithms, Sourcebook of Parallel Computing (ed. by J. Dongarra et al, Morgan Kaufmann, 2003), which deals primarily with software and application topics, constitutes helpful supplementary reading. Get more info via www.mkp.com (search for the title).

JournalsIEEE Trans. Computers, IEEE Trans. Parallel and Distributed Systems, J. Parallel & Distributed Computing, Parallel Computing, Parallel Processing Letters. Also, IEEE Computer and IEEE Concurrency (the latter ceased publication in late 2000) for broad introductory papers.

Conferences – Int’l Symp. Computer Architecture (ISCA, since 1973), Int’l Conf. Parallel Processing (ICPP, since 1972), Int’l Parallel & Distributed Processing Symp. (IPDPS, formed in 1998 by merging  IPPS/SPDP, which were held since 1987/1989), and ACM Symp. Parallel Algorithms and Architectures (SPAA, since 1988).

Electronic Resources at UCSB

http://www.library.ucsb.edu/eresources/databases/ (electronic journals, collections, etc.)
http://www.library.ucsb.edu/subjects/engineering/ece.html (research guide in ECE)

Evaluation:   

Students will be evaluated based on these three components with the given weights:

   

25% -- Homework: see the “deadlines” column in course calendar for schedule.

   

25% -- Closed-book midterm exam: see the course calendar for date and coverage.

   

50% -- Research project participation and report: see the description below.

Research: The class will cooperate in investigating a well-defined problem relating to interconnection networks for parallel processing. A brief period in some class sessions will be devoted to defining the research problem, dividing up the work, reviewing the progress, and setting research goals. Please follow the research paper guidelines to format your research report. Students who obtain results of publishable quality will receive an "A" for the course, regardless of homework and midterm exam grades.

Calendar:

Course lectures, homework assignments, and exams have been scheduled as follows. This schedule will be strictly observed. Textbook chapters are specified for lectures and exams. It is a good idea to look over the covered chapter(s) before each lecture if possible.  

Day & Date

Chap

Lecture Topic

Deadlines, Links, Notes

Mon. 4/3/06

1

Introduction to parallel processing

Homework requirements

Wed. 4/5/06

2

A taste of parallel algorithms 

HW#1 posted (chaps. 1-4)

Mon. 4/10/06

3-4

Complexity and models

 

Wed. 4/12/06

5

PRAM model and basic algorithms

Research project defined

Mon. 4/17/06

6

More shared-memory algorithms

HW #1 due, HW #2 posted (chaps. 5-8)

Wed. 4/19/06

7

Sorting and selection networks   
Mon. 4/24/06 8 Other circuit-level examples Research project background

Wed. 4/26/06

9

Sorting on a 2D mesh or torus

HW #2 due, HW #3 posted (chaps. 9-12)

Mon. 5/1/06

10

Routing on a 2D mesh or torus

 

Wed. 5/3/06

11

Other mesh/torus algorithms

Project assignments

Mon. 5/8/06

12

Mesh variations and extensions

HW #3 due

Wed. 5/10/06

1-12

Midterm exam: closed book (10:00-11:50 AM)

Note the extended time

Mon. 5/15/06

13 Hypercubes and their algorithms

Wed. 5/17/06

14 Sorting and routing on hypercubes

HW #4 posted (chapters 13-17)

Mon. 5/22/06

15

Other hypercubic architectures

 

Wed. 5/24/06

16

Other interconnection networks

Project plan and references due

Mon. 5/29/06

 

Holiday (Memorial day): No lecture

 

Wed. 5/31/06

17

Emulation and scheduling

HW #4 due

Mon. 6/5/06

18

Data storage, input, and output

 

Wed. 6/7/06

19-20 Reliability and system issues

 

Wed. 6/14/06

 

  Final project report due by 5:00 PM

Homework: General Requirements

Solutions should be brought to class on the due date and handed in before the lecture begins.

Late homework will not be accepted, so plan to start work on your assignments early.

Use a cover page that includes your name, course and assignment number for your solutions.

Staple the sheets and write your name on top of every sheet in case sheets are separated.

Although some cooperation is permitted, direct copying will have severe consequences.  

ECE 254B s2006, Homework #1: Fundamental concepts (chapters 1-4, due Mon. 4/17/06)

Do the following problems from the textbook: 1.2 [10 pts.], 1.12b [10 pts], 2.6 [10 pts.], 2.14 [20 pts], 3.1 [20 pts.], 3.4 [15 pts.], 4.5 [15 pts.]

ECE 254B s2006, Homework #2: Shared-memory and circuit models (chapters 5-8, due Wed. 4/26/05)

Do the following problems from the textbook: 5.6 [15 pts.], 5.9 [10 pts.], 6.2 [15 pts.], 6.13 [15 pts.], 7.6 [15 pts.], 7.8 [15 pts], 8.7 [15 pts]

ECE 254B s2006, Homework #3: Mesh and mesh-based architectures (chapters 9-12, due Mon. 5/8/05)

Do the following problems from the textbook: 9.7 [20 pts.], 9.9 [10 pts.], 10.9 [20 pts.], 11.1 [15 pts.], 12.6 [20 pts.], 12.10 [15 pts]

ECE 254B s2006, Homework #4: Interconnection networks (chapters 13-17, due Wed. 5/31/05)

Do the following problems from the textbook or defined below: 13.3 [15 pts.], 13.B [15 pts.], 14.8 [20 pts.], 14.C [15 pts.], 15.9 [15 pts.], 16.7 [20 pts]

13.B   Embedding multigrids and pyramids into hypercubes
a. Show that a 2D multigrid whose base is a 2q–1-node square mesh, with q odd and q
³ 5, and hence a pyramid of the same size, cannot be embedded in a q-cube with dilation 1.
b. Show that the 21-node 2D multigrid with a 4
´4 base can be embedded in a 5-cube with dilation 2 and congestion 1 but that the 21-node pyramid cannot.

14.C   Hypercube routing algorithm
Consider a q-cube in which each bidirectional link has been replaced by two unidirectional links in opposite directions. Dividing the unidirectional links into “upward” links (connecting x to y with x < y) and “downward” links (connecting x to y with x > y), we can view the hypercube as two unidirectional cubes formed by the upward and downward links. A proposed routing algorithm works as follows. The route taken by a message going from node u to node w passes through an intermediate node v such that the path from u to v in the upward cube and the path from v to w is in the downward cube (v can be the same as u or w).
a. Show that the proposed algorithm always chooses a shortest path from u to w.
b. Prove that the algorithm is free from deadlock if used for wormhole routing.
c. How can we benefit from this algorithm in devising a deadlock-free routing algorithm for a bidirectional hypercube?

ECE 254B s2006, Reading list for midterm exam

Remember that the midterm exam ends at 11:50 AM (not 11:30 as our normal classes). The following sections are excluded from the first 12 chapters of the textbook to be covered in the midterm exam: 2.6, 3.5, 4.5, 4.6, 6.5, 7.6, 8.5, 9.6, 11.4, 11.6, 12.6.

ECE 254B s2006, Reading list for final exam

Not applicable to this quarter.

ECE 254B s2006, Research Project Notes

Due to exchange of original ideas that are as yet unpublished, most research project notes will be handed out in hard-copy format rather than posted to this Web page.

Return to: Top of this page  ||  Go up to: B. Parhami's course syllabi or his home page      


Dr. Behrooz Parhami, Professor

                     Office phone: +1 805 893 3211
E-mail: parhami@ece.ucsb.edu                 Messages: +1 805 893 3716
Dept. Electrical & Computer Eng.                  Dept. fax: +1 805 893 3262
Univ. of California, Santa Barbara                Office: Room 5155 Eng. I
Santa Barbara, CA 93106-9560 USA                      Deliveries: Room 4155 Eng. I