|
ECE
254B: Information on Previous Offerings
Behrooz
Parhami: 2008/08/27
|| E-mail: parhami at ece.ucsb.edu || Other
contact info at: Bottom
of this page
Go up to: B.
Parhami's course syllabi or his
home page
Previous offerings of the course
Return to:
Top of this page
ECE 254B: Spring Quarter 2006
|
Course: |
ECE 254B – Advanced Computer Architecture: Parallel Processing, University
of California, Santa Barbara, Spring 2006, Enrollment Code 10835 |
|
Catalog
entry: |
254B.
Advanced Computer Architecture: Parallel Processing. (4)
PARHAMI. Prerequisite:
ECE 254A. Lecture, 4 hours. The nature of concurrent computations.
Idealized models of parallel systems. Practical realization of
concurrency. Interconnection networks. Building-block parallel algorithms.
Algorithm design, optimality, and efficiency. Mapping and scheduling of
computations. Example multiprocessors and multicomputers. (S)
|
|
Instructor: |
Behrooz
Parhami, Room 5155 Engineering I, Phone
805-893-3211, parhami at ece dot ucsb dot edu |
|
Meetings: |
MW 10:00-11:30, Phelps Hall, Room 1431 |
|
Consultation: |
Open office
hours, held in Room 5155 Engineering I – M
11:30-1:00, W 3:30-5:00 |
|
Motivation: |
The ultimate
efficiency in parallel systems is to achieve a computation speedup factor
of p with p processors. Although often this ideal cannot be achieved, some
speedup is generally possible by using multiple processors in a concurrent
(parallel or distributed) system. The actual speed gain depends on the
system’s architecture and the algorithm run on it. This course focuses
on the interplay of architectural and
algorithmic speedup techniques.
More specifically, the problem of algorithm design for
“general-purpose” parallel systems and its “converse”, the
incorporation of architectural features to help improve algorithm
efficiency and, in the extreme, the design of algorithm-based
special-purpose parallel architectures, are dealt with.
The foregoing notions will be covered in sufficient detail to allow
extensions and applications in a variety of contexts, from network
processors, through desktop computers, game boxes, Web server farms,
multiterabyte storage systems, and mainframes, to high-end supercomputers. |
|
Prerequisite: |
Basic computer
architecture at the level of ECE 154; an advanced architecture course,
such as ECE 254A, would be helpful but can be waived.
|
|
References: |
Required
Text–Parhami,
B., Intro. Parallel Processing:
Algorithms and Architectures, Plenum, 1999. Click on the title to view
detailed information about the textbook, its errata, and downloadable
PowerPoint presentations used in course lectures.
Supplement 1, optional
–
Dally, W.J. and B.P. Towles, Interconnection Networks: Principles and
Practices, Morgan Kaufmann, 2004.
Get more info via www.mkp.com (search for the
title).
Supplement 2, optional
–
Because
the focus of this course is on architecture and its interplay with
algorithms, Sourcebook of Parallel Computing (ed. by J. Dongarra et
al, Morgan Kaufmann, 2003), which deals primarily with software and
application topics, constitutes helpful supplementary reading. Get more
info via www.mkp.com (search for the
title).
Journals – IEEE Trans. Computers, IEEE
Trans. Parallel and Distributed Systems, J.
Parallel & Distributed Computing,
Parallel Computing, Parallel Processing Letters.
Also, IEEE Computer and IEEE Concurrency (the latter ceased publication in late 2000) for broad introductory papers.
Conferences – Int’l Symp. Computer
Architecture (ISCA, since 1973), Int’l Conf. Parallel Processing (ICPP,
since 1972), Int’l Parallel & Distributed Processing Symp. (IPDPS,
formed in 1998 by merging IPPS/SPDP, which were held since 1987/1989), and
ACM Symp. Parallel Algorithms and Architectures (SPAA, since 1988).
Electronic Resources at UCSB
http://www.library.ucsb.edu/eresources/databases/
(electronic journals, collections, etc.)
http://www.library.ucsb.edu/subjects/engineering/ece.html (research
guide in ECE) |
|
Evaluation: |
Students will be
evaluated based on these three components with the given weights:
|
|
|
25%
-- Homework: see the “deadlines” column in course calendar for
schedule. |
|
|
25%
-- Closed-book midterm exam: see the course calendar for date and
coverage. |
|
|
50%
--
Research project participation and report:
see the description below. |
| Research: |
The class will cooperate in
investigating a well-defined problem relating to interconnection networks
for parallel processing. A
brief period in some class sessions will be devoted to defining the research
problem, dividing up the work, reviewing the progress, and setting research
goals.
Please follow the research paper
guidelines to format your research report. Students who obtain results
of publishable quality will receive an "A" for the course, regardless of
homework and midterm exam grades. |
Calendar:
Course
lectures, homework assignments, and exams have been scheduled as follows. This
schedule will be strictly observed. Textbook chapters are specified for lectures
and exams. It is a good idea to look over the covered chapter(s) before
each lecture if possible.
|
Day
& Date |
Chap |
Lecture
Topic |
Deadlines,
Links, Notes |
|
Mon.
4/3/06 |
1 |
Introduction to parallel processing |
|
|
Wed. 4/5/06 |
2 |
A
taste of parallel algorithms |
HW#1
posted (chaps. 1-4) |
|
Mon. 4/10/06 |
3-4 |
Complexity
and models |
|
|
Wed. 4/12/06 |
5 |
PRAM
model and basic algorithms |
Research project
defined |
|
Mon. 4/17/06 |
6 |
More
shared-memory algorithms |
HW #1 due, HW
#2 posted (chaps. 5-8) |
|
Wed. 4/19/06 |
7 |
Sorting
and selection networks |
|
|
Mon. 4/24/06 |
8 |
Other
circuit-level examples |
Research project
background |
|
Wed. 4/26/06 |
9 |
Sorting
on a 2D mesh or torus |
HW #2 due, HW
#3 posted (chaps. 9-12) |
|
Mon. 5/1/06 |
10 |
Routing
on a 2D mesh or torus |
|
|
Wed. 5/3/06 |
11 |
Other mesh/torus algorithms |
Project
assignments |
|
Mon. 5/8/06 |
12 |
Mesh
variations and extensions |
HW #3 due |
|
Wed. 5/10/06 |
1-12 |
Midterm
exam: closed book (10:00-11:50 AM) |
Note
the extended time |
|
Mon. 5/15/06 |
13 |
Hypercubes
and their algorithms |
|
|
Wed. 5/17/06 |
14 |
Sorting
and routing on hypercubes |
HW
#4 posted (chapters 13-17) |
|
Mon. 5/22/06 |
15 |
Other
hypercubic architectures |
|
|
Wed. 5/24/06 |
16 |
Other interconnection networks |
Project plan and references due |
|
Mon.
5/29/06 |
|
Holiday
(Memorial day):
No lecture |
|
|
Wed. 5/31/06 |
17 |
Emulation
and scheduling |
HW #4 due |
|
Mon. 6/5/06 |
18 |
Data
storage, input, and output |
|
|
Wed. 6/7/06 |
19-20 |
Reliability and system issues |
|
|
Wed. 6/14/06 |
|
|
Final project report due by 5:00 PM |
Homework:
General Requirements
Solutions should be brought to class on the due date
and handed in before the lecture begins.
Late
homework will not be accepted, so plan to start work on your assignments early.
Use
a cover page that includes your name, course and assignment number for your
solutions.
Staple
the sheets and write your name on top of every sheet in case sheets are
separated.
Although
some cooperation is permitted, direct copying will have severe consequences.
ECE
254B s2006, Homework #1: Fundamental concepts (chapters 1-4, due Mon. 4/17/06)
Do the following problems from the textbook: 1.2 [10 pts.], 1.12b [10 pts], 2.6
[10 pts.], 2.14 [20 pts], 3.1 [20 pts.], 3.4 [15 pts.],
4.5 [15 pts.]
ECE
254B s2006, Homework #2: Shared-memory and circuit models (chapters 5-8, due Wed.
4/26/05)
Do the following problems from the textbook: 5.6 [15 pts.], 5.9 [10 pts.], 6.2
[15 pts.], 6.13 [15 pts.], 7.6 [15 pts.], 7.8 [15 pts], 8.7 [15 pts]
ECE
254B s2006, Homework #3:
Mesh and mesh-based architectures (chapters
9-12, due Mon. 5/8/05)
Do the following problems from the textbook: 9.7 [20 pts.], 9.9 [10 pts.], 10.9
[20 pts.], 11.1 [15 pts.], 12.6 [20 pts.], 12.10 [15 pts]
ECE
254B s2006, Homework #4: Interconnection networks (chapters 13-17, due Wed.
5/31/05)
Do the following problems from the textbook or defined below: 13.3 [15 pts.],
13.B [15 pts.], 14.8 [20 pts.], 14.C [15 pts.], 15.9 [15 pts.], 16.7 [20 pts]
13.B
Embedding multigrids and pyramids into hypercubes
a. Show that a 2D multigrid whose base is a 2q–1-node square
mesh, with q odd and q ³ 5, and hence a pyramid of the same size, cannot
be embedded in a q-cube with dilation 1.
b. Show that the 21-node 2D multigrid with a 4´4 base can be
embedded in a 5-cube with dilation 2 and congestion 1 but that the 21-node
pyramid cannot.
14.C
Hypercube routing algorithm
Consider a q-cube in which each bidirectional link has been replaced by
two unidirectional links in opposite directions. Dividing the unidirectional
links into “upward” links (connecting x to y with x < y)
and “downward” links (connecting x to y with x > y),
we can view the hypercube as two unidirectional cubes formed by the upward and
downward links. A proposed routing algorithm works as follows. The route taken
by a message going from node u to node w passes through an
intermediate node v such that the path from u to v in the
upward cube and the path from v to w is in the downward cube (v
can be the same as u or w).
a. Show that the proposed algorithm always chooses a shortest path from u
to w.
b. Prove that the algorithm is free from deadlock if used for wormhole routing.
c. How can we benefit from this algorithm in devising a deadlock-free routing
algorithm for a bidirectional hypercube?
ECE
254B s2006, Reading list for midterm exam
Remember
that the midterm exam ends at 11:50 AM (not 11:30 as our normal classes). The
following sections are excluded from the first 12 chapters of the textbook to be
covered in the midterm exam: 2.6, 3.5, 4.5, 4.6, 6.5, 7.6, 8.5, 9.6,
11.4, 11.6, 12.6.
ECE
254B s2006, Reading list for final exam
Not applicable to this quarter.
ECE
254B s2006, Research Project Notes
Due to exchange of original ideas that are as yet unpublished,
most research project notes will be handed out in hard-copy format rather than
posted to this Web page.
Return to:
Top of this page
CE 254B: Spring Quarter 2005
This area reserved for
important course announcements: Research paper and final
course grades have been assigned. An e-mail message, with comments on the
research paper, will be sent to each student no later than Friday, June 17,
2005. Have a nice summer!
|
Course: |
ECE 254B – Advanced Computer Architecture: Parallel Processing, University
of California, Santa Barbara, Spring 2005, Enrollment Code 48413 |
|
Catalog
entry: |
254B.
Advanced Computer Architecture: Parallel Processing. (4)
PARHAMI. Prerequisite:
ECE 254A. Lecture, 4 hours. The nature of concurrent computations.
Idealized models of parallel systems. Practical realization of
concurrency. Interconnection networks. Building-block parallel algorithms.
Algorithm design, optimality, and efficiency. Mapping and scheduling of
computations. Example multiprocessors and multicomputers. (S)
|
|
Instructor: |
Behrooz
Parhami, Room 5155 Engineering I, Phone
805-893-3211, parhami at ece dot ucsb dot edu |
|
Meetings: |
MW 12:00-1:30, Phelps Hall, Room 1431 |
|
Consultation: |
Open office
hours, held in Room 5155 Engineering I – M
2:00-3:30, W 3:30-5:00 |
|
Motivation: |
The ultimate
efficiency in parallel systems is to achieve a computation speedup factor
of p with p processors. Although often this ideal cannot be achieved, some
speedup is generally possible by using multiple processors in a concurrent
(parallel or distributed) system. The actual speed gain depends on the
system’s architecture and the algorithm run on it. This course focuses
on the interplay of architectural and
algorithmic speedup techniques.
More specifically, the problem of algorithm design for
“general-purpose” parallel systems and its “converse”, the
incorporation of architectural features to help improve algorithm
efficiency and, in the extreme, the design of algorithm-based
special-purpose parallel architectures, are dealt with.
|
|
Prerequisite: |
Basic computer
architecture at the level of ECE 154; an advanced architecture course,
such as ECE 254A, would be helpful but can be waived.
|
|
References: |
Required
Text–Parhami,
B., Intro. Parallel Processing:
Algorithms and Architectures, Plenum, 1999. Use the link to view
detailed information about the textbook and its errata.
I found out in early April, 2005, that the UCSB
bookstore is selling our textbook at a much higher price than what is
available from multiple on-line sources (they have priced it even higher
than the book's list price). The price quoted to me is almost double what
was posted on the empty shelves before the books arrived at the bookstore.
That posted price was the basis of my recommendation to you to wait,
rather than purchase the book on-line. This is quite surprising and goes
against the spirit of college bookstores that use their bulk purchasing
power to get the best prices for students. You may want to shop around for
a lower price. If this practice of getting the textbooks two weeks into
the quarter and pricing them ridiculously high continues, I will boycott
the bookstore and ask my students to buy all their books on-line.
Supplement 1, optional
–
Dally, W.J. and B.P. Towles, Interconnection Networks: Principles and
Practices, Morgan Kaufmann, 2004.
Get more info via www.mkp.com (search for the
title).
Supplement 2, optional
–
Because
the focus of this course is on architecture and its interplay with
algorithms, Sourcebook of Parallel Computing (ed. by J. Dongarra et
al, Morgan Kaufmann, 2003), which deals primarily with software and
application topics, constitutes helpful supplementary reading. Get more
info via www.mkp.com (search for the
title).
Journals – IEEE Trans. Computers, IEEE
Trans. Parallel and Distributed Systems, J.
Parallel & Distributed Computing,
Parallel Computing, Parallel Processing Letters.
Also, IEEE Computer and IEEE Concurrency (the latter ceased publication in late 2000) for broad introductory papers.
Conferences – Int’l Symp. Computer
Architecture (ISCA, since 1973), Int’l Conf. Parallel Processing (ICPP,
since 1972), Int’l Parallel & Distributed Processing Symp. (IPDPS,
formed in 1998 by merging IPPS/SPDP, which were held since 1987/1989), and
ACM Symp. Parallel Algorithms and Architectures (SPAA, since 1988).
Electronic Resources at UCSB
http://www.library.ucsb.edu/eresources/databases/
(electronic journals, collections, etc.)
http://www.library.ucsb.edu/subjects/engineering/ece.html (research
guide in ECE) |
|
Evaluation: |
Students will be
evaluated based on these three components with the given weights:
|
|
|
25%
-- Homework: see the “deadlines” column in course calendar for
schedule. |
|
|
25%
-- Closed-book midterm exam: see the course calendar for date and
coverage. |
|
|
50%
--
Research project participation and report:
see the description below. |
| Research: |
The class will cooperate in
investigating a well-defined problem relating to interconnection networks
for parallel processing. A
brief period in some class sessions will be devoted to defining the research
problem, dividing up the work, reviewing the progress, and setting research
goals.
Please follow the research paper
guidelines to format your research report. Students who obtain results
of publishable quality will receive an "A" for the course, regardless of
homework and midterm exam grades. |
Calendar:
Course
lectures, homework assignments, and exams have been scheduled as follows. This
schedule will be strictly observed. Textbook chapters are specified for lectures
and exams. It is a good idea to look over the covered chapter(s) before
each lecture if possible.
|
Day
& Date |
Chap |
Lecture
Topic |
Deadlines,
Links, Notes |
|
Mon.
3/28/05 |
1 |
Introduction
to parallelism |
|
|
Wed. 3/30/05 |
2 |
A
taste of parallel algorithms |
HW#1
posted (chaps. 1-4) |
|
Mon. 4/4/05 |
3-4 |
Complexity
and models |
|
|
Wed. 4/6/05 |
5 |
PRAM
model and basic algorithms |
Research project
defined |
|
Mon. 4/11/05 |
6 |
More
shared-memory algorithms |
HW #1 due, HW
#2 posted (chaps. 5-8) |
|
Wed. 4/13/05 |
7 |
Sorting
and selection networks |
Extended due
date for HW#1 |
|
Mon. 4/18/05 |
8 |
Other
circuit-level examples |
Research project
background |
|
Wed. 4/20/05 |
9 |
Sorting
on a 2D mesh or torus |
HW #2 due, HW
#3 posted (chaps. 9-12) |
|
Mon.
4/25/05 |
10 |
Routing
on a 2D mesh or torus |
|
|
Wed. 4/27/05 |
11 |
Other mesh/torus algorithms |
Project
assignments |
|
Mon. 5/2/05 |
12 |
Mesh
variations and extensions |
HW #3 due |
|
Wed. 5/4/05 |
1-12 |
Midterm
exam: closed book (12-2 PM) |
Note
the extended time |
|
Mon. 5/9/05 |
13 |
Hypercubes
and their algorithms |
|
|
Wed. 5/11/05 |
14 |
Sorting
and routing on hypercubes |
HW
#4 posted (chapters 13-17) |
|
Mon. 5/16/05 |
15 |
Other
hypercubic architectures |
|
|
Wed. 5/18/05 |
16 |
Other interconnection networks |
Project plan and references due |
|
Mon.
5/23/05 |
17 |
Emulation
and scheduling |
|
|
Wed. 5/25/05 |
18 |
Data
storage, input, and output |
HW #4 due |
|
Mon. 5/30/05 |
|
Holiday
(Memorial day):
No lecture |
|
|
Wed. 6/1/05 |
19-20 |
Reliability and system issues |
|
|
Wed. 6/13/05 |
|
(Extended from
the due date of 6/8/05) |
Final project report due by 12:00 noon |
Homework:
General Requirements
Solutions should be brought to class on the due date
and handed in before the lecture begins.
Late
homework will not be accepted, so plan to start work on your assignments early.
Use
a cover page that includes your name, course and assignment number for your
solutions.
Staple
the sheets and write your name on top of every sheet in case sheets are
separated.
Although
some cooperation is permitted, direct copying will have severe consequences.
Course
Grade Statistics
| Item |
Out of |
Max |
Mean |
Median |
Min |
SD |
| HW #1 |
100 |
66 |
46 |
44 |
35 |
n/a |
| HW #2 |
100 |
77 |
55 |
47 |
44 |
n/a |
| HW #3 |
100 |
71 |
58 |
66 |
28 |
n/a |
| HW #4 |
100 |
98 |
77 |
63 |
53 |
n/a |
| HW avg. |
100 |
78 |
61 |
62 |
42 |
n/a |
| Research |
100 |
85 |
75 |
78 |
60 |
n/a |
| Midterm |
100 |
88 |
53 |
49 |
16 |
n/a |
| Course* |
100 |
82 |
70 |
70 |
58 |
n/a |
| Course |
A (4.0) |
A |
3.5 |
3.5 |
B |
n/a |
*
Course = 25%(HW avg.) + 25%(Midterm) + 50%(Research)
ECE
254B s2005, Homework #1: Fundamental concepts (chapters 1-4, due Mon. 4/11/05)
Do the
following problems from the textbook or defined below: 1.6 [20 pts.], 2.1 [20
pts.], 2.J [15 points], 3.9 [10 pts.], 3.13 [15 pts.], 4.12 [20 pts.]
Problem
2.J An integer sequence S of length n is stored in the n
leaves of a complete binary tree. Develop an efficient parallel algorithm that
determines whether the sequence is sorted in nondescending order from left to
right. If the sequence is not nondescending, the algorithm should also flag each
out-of-order element (one that is less than the element to its left).
ECE
254B s2005, Homework #2: Shared-memory and circuit models (chapters 5-8, due Wed.
4/20/05)
Do the
following problems from the textbook or defined below: 5.1 [10 pts.], 5.8 [15
pts], 6.7 [20 pts.], 7.5 [20 pts.], 7.E [15 pts.], 8.4 [20 pts.]
Problem 7.E Selection networks – An (n,
k)-selector is an n-input, single-output network of comparators
that selects the kth smallest of the n inputs. For example, an (n,
1)-selector outputs the smallest element and an (n, n)-selector
yields the largest element. (a) Prove or disprove that the zero-one principle
is also valid for selection networks. (b) Design delay-minimal and cost-minimal
(5, 2)- and (5, 3)-selectors. (c) Prove or disprove that the delay of an (n,
k)-selector is strictly less than that of an n-sorter. (d) Prove
or disprove that the cost of an (n, k)-selector is strictly less
than that of an n-sorter.
ECE
254B s2005, Homework #3:
Mesh and mesh-based architectures (chapters
9-12, due Mon. 5/2/05)
Do the
following problems from the textbook or defined below: 9.4 [20 pts.], 9.8 [15
pts.], 10.11 [15 pts], 10.B [20 pts.], 11.E [15 pts.], 12.1 [15 pts.]
Clarification for Problem 12.1: In part b, the assumption is that in any given
step, all processors must communicate along the same dimension. This is
sometimes referred to as uniaxis communication.
Problem 10.B Interval routing in 2D mesh –
Consider an interval routing scheme, as defined in problem 10.14, with the
intervals attached to the four mesh links for node (i, j)
corresponding to the submeshes defined by bounds on row and column numbers for
the destination node (k, l) as follows: above, if k < i,
l ³ j; right, if k
³
i, l > j; below, if k > i, l
£
j; left, if k
£
i, l < j. (a) Draw a few sample paths with various source
destination pairs for the above mesh routing algorithm and state the general
properties of the paths selected. (b) Compare the algorithm to row-first routing
with respect to routing delay and buffer requirements in the worst case. (c)
Compared to row-first routing, is the traffic going through the links more
balanced or less balanced in this algorithm? (d) Is the algorithm deadlock-free
if used with wormhole routing? Why or why not? (e) Modify the above regions or
intervals for the case of a square 2D torus to achieve balance in the traffic
for the various links.
Problem 11.E FIR filter implemented as a linear
array – A finite impulse response (FIR) filter produces a sequence of outputs
yj in response to the sequence of inputs xi
such that yj =
å
akxj–k,
where the ak are the filter’s coefficients and the first p
– 1 outputs are ignored. Show how a linear array of p processors can
perform this computation, with the input xt accepted, and the
output yt produced, in cycle 2t.
ECE
254B s2005, Homework #4: Interconnection networks (chapters 13-17, due Wed.
5/25/05)
Do the
following problems from the textbook or defined below: 13.7 [20 pts.], 13.F [15
pts.], 14.10 [15 pts.], 15.13 [15 pts.], 16.5 [20 pts.], 17.13 [15 pts.]
Problem 13.F Embedding parameters – The three
examples in Fig. 13.2 suggest that dilation, congestion, and load factor are
orthogonal parameters in the sense that knowing two of them does not provide
much, if any, information about the third. Reinforce this conclusion by
constructing, when possible, additional examples for which dilation, congestion,
and load factor are (respectively): (a) 1, 1, 2; (b) 1, 2, 1; (c) 2, 1, 1;
(d) 2, 1, 2; (e) 2, 2, 2
ECE
254B s2005, Reading list for midterm exam
Remember
that the midterm exam ends at 2:00 PM (not 1:30 as our normal classes). The
following sections are excluded from the first 12 chapters of the textbook to be
covered in the midterm exam: 2.6, 3.5, 4.5, 4.6, 6.5, 7.6, 8.5, 8.6, 9.6,
11.4, 12.6 (pp. 250-253).
ECE
254B s2005, Reading list for final exam
Not applicable to this quarter.
ECE
254B s2005, Research Project Notes
Due to exchange of original ideas that are as yet unpublished,
most research project notes will be handed out in hard-copy format rather than
posted to this Web page.
Return to:
Top of this page
ECE 254B: Winter Quarter 2004
This area reserved for
important course announcements: Homework 5 (last homework) has been posted
below.
|
Course: |
ECE
254B – Advanced Computer Architecture: Parallel Processing, University
of California, Santa Barbara, Winter 2004, Enrollment Code 11882 |
|
Catalog
entry: |
254B.
Advanced Computer Architecture: Parallel Processing. (4)
PARHAMI. | |