Behrooz Parhami's website banner

Menu:

Behrooz Parhami's ECE 254B Course Page for Winter 2023

Parallel activities at a racing car's pit stop

Adv. Computer Architecture: Parallel Processing

Page last updated on 2023 March 20

Enrollment code: 13367
Prerequisite: ECE 254A (can be waived, but ECE 154B is required)
Class meetings: MW 10:00-11:00, Phelps 1431 (inverted classroom)
Instructor: Professor Behrooz Parhami
Open office hours: MW 11:00-11:45, Phelps 1431
Course announcements: Listed in reverse chronological order
Course calendar: Schedule of lectures, homework, exams, research
Homework assignments: Five assignments, worth a total of 40%
Exams: None for winter 2023
Research paper and poster: Worth 60%
Research paper guidlines: Brief guide to format and contents
Poster presentation tips: Brief guide to format and structure
Policy on academic integrity: Please read very carefully
Grade statistics: Range, mean, etc. for homework and exam grades
References: Textbook and other sources (Textbook's web page)
Lecture slides: Available on the textbook's web page
Miscellaneous information: Motivation, catalog entry, history

Course Announcements

Megaphone 2023/03/17: I have sent feedback to everyone who submitted the final list of references and provisional abstract for the research paper. I gave this task priority over grading HW5, so that you'll have more time to finish your research paper by the 3/22 deadline. HW5 grades will be sent to you by the end of Sunday 3/19.
2023/02/28: HW5, the last one for the course, has been posted to the homework area below. The next research milestone is submission of your final references and provisional abstract on W 3/15 (any time).
2023/02/16: Problem 9.19 of HW4, whose description was inadvertently left out, has been added in the homework area below.
2023/02/14: HW4 has been posted to the homework area below. Next week, Feb. 20-24, we won't have any classes, but your preliminary list of research references will be due by midnight on W 2/22.
2023/01/30: HW3 has been posted to the homework area below. One more research topic has been assigned. One student remains without a research topic due to not submitting a list of preferences.
2023/01/26: Research topics have been assigned and appear in the research area below. Given that a handful of topics were quite popular, the assignment was rather difficult and took several iterations to get it right. In the end, 3/4/5/3 students got the topic they ranked 1/2/3/4 in their preference lists. Two students still enrolled in the course have not yet submitted their research preference lists. For the benefit of these two students, only the unclaimed topics D2, D3, and N6 are available to them.
2023/01/21: HW2 has been posted to the homework area below. Your top-4 research topic preferences in priority order, 1-4, will be due by Wednesday 2023/01/25. I will assign topics to students on Thursday 01/26. Later submission of your choice of topics will risk missing out on getting your preferred topic due to prior assignment to someone else.
2023/01/10: HW1 has been posted to the homework area below. UCSB classes have been cancelled through Tuesday 1/10 but, as of now, Wednesday classes will be held as scheduled.
2022/12/08: Welcome to the ECE 254B web page for winter 2023! As of today, 15 students have signed up for the course. I will send out an introductory e-mail to all enrolled students later in December. However, my primary mode of communication about the course will be through this Web page. Make sure to consult this "Announcements" section regularly. The 16 course lectures have been recorded, with links provided in the "Course Calendar" section below. Please watch each lecture before the class session devoted to its discussion, in a flipped-classroom model. Please ignore any references to dates and deadlines in the recorded lectures. The dates given in the following Course Calendar are the ones that apply to winter 2023. Looking forward to meeting all of you!

Course Calendar

Calendar

Discussion sessions, homework assignments, and research milestones are scheduled as follows (lectures have been pre-recorded, so class sessions will be devoted to discussion and Q&A). This schedule will be strictly observed. In particular, no extension is possible for homework due dates; please start work on the assignments early. Each lecture covers topics in 1-2 chapters of the textbook. Chapter numbers are provided in parentheses, after day & date. PowerPoint and PDF files of the lecture slides can be found on the textbook's web page.

Day & Date (book chapters) Lecture topic [HW posted/due] {Research milestones}

M 01/09 (1) Lecture 1   Introduction to parallel processing
W 01/11 (2) Lecture 2   A taste of parallel algorithms [HW1 posted; chs. 1-4]

M 01/16 No lecture: Martin Luther King Holiday
W 01/18 (3-4) Lecture 3   Complexity and parallel computation models {Research topics announced}

M 01/23 (5) Lecture 4   PRAM shared-memory model & algorithms [HW1 due] [HW2 posted; chs. 5-6C]
W 01/25 (6A) Lecture 5   More shared-memory algorithms {Research topic preferences due by midnight}

M 01/30 (6B-6C) Lecture 6   Shared-memory implementations & abstractions {Research topics assigned}
W 02/01 (7) Lecture 7   Sorting and selection networks [HW2 due] [HW3 posted; chs. 7-8C]

M 02/06 (8A) Lecture 8   Search acceleration circuits
W 02/08 (8B-8C) Lecture 9   Other circuit-level examples

M 02/13 (9) Lecture 10   Sorting on a 2D mesh/torus architectures [HW3 due]
W 02/15 (10) Lecture 11   Routing on a 2D mesh/torus architectures [HW4 posted; chs. 9-12]

M 02/20 No lecture: President's Day Holiday
W 02/22 No lecture: Research focus week {Preliminary references due by midnight}

M 02/27 (11-12) Lecture 12   Other mesh/torus concepts
W 03/01 (13) Lecture 13   Hypercubes and their algorithms [HW4 due] [HW5 posted; chs. 13-16]

M 03/06 (14) Lecture 14   Sorting and routing on hypercubes
W 03/08 (15-16) Lecture 15   Other interconnection architectures

M 03/13 (17) Lecture 16   Network embedding and task scheduling [HW5 due]
W 03/15 No lecture, to allow a focus on research {Final references & provisional abstract due by midnight}

W 03/22 {Research paper PDF file due by midnight}

T 03/28 {Course grades due by midnight}

Homework Assignments

Homework image - Turn in solutions as a single PDF file attached to an e-mail sent by the due date/time.
- Because solutions will be handed out on the due date, no extension can be granted.
- Include your name, course name, and assignment number at the top of the first page.
- If homework is handwritten and scanned, make sure that the PDF is clean and legible.
- Although some cooperation is permitted, direct copying will have severe consequences.

Homework 1: Introduction, complexity, and models (Chs. 1-4; due M 2023/01/23, 10:00 AM)
Do the following problems from the textbook: 1.3, 1.10, 1.22 (defined below), 2.7, 3.6ab, 3.9, 4.7
1.22   Top supercomputers in the world   Read the paper [Stro15] and answer the following questions in less than one page of typeset text (single spacing okay, if needed).
a. What is the significance of the Top500 supercomputers list?
b. In Figure 1, why is the topmost curve wavy whereas the other two are relatively smooth?
c. How many processors are used in the current top supercomputer on the Top500 list?
d. What is the current performance level of the top supercomputer on the Top500 list?
e. What is the next milestone after exaflops-level performance and when will it be achieved?
[Stro15] Strohmaier, E., H. W. Meuer, J. Dongarra, and H. D. Simon, "The TOP500 List and Progress in High-Performance Computing," IEEE Computer, Vol. 48, No. 11, pp. 42-49, November 2015.

Homework 2: Shared-memory parallel processing (Chs. 5-6C; due W 2023/02/01, 10:00 AM)
Do the following problems from the textbook: 5.3, 5.9, 6.1, 6.24, 16.20, 16.27 (last three defined below)
6.24   Parallel searching in an unsorted list   The worst-case complexity of parallel searching in an unsorted list was determined to be n/p, yielding the perfect speed-up of p. Assuming that the probability of the item sought being in the list is q, perform an analysis of the average-case complexity and determine the associated speed-up.
16.20   Clos network   Consider a Clos network with rs inputs, rs outputs, and three columns (0-2) of switches. Columns 0 and 2 contain r switches, each of which is an s × s crossbar. Column 1 contains s switches that are r × r crossbars. Zero-origin top-to-bottom indexing is used to identify the switches in each column and their input/output lines. Switch terminals are identified by in(c, b, a) and out(c, b, a), with c being the column index, b the switch (block) index, and a the line index. Intercolumn connections are as follows:
for all x and y (0 ≤ xr – 1, 0 ≤ y ≤ s – 1),
out(0, x, y) is connected to in(1, y, x)
out(1, y, x) is connected to in(2, x, y)
a. Prove that the Clos network can realize any rs × rs permutation.
b. If the cost of an m × m crossbar switch is m^2 units, what are the optimal values of r and s for a given total number n = rs of inputs?
c. Try to prove a more general optimality result that holds for a class of cost functions rather than only for the one given in part b.
16.27   Permutation routing on multistage networks   Show that the following network built of 2 × 2 switches cannot possibly route all permutations.
A multistage network that isn't a permutation network

Homework 3: Circuit model of parallel processing (Chs. 7-8C; due M 2023/02/13, 10:00 AM)
Do the following problems from the textbook: 7.3, 7.5, 7.16ab, 8.7, 8.8, 8.20 (defined below)
8.20   Two-dimensional FFT   A (2^(2q))-point 2D FFT computation is defined as follows. The 2^(2q) data elements are viewed as a (2^q) × (2^q) array. A (2^q)-point FFT is performed on each row of the array separately. Then, a (2^q)-point FFT is performed on each column. Thus, two sets of results, corresponding to row and columns FFTs, are computed. Show how the required results can be obtained in one pass through a (2q)-dimensional butterfly with (2^(2q))(2q + 1) nodes.

Homework 4: Mesh/torus-connected parallel computers (Chs. 9-12; due W 2023/03/01, 10:00 AM)
Do the following problems from the textbook: 9.6, 9.19 (defined below), 10.7, 10.14abc, 11.4, 12.6ab
9.19   Analysis of shearsort   On a p-processor 2D mesh with one dimension equal to x (x < sqrt(p)), is shearsort faster when x is the number of rows or the number of columns? Fully justify your answer.

Homework 5: Hypercubic & other parallel computers (Chs. 13-16; due M 2023/03/13, 10:00 AM)
Do the following problems from the textbook: 13.1, 13.10, 14.8, 14.13, 15.13, 16.30 (defined below)
16.30   Petersen graph   Show that the 10 nodes of the Petersen graph can be labeled with 2-element subsets of {1, 2, 3, 4, 5}, such that two nodes are connected if their corresponding subsets have an empty intersection.

Sample Exams and Study Guides (not applicable to winter 2023)

Answer sheet

The following sample exams using problems from the textbook are meant to indicate the types and levels of problems, rather than the coverage (which is outlined in the course calendar). Students are responsible for all sections and topics, in the textbook, lecture slides, and class handouts, that are not explicitly excluded in the study guide that follows the sample exams, even if the material was not covered in class lectures.

Sample Midterm Exam (105 minutes) (Chapters 8A-8C do not apply to this year's midterm)
Textbook problems 2.3, 3.5, 5.5 (with i + s corrected to j + s), 7.6a, and 8.4ac; note that problem statements might change a bit for a closed-book exam.

Midterm Exam Study Guide
The following sections are excluded from Chapters 1-7 of the textbook to be covered in the midterm exam, including the three new chapters named 6A-C (expanding on Chpater 6):
3.5, 4.5, 4.6, 6A.6, 6B.3, 6B.5, 6C.3, 6C.4, 6C.5, 6C.6, 7.6

Sample Final Exam (150 minutes) (Chapters 1-7 do not apply to this year's final)
Textbook problems 1.10, 6.14, 9.5, 10.5, 13.5a, 14.10, 16.1; note that problem statements might change a bit for a closed-book exam.

Final Exam Study Guide
The following sections are excluded from Chapters 8A-19 of the textbook to be covered in the final exam:
8A.5, 8A.6, 8B.2, 8B.5, 8B.6, 9.6, 11.6, 12.5, 12.6, 13.5, 15.5, 16.5, 16.6, 17.1, 17.2, 17.6, 18.6, 19

Research Paper and Presentation

Colored marbles Our research focus this quarter will be on interconnection networks, which constitute important components for realizing efficient, large-scale parallel processors. This is a continuation of last year's focus area, with topics assigned to students last year marked as "Not available for winter 2023." You will not be required to present your work orally as a poster: Only a written report in PDF format is to be submitted by the specified deadline.

Our textbook covers some general ideas on interconnection networks in Chapter 16 and many example networks throughout the book. In addition, you will find the first two of the following references useful. The first one has good theoretical coverage, while the second one focuses on networks used in actual top-of-the-line supercomputers. The third reference is listed for its relative recency.
[Xu13] J. Xu, Topological Structure and Analysis of Interconnection Networks, Springer, 2013.
[Trob16] R. Trobec, R. Vasiljevic, M. Tomasevic, V. Milutinovic, R. Beivide, and M. Valero, "Interconnection Networks in Petascale Computer Systems: A Survey," ACM Computing Surveys, Vol. 49, No. 3, pp. 1-24, 2016.
[Moud21] M. Moudi and M. Othman, "A Survey on Emerging Issues in Interconnection Networks," Int'l J. Internet Technology and Secured Transactions, Vol. 11, No. 2, pp. 131-159, 2021.

Banner illustrating biswapped networks

Here is a list of research paper titles, presented in five sections. Sample references follow the titles to help define the scope and serve as starting points.

For topic selection, you should give me your first to fourth choices among the topics that follow by the "Research topic preferences due" deadline. I will then assign a topic to you within a few days, based on your preferences and those of your classmates. Each preference should be presented in the following format (shown for someone's third choice):

(3) D2. Fault-Diameter of Interconnection Networks

Please include both the topic's two-character code and its full title, as it is quite easy to commit a typo in the code and end up with a wrong topic.

Dependability Attributes

D1. Node- and Edge-Connectivity of Networks (Assigned to: Joshua Kim)
[Gutm06] I. Gutman and S. Zhang, "Graph Connectivity and Wiener Index," Bulletin (Academie Serbe des Sciences et des Arts. Classe des Sciences Mathematiques et Naturelles. Sciences mathematiques), pp. 1-5, 2006.
[Wigd92] A. Wigderson, "The Complexity of Graph Connectivity," Proc. Int'l Symp. Mathematical Foundations of Computer Science, Springer, pp. 112-132, 1992.

D2. Fault-Diameter of Interconnection Networks (Assigned to: TBD)
[Kris87] M. S. Krishnamoorthy and B. Krishnamurthy, "Fault Diameter of Interconnection Networks," Computers & Mathematics with Applications, Vol. 13, Nos. 5-6, pp. 577-582, 1987.
[Xu02] J.-M. Xu and X. Xie, "On Fault-Tolerant Diameter and Wide Diameter of Graphs," J. University of Science and Technology of China, Vol. 32, No. 2, pp. 135-139, 2002.

D3. Wide-Diameter of Interconnection Networks (Assigned to: TBD)
[Xu02] J.-M. Xu and X. Xie, "On Fault-Tolerant Diameter and Wide Diameter of Graphs," J. University of Science and Technology of China, Vol. 32, No. 2, pp. 135-139, 2002.
[Gao15] S. Gao and C. Zhu, "Wide Diameter for Two Families of Interconnection Networks," Wuhan University J. Natural Sciences, Vol. 20, No. 5, pp. 375-380, 2015.

D4. Parallel Disjoint Paths in Networks (Not available for winter 2023)
[Iqba15] F. Iqbal and F. A. Kuipers, "Disjoint Paths in Networks," Wiley Encyclopedia of Electrical and Electronics Engineering, 2015.
[Ling14] S. Ling and W. Chen, "Node-to-Set Disjoint Paths in Biswapped Networks," The Computer J., Vol. 57, No. 7. pp. 953-967, 2014.

D5. Modeling of Interconnection Network Reliability (Not available for winter 2023)
[Dash12] R. K. Dash, N. K. Barpanda, P. K. Tripathy, and C. R. Tripathy, "Network Reliability Optimization Problem of Interconnection Network Under Node-Edge Failure Model," Applied Soft Computing, Vol. 12, No. 8, pp. 2322-2328, 2012.
[MdYu16] N. A. Md Yunus, M. Othman, Z. Mohd Hanapi, and K. Y. Lun, K.Y., "Reliability Review of Interconnection Networks," IETE Technical Review, Vol. 33, No. 6, pp. 596-606, 2016.

D6. Dependability Assessment via Resistance Distance in Networks (Assigned to: Jonghyun Park)
[Klei93] D. J. Klein and M. Randic, "Resistance Distance," J. Mathematical Chemistry, Vol. 12, No. 1, pp. 81-95, December 1993.
[Tizg10] A. Tizghadam and A. Leon-Garcia, "Betweenness Centrality and Resistance Distance in Communication Networks," IEEE Network, Vol. 24, No. 6, pp. 10-16, November 2010.

Network Classes

N1. Swapped and Biswapped Networks (Assigned to: Eimon Erfanfar)
[Yeh96] C. H. Yeh and B. Parhami, "Swapped Networks: Unifying the Architectures and Algorithms of a Wide Class of Hierarchical Parallel Processors," Proc. Int'l Conf. Parallel and Distributed Systems, pp. 230-237, 1996.
[Xiao11] W. Xiao, B. Parhami, W. Chen, M. He, and W. Wei, "Biswapped Networks: A Family of Interconnection Architectures with Advantages over Swapped or OTIS Networks," Int'l J. Computer Mathematics, Vol. 88, No. 13, pp. 2669-2684, 2011.

N2. Constant-Diameter Networks (Not available for winter 2023)
[Parh05] B. Parhami and M. Rakov, "Perfect Difference Networks and Related Interconnection Structures for Parallel and Distributed Systems," IEEE trans. Parallel and Distributed Systems, Vol. 16, No. 8, pp. 714-724, 2005.
[Kita16] T. Kitasuka and M. Iida, "A Heuristic Method of Generating Diameter 3 Graphs for Order/Degree Problem," Proc. 10th IEEE/ACM Int'l Symp. Networks-on-Chip (NOCS), pp. 1-6, 2016.

N3. Chordal-Ring Networks (Not available for winter 2023)
[Bujn03] S. Bujnowski, B. Dubalski, and A. Zabludowski, "Analysis of Chordal Rings," Mathematical Techniques and Problems in Telecommunications, pp. 257-279, 2003.
[Parh99] B. Parhami and D.-M. Kwai, "Periodically Regular Chordal Rings," IEEE Trans. Parallel and Distributed Systems, Vol. 10, No. 6, pp. 658-672, 1999.

N4. Low- vs. High-Dimensional Mesh Networks (Not available for winter 2023)
[Dall90] W. J. Dally, "Performance Analysis of k-ary n-cube Interconnection Networks," IEEE Trans. Computers, Vol. 39, No. 6, pp. 775-785, 1990.
[Parh04] B. Parhami and D. M. Kwai, "Comparing Four Classes of Torus-Based Parallel Architectures: Network Parameters and Communication Performance," Mathematical and Computer Modelling, Vol. 40, Nos. 7-8, pp. 701-720, 2004.

N5. Variations on the Fat-Tree Network (Not available for winter 2023)
[Jain17] N. Jain, A. Bhatele, L. H. Howell, D. Bohme, I. Karlin, E. A. Leon, M. Mubarak, N. Wolfe, T. Gamblin, and M. L. Leininger, "Predicting the Performance Impact of Different Fat-Tree Configurations," Proc. Int'l Conf. High Performance Computing, Networking, Storage and Analysis, pp. 1-13, 2017.
[Wang12] Z. Wang, J. Xu, X. Wu, Y. Ye, W. Zhang, M. Nikdast, X. Wang, and Z. Wang, Z., "Floorplan Optimization of Fat-Tree-Based Networks-on-Chip for Chip Multiprocessors," IEEE Trans. Computers, Vol. 63, No. 6, pp. 1446-1459, 2012.

N6. Performance Implications of Network Pruning (Assigned to: Kevin Yuen)
[Kwai04] D. M. Kwai and B. Parhami, "Incomplete k-ary n-cube and its Derivatives," J. Parallel and Distributed Computing, Vol. 64, No. 2, pp. 183-190, February 2004.
[Stew13] I. A. Stewart, "Interconnection Networks of Degree Three Obtained by Pruning Two-Dimensional Tori," IEEE Trans. Computers, Vol. 63, No. 10, pp. 2473-2486, July 2013.

Packaging and Layout

P1. Layout of High-Dimensional Mesh Networks (Assigned to: Henry Chang)
[Weld09] A. Y. Weldezion, M. Grange, D. Pamunuwa, Z. Lu, A. Jantsch, R. Weerasekera, and H. Tenhunen, "Scalability of Network-on-Chip Communication Architecture for 3-D Meshes," Proc. 3rd ACM/IEEE Int'l Symp. Networks-on-Chip, pp. 114-123, 2009.
[Fors11] M. Forsell, V. Leppanen, and M. Penttonen, "Cost of Sparse Mesh Layouts Supporting Throughput Computing," Proc. 14th Euromicro Conf. Digital System Design, pp. 316-323, 2011.

P2. Layout of Multistage Interconnection Networks (Assigned to: Jooyoung Bao)
[Manu07] P. Manuel, K. Qureshi, A. William, and A. Muthumalai, "VLSI Layout of Benes Networks," J. Discrete Mathematical Sciences and Cryptography, Vol. 10, No. 4, pp. 461-472, 2007.
[Yeh00] C. H. Yeh, B. Parhami, E. A. Varvarigos, and H. Lee, "VLSI Layout and Packaging of Butterfly Networks," Proc. 12th Annual ACM Symp. Parallel Algorithms and Architectures, pp. 196-205, 2000.

P3. Design Considerations for Networks on Chip (Assigned to: Ryan Chau)
[Beni02] L. Benini and G. De Micheli, "Networks on Chip: A New Paradigm for Systems on Chip Design," Proc. Conf. Design, Automation and Test in Europe, pp. 418-419, 2002.
[Bolo04] E. Bolotin, I. Cidon, R. Ginosar, and A. Kolodny, "Cost Considerations in Network on Chip," Integration, Vol. 38, No. 1, pp. 19-42, 2004.

P4. Energy for On-Chip vs. Off-Chip Network Links (Assigned to: Chaeyun Shim)
[Yayl98] G. I. Yayla, P. J. Marchand, and S. C. Esener, "Speed and Energy Analysis of Digital Interconnections: Comparison of On-Chip, Off-Chip, and Free-Space Technologies," Applied Optics, Vol. 37, No. 2, pp. 205-227, 1998.
[Sikd16] M. A. I. Sikder, "Emerging Technologies in On-Chip and Off-Chip Interconnection Network," Doctoral dissertation, Ohio University, 2016.

P5. Packaging Advantages of Hierarchical Networks (Not available for winter 2023)
[Ragh95] M. T. Raghunath and A. Ranade, "Designing Interconnection Networks for Multi-Level Packaging," VLSI Design, Vol. 2, No. 4, pp. 375-388, 1995.
[Unde12] K. D. Underwood and E. Borch, "Exploiting Communication and Packaging Locality for Cost-Effective Large Scale Networks," Proc. 26th ACM Int'l Conf. Supercomputing, pp. 291-300, 2012.

P6. Impact of Pruning on Network Packaging (Assigned to: Hao Li)
[Kwai98] D. M. Kwai and B. Parhami, "Pruned Three-Dimensional Toroidal Networks," Information Processing Letters, Vol. 68, No. 4, pp. 179-183, 1998.
[Rahm09] M. H. H. Rahman, X. Jiang, M. S. A. Masud, and S. Horiguchi, "Network Performance of Pruned Hierarchical Torus Network," Proc. 6th IFIP Int'l Conf. Network and Parallel Computing, pp. 9-15, 2009.

Routing Algoirthms

R1. Oblivious Network Routing and Its Limitations (Not available for winter 2023)
[Iyen15] S. S. Iyengar and K. G. Boroojeni, Oblivious Network Routing: Algorithms and Applications, MIT Press, 2015.
[Towl02] B. Towles and W. J. Dally, "Worst-Case Traffic for Oblivious Routing Functions," Proc. 14th Annual ACM Symp. Parallel Algorithms and Architectures, pp. 1-8, 2002.

R2. Adaptive Routing in Networks (Not available for winter 2023)
[Duat95] J. Duato, "A Necessary and Sufficient Condition for Deadlock-Free Adaptive Routing in Wormhole Networks," IEEE Trans. Parallel and Distributed Systems, Vol. 6, No. 10, pp. 1055-1067, 1995.
[Pere17] D. Perepelkin and M. Ivanchikova, "Improved Adaptive Routing Algorithm in Distributed Data Centers," Advances in Electrical and Electronic Engineering, Vol. 15, No. 3, pp. 502-508, 2017.

R3. Table-Assisted Routing in Networks (Not available for winter 2023)
[Bose14] A. Bose, P. Ghosal, and S. P. Mohanty, "A Low Latency Scalable 3D NoC Using BFT Topology with Table Based Uniform Routing," Proc. IEEE Computer Society Annual Symp. VLSI, pp. 136-141, 2014.
[Wang09] L. Wang, H. Song, Y. Jiang, and L. Zhang, "A Routing-Table-Based Adaptive and Minimal Routing Scheme on Network-on-Chip Architectures," Computers & Electrical Engineering, Vol. 35, No. 6, pp. 846-855, 2009.

R4. Deadlock-Free Routing in Networks (Assigned to: Brian Li)
[Duat01] J. Duato and T. M. Pinkston, "A General Theory for Deadlock-Free Adaptive Routing Using a Mixed Set of Resources," IEEE Trans. Parallel and Distributed Systems, Vol. 12, No. 12, pp. 1219-1235, 2001.
[Duat95] J. Duato, "A Necessary and Sufficient Condition for Deadlock-Free Adaptive Routing in Wormhole Networks," IEEE Trans. Parallel and Distributed Systems, Vol. 6, No. 10, pp. 1055-1067, 1995.

R5. Broadcasting and Multicasting in Networks (Assigned to: Ethan Sifferman)
[Defa04] X. Defago, A. Schiper, and P. Urban, "Total Order Broadcast and Multicast Algorithms: Taxonomy and Survey," ACM Computing Surveys, Vol. 36, No. 4, pp. 372-421, 2004.
[Haru06] H. A. Harutyunyan and B. Shao, "An Efficient Heuristic for Broadcasting in Networks," J. Parallel and Distributed Computing, Vol. 66, No. 1, pp. 68-76, 2006.

R6. Multi-Path Parallel Routing in Networks (Assigned to: Shu-Yu Li)
[Cido99] I. Cidon, R. Rom, and Y. Shavitt, "Analysis of Multi-Path Routing," IEEE/ACM Trans. Networking, Vol. 7, No. 6, pp. 885-896, 1999.
[Wang12] Y. Wang, W. S. Yang, and J. Wu, "Analysis of a Hypercube-Based Social Feature Multipath Routing in Delay Tolerant Networks," IEEE Trans. Parallel and Distributed Systems, Vol. 24, No. 9, pp. 1706-1716, 2012.

Theory and Foundations

T1. The Degree-Diameter Problem in Networks (Not available for winter 2023)
[Mill12] M. Miller and J. Siran, "Moore Graphs and Beyond: A Survey of the Degree/Diameter Problem," Electronic J. Combinatorics, 2012.
[Stel20] P. Steller, A Survey of the Degree/Diameter Problem for Undirected Graphs, MS thesis, University of Delaware, 2020.

T2. Cayley Networks: Properties and Applications (Not available for winter 2023)
[Gane17] A. Ganesan, "Cayley Graphs and Symmetric Interconnection Networks," Preprint arXiv:1703.08109, 2017.
[Dema13] M. C. Heydemann, "Cayley Graphs and Interconnection Networks," In Graph Symmetry: Algebraic Methods and Applications, p. 167, 2013.

T3. Cartesian-Product Networks (Assigned to: Ray Chang)
[Yous91] A. Youssef, "Cartesian Product Networks," Proc. Int'l Conf. Parallel Processing, pp. 684-685, 1991.
[Mora11] R. Moraveji, H. Sarbazi-Azad, and A. Y. Zomaya, "Performance Modeling of Cartesian Product Networks," J. Parallel and Distributed Computing, Vol. 71, No. 1, pp. 105-113, 2011.

T4. Connectivity of Networks (Assigned to: Sijia Liang)
[Li16] X. Li and Y. Mao, Generalized Connectivity of Graphs, Springer, 2016.
[Fran92] A. Frank, "Augmenting Graphs to Meet Edge-Connectivity Requirements," SIAM J. Discrete Mathematics, Vol. 5, No. 1, pp. 25-53, 1992.

T5. Hamiltonicity of Networks (Assigned to: Alex Lai)
[Kewe11] C. D. Z. Kewen, "A Survey of the Advances of Hamiltonicity on Simple Graphs," Mathematical Theory and Applications, 2011.
[Xu09] J. M. Xu and M. Ma, "Survey on Path and Cycle Embedding in Some Networks," Frontiers of Mathematics in China, Vol. 4, No. 2, pp. 217-252, 2009.

T6. Network Mappings and Embeddings (Assigned to: Shang-Hsun Yang)
[Fisc13] A. Fischer, J. F. Botero, M. T. Beck, H. De Meer, and X. Hesselbach," "Virtual Network Embedding: A Survey," IEEE Communications Surveys & Tutorials, Vol. 15, No. 4, pp. 1888-1906, 2013.
[Rahm10] M. R. Rahman, I. Aib, and R. Boutaba, "Survivable Virtual Network Embedding," Proc. Int'l Conf. Research in Networking, pp. 40-52, 2010.

Poster Preparation and Presentation Tips

Poster format Here are some guidelines for preparing your research poster. The idea of the poster is to present your research results and conclusions thus far, get oral feedback during the session from the instructor and your peers, and to provide the instructor with something to comment on before your final report is due (no presentation for winter 2021, so the instructor feedback will be written). Please send a PDF copy of the poster via e-mail by midnight on the poster due date.

Posters prepared for conferences must be colorful and eye-catching, as they are typically competing with dozens of other posters for the attendees' attention. Here are some tips and examples. Such posters are often mounted on a colored cardboard base, even if the pages themselves are standard PowerPoint slides. You can also prepare a "plain" poster (loose sheets, to be taped or pinned to a wall) that conveys your message in a simple and direct way. Eight to 10 pages, each resembling a PowerPoint slide, would be an appropriate goal. You can organize the pages into 2 x 4 (2 columns, 4 rows), 2 x 5, or 3 x 3 array on the wall. The top two of these might contain the project title, your name, course name and number, and a very short (50-word) abstract. The final two can perhaps contain your conclusions and directions for further work (including work that does not appear in the poster, but will be included in your research report). The rest will contain brief description of ideas, with emphasis on diagrams, graphs, tables, and the like, rather than text which is very difficult to absorb for a visitor in a short time span.

Grade Statistics

Chart All grades listed below are in [0.0, 4.3] (F to A+), unless otherwise noted
HW1 grades: Range = [B, A], Mean = 3.54, Median = A–
HW2 grades: Range = [B–, A+], Mean = 3.53, Median = A–
HW3 grades: Range = [C, A+], Mean = 3.32, Median = B+
HW4 grades: Range = [B, A], Mean = 3.41, Median = B+
HW5 grades: Range = [B–, A], Mean = 3.19, Median = B
Overall homework grades (percent): Range = [75, 102], Mean = 86, Median = 85
Research paper grades: Range = [L, H], Mean = 0.0, Median = M
Course letter grades: Range = [L, H], Mean = 0.0, Median = M

References

Image of a reference book

Required text: B. Parhami, Introduction to Parallel Processing: Algorithms and Architectures, Plenum Press, 1999. Make sure that you visit the textbook's web page which contains an errata. Lecture slides are also available there.
Optional recommended book: T. Rauber and G. Runger, Parallel Programming for Multicore and Cluster Systems, 2nd ed., Springer, 2013. Because ECE 254B's focus is on architecture and its interplay with algorithms, this book constitutes helpful supplementary reading.
On-line: N. Matloff, Programming Parallel Machines: GPU, Multicore, Clusters, & More. [Link]
Research resources:
The following journals contain a wealth of information on new developments in parallel processing: IEEE Trans. Parallel and Distributed Systems, IEEE Trans. Computers, J. Parallel & Distributed Computing, Parallel Computing, Parallel Processing Letters, ACM Trans. Parallel Computing. Also, see IEEE Computer and IEEE Concurrency (the latter ceased publication in late 2000) for broad introductory articles.
The following are the main conferences of the field: Int'l Symp. Computer Architecture (ISCA, since 1973), Int'l Conf. Parallel Processing (ICPP, since 1972), Int'l Parallel & Distributed Processing Symp. (IPDPS, formed in 1998 by merging IPPS/SPDP, which were held since 1987/1989), and ACM Symp. Parallel Algorithms and Architectures (SPAA, since 1988).
UCSB library's electronic journals, collections, and other resources

Miscellaneous Information

Motivation: The ultimate efficiency in parallel systems is to achieve a computation speedup factor of p with p processors. Although often this ideal cannot be achieved, some speedup is generally possible by using multiple processors in a concurrent (parallel or distributed) system. The actual speed gain depends on the system's architecture and the algorithm run on it. This course focuses on the interplay of architectural and algorithmic speedup techniques. More specifically, the problem of algorithm design for "general-purpose" parallel systems and its "converse," the incorporation of architectural features to help improve algorithm efficiency and, in the extreme, the design of algorithm-based special-purpose parallel architectures, are dealt with. The foregoing notions will be covered in sufficient detail to allow extensions and applications in a variety of contexts, from network processors, through desktop computers, game boxes, Web server farms, multiterabyte storage systems, and mainframes, to high-end supercomputers.

Catalog entry: 254B. Advanced Computer Architecture: Parallel Processing(4) PARHAMI. Prerequisites: ECE 254A. Lecture, 4 hours. The nature of concurrent computations. Idealized models of parallel systems. Practical realization of concurrency. Interconnection networks. Building-block parallel algorithms. Algorithm design, optimality, and efficiency. Mapping and scheduling of computations. Example multiprocessors and multicomputers.

History: The graduate course ECE 254B was created by Dr. Parhami, shortly after he joined UCSB in 1988. It was first taught in spring 1989 as ECE 594L, Special Topics in Computer Architecture: Parallel and Distributed Computations. A year later, it was converted to ECE 251, a regular graduate course. In 1991, Dr. Parhami led an effort to restructure and update UCSB's graduate course offerings in the area of computer architecture. The result was the creation of the three-course sequence ECE 254A/B/C to replace ECE 250 (Adv. Computer Architecture) and ECE 251. The three new courses were designed to cover high-performance uniprocessing, parallel computing, and distributed computer systems, respectively. In 1999, based on a decade of experience in teaching ECE 254B, Dr. Parhami published the textbook Introduction to Parallel Processing: Algorithms and Architectures (Website).
Offering of ECE 254B in winter 2023 (Link)
Offering of ECE 254B in winter 2022 (Link)
Offering of ECE 254B in winter 2021 (Link)
Offering of ECE 254B in winter 2020 (Link)
Offering of ECE 254B in winter 2019 (Link)
Offering of ECE 254B in winter 2018 (Link)
Offering of ECE 254B in winter 2017 (Link)
Offering of ECE 254B in winter 2016 (PDF file)
Offering of ECE 254B in winter 2014 (PDF file)
Offering of ECE 254B in winter 2013 (PDF file
Offering of ECE 254B in fall 2010 (PDF file)
Offering of ECE 254B in fall 2008 (PDF file)
Offerings of ECE 254B from 2000 to 2006 (PDF file)