ECE 257A: Fall Quarter 1998 offering


ECE 257A – Fault-Tolerant Computing, University of California, Santa Barbara, Fall 1998


Behrooz Parhami, Room 5155 Engineering I, Phone  805-893-3211,


MW 4:00-5:30, Room 1431 Phelps Hall


Open office hours, held in Room 5155 Engineering I – M 3:00-3:50, W 1:00-1:50, R 9:00-9:50


Dependability concerns are integral parts of engineering design. Ideally, we would like our computer systems to be perfect, always yielding timely and correct results. However, just as bridges collapse and airplanes crash occasionally, so too computer hardware and software cannot be made totally immune to unpredictable behavior. Despite great strides in component reliability and programming methodology, the exponentially increasing complexity of integrated circuits and software products makes the design of prefect computer systems nearly impossible. In this course, we study the causes of computer system failures (impairments to dependability), techniques for ensuring correct and timely computations despite such impairments, and tools for evaluating the quality of proposed or implemented solutions.


Basic digital system design at the level of ECE 152A/B and, preferably, ECE 154.


Reader – Portions of two out-of-print textbooks will be reproduced as the main reference for the course. We are now awaiting permission from the publishers.

JournalsIEEE Trans. Computers, IEEE Trans. Reliability, IEEE Trans. Software Engineering, ACM Trans. Computer Systems, and Information Processing Letters. Also, IEEE Computer, IEEE Micro, IEEE Design & Test of Computers, and ACM Computing Surveys are good sources for broad introductory papers.

Conferences – Int’l Symp. on Fault-Tolerant Computing (FTCS, annual, since 1971), Pacific Rim Int’l Symp. Fault-Tolerant Systems (PRFTS, odd years, since 1991), Conf. on Computer Assurance (COMPASS), IFIP Int’l Working Conf. Dependable Computing for Critical Applications (DCCA).


Students will be evaluated based on these four components with the given weights:


20% -- Homework assigned on Wed. 9/30, 10/14, 11/4, 11/18, each due in 12 days.


25% -- Closed-book midterm, Wed. 10/28, 4:00-6:00 (covers material up to errors).

10% -- Poster presentation of research project, Wed. 12/2, 4:00-6:00, in class.


45% -- Written research report, due by 4:00 pm on Wed. 12/9.

Research: A research paper or term project is required. Subject to be finalized by Wed. 10/21. Preliminary title, abstract, and reference list are due on Wed. 11/11. Final title and references are due on Wed. 11/25. Complete paper is due on Wed. 12/9 by 4:00 pm. All deadlines are firm.

Lecture plan:

Lectures have been scheduled as follows:

M 09/28   Introduction & motivation

  W 09/30   Dependability measures

M 10/05   Combinational modeling

  W 10/07   State-space modeling

M 10/12   Defect avoidance/circumvention

  W 10/14   Fault testing

M 10/19   Fault masking

  W 10/21   Error detection

M 10/26   Error correction

  W 10/28   MIDTERM EXAM

M 11/02   Malfunction diagnosis

  W 11/04   Malfunction tolerance

M 11/09   Degradation allowance/management

  W 11/11   Failure confinement/recovery

M 11/16   Self-checking modules

  W 11/18   Reconfiguration & voting

M 11/23   Algorithm design methods

  W 11/25   Agreement & adjudication

M 11/30   Software redundancy

  W 12/02   Research poster session


