Self-Study Topics on Dependable Computing, 1997
Dependability is the property of a system that reliance can justifiably
be placed on the service it delivers. Dependable computing covers a
wide range of subjects. This year I am offering 7 self-study topics
in two dependable computing subjects: Dependable Real-Time Systems
and Software Dependability.
General Reference on basic concepts and terminology:
J.-C. Laprie, Dependability: Basic Concepts and Terminolog.
This publication can be found:
(1) in book: J.-C. Laprie (Ed.), Dependability: Basic Concepts and
Terminolog, Springer-Verlag, Vienna, 1992.
(2) in Proc. 15th International Symposium on Fault-Tolerant Computing,
1985, (FTCS-15)
(3) Special edition of FTCS-25, Pasadena, 1995, highlights from 25 years.
Subject I
Dependable Real-Time Systems
-- In this subject, 5 self-study topics are offered. This topics are
related to a project supported by FRD.
Diagram 1 gives an example of a dependable system with five UNIX machines
where three of them form a reliable hard-core.
Diagram 2 gives more details of the hard-core.
Introduction
A real-time system is one whose basic specification, design
and implementation must meet the functionality and the timing
constraints. This implies that system correctness depends not only the
logical correctness but also the timeliness of its actions.
Although it is commonly believed that meeting the timing
requirements is a matter of increasing system speed/throughput,
research in real-time systems has discredited this notion. In fact,
the computational structures appropriate for systems requiring bounded
response time are fundamentally different from those requiring high
throughput. The progress in hardware technology in recent years has
made high-performance computing and communication feasible. However,
it became clear that high-speed execution alone may not solve all the
problems that real-time computing need to address.
Real-time systems are widely applied in safety-critical areas,
such as nuclear power plants, aerospace systems, industrial automation,
telecommunication, banking, and traffic control systems, where the
consequence of a computer failure is a significant economic impact or
even loss of human life. High dependability is a fundamental requirement
of the real-time system design.
Fault-tolerant computing ensure that a system functions correctly
even if a limited number of components are faulty and is a major technique
to increase system dependability. It is a challenge to meet the
functionality and the timing constraints of a system, even if some
components of the system are faulty.
Dependable real-time computing is a broad area of research.
In the self-study topics you are going to concentrate on the following
points (each point is considered as a separate self-study topic:
1 Formal methods for specification and implementation of real-time systems
The problem in this area which you are expected to address is:
how to specify and ensure the timing of real-time systems. Programming
languages, compilers, and tools which support specification and
implementation of real-time systems shall be studied.
References
[1] J. Vytopil (ed), Formal Techniques in Real-Time and
Fault-Tolerant Systems, pp.
[2] B. Dasarathy, et al, Timing constraints of real-time systems. IEEE
Trans. Soft. Eng., SE-11, Jan, 1985, pp.80-86
[3] R. Gerber, et al, Compiler support for real-time programs, in
S.H. Son (ed) Advances in real-time systems, Printice Hall, 1995.
[4] K.M. Kavi, et al., Specification and analysis of real-time systems
using CSP and Petri nets, in R. Mittal et al (ed), Fault-tolerant
system and software, Narosa, 1996, pp. 141 - 147.
[5] M. Heisel, et al., Formal Specification of Safety-Critical
Software with Z and CSP, in Proc. 15th International Conference
on Computer Safety, Reliability and Security, Vienna, Oct. 1996,
pp. 31 - 45.
2 Fault-tolerant system architecture system with bounded fault
handling time.
The problem in this area which you are expected to address is:
to study the fault-tolerant techniques which can be applied in
real-time systems and some existing fault-tolerant real-time systems.
Reference
[1] H. Koptz, The time-triggered approach to real-time system design,
in B. Randell, J.-C. Laprie, et al (ed),
Predictably dependable computing systems, Springer, 1995.
[2] H.-P. Meske, et al., A processor Architecture Designed to
Facilitate the Safety Certification of Hard Real Time Systems,
in Proc. 15th International Conference on Computer Safety,
Reliability and Security, Vienna, Oct. 1996, pp. 31 - 45.
[3] Y Chen, et al, Implementing Fault-Tolerance via Modular,
Redundancy with Comparison, IEEE Trans. on Reliability, Vol.39,
N0.2, June 1990, pp.Ê217-225.
3 Operating systems for predictable operations in a complex and
unpredictable environment with multiprocessors and
possible processor faults.
The problem in this area which you are expected to address is:
to exam some existing real-time operating systems and and
scheduling and resource management algorithms which ensure
timing requirements are met, possibly also considering the case
of component faults
References
[1] J. Stankovic, et al, A reflective architecture for real-time
operating systems, in S.H. Son (ed) Advances in real-time
systems, Printice Hall, 1995.
[2] K. Shin, A software overview of HARTS: A distributed real-
time system, in S.H. Son (ed) Advances in real-time
systems, Printice Hall, 1995.
[3] N. Audsley, Real-time system scheduling
in B. Randell, J.-C. Laprie, et al (ed),
Predictably dependable computing systems, Springer, 1995.
[4] J Lehoczky, Scheduling periodic and Aperiodic tasks using slack
stealing algorithm, in
S.H. Son (ed) Advances in real-time systems, Printice Hall, 1995.
4 Real-time communication
The problem in this area which you are expected to address is:
to study real-time communication mechanisms which support real-time
traffic in satisfying timing constraints of individual messages,
under the condition that some communication links may be broken.
References
[1] S. Rangarajan, A fault-tolerant protocol for location directory
maintenance in mobile networksin Proc. 25th International
Symposium on Fault-Tolerant Computing, Pasadena, 1995, pp.164-173.
[2] D. Ferrari, A new admission control method for real-time
communication in an internet, in
S.H. Son (ed) Advances in real-time systems, Prentice
Hall, 1995.
[3] M. Hamdaoui, et al., Selection of Timed Token Protocal Parameters
to Guaranttee Message Deadline, IEEE/ACM Trans. on Networking,
Vol.3, No.3, June 1995, pp. 340 - 351.
5 Industry applications
The problem in this area which you are expected to address is:
to review dependable real-time computer systems used in industry;
to analyse the industry requiremets, and how these requirements are
met.
References
[1] R. Eriksen, et al., Reliability and vulnerability assessment as
decision support during purchase and design of complex, technical
systems, in Proc. 15th International Conference on
Computer Safety, Reliability and Security, Vienna, Oct. 1996,
pp. 207 - 218.
[2] H. Kantz, Ch. Koza, The Electra railway signalling-syste, in Proc.
25th International Symposium on Fault-Tolerant Computing,
Pasadena, 1995, pp.453 - 463.
[3] Safety Analysis and Evaluation of an Air Trafic Control
Computing System, in Proc. 15th International Conference
on Computer Safety, Reliability and Security, Vienna, Oct. 1996,
pp. 219 - 229.
Subject II
Software Dependability
With increasing software complexity in computer systems, software
dependability is coming increasing concerns of overall systems. The
importance of achieving a high dependability in software is obvious.
One way of increasing software dependability is to test software
thoroughly (software testing) and make sure that the dependability
achieve the given standard (dependability evaluation). In this subject,
2 self-study topics are offered
6 Software Dependability Model
The problem in this area which you are expected to address is:
to discuss different software dependability models which are used to
estimate software dependability. Discuss the relation and differences
of these models.
References
[1] C. V. Ramamoorthy, et al, Software reliability Ñ status and
perspectives, IEEE, Trans. Soft. Eng., SE-8, No. 4, July 1982,
pp. 354 - 371.
[2] D. Hamlet, Connecting test coverage to software reliability,
the 5th International Symp. on Software Reliability Engineering,
Monterey, Nov. 1994, pp. 158 - 165.
[3] Y. Chen, et al, Modelling Software Dependability Growth under
Input Partition Testing, in Proc. 15th International Conference
on Computer Safety, Reliability and Security, Vienna, Oct. 1996,
pp. 183 - 192.
[4] W.J. Gutjahr, et al., Failure risk estimation via Markov Software
Usage Models, in Proc. 15th International Conference on
Computer Safety, Reliability and Security, Vienna, Oct. 1996,
pp. 136 - 145.
7 Comparing testing strategies
The problem in this area which you are expected to address is:
to study different software testing strategies, especially random
and partition testing. Discuss the relation and differences of these
strategies.
References
[1] T. Y. Chen, et al, On the relationship between partition and random
testing, IEEE Trans. Soft. Eng., SE-20, No. 12, Dec. 1994,
pp.Ê977 - 980.
[2] J. W. Duran, et al, An evaluation of random testing, IEEE Trans.
Soft. Eng., SE-10, July 1984, pp. 438 - 444.
[3] Y. Chen, et al, Comparing Software Testing Strategies Using
Reliability Growth, in R. Mittal et al (ed), Fault-tolerant system
and software, Narosa, 1996.
[4] D. Hamlet, R. Taylor, partition testing does not inspire confidence,
IEEE Trans. Soft. Eng., SE-16, No. 12, Dec. 1990, pp.Ê1402 - 1411