Self-Study Topics on Dependable Computing, 1997

Dependability is the property of a system that reliance can justifiably 
be placed on the service it delivers. Dependable computing covers a 
wide range of subjects. This year I am offering 7 self-study topics 
in two dependable computing subjects: Dependable Real-Time Systems 
and Software Dependability.

General Reference on basic concepts and terminology:

J.-C. Laprie, Dependability: Basic Concepts and Terminolog. 
This publication can be found: 
(1) in book: J.-C. Laprie (Ed.), Dependability: Basic Concepts and 
    Terminolog, Springer-Verlag, Vienna, 1992.
(2) in Proc. 15th International Symposium on Fault-Tolerant Computing, 
    1985, (FTCS-15)
(3) Special edition of FTCS-25, Pasadena, 1995, highlights from 25 years.


Subject I

Dependable Real-Time Systems

-- In this subject, 5 self-study topics are offered. This topics are 
related to a project supported by FRD.
Diagram 1 gives an example of a dependable system with five UNIX machines 
where three of them form a reliable hard-core. 
Diagram 2 gives more details of the hard-core.

Introduction
        A real-time system is one whose basic specification, design 
and implementation must meet the functionality and the timing 
constraints. This implies that system correctness depends not only the 
logical correctness but also the timeliness of its actions.
        Although it is commonly believed that meeting the timing 
requirements is a matter of increasing system speed/throughput, 
research in real-time systems has discredited this notion. In fact, 
the computational structures appropriate for systems requiring bounded 
response time are fundamentally different from those requiring high 
throughput. The progress in hardware technology in recent years has 
made high-performance computing and communication feasible. However, 
it became clear that high-speed execution alone may not solve all the 
problems that real-time computing need to address.
        Real-time systems are widely applied in safety-critical areas, 
such as nuclear power plants, aerospace systems, industrial automation, 
telecommunication, banking, and traffic control systems, where the 
consequence of a computer failure is a significant economic impact or 
even loss of human life. High dependability is a fundamental requirement 
of the real-time system design. 
        Fault-tolerant computing ensure that a system functions correctly 
even if a limited number of components are faulty and is a major technique 
to increase system dependability. It is a challenge to meet the 
functionality and the timing constraints of a system, even if some 
components of the system are faulty.
        Dependable real-time computing is a broad area of research. 
In the self-study topics you are going to concentrate on the following 
points (each point is considered as a separate self-study topic:

1       Formal methods for specification and implementation of real-time systems
        The problem in this area which you are expected to address is:

        how to specify and ensure the timing of real-time systems. Programming
        languages, compilers, and tools which support specification and 
        implementation of real-time systems shall be studied.

        References
        [1] J. Vytopil (ed), Formal Techniques in Real-Time and 
            Fault-Tolerant Systems, pp.
        [2] B. Dasarathy, et al, Timing constraints of real-time systems. IEEE
            Trans. Soft. Eng., SE-11, Jan, 1985, pp.80-86
        [3] R. Gerber, et al, Compiler support for real-time programs, in 
            S.H. Son (ed) Advances in real-time systems, Printice Hall, 1995.
        [4] K.M. Kavi, et al., Specification and analysis of real-time systems 
            using CSP and Petri nets, in R. Mittal et al (ed), Fault-tolerant 
            system and software, Narosa, 1996, pp. 141 - 147.
        [5] M. Heisel, et al., Formal Specification of Safety-Critical 
            Software with Z and CSP, in Proc. 15th International Conference 
            on Computer Safety, Reliability and Security, Vienna, Oct. 1996, 
            pp. 31 - 45.

2       Fault-tolerant system architecture system with bounded fault 
        handling time.
        The problem in this area which you are expected to address is:

        to study the fault-tolerant techniques which can be applied in 
        real-time systems and some existing fault-tolerant real-time systems.

        Reference
        [1] H. Koptz, The time-triggered approach to real-time system design,
            in B. Randell, J.-C. Laprie, et al (ed), 
            Predictably dependable computing systems, Springer, 1995.
        [2] H.-P. Meske, et al., A processor Architecture Designed to 
            Facilitate the Safety Certification of Hard Real Time Systems, 
            in Proc. 15th International Conference on Computer Safety, 
            Reliability and Security, Vienna, Oct. 1996, pp. 31 - 45.
        [3] Y Chen, et al, Implementing Fault-Tolerance via Modular, 
            Redundancy with Comparison, IEEE Trans. on Reliability, Vol.39, 
            N0.2, June 1990, pp.Ê217-225.

3       Operating systems for predictable operations in a complex and 
        unpredictable environment with multiprocessors and 
        possible processor faults. 
        The problem in this area which you are expected to address is:

        to exam some existing real-time operating systems and and 
        scheduling and resource management algorithms which ensure 
        timing requirements are met, possibly also considering the case 
        of component faults

        References
        [1] J. Stankovic, et al, A reflective architecture for real-time 
               operating systems, in S.H. Son (ed) Advances in real-time 
               systems, Printice Hall, 1995.
        [2] K. Shin, A software overview of HARTS: A distributed real-
               time system, in S.H. Son (ed) Advances in real-time 
               systems, Printice Hall, 1995.
        [3] N. Audsley, Real-time system scheduling
            in B. Randell, J.-C. Laprie, et al (ed), 
            Predictably dependable computing systems, Springer, 1995.
        [4] J Lehoczky, Scheduling periodic and Aperiodic tasks using slack
            stealing algorithm, in
            S.H. Son (ed) Advances in real-time systems, Printice Hall, 1995.

4       Real-time communication
        The problem in this area which you are expected to address is:

        to study real-time communication mechanisms which support real-time 
        traffic in satisfying timing constraints of individual messages, 
        under the condition that some communication links may be broken.

        References
        [1] S. Rangarajan, A fault-tolerant protocol for location directory 
            maintenance in mobile networksin Proc. 25th International 
            Symposium on Fault-Tolerant Computing, Pasadena, 1995, pp.164-173.
        [2] D. Ferrari, A new admission control method for real-time 
            communication in an internet, in
            S.H. Son (ed) Advances in real-time systems, Prentice 
	      Hall, 1995.
        [3] M. Hamdaoui, et al., Selection of Timed Token Protocal Parameters 
            to Guaranttee Message Deadline, IEEE/ACM Trans. on Networking, 
            Vol.3, No.3, June 1995, pp. 340 - 351.

5       Industry applications 
        The problem in this area which you are expected to address is:

        to review dependable real-time computer systems used in industry; 
        to analyse the industry requiremets, and how these requirements are
        met.

        References
        [1] R. Eriksen, et al., Reliability and vulnerability assessment as
            decision support during purchase and design of complex, technical 
            systems, in Proc. 15th International Conference on 
            Computer Safety, Reliability and Security, Vienna, Oct. 1996, 
            pp. 207 - 218.
        [2] H. Kantz, Ch. Koza, The Electra railway signalling-syste, in Proc. 
            25th International Symposium on Fault-Tolerant Computing, 
            Pasadena, 1995, pp.453 - 463.
        [3] Safety Analysis and Evaluation of an Air Trafic Control 
            Computing System, in Proc. 15th International Conference 
            on Computer Safety, Reliability and Security, Vienna, Oct. 1996, 
            pp. 219 - 229.


Subject II

Software Dependability
With increasing software complexity in computer systems, software 
dependability is coming increasing concerns of overall systems. The 
importance of achieving a high dependability in software is obvious. 
One way of increasing software dependability is to test software 
thoroughly (software testing) and make sure that the dependability 
achieve the given standard (dependability evaluation). In this subject, 
2 self-study topics are offered

6      Software Dependability Model
       The problem in this area which you are expected to address is:

      to discuss different software dependability models which are used to 
      estimate software dependability. Discuss the relation and differences 
      of these models.

      References
      [1] C. V. Ramamoorthy, et al, Software reliability Ñ status and 
          perspectives, IEEE, Trans. Soft. Eng., SE-8, No. 4, July 1982, 
          pp. 354 - 371.
      [2] D. Hamlet, Connecting test coverage to software reliability, 
          the 5th International Symp. on Software Reliability Engineering, 
          Monterey, Nov. 1994, pp. 158 - 165.
      [3] Y. Chen, et al, Modelling Software Dependability Growth under 
          Input Partition Testing, in Proc. 15th International Conference 
          on Computer Safety, Reliability and Security, Vienna, Oct. 1996, 
          pp. 183 - 192.
      [4] W.J. Gutjahr, et al., Failure risk estimation via Markov Software 
          Usage Models, in Proc. 15th International Conference on 
          Computer Safety, Reliability and Security, Vienna, Oct. 1996, 
          pp. 136 - 145.

7      Comparing testing strategies
       The problem in this area which you are expected to address is:

      to study different software testing strategies, especially random 
      and partition testing. Discuss the relation and differences of these 
      strategies.

      References
      [1] T. Y. Chen, et al, On the relationship between partition and random 
          testing, IEEE Trans. Soft. Eng., SE-20, No. 12, Dec. 1994, 
          pp.Ê977 - 980.
      [2] J. W. Duran, et al, An evaluation of random testing, IEEE Trans. 
          Soft. Eng., SE-10, July 1984, pp. 438 - 444.
      [3] Y. Chen, et al, Comparing Software Testing Strategies Using 
          Reliability Growth, in R. Mittal et al (ed), Fault-tolerant system 
          and software, Narosa, 1996.
      [4] D. Hamlet, R. Taylor, partition testing does not inspire confidence, 
          IEEE Trans. Soft. Eng., SE-16, No. 12, Dec. 1990, pp.Ê1402 - 1411