Fatal defects in the control system of the Ariane-5 rocket and the Mars Pathfinder have led to headlines in newspapers all over the world. To detect such flaws in an early stage of the design, the European Space Agency intends to develop techniques that assist design engineers in software tools that help the engineers to find bugs in a fully automated manner.
In this project a model-based approach to system-software co-engineering is taken, specifically tailored to critical on-board systems for the space domain. For this, the consortium develops an integrated toolset based on a newly developed modelling language.
A System-Level Integrated Modeling (SLIM) language for modeling and specifying hardware/software systems, covering the following aspects:
- (timed) hardware operation, specified on the level of processors, storages, buses, sensors, and actuators
- (timed) software operation, supporting concepts such as processes, threads, methods and/or procedures, and instructions
- hybridity, i.e., continuous, real-valued variables with (linear) time-dependent dynamics which are used to model the interaction between the physical and the hardware/software system
- faults, covering aspects such as
- hardware faults (with probabilistic failure rates) and software faults,
- error propagation, i.e., turning fault occurrences into failure events,
- intermittent failures and steady-state failures,
- fault treatment and failure recovery measures, and
- mapping failures from architectural level to service level, introducing degraded modes of operation.
Along with this language is the development of a unified basic formalism which provides the semantic models for SLIM specifications, along with a formal and semantically correct translation of the latter into the former. Techniques for the automated integration of different kinds of models, such as hardware, software, and fault models will be developed.
An example of a battery model in SLIM:
device type Battery features empty: out event port; voltage: out data port real initially 6.0; end Battery; device implementation Battery.Imp subcomponents energy: data continuous initially 100.0; modes charged: activation mode while energy'=-0.02 and energy>=20.0; depleted: mode while energy'=-0.03; transitions charged -[then voltage:=energy/50.0+4.0]-> charged; charged -[empty when energy<=20.0]-> depleted; depleted -[then voltage:=energy/50.0+4.0]-> depleted; end Battery.Imp;
Extensive experimental studies in the field of automated verification have considered more than 500 case studies and have established that only seven patterns (or categories) of requirements suffice to cover about 80% of all occurring requirements. In this project, we anticipate the use of these patterns to describe desired system-level aspects, while keeping into account the interplay between functional properties, performability and dependability aspects
Example of patterns:
- The probability that proposition Phi will eventually hold within timebound [Time1,Time2].
The probability that proposition battery is depleted will eventually hold within [0,10].
- The probability that proposition Phi1 precedes proposition Phi2 within timebound [Time1,Time2].
The probability that proposition first sensor glitch precedes proposition second sensor glitch within timebound [10,40].
Analyzing operational correctness is the first step to be performed during the system development lifecycle. It consists in verifying that the system will operate correctly with respect to a set of functional requirements, under the hypothesis of nominal conditions, that is, when software and hardware components are assumed to be fault-free. A preliminary but related step, consists in analyzing the set of provided functional requirements, to guarantee that they are consistent and that they adequately and fully describe the desired system behavior. Both issues will be addressed using state-of-the-art model checking techniques.
Key techniques: model checking, model simulation.
Safety and Dependability Analysis
Analyzing system safety is a fundamental step that is performed in parallel with system design and verification of functional correctness. The goal of safety analysis is to investigate the behavior of a system in degraded conditions (that is, when some parts of the system are not working properly, due to malfunctions) and ensure that the system meets the safety requirements that are required for its deployment and use.
Key techniques: (dynamic) fault tree generation, (dynamic) FMEA table generation, fault tolerance evaluation, criticality analysis.
To guarantee the required system performance, integrated hardware and software models will be evaluated. In line with the approach for the functional correctness, model checking techniques will be employed for assessing performance requirements.
Key techniques: performability evaluation, (dynamic) fault tree evaluation and verification.
Fault Detection, Identification and Recovery Effectiveness Analysis
The models will include a formal description of both the fault detection and identification sub-systems, and the recovery actions to be taken. Based on these models, we will provide tool facilities to analyze the operational effectiveness of the (probabilistic) FDIR measures, and to investigate the observational requirements that make the system diagnosable.
Key techniques: (probabilistic) fault detection, identification and recovery analysis, synthesis of minimum set of observables.
In order to ensure the quality of specifications, functional requirements will be preliminarily validated. Validation of requirements includes both property simulation, allowing the designer to explore the behaviors associated with the requirements, and property assurance, allowing the designer to explore the strictness and adequacy of the requirements. Expected benefits of this approach include traceability of the requirements and easier sharing between different actors involved in system design and safety assessment. Furthermore, high- quality requirements facilitate incremental system development and assessment, reuse and design change, and they can be useful for product certification.
Key techniques: property consistency check, property witness, property counterexample, property possibility check.
The activities described in the previous sections are supported by an integrated platform for system-software co-engineering, which will incorporate existing tools (or extensions thereof ) in a uniform environment. An overview of this toolset is shown below. The tool set takes as input a model written in the high-level SLIM language (integrated with fault and FDIR models, as required – the integration step is not shown for conciseness), and a set of requirements, written as specification patterns. Appropriate translators take care of translating the models into the low-level languages supported by the verification tools, and of translating the requirements in a suitable fragment of temporal logic. The above figure summarizes the different analyses provided by the tool set, together with the corresponding outputs and the tools responsible for performing them. The pair of dashed arrows indicates a flow of information between the tools used to perform safety analysis and the tools for performability analysis; this link is used, for instance, to facilitate probabilistic analysis of dynamic fault trees.
The toolset can be downloaded from Tools & Download.
A performance analysis of the toolset is described in CAV2010 Performance Analysis. Further information about the toolset can be found in our publications.
Industrial Case Studies
The need for case studies is mainly imposed by the evaluation and validation of the proposed approach and tool set. We have two case studies representing the development of different space systems. One is based on a TAS-F telecommunication platform and the other one on the inter-satellite link (ISL) in case of formation flying.