You are required to read and agree to the below before accessing a full-text version of an article in the IDE article repository.

The full-text document you are about to access is subject to national and international copyright laws. In most cases (but not necessarily all) the consequence is that personal use is allowed given that the copyright owner is duly acknowledged and respected. All other use (typically) require an explicit permission (often in writing) by the copyright owner.

For the reports in this repository we specifically note that

  • the use of articles under IEEE copyright is governed by the IEEE copyright policy (available at
  • the use of articles under ACM copyright is governed by the ACM copyright policy (available at
  • technical reports and other articles issued by Mälardalen University is free for personal use. For other use, the explicit consent of the authors is required
  • in other cases, please contact the copyright owner for detailed information

By accepting I agree to acknowledge and respect the rights of the copyright owner of the document I am about to access.

If you are in doubt, feel free to contact

An ALARP stop-test Decision for the worst case Timing characteristics of safety-critical systems


Licentiate presentation

Start time:

2016-09-19 13:00

End time:

2016-09-19 00:00


Room Gamma (MDH)

Contact person:


Examining committee:
- Professor Karl Meinke, KTH; ( Faculty examiner-Opponent)
- Professor Björn Lisper, MDH
- Professor Thomas Nolte, MDH
- Associate Professor Cristina Seceleanu, MDH (Reserve)

Profs: Iain Bate, Hans Hansson & Sasikumar Punnekkat (main advisor, MDH

Abstract:  Safety-critical systems are those in which failure can lead to loss of people’s lives, or catastrophic damage to the environment. Timeliness is an important requirement in safety-critical systems, which relates to the notion of response time, i.e., the time a system takes to respond to stimuli from the environment. If the response time exceeds a specified time interval, a catastrophe might occur.
Stringent timing requirements make testing a necessary and important process with which not only the correct system functionality has to be verified but also the system timing behaviour. However, a key issue for testers is to determine when to stop testing, as stopping too early may result in defects remaining in the system, or a catastrophe due to high severity level of undiscovered defects; and stopping too late will result in waste of time and resources. To date, researchers and practitioners have mainly focused on the design and application of diverse testing strategies, leaving the critical stop-test decision a largely open issue, especially with respect to timeliness.

In the first part of this thesis, we propose a novel approach to make a stop-test decision in the context of testing the worst-case timing characteristics of systems. More specifically, we propose a convergence algorithm that informs the tester whether further testing would reveal significant new insight into the timing behaviour of the system, and if not, it suggests testing to be stopped. The convergence algorithm looks into the observed response times achieved by testing, and examines whether the Maximum Observed Response Time (MORT) has recently increased, and when this is no longer the case, it investigates if the distribution of response times has changed significantly. When no significant new information about the system is revealed during a given period of time it is concluded, with some statistical confidence, that more testing of the same nature is not going to be useful. However, some other testing techniques may still achieve significant new findings.
Furthermore, the convergence algorithm is evaluated based on the As Low As Reasonably Practicable (ALARP) principle which is an underpinning concept in most safety standards. ALARP involves weighting benefit against the associated cost. In order to evaluate the convergence algorithm, it is shown that the sacrifice, here testing time, would be grossly disproportionate compared to the benefit attained, which in this context is any further significant increase in the MORT after stopping the test.
Our algorithm includes a set of tunable parameters. The second part of this work is to improve the algorithm performance and scalability by (i) determine whether the parameters do affect the algorithm, and (ii) identify and tune the most influential parameters. This process is based on the Design of Experiment (DoE) approach.

Moreover, the algorithm is required to be robust, i.e., provides valid stop-test decisions across a required range of task sets. For example, if the system’s number of tasks varies from 10 to 50 tasks and the tasks’ periods change from the range [200 µs, 400 µs] to the range [200 µs, 1000 µs], the algorithm performance would not be adversely affected. In order to achieve robustness, firstly, the task set parameters that influence the algorithm performance the most are identified by the Analysis of Variance (ANOVA) approach. Secondly, it is examined whether the algorithm is sound over some required ranges of those parameters, and if not, the situations in which the algorithm’s performance significantly degrades are identified. In our future work we will use these situations to stress test the algorithm and to tune it so that it becomes robust across the required ranges.

Our experimental evaluation, so far, has shown that the convergence algorithm provides good results for most considered scenarios, however, there have been experiments in which the algorithm results in a premature stop-test decision, i.e., it suggests to stop even if further testing is likely to reveal new interesting information. With the aim to improve the situation, we investigate if the statistical test used in the algorithm may have an adverse effect on it and if another test could improve the premature stop-test decision. The evidence collected indicates that the amount of premature stop-test decision can be reduced (or even eliminated), at the cost of an increased testing effort.

Mahnaz Malekzadeh,