Architectural Aspects of Deriving Performance Guarantees

Welcome to the webpage of the ISCA tutorial on "Architectural Aspects of Deriving Performance Guarantees: Timing Analysis and Timing Predictability".

The tutorial will describe state-of-the-art methods for the derivation of performance guarantees. These methods incorporate models of the execution platform. The effort needed for the analyses and the precision obtained crucially depend on certain properties of the platform architecture. We subsume these under the label Predictability.

Dates

The tutorial will take place in the morning of June 20, 2010. The location will be announced.

Organization

For organizational issues, please refer to the ISCA web page.

Speakers

Prof. Reinhard Wilhelm (Saarland University, Germany)

Research interests: Timing Analysis for Real-Time Systems, Static Program Analysis based on 3-valued logic, vulgo Shape Analysis, Compiler Construction, Algorithm Explanation.

Positions and Functions: Chair for Programming Languages and Compiler Construction at Saarland University, Scientific Director of Schloss Dagstuhl, the Leibniz Center for Informatics, Site Coordinator Saarbruecken in the AVACS Project, Coordinator of the Predator Project, Member of the Strategic Management Board for the Artist2 and ArtistDesign Networks of Excellence, Associate of AbsInt Angewandte Informatik GmbH, Member of the ACM SIGBED Executive Committee, Member of the Steering Committee of the International Conference on Embedded Software EMSOFT, Member at Large of the Steering Committee of the ACM Conference on Languages, Compilers, and Tools for Embedded Systems LCTES, Coorganizer of ARTIST workshop Reconciling Predictability with Performance (RePP), Member of the Scientific Advisory Board of CWI, Member of the Program Committees of SCOPES 2009, LCTES 2009, MEMOCODE 2009, RTSS 2008.

Course: Timing Analysis and Timing Predictability: extension to multi-processor systems

Abstract: Hard real-time systems are subject to stringent timing constraints which are dictated by the surrounding physical environment. A schedulability analysis has to be performed in order to guarantee that all timing constraints will be met ("timing validation"). Existing techniques for schedulability analysis require upper bounds for the execution times of all the system's tasks to be known. These upper bounds are commonly called worst-case execution times (WCETs). The WCET-determination problem has become non-trivial due to the advent of processor features such as caches, pipelines, and all kinds of speculation, which make the execution time of an individual instruction locally unpredictable. Such execution times may vary between a few cycles and several hundred cycles.

A combination of Abstract Interpretation (AI) with Integer Linear Programming (ILP) has been successfully used to determine precise upper bounds on the execution times of real-time programs. The task solved by abstract interpretation is to compute invariants about the processor's execution states at all program points. These invariants describe the contents of caches, of the pipeline, of prediction units etc. They allow to verify local safety properties, safety properties who correspond to the absence of "timing accidents". Timing accidents, e.g. cache misses, pipeline stalls are reasons for the increase of the execution time of an individual instruction in an execution state.

The technology and tools have been used in the certification of several time-critical subsystems of the Airbus A380. The AbsInt tool, aiT, is the only tool worldwide, validated for these avionics applications.

Prof. Lothar Thiele (ETH Zurich, Switzerland)

Research interests: Models, methods and software tools for the design of embedded systems, embedded software and bioinspired optimization techniques

Short CV: After completing his Habilitation thesis from the Institute of Network Theory and Circuit Design of the Technical University Munich, Lothar Thiele joined the Information Systems Laboratory at Stanford University in 1987. In 1988, he took up the chair of microelectronics at the Faculty of Engineering, University of Saarland, Saarbrucken, Germany. He joined ETH Zurich, Switzerland, as a Professor of Computer Engineering, in 1994.

In 1986 he received the "Dissertation Award" of the Technical University of Munich, in 1987, the "Outstanding Young Author Award" of the IEEE Circuits and Systems Society, in 1988, the Browder J. Thompson Memorial Award of the IEEE, and in 2000-2001, the "IBM Faculty Partnership Award". In 2004, he joined the German Academy of Natural Scientists Leopoldina. In 2005, he was the recipient of the Honorary Blaise Pascal Chair of University Leiden, The Netherlands.

Course: Timing analysis of parallel and distributed embedded systems

Abstract: Often, the constraints imposed by a particular application domain require a distributed implementation of embedded systems. In this case, a number of software or hardware components communicate via some interconnection network. We find this structure on various levels of granularity, starting from multiprocessor systems on a chip via distributed embedded control units (ECU) in automotive applications and ending at large-scale sensor networks.

For example, architectural concepts of heterogeneity, distributivity and parallelism can be observed on single hardware components themselves, as they are often implemented as so-called systems-on-chip (SoC) or multiprocessor-systems-on-a-chip (MPSoC). In these components, a collection of memories and heterogeneous computing resources are implemented on a single device, and communicate using networks-on-chip (NoC) that can be regarded as dedicated interconnection networks involving adapted protocols, bridges or gateways.

Embedded systems are typically reactive systems that are in continuous interaction with their physical environment to which they are connected through sensors an actuators. Examples are applications in multimedia processing, automatic control, automotive and avionics, and industrial automation. Therefore, many embedded systems must meet real-time constraints, i. e. they must react to stimuli within a time interval dictated by the environment. It becomes apparent that heterogeneous and distributed embedded real-time systems as described above are inherently difficult to design and to analyze because of the tight interaction between computation, communication and the available resources.

Part of this difficulty is caused by the fact that the functional and extra-functional behavior of the system is influenced by interferences on shared resources such as processors, memory or communication devices. Packet streams or tasks may prevent each other from using these resources, even if they belong to independent parts of the application. As a result, resource sharing strategies influence the system behavior to a large extend. In addition, the system environment which continuously interacts with the embedded system may vary and cause an additional degree of non-determinism.

During the system level design process of an embedded system, a designer is typically faced with questions such as whether the timing properties of a certain system design will meet the design requirements, what architectural element will act as a bottleneck, or what the on-chip memory requirements will be. Consequently it becomes one of the major challenges in the design process to analyze specific characteristics of a system design, such as end-to-end delays, buffer requirements, or throughput in an early design stage, to support making important design decisions before much time is invested in detailed implementations. This analysis is generally referred to as system level performance analysis. If the results of such an analysis is able to give guarantees on the overall system behavior, it can also be applied after the implementation phase in order to verify critical system properties.

In the presentation, we will cover the following aspects of system level performance analysis of distributed embedded systems:

  • Approaches to system-level performance analysis. Requirements in terms of accuracy, scalability, composability and modularity.
  • Modular Performance Analysis (MPA): basic principles, methods and tool support.
  • Examples that show the applicability: An environment to map applications onto multiprocessor platforms including specification, simulation, performance evaluation and mapping of distributed algorithms; analysis of memory access and I\O interaction on shared busses in multi-core systems.

Dr. Jan Reineke (UC Berkeley, USA)

Research interests: Timing Analysis and Timing Predictability, Static Analysis by Abstract Interpretation, in particular Cache Analysis and Shape Analysis.

Short CV: Dr. Reineke studied computer science in Oldenburg and Saarbrücken, receiving a Bachelor's degree from the University of Oldenburg in 2003 and a Masters from Saarland University in 2005. From 2005, he was a PhD student under the supervision of Reinhard Wilhelm. He defended his PhD thesis on "Caches in WCET Analysis" in late 2008. Since late 2009, he is a postdoctoral scholar at the University of California, Berkeley in the group of Edward A. Lee.

Course: Everything you (n)ever wanted to know about caches

Abstract: Embedded systems as they occur in application domains such as automotive, aeronautics, and industrial automation often have to satisfy hard real-time constraints. Safe and precise bounds on the worst-case execution time (WCET) of each task have to be derived.

In this talk I will discuss the influence of the cache architecture on the precision and soundness of WCET analyses, by

  • evaluating predictability metrics that capture the inherent uncertainty in any cache analysis.
  • introducing the notion of relative competitiveness, which allows to derive new cache analyses that are optimal w.r.t. the predictability metrics
  • investigating the soundness of measurement-based WCET analysis in the presence of caches.