Research and Integration Activities for the "Excution Platforms" cluster

System Modelling Infrastructure
JPIA-Platform

Abstract
Integrate ongoing research efforts on infrastructure modelling. This would replace prototyping hardware to reduce the cost and time required for designing embedded systems. This activity is strategic for providing one angle in tackling the growing complexity of embedded systems.


Baseline

A key research and research integration enabler is a scalable and realistic modelling platform which is abstract enough to provide complete system representations and some form of functional models even for billion-transistor future systems, while at the same time providing the needed flexibility for modelling a number of different embodiments (e.g. multi-processors, homogeneous and heterogeneous, reconfigurable, etc.).

The main objectives are:
- Integrate ongoing research efforts on infrastructure modelling
- Replacing prototyping of hardware
- Reducing the cost and time required for designing embedded systems
- Tackling the growing complexity of embedded systems


Previous Work

Simulation-based Modeling - MPARM (UoB)
months 1-6
UoB has devoted quite a lot of effort to developing a simulation environment for multi-processor systems on chip, called MPARM, and augmenting its modelling capabilities so to be able to simulate real-life systems. The work initially addressed the integration of IP cores into the simulation environment, and therefore required to cope with issues such as IP core wrapping and standard interfacing, synchronization with the system simulation engine, development of non-functional models (e.g., power consumption) and integration of such models in the functional simulation. Moreover, inter-processor communication and synchronization mechanisms have been developed, modelled and explored via functional simulation. The work on IP core integration, communication and synchronization has led to a multi-processor system-on-chip simulation environment with unprecedented modelling capabilities, which can be effectively deployed for design space exploration at a high level of accuracy. In fact, the cycle-accuracy of all component models has been preserved throughout the development and integration process.

MPARM existed prior to ARTIST2, but it has been extended and refined during the first year of ARTIST2. The work contributed by ARTIST2 are the development of the entire traffic tracing and the transaction extraction facilities, which have been instrumental to integrate the traffic generators (see next paragraph) with MPARM. On the modelling side, ARTIST2 has contributed with the development of a few communicaiton intensive benchmarks.for comparing different NoC technologies and strategies.

months 7-12
The system modelling effort in MPARM has not only been concerned with the hardware architecture, but also the software infrastructure. A middleware layer has been designed with the objective of abstracting software developers from low level implementation details such as memory map, address of memory mapped slave devices, management of synchronization and inter-processor communication mechanisms, shared memory allocation and de-allocation, etc. This work has been performed with the cooperation of associated member IMEC.
The work on the middleware layer performed in cooperation with IMEC is entirely contributed by ARTIST2.

Simulation-based Modeling - Traffic Generators (UoB, DTU)
months 1-6
Traditionally, synthetic traffic generators have been used to overcome the more realistic development scenarios in the industry, where the parallel development of components may cause IP core models to be still unavailable when tuning the communication architecture. However, target applications increasingly present non-trivial execution flows and synchronization patterns, especially in presence of underlying operating systems and when exploiting interrupt facilities. This property makes it very difficult to generate realistic test traffic. For this reason, UoB and DTU have established a cooperation to realistically render SoC traffic patterns with interrupt awareness. The proposed methodology was extensively validated by showing cycle-accurate reproduction of previously traced application flows.

The entire work on developing, integrating and testing the traffic generators are contributed by ARTIST2. This has been done in a close cooperation between UoB and DTU.

months 7-12
The cooperation between UoB and DTU to establish a reactive traffic generator model has been continued in this period. The generator model has been extended to deal with more complex and more realistic events, such as OS-driven interrupt handling mechanism, and therefore mimic non-trivial execution flows and synchronization patterns.

The work on extending the traffic generators to handle interrupts and synchronization is entirely contributed by ARTIST2.

Distributed embedded systems for automotive applications (Linköping University, DTU)
The work of Linkoeping Univeristy aims at implementing a simulator for distributed embedded systems for automotive applications. The starting point for this work is the multiprocessor simulation environment (ARTS) developed at DTU. The work is performed in cooperation between the DTU and Linkoping groups. A student from DTU has made a short visit to Linkoping in June 2005.

The entire work on developing a simulator for distributed embedded systems for automotive applications based on ARTS is contributed by ARTIST2.

Simulation-based Modeling - ARTS (DTU)
months 1-6
ARTS is a system-level heterogeneous multiprocessor System-on-Chip (MPSoC) modeling framework which allows for designer-driven design-space exploration of heterogeneous MPSoC platform architectures and co-exploration of cross-layer dependencies. In particular, the consequences of different mappings of tasks to processors – software or hardware, the effects of different RTOS selections – scheduling, synchronization and resource allocation policies, and the effects of different network topologies and communication protocols. As both ARTS and MPARM are based on SystemC and interface components using OCP, initial tempts to interface the two models have been taken.
ARTS existed prior to ARTIST2, but it has been extended and refined during the first year of ARTIST2. The work contributed by ARTIS2 are the development of an OCP-based interface between computaion and communication components. This has required a major rewrite of the modelling core of ARTS as well as an extension of the basic ARTS model to include IO tasks, modelling device drivers, and IO devices, modelling the interface devices.

months 7-12
The work on the ARTS modeling framework has been extended to include more details of the platform. This has been done in order to target two different types of platforms: MPSoC particularly for multimedia applications, and wireless sensor networks. On the application side, DTU and Linköping University have started a cooperation aimed at extending ARTS to be able to simulate distributed embedded systems for automotive applications.

The cooperation between DTU and Linkoping is done entirely within ARTIST2. The extensions related to multimedia applications, including the setup and experimentation wich a wireless multimedia terminal is contributed ARTIST2. The work done on extending the ARTS modeling framework to support modelling of wireless sensor networks has been done within the Hogthrob project, which is one of the main sources of fundings for the system modelling infrastructure activity.

Formal Modeling - Response Time Analysis (TU Braunschweig)
months 1-6
The aim has been to consider remote memory accesses in worst-case response times. In our extended task model, multiple remote transactions during the task’s execution are taken into account. This extension allows the straightforward modelling of many applications and increases the applicability of formal analysis methods to many real-world architectures containing shared memory.
The task model extension that allows to include the effect of remote memory accesses on worst-case response time analysis is entirely contributed by ARTIST2.

Formal Modeling - Sensitivity Analysis (TU Braunschweig)
months 1-6
Sensitivity analysis and flexibility optimization is used to reduce the design risk of critical components and increase design robustness. Sensitivity analysis allows the system designer to evaluate the flexibility of a given system, and thus to quickly assess the system-level impact of changes in performance properties of individual hardware and software components. If for example, the integration of a supplied IP component into the system in its original configuration results in a non-working system, the designer can easily determine if the reconfiguration of the system is possible so that all system constraints are satisfied.
The Artist2 contribution to sensitivity analysis is the formal framework to define sensitivity and robustness measures which can then be used in design space exploration and robustness optimization.

Formal Modeling - Modelling Power Consumption with SymTA/S (TU Braunschweig)
months 7-12
Recently the SymTA/S tool was extended in cooperation with Prof. Sharon Hu, University of Notre Dame, USA, to model and analyse the power consumption of complex heterogeneous embedded systems. Power aspects represent, besides performance issues, a critical problem during implementation and integration of complex systems. Currently we are working on power optimization techniques (heuristic and stochastic) based on the developed power models.
SymTA/S existed prior to ARTIST2. The work on extending the basic model of SymTA/S to handle power consumption which has been done in a cooperation between TU Braunschweig and University of Notre Dame is contributed by ARTIST2.

Formal Modeling - Integration (TU Braunschweig, ETHZ)
months 7-12
Initial attempts to integrate the event-stream formalism from TU Braunschweig and the real-time calculus from ETHZ have been established. The real-time calculus has been embedded into the SymTA/S front-end, allowing designers to expres both models within the same environment.

The cooperation between TU Braunschweig and ETHZ on embedding the real-time calculus within the SymTA/S front-end is entirely contributed by ARTIST2.


Problem Tackled in Year 2

The method to integrate the various approaches provided by the international research community will be to develop and offer a system of formal models, describing their dependencies and offering associated tools which implement those models and associated methods (e.g. simulation, worst case analysis), if applicable.

The aim is to create a system-level model in which designers can model and analyse complex heterogeneous embedded systems, and in particular explore:
- Consequences of different mappings of tasks to processors – software or hardware
- Effects of different RTOS selections – scheduling, synchronization and resource allocation policies
- Effects of different Network topologies and communication protocols.
- Effects of different component selections – processor, memory and memory hierarchy, IO buffers, co-processors

As this is a very complex problem, we are addressing it through two different approaches:
- Simulation-based modeling, in which we want to integrate cycle-true models with transaction level and abstract models in order to be able to do cross-layer and –level modeling and analysis.
- Formal modeling, in which we want to integrate different formalisms in order to be able to perform worst-case and response time analysis as well as schedulability analysis.

In the long term we want to integrate the simulation-based and formal models into a system modeling infrastructure which allows for both views.
The major focus of Year2 has been on:
- Model integration both within and between the two different approaches; simulation-based and formal-based.
- The usage of the models, in particular for performance analysis and exploration. Much of the model usage are reported in detail in the other activity reports from the Execution Platforms cluster.


Current Results

Simulation platform for distributed embedded systems (University Linköping)
A simulation environment is designed and implemented for distributed real-time systems such as those used in automotive applications. The ARTS environment, developed at DTU and targeting System on chip applications, has been used as a starting point by the Linköping team.

In Year2 the following work has been done:
- The implementation of the environment has been finalized and new protocols, such as Flexrey have been implemented;
- Theoretical investigation regarding anomalies and sensitivity in distributed real-time systems has been performed, with results that will help to improve the efficiency of the simulator in detecting close to worst case behavior. This is important when using the simulator for evaluation of the pessimism of certain schedulability analysis approaches.

This work is done in interaction with the Braunschweig group.
- Implementation of real-life applications from our industrial partners at Volvo.
First publication is planned for the next year.

Modeling and response-time/buffer analysis for NoC (University Linköping)
The Linköping group has developed a system model, based on which worst case response times and worst case buffer need for hard real-time applications implemented on NoCs can be calculated. On top of this analysis approach, an optimization tool for buffer space minimization has been implemented, for real-time NoC applications.

Modelling and formal timing analysis of shared memory accesses (TU Braunschweig)
We have continued to investigate design paradigms of MPSoC architectures. As opposed to distributed systems, a common feature here is the use of a shared memory that is accessed from each processor, introducing conflicts on the memory and interconnects. System designers often implement latency-hiding techniques to reduce the effect of waiting for data, by allowing frequent context switches to tasks that are ready.

We have systematically identified dependencies in such systems that have an influence on design properties such as end-to-end delays. Using this in [SIE06], we were able to show that the technique for latency hiding can bear unwanted results for critical worst-case response time scenarios.

We have further investigated formally the timing of multiple coinciding memory accesses. Previous approaches had to assume a worst-case timing for each individual memory access. Due to large timing variations, this leads to a large deviation of analysis result and actual behaviour. In [SISE06], we presented a new way to express and calculate total latency of multiple events with much higher accuracy, leading to improve worst-case response time estimates.

Integration of formal SDF analysis techniques into the SymTA/S framework (TU Braunschweig)
Standard event models represent key integration aspects and hide complexity of local scheduling analysis algorithms. Thus, they are a suitable abstraction to integrate different models of computation into the SymTA/S framework. Recent work at IDA has produced a methodology to embed the analysis of SDF Graphs [Lee/Messerschmitt] into the SymTA/S framework (paper submitted for review at DATE07).
Integrating SDF models into the SymTA/S framework required corner-case evaluation of SDF graphs to construct event models describing their timing behaviour. Also, notions for path related metrics like latencies were defined and algorithms for computing their upper and lower bounds were proposed.
SDF Graphs are especially suited for describing data transforming applications like filters. Integrating their analysis into the SymTA/S framework significantly enlarges its application domain and improves the analysis results i.e. in the field of filter applications.

Multi-dimensional sensitivity analysis (TU Braunschweig)
The robustness of an architecture to changes is a major concern in embedded system design. Robustness is important in early design stages to identify if and in how far a system can accommodate later changes or updates or whether it can be reused in a next generation product. Robustness can be expressed as a "performance reserve", the slack in performance before a system fails to meet timing requirements. This is measured as design sensitivity.

Due to complex component interactions, resource sharing and functional dependencies, one-dimensional sensitivity analysis [RJE05] cannot cover all effects that modifications of one system property may have on system performance. One reason is that the variation of one property can also affect the values of other system properties requiring new approaches to keep track of simultaneous parameter changes.

Therefore, TU Braunschweig developed a heuristic and a stochastic approach for multi-dimensional sensitivity analysis [RHE06]. The heuristic approach is a divide-and-conquer like algorithm, which uses parameter specific heuristics to prune the search space. It is applicable to two dimensional search spaces. The stochastic approach is based on evolutionary search spaces and uses tabu search to bound the region containing the sought-after sensitivity front separating working and non working system configurations. It is applicable to search spaces of arbitrary dimension.

MPARM interface with Lisatek
(University of Bologna)
New processor models have been included. The most important extension in this area is the integration with the SystemC models generated by the Lisatek suite developed by AACHEN. Any processor modeled in LISA can now be integrated as add-on core in the MARM platform. A standardized transaction-level interface has been defined for core embedding.

MPARM memory models (University of Bologna)
Models for external memory controllers (DRAM-DDRAM). The main memory interface is often the true performance bottleneck for many MPSoC platforms. Therefore significant effort has been devoted to the development of an accurate DRAM controller module, capable of several advanced communication-optimizations. The model has been integrated within the MPARM platform. Associate partner STmicroelectronics has provided the functional specification for the controller.

Traffic generator model (University of Bologna, Technical Univeristy of Denmark)
Applications running on MPSoC architectures increasingly present non-trivial execution flows and synchronization patterns, especially in presence of underlying operating systems and when exploiting interrupt facilities. These properties make it very difficult to generate realistic test traffic. Technical University of Denmark and University of Bologna have jointly developed a reactive traffic generator device capable of correctly replicating complex software behaviours in the MPSoC design phase. The approach has been validated by showing cycle-accurate reproduction of a previously traced application flow. The traffic models have been integrated in both the ARTS environment from Technical University of Denmark and the MPARM environment from University of Bologna.

ARTS modelling framework (Technical University of Denmark)
ARTS is a SystemC-based abstract system-level modelling and simulation framework, which allows MPSoC designers to model and analyze the different layers, i.e., application software, middleware and platform architecture, and their interaction prior to implementation. In particular, ARTS provides a simulation engine that captures cross-layer properties, such as the impact of OS scheduling policies on memory and communication performance, or of communication topology and protocol on deadline misses. The ARTS framework was demonstrated at the University Booth at the DATE07 conference in Munich. As a result, ARTS has been made public available. The distribution consists of the framework and a tutorial. The results of this work was published at MASCOTS05 [MSM05] and an article has been submitted for the journal on Design Automation for Embedded Systems.
A web-link to the downloadable ARTS framework is http://www.imm.dtu.dk/arts.

Toolbox for Modular Performance Analysis method of ETHZ (ETH Zurich)
The analytic performance analysis model for distributed embedded systems and multiprocessing devices has been refined and discussed together with other partners. A major event has been the Distributed Embedded Systems workshop in Leiden and the Execution Platform Meeting in Bologna. As a result, we decided to implement the basic mathematical tools of Real-Time Calculus in form of a Matlab toolbox. The aim is to foster even more integration in the future as now other groups will be able to apply and incorporate analytic methods easily. The first version of the toolbox is available, including documentation and a tutorial.
It will be used to integrate Symta/S and the modular performance analysis method in the next year of ARTIST2. A web-link to the toolbox is
http://www.mpa.ethz.ch/Rtctoolbox/O... .

Combining simulation and formal analysis for performance analysis (ETH Zurich)
Collaboration with Francesco Poletti and Luca Benini at University of Bologna
In this activity, we developed a new, compositional performance evaluation method for embedded systems. The new method combines existing approaches for system-level performance analysis, namely MPA a formal method and MPSim a simulation-based approach. To enable this combination, we defined the interfaces needed between the different performance evaluation methods. As a core of the approach, we propose a method to generate simulation stimuli from analytical models. In addition, we introduced a measure to assess the quality of a generated simulation trace with respect to its analytical description. In order to show the applicability of this new approach for performance evaluation, we implemented an example system for such a combined performance evaluation consisting of a multiprocessor system-on-a-chip. It is based on existing models for simulation and analytical models extended by the needed interfaces for the combination, including an implementation of the simulation trace generation algorithm. This combined model was then used for a case study of an application running on a multiprocessor system.
To achieve the results described above, several physical and phone meetings were held to coordinate the joint effort and to discuss future directions of this activity. The following two publications [KBPT06] and [KT06] describe the results of the joint activity.


Publications Resulting from these Achievements

nts
- [MEP06] Sorin Manolache, Petru Eles, Zebo Peng, "Buffer Space Optimisation with Communication Synthesis and Traffic Shaping for NoCs," Design Automation and Test in Europe Conference (DATE 2006), Munich, Germany, March 6-10, 2006, pp. 718-723
- [SIE06] Simon Schliecker, Matthias Ivers, Rolf Ernst. Integrated Analysis of Communicating Tasks in MPSoCs. In Proc. 3rd International Conference on Hardware Software Codesign and System Synthesis, Seoul, Korea, October 2006.
- [SISE06] Simon Schliecker, Matthias Ivers, Jan Staschulat, Rolf Ernst. A Framework for the Busy Time Calculation of Multiple Correlated Events. In 6th Intl. Workshop on Worst-Case Execution Time (WCET) Analysis, Dresden, Germany, July 2006.
- [RHE06] Razvan Racu, Arne Hamann, Rolf Ernst, A Formal Approach to Multi-Dimensional Sensitivity Analysis of Embedded Real-Time Systems. In Proc. of the 18th Euromicro Conference on Real-Time Systems (ECRTS), Dresden, Germany, July 2006
- [RJE05] Razvan Racu, Marek Jersak, Rolf Ernst. Applying Sensitivity Analysis in Real-Time Distributed Systems. In 11th IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS), San Francisco, USA, March 2005.
- [KPBT06] Simon Künzli, Francesco Poletti, Luca Benini, and Lothar Thiele: Combining simulation and formal methods for system-level performance analysis. In Proceedings of Design, Automation and Test in Europe (DATE), March 2006.
- [KT06] Simon Künzli and Lothar Thiele: Generating event traces based on arrival curves. In Proceedings of 13th GI/ITG Conference on Measurement, Modeling, and Evaluation of Computer and Communication Systems (MMB), March 2006.
- [VM05] Virk, K., Madsen, J., System-level Modeling of Wireless Integrated Sensor Networks, in the proceedings of the International Symposium on System-on-Chip (SoC), 2005
- [MSM05] Mahadevan, S., Storgaard, M., Madsen, J., ARTS: A System-Level Framework for Modeling MPSoC Components and Analysis of their Causality, in the proceedings of the 13th IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS), 2005
- [SARBV05] Srinivasan, S.; Angiolini, F.; Ruggiero, M.; Benini, L.; Vijaykrishnan, N.; Simultaneous memory and bus partitioning for SoC architectures, Proceedings of the IEEE International SOC Conference, 2005. 25-28 Sept. 2005 Page(s):125 - 128


Keynotes, Workshops, Tutorials

Tutorial: Frameworks for System-Level Analysis of Real-Time Systems - Symta/S and MPA
RTAS 2006 Tutorial IEEE Real-Time and Embedded Technology and Applications Symposium:
Speaker: Lothar Thiele
System-level timing, performance, and power becomes increasingly intractable as the interactions between system parts introduce complex dynamic behaviour that can not be fully overseen by anyone in a design team. The tutorial addressed recent research on composable and extensible analysis methods, and tools.

Tutorial: Sensor Networks
DATE Conference, Munich, Germany, 6.3.2006
Speaker: Lothar Thiele

This tutorial reviewed basic concepts of wireless sensor networks, including: ad-hoc networking, programming models, power management, in-network processing, development environments and methodologies.

Tutorial: ARTIST2 Spring School in China on Models, Methods and Tools for Embedded Systems
Xi’an, China, April 3rd-15th, 2006
Speakers: Lothar Thiele and Peter Marwedel

The tutorial started with an introduction to embedded systems and resource aware generation of software and performance analysis. It also comprised modelling of real-time systems, validation and verification.
See it online

Workshop: Workshop on Distributed Embedded Systems
Leiden, Netherlands, November 2005
Collaboration with various research groups in the area of performance analysis of embedded systems. There exist almost no results on comparisons of system level performance estimation and analysis methods for embedded systems in literature. One of the main problems that prevent such results is that there exists no consensus on how these methods should be compared. To overcome this problem, we co-organized an international workshop on distributed embedded systems. The workshop aimed to achieve three goals: first to identify issues and trends of system level performance estimation and analysis of distributed embedded systems, secondly to learn about existing system level performance estimation and analysis methods for embedded systems, and thirdly to establish a set of benchmark applications that can serve as basis for future methods comparisons.

As the focus was on distributed embedded systems, the activity is mainly reported in relation to the Communication Centric Systems activity. However, as part of the workshop, the different system models, on which the various performance estimation methods were carried out, were presented and discussed.
See it online

Workshop: Design Issues in Distributed, Communication-Centric Systems
DATE Conference, Munich, Germany, 10.3.2006
Organisers: Bruno Bouyssounouse, Rolf Ernst, Lothar Thiele

The workshop presented relevant, innovative, and holistic topics in communication-centric systems, sensor networks, dynamic real-time architecture, distributed computing, minimal operating systems, and self-organisation.
In relation to the System Modelling Infrastreucture, Jan Madsen gave a presentation on Modelling Embedded Network Systems: From MPSoC to Sensor Networks.
See it online

Invited presentation: Application Specific NoC Design
DATE Conference, Munich, Germany, 8.3.2006
Speaker: Luca Benini
Hot-Topic session on system level design of SoC. This invited presentation was part of the special day for 4G wireless.

Keynote: Execution Platforms
SNART Seminar, KTH, Stockholm, Sweden – August, 2006
Speaker: Jan Madsen

Presentation of the Execution Platform cluster at the Scandinavian ARTIST2 Seminar on Embedded Systems Design, the SNART Seminar 2006. The theme of this seminar was a pro-active approache to strengthen Scandinavia’s position in Embedded Systems - with an emphasis on topics covered by the ARTIST2 EU/IST Network of Excellence on Embedded Systems Design.
See it online

Mini-Keynote: Evolving MPSoC Architectures
MpSoC Summer school, Estes Park, Colorado, August 2006
Speaker: Jan Madsen

The keynote presented an evolutionary approach to solve the problem of mapping a set of task graphs onto a heterogeneous multiprocessor platform. The objective is to meet all real-time deadlines subject to minimizing system cost and power consumption, while staying within bounds on local memory sizes and interface buffer sizes. The exploration is based on a system level model expressed in ARTS.
See it online


ARTIST2 Participants: Expertise and Roles

  • Prof. Petru Eles – ESLAB, Linköping University (Sweden)
    Areas of his team’s expertise: models for communication intensive distributed systems, power models, analytic performance estimation.
  • Prof. Dr. Rolf Ernst – IDA, TU Braunschweig (Germany)
    Areas of his team’s expertise: model infrastructure for performance analysis of heterogeneous systems.
  • Prof. Luca Benini – Micrel Lab, University of Bologna (Italy)
    Areas of his team’s expertise: providing models for estimation of non-functional properties.
  • Prof. Jan Madsen – IMM, Technical university of Denmark (Denmark)
    Areas of his team’s expertise: abstract RTOS and NoC models for multiprocessor system simulation.

Affiliated Participants: Expertise and Roles

  • Dr. Roberto Zafalon – STM (Italy)
    Areas of his team’s expertise: requirements and use of platform.
  • Salvatore Carta - University of Cagliari (Spain)
    Working on MPSoC Middleware (task migration support) using our infrastrcucture
  • Dr. Magnus Hellring – Volvo (Sweden)
    Areas of his team’s expertise: requirements analysis.

(c) Artist Consortium, All Rights Reserved - 2006, 2007, 2008, 2009

Réalisation Axome - Création de sites Internet