ARTIST Summer School in Europe 2010 Autrans (near Grenoble), France September 5-10, 2010

artirt

Invasive Computing Basic Concepts & Foreseen Benefits

Prof. Dr. Jürgen Teich University of Erlangen-Nuremberg Germany

# Invasive Computing Transregional Collaborative Research Centre 89

P artirt

- 2

SEVENTH FRAMEWORK



#### artirt

•

•

- What is Invasive Computing?
  - Uniquitousness of parallel computers
  - Challenges in the year 2020
  - Vision and Potentials
  - Scientific Work Program
    - Basics: Resource-Aware Programming, Algorithms, Complexity
    - Architectures: Reconfigurability and Decentralized Resource Management
    - Tools: Compiler, Simulation Support and Run-Time System
    - Applications: Real-Time, Fault-tolerance, Efficiency and Utilization
  - Structure, Chances and Goals
    - Project structure
    - Funded Institutions and Researchers
    - Demonstrator Roadmap
    - Impact and Risks



# ortist Ubiquitousness of parallel computers





Nvidia Fermi: 512 Cores





Sony Playstation 3, IBM Cell 9 Cores



SEVENTH FRAMEWORK PROGRAMME

# ortist Ubiquitousness of parallel computers



Source: Hardware/Software Co-Design, Univ. of Erlangen-Nuremberg, Jan 2009. Programmable 5x5 core MPSoC for image filtering. Technology: CMOS 1.0 V, 9 metal Layers 90nm standard cell design. VLIW memory/PE: 16x128, FUs/PE: 2xAdd, 2xMul, 1xShift, 1xDPU. Registers/PE: 15. Register file/PE: 11 read/ 12 write ports. Configuration Memory: 1024x32 = 4KB. Operating frequency: 200 MHz. Peak Performance: 24 GOPS. Power consumption: 132,7 mW @ 200 MHz (hybrid clock gating). Power efficiency: 0,6 mW/MHz.



## **Abstraction Levels for Parallel Computations**

artirt







### **Exercise 1**

• What kind of parallel computing systems and computing devices (i.e., MPSoCs are you aware of/have you already experience with?

(3 minutes)

| Name                       | Target class of apps./supported parallelism levels | Individual<br>Experience (y/n) |
|----------------------------|----------------------------------------------------|--------------------------------|
| Nvidia Tesla<br>GPU        | Computer Graphics<br>Apps.<br>Data/Thread          | yes                            |
| Tilera Tile-GX<br>100 core | Signal Processing<br>Fine Grain Parallel           | no                             |
| Intel SCC                  |                                                    |                                |
|                            |                                                    |                                |



- 7 -

#### Challenges in the year 2020

P artirt



- 8 -

#### artirt

# Challenges in the year 2020

- Complexity
  - How to map dynamically applications onto 1000 or more processors while considering memory, communication and computing resource constraints?
- Adaptivity
  - How and to what degree shall algorithms and architectures be adaptable (HW/SW, bit/word/loop/thread/process-level)?
- Scalability
  - How to specify and/or generate programs that may rund without (great) modifications on either 1,2,4, or N processors?
  - Physical Constratins
    - Low power, performance exploitation, management overhead
  - Reliability and Fault-Tolerance
    - Necessity for compensation of process variations as well as temporal and permanent defects





Definition [J. Teich. Invasive Algorithms and Architectures. Journal it – Information Technology, 50 (2008), pp. 300-310, 2008]

Invasive Programming denotes the capability of a program running on a parallel computer to request and temporarily claim processing, communication and memory resources in the neighborhood of its actual computing environment, to then execute in parallel using these claimed resources, and to be capable to subsequently release these resources again.



- 10



# Vision and Potentials

Run-Time Scalability

•

- Today's parallel programs are in general not able to adapt themselves to the current availablity of resources.
- Today's computer architectures do not support any application-controlled resource reservation.
- Dynamic Self-Optimization possible through Invasion wrt.
  - Ressource Utilization
  - Power Consumption (Temperature Management)
  - Performance
- Tolerance of Failures and Defects
  - Today's parallel programs just would not run (correctly) any more!
- Robustness
  - Applications tolerate a variable availability of resources



- 12 -

#### artirt

## Potential: Resource Utilizations up to 100%



- 13 -

### Potential: Power and Temp. Management

ortint.



- 14 -

#### Potential: Performance Gain/Tradeoff

artirt



- 15 -

#### artirt

# **Potential: Robustness and Fault-Tolerance**



- 16 -



### Exercise 2

- A) Explain the term, respectively main idea of "INVASIVE COMPUTING" in your own words
- B) List the major advantages and potentials of "INVASIVE COMPUTING"
- C) Imagine and name similarities and differences to recent research areas such as "GRID COMPUTING" and "CLOUD COMPUTING".

(5 minutes)



#### ortirto •

- What is Invasive Computing?
  - Uniquitousness of parallel computers
  - Challenges in the year 2020
  - Vision and Potentials
- Scientific Work Program
  - Basics: Resource-Aware Programming, Algorithms, Complexity
  - Architectures: Reconfigurability and Decentralized Resource Management
  - Tools: Compiler, Simulation Support and Run-Time System
  - Applications: Real-Time, Fault-tolerance, Efficiency and Utilization
- Structure, Chances and Goals
  - Project structure
  - Funded Institutions and Researchers
  - Demonstrator Roadmap
  - Impact and Risks



- 18

# Basic Functionality of Invasive Programs

Invade

Construct(s) for request and reservation of resources (processors, memory, interconnect)

Infect

Construct(s) for programming, resp. configuration of resources (processors, memory, interconnect) for special services

#### Retreat

Construct(s) for release of resources (processors, memory, interconnect)



Concept invade-let (i-let)



- 19 -





#### **Exercise 3**

- A) Why should an application be aware of the operating state of the underlying hardware resources?
   Give at least 3 reasons.
- B) Why can't or shouldn't resource-aware programming be provided just by the middleware/OS and thus be hidden to the application programmer? What are advantages of resource consideration at the algorithmic or application-programmer's level?
  - C) What advantages/disadvantages or risks may resource-aware programming have or introduce?

| Advantages          | Disadvantages      |
|---------------------|--------------------|
| Fault-Tolerance     | Overheads (HW, SW) |
| Resource Efficiency | Speed?             |
|                     |                    |

(5 minutes)



- 21

# Invasive Computing on MPSoCs

- Question: Is invasive computing already possible on MPSoC platforms available today?
- First case study: Cell B.E.

artirt

- Library-based implementation of basic functionality
- PPE: "Master of Invasion"
- SPE: Sorting as slave
- Algorithm: Invasive implementation of well-known bitonic sorter for exploitation of available SIMD architecture
- Merging accomplished by PPE



# Cell B.E. Architecture

- 8 Synergistic Processing Elements (SPEs)
- 1 Power Processing Element (PPE)

• artirt

• Element Interconnect Bus (EIB): ring bus  $\rightarrow$  no direct neighbour



# Library-Based Approach

- Link your invasive program with library:
  - ppu-g++ -o ... -linvasic ...

arturt

- Library invasic provides all basic commands:
  - SPE initialization: init\_spus(program)
  - Invade command: spe\_struct\_t \*invade(uint32\_t num\_spe)
  - Infect command: void infect(spe\_struct\_t \*spes, workload\_struct\_t \*workloads, uint32\_t signal)
  - Retreat command: void retreat(spe\_struct\_t \*spe)
  - SPE cleanup: exit\_spus()
- Planned: Support of distinct programs in each SPE
  - E.g., using code overlay techniques
  - Program will be loaded "on demand"
  - Main program on PPE runs always (Routine main)



### entire Example code: Invasive Sorting

```
n = ... /* determine number of SPEs needed for sorting */
```

```
/* invade */
```

```
sort_spes = invade(n);
```

```
if (sort_spes->num_spe == 0) {
```

```
fprintf(stdout, "Couldn't invade any SPE, using serial version ...");
```

```
sort_serial(...);
```

} else {

```
/* create workloads for each SPE; command not provided by library;
distributes the work equally between all invaded SPEs;
sets ea_in, ea_out and size of data */
```

```
workloads = create_workloads(sort_spes, in_data, out_data, size);
```

```
/* infect */
```

```
infect(sort_spes, workloads, SPU_SORT);
```

```
/* wait until all SPUs have finished (retreat) */
```

```
for (i=0; i<sort_spes->num_spe; i++) {
```

#### sort\_spes->spes[i]->wait\_mbox();

```
SEVENTH FRAMEWORK
PROGRAMME
```

- 25 -

# Project Area A – Basics<sup>26</sup>

- Programming and Language Issues:
  - Finding and classification of elementary (basic) constructs for invasive programs (the *invasive command space*) [A1]
  - Definition of an abstract kernel language (syntax, semantics, type system) [A1]
  - Embedding of command set into programming language(s) [A1]
- Mathematical Models for Effifiency and Utilization Analysis of invasive applications [A1]
- Algorithm Engineering:

artirt

- Complexity and cost invasive algorithms [A1]
- Scheduling and Load Balancing [A3]





#### **Exercise 4**

• A) What kind of parallel programming languages, language extensions and/or programming environments such as APIs are you aware of?

| Name   | Language/<br>API         | Architecture/<br>Application<br>Target Class  | Advantages (+)/<br>Disadvantages (-)                                                                          |
|--------|--------------------------|-----------------------------------------------|---------------------------------------------------------------------------------------------------------------|
| OpenMP | API for Fortran<br>and C | Shared-Mem.<br>Architectures/<br>Thread-level | <ul> <li>(+) Wide spread</li> <li>(+) Quasi-standard</li> <li>(-) Shared-Mem.</li> <li>Archs. only</li> </ul> |
|        |                          |                                               |                                                                                                               |
|        |                          |                                               |                                                                                                               |

(3 minutes)



# Parallel Programming Survey

|           | Til          | ing          |              | nory<br>tecture |              | ability      | m                      |              | Аррі         | roach        |            |              | oad<br>ncing | Arc          | chitectu     | ires         |
|-----------|--------------|--------------|--------------|-----------------|--------------|--------------|------------------------|--------------|--------------|--------------|------------|--------------|--------------|--------------|--------------|--------------|
|           | auto         | user         | shared       | distributed     | Commercial   | osso         | Consorti <sub>um</sub> | Language     | Library      | Compiler     | Comp. Ext. | SW           | МH           | x86          | GPUS         | others       |
| OpenMP    | ✓            | <b>~</b>     | ✓            |                 |              |              | ~                      |              | (√)          |              | ✓          | ✓            |              | ✓            |              | ✓            |
| MPI       |              | ✓            | (√)          | $\checkmark$    |              |              | $\checkmark$           |              | $\checkmark$ |              |            | $\checkmark$ |              | $\checkmark$ |              | $\checkmark$ |
| PGI       | ✓            | ✓            | $\checkmark$ | ✓               | ✓            |              |                        |              |              | $\checkmark$ |            | ✓            | (√)          | ✓            | (√)          |              |
| твв       | $\checkmark$ | ✓            | $\checkmark$ |                 | ✓            | $\checkmark$ |                        |              | ✓            |              |            | $\checkmark$ |              | $\checkmark$ |              | $\checkmark$ |
| Cilk++    | $\checkmark$ | ✓            | $\checkmark$ |                 | ✓            | (√)          |                        | $\checkmark$ |              | $\checkmark$ | (√)        | ✓            |              | ✓            |              |              |
| CUDA      |              | $\checkmark$ | $\checkmark$ | $\checkmark$    | ✓            |              |                        | $\checkmark$ |              | $\checkmark$ |            |              | $\checkmark$ | (√)          | $\checkmark$ |              |
| Brook+    |              | ✓            | $\checkmark$ | ✓               | ✓            | (√)          |                        | $\checkmark$ |              | ✓            |            |              | $\checkmark$ |              | $\checkmark$ |              |
| OpenCL    | $\checkmark$ | ✓            | $\checkmark$ | $\checkmark$    |              |              | $\checkmark$           | $\checkmark$ |              | $\checkmark$ |            | ✓            | $\checkmark$ | $\checkmark$ | $\checkmark$ | $\checkmark$ |
| RapidMind | ✓            | ✓            | $\checkmark$ | ✓               | ✓            |              |                        |              | ✓            |              |            | ✓            | $\checkmark$ | $\checkmark$ | $\checkmark$ | $\checkmark$ |
| Ct        | $\checkmark$ | $\checkmark$ | $\checkmark$ | $\checkmark$    | $\checkmark$ |              |                        |              | $\checkmark$ |              |            | $\checkmark$ | $\checkmark$ | $\checkmark$ | $\checkmark$ | $\checkmark$ |
| Axum      |              | ✓            |              | ✓               | ✓            |              |                        | $\checkmark$ |              | $\checkmark$ |            | $\checkmark$ |              | $\checkmark$ |              |              |
| X10       | (√)          | $\checkmark$ | $\checkmark$ | $\checkmark$    |              | $\checkmark$ |                        | $\checkmark$ |              | $\checkmark$ |            | $\checkmark$ |              | $\checkmark$ | (√)          | $\checkmark$ |

artist



- 28 -

# The X10 Programming Language

• What is X10?

artirt

- X10 is a new programming language being developed at IBM Research in collaboration with academic partners.
- The development started in 2004 within the IBM PERCS project as part of the DARPA program on High Productivity Computing Systems (HPCS).
- Designed for parallel programming using the partitioned global address space (PGAS) model.
- Features of X10
  - Is more productive than current models
  - Can support high levels of abstraction
  - Can exploit multiple levels of parallelism and non-uniform data access
  - Is suitable for multiple architectures, and multiple workloads.
    - $\Rightarrow$  Adaptability and resource-awareness
      - are already included to some extend in X10

Note: Slides are partly based on material by Christoph von Praun, Vijay Saraswat, and Vivek Sarkar.



# Goals of X10

#### Simple

•

- Start with a well-accepted programming model, build on strong technical foundations, add few core constructs
- Safe
  - Eliminate the possibility of errors by design, and through static checking
- Powerful
  - Permit easy expression of high-level idioms
  - And permit expression of high-performance programs
- Scalable
  - Support high-end computing with millions of concurrent tasks
- Universal
  - Present one core programming model to abstract from the current plethora of architectures



- 30

artirt

# X10 Concepts

- 31 -



## X10 Constructs



...

Place 0

artirt



Place (MaxPlaces-1)

| Fine grained concurrency  | Atomicity    | Global data-structures                                         |
|---------------------------|--------------|----------------------------------------------------------------|
| • async S                 | • atomic S   | <ul> <li>points, regions,</li> </ul>                           |
|                           | • when (C) S | <ul> <li>points, regions,<br/>distributions, arrays</li> </ul> |
| Place-shifting operations | Ordering     |                                                                |
| • <b>at</b> (P) S         | • finish S   |                                                                |
|                           | • clock      |                                                                |

Two basic ideas: Places and Asynchrony



- 32 -

#### artirt

- Given: An array of 8 bit numbers
- Result: Number of observations in the array sorted into 256 bins
- Small illustrative example (array with 16 elements, 6 bins):



• Histogram:





- 33 -



# Histogram, X10 Code

Helper class for handling sub histograms

```
class subHist
{ val sh: Rail[Int] = Rail.make[Int](256, (Int) \Rightarrow 0);
  def this() {}
                                                                        Generates 256 bins and
                               Increments bin i by one
                                                                        initializes them with zero
  def inc(i: Int)
  { sh(i)++;
  }
                                        Adds value v to bin i
  def add(i: Int, v: Int)
  \{ sh(i) += v; \}
  }
  def get(i: Int): Int
                                       Returns the value of bin i
  { return sh(i);
  }
  . . .
}
                                                                                          SEVENTH FRAMEWORK
PROGRAMME
```

- 35 -

#### - 36 Histogram, X10 Code artist public class Histogram { public static def main(argv:Rail[String]!) { val rnd = new Random(); Generate some workload (random numbers within rnd.setSeed(42); the interval [0,255]) val noOfElements: Int = 10000000; val R = 0..noOfElements-1; // Region val dataSet = new Array[Int](R, ((i):Point) => rnd.nextInt(256)); Invade: Are there some additional val numberOfCores: Int = Invade(); cores available? val D = Dist.makeDistribution(R, numberOfCores); Create mapping from region R to set of cores val subHists: DistArray[subHist] = DistArray.make[subHist](Dist.makeDistribution(numberOfCores), Infect (Data): ((i):Point) => new subHist()); Create at each core a val distDataSet: DistArray[Int] = sub histogram DistArray.make[Int](D, ((i):Point) => dataSet(i)); Distribute workload to . . . available cores

### Histogram, X10 Code

- 37



artirt



- The Invade function might have different flavors, e.g.,
  - Give me as much as possible resources
  - Give me a power of 2
  - Give me an even number
  - etc.
  - The type of the returned resource, (x86 multi-core, graphics device, ...) (CUDA devices are already supported to some extend in X10)
- According to the resource type, the program might select also different parallelization strategies, e.g.
  - Thread-level parallelism in case of a standard multi-core
  - Data-level parallelism in case of a graphics device



### Project Area B – Architectures

Invasive Computer Architectures:

- Invasion Control Architectures for networks of ASIP- (iCore [B1]), RISC- (CPU [B3]) and Tightly-Toupled Processor Arrays (TCPA [B2])
- Microarchitecture:

artirt

- Segmentable and reconfigurable memory, processor, instruction sets and interconnect [B1, B2, B4]
- "Instruction set" definition for basic functionality [B1,B2]
- Hardware-supported invasion
   (Invasion-Controller) [B2]
- Macroarchitecture:
  - Hardware-supported Invasion (CIC [B3])
  - Invasive Communication Networks (iNoC [B5])
- Monitoring and Design Optimization [B4]





#### artirt

#### Project Area C – Tools<sup>40</sup>

- Run-Time System [C1]
  - Methods, principles and abstractions for extendable, (re-)configurable and adaptable OS structures for invasive computing systems
  - Agent technology for Scalable Resource Management
  - Techniques for Virtual Power Management
  - *i*RTSS: (de-centralized) Services of Operating Systems for Invasive Architectures
  - Simulation and Compiler [C2, C3]
    - Simulation (Speed, HW/SW, Heterogeneity) [C2]
    - Compiler
      - Symbolic Parallelization: Loop Invader [C3]
      - Machine Markup Languages [C3]
      - Backend Design (X10 -> Sparc, X10 -> TCPA) [C3]
      - Invasification [C3]



#### **Demo Invasion Strategies**





- 41 -

artirt

#### **Exercise 5**

- 42

Given a mesh-connected 2D invasive TCPA with NxN processor elements (PEs). Let the invasion of one single PE take one single clock cycle (time for one PE issueing an invasion request)

#### • CASE A:

Let each PE be able to invade only one nearest neighbour PE after being invaded itself.

- A) How many PEs may be invaded in n clock cycles?
- B) Place an initial single PE seed program starting the invasion (Master of Invasion) such that a LINEAR INVASION (invasion in a single direction) yields the biggest CLAIM (number of invaded PEs).
- C) How long does it take (minimally) to invade the whole NXN array assuming multi-D invasion is allowed? Specify a single invasion strategy.
- CASE B:

Let each PE be able to once issue a simultaneous invade request to ALL its nearest neighbors in one clock cycle. Answer A),B),C) alike. (6 minutes)

### Project Area D – Applications<sup>43</sup>

• Application Areas:

• artirt

- Robotics [D1]
  - Real-Time
  - Fault-Tolerance
  - Performance
- Scientific Computing [D3]
  - Invasive Computing on HPC-Systems
  - Ressource utilization
  - Performance



# Invasive Computing Scenario in Robot Vision

artirt

| Subtask                                                      | Sample algorithm                                                                  | HW resources                            |  |  |  |  |  |
|--------------------------------------------------------------|-----------------------------------------------------------------------------------|-----------------------------------------|--|--|--|--|--|
| Build 3D model of environment, use cameras from other robots | Stereo vision, disparity map, block<br>matching, combine multiple camera<br>views | video input, network I/O,<br>TCPA, RISC |  |  |  |  |  |
| (Listen for "help" shouts, analyze audio input)              |                                                                                   |                                         |  |  |  |  |  |
| Find regions of interest (ROI): persons, explosives          | Color segmentation, region growing                                                | ТСРА                                    |  |  |  |  |  |
| Check hypothesis for each ROI                                | Shape segmentation, shape matching                                                | Multiple RISCs<br>(one per ROI)         |  |  |  |  |  |
| Motion planning                                              | Shortest path calculation                                                         | 1 RISC                                  |  |  |  |  |  |
| Walk into hazardous area                                     | Motion control                                                                    | 1 RISC hard real-time                   |  |  |  |  |  |
| Check for any interleaving objects                           | Optical flow                                                                      | TCPA<br>Multiple RISCs (e.g. 2)         |  |  |  |  |  |
| Update 3D model                                              | Disparity map update, combine multiple views                                      | video input, network I/O,<br>TCPA, RISC |  |  |  |  |  |



#### **Scenario Robot Vision**

artirt





Scenario: Dyn. Adaptive Mesh Refinement & "Multigrid"-Solver

- Structured mesh refinement based on tree structures
- Load distribution und -balancing using so-called "space-filling curves" (strong changes in the dynamic mesh refinement, depending on application)
- Challenges: "Divide-and-conquer-" as well as "Multi-level"-Approaches

SEVENTH FRAMEWORK



•

•

#### **Exercise VI**

- A) Name applications and application areas where "INVASIVE COMPUTING" might beat conventional programming practice. In which aspects?
- B) For which applications/domains do you believe "INVASIVE COMPUTING" will be most beneficial? For which not. Explain.

| App domain                                                          | Why? |
|---------------------------------------------------------------------|------|
| Apps. with dynamic/time-<br>variant degree of<br>parallelism (DOP)? |      |
| Apps. with static<br>parallelism?                                   |      |
| Apps. with high/low number of threads?                              |      |
| Real-time apps.?                                                    |      |

(4 minutes)



#### ortirto •

•

- What is Invasive Computing?
  - Uniquitousness of parallel computers
  - Challenges in the year 2020
  - Vision and Potentials
  - Scientific Work Program
    - Basics: Resource-Aware Programming, Algorithms, Complexity
    - Architectures: Reconfigurability and Decentralized Resource Management
    - Tools: Compiler, Simulation Support and Run-Time System
    - Applications: Real-Time, Fault-tolerance, Efficiency and Utilization
- Structure, Chances and Goals
  - Project structure
  - Funded Institutions and Researchers
  - Demonstrator Roadmap
  - Impact and Risks





- 49

SEVENTH FRAMEWORK PROGRAMME

## Portirt TRR 89 – Projects & Researchers

| Project Area A:<br>Fundamentals,             | Project Area B:<br>Architectural Research                                               | Project Area C:<br>Compiler, Simulation,                                 | Project Area D:<br>Applications                                           |
|----------------------------------------------|-----------------------------------------------------------------------------------------|--------------------------------------------------------------------------|---------------------------------------------------------------------------|
| Language and Algorithm<br>Research           |                                                                                         | and Run-Time Support                                                     |                                                                           |
| A1: Basics of Invasive<br>Computing          | B1: Adaptive Application-Specific<br>Invasive Micro-Architectures                       | C1: Invasive Run-Time<br>Support System (iRTSS)                          | D1: Invasive Software-<br>Hardware Architectures<br>for Robotics          |
| Teich/Snelting                               | Henkel/Hübner/Bauer                                                                     | Schröder-Preikschat/<br>Lohmann/Henkel/Bauer                             | Dillmann/Asfour/<br>Stechele                                              |
| A2: Algorithms and<br>Complexity of Invasion | B2: Invasive Tightly-Coupled<br>Processor Arrays                                        | C2: Simulation of Invasive<br>Applications and Invasive<br>Architectures | D2: Invasive<br>Ray Tracing                                               |
| Wanka                                        | Teich                                                                                   | Hannig/Gerndt/Herkersdorf                                                | Stamminger                                                                |
| A3: Scheduling and Load<br>Balancing         | B3: Invasive Loosely-Coupled<br>MPSoC                                                   | C3: Compilation and Code<br>Generation for Invasive<br>Programs          | D3: Multilevel<br>Approaches and<br>Adaptivity in Scientific<br>Computing |
| Sanders                                      | Herkersdorf/Henkel                                                                      | Snelting/Teich                                                           | Bungartz/Gerndt                                                           |
|                                              | B4: Hardware Monitoring<br>System and Design Optimization<br>for Invasive Architectures |                                                                          |                                                                           |
|                                              | Schmitt-Landsiedel/Schlichtmann                                                         |                                                                          |                                                                           |
|                                              | B5: Invasive NoCs                                                                       |                                                                          |                                                                           |
|                                              | Becker/Herkersdorf/Teich                                                                |                                                                          |                                                                           |



- 50 -

#### TRR 89 – Existing Cooperations

- Successful cooperations exist with famous Chip and System Design Houses, e.g.,
  - Infineon, Munich

artirt

- Alcatel-Lucent, Neremberg
- IBM, Böblingen & Poughkeepsie
- Cadence, Munich
- Forte Design Systems, San Jose
- Intel GmbH, Germany





....

| - | - |   |   | _    |
|---|---|---|---|------|
| - | - | - | _ | _    |
|   |   | - |   |      |
|   |   |   |   |      |
|   |   |   |   |      |
|   |   |   |   | 1000 |



FORTE DESIGN SYSTEMS





# TRR 89 – International Initiatives & Contacts

artirt



#### artirt

## TRR 89 – Validation & Demonstrator Roadmap

- 2-level validation concept:
  - Phase I: Early Concept Validation Demonstrator (FPGAbased)
  - Phase II: InvasIC ASIC Demonstrator
- InvasIC Lab (TP Z2)
  - Each location one lab
  - 1 technician per site
  - Established milestone roadmap



- 53

PROGRAMME

#### **Impact and Risks**

Introduction of a new paradigm of <u>resource-aware programming</u> as well as new architectural support by <u>reconfigurable</u> MPSoC-Architectures: **InvasICs** 

Expected impact on:

artirt

- Future advanced processor development for MPSoCs
- Future programming environments for Many Core Systems
- Development of parallel algorithms

#### Potential Risks:

- Acceptance of resource-aware programming
- Cost of Invasion (Hardware/Software, Timing)

### **Questions?**



artirt



- 55 -