Course Parallel and Distributed Embedded Systems

Summer Semester 2020: See replcamenet course "Verteile und Parallele Programmierung" VPP2k

Description

VAK 03-ME-712.06
Category: Lecture+Lesson, 4 SWS
Master Course
ECTS: 6, Sommer Semester
University of Bremen
Lecturer: PD Dr. Stefan Bosse

This lecture is intended to illustrate the increasing importance of parallel data processing in computer science and hardware chip design. Parallel processing concepts are well known in algorithms and software engineering for several decades. An algorithm is partitioned into subprocesses that are run in parallel and concurrently on several processors. However, applications of parallel and distributed algorithms have so far often been in the area of computer-intensive numerics. The development tends to create more powerful microprocessors, with the result of increasing complexity and much more important with an increase in electrical power consumption. It can be shown that the decomposition of a complex system into a system of cooperating less complex systems with the same overall computing performance has significant advantages:

Lower complexity of the single system and regular structures simplify the hardware design;
The power consumption of the total system can be lower than that of a monolithic single-processor system;
Existing system designs can be expanded by regular organizational structures.

The structures and algorithms known from classical multiprocessing techniques can be transfered under certain constraints in the design of digital logic systems so that a hardware design can be made with software engineering models, e.g. Multi-process models with interprocess communication primitives such as semaphores or queues. The trend in the hardware design therefore goes towards the sea-of-processor concepts with up to 1000 (simple) processor cores on a single chip. In this hardware design, system partitioning and communication play a central role. A combined hardware software co-design is essential here.

However, the design of parallel systems involves a few pitfalls and difficulties, and there is also no optimal generic computer architecture for parallel systems, such as is present in sequential data processing. Scaling, synchronization, deadlocks, the handling of competition from multi-process-based parallel and distributed data processing place high demands on the understanding of parallel systems and their design, acquired both theoretically, by examples, and practically in lessons. The understanding of the problem of parallel systems and their synchronization is introduced with state space diagrams and a simulator (CPV). In the following, the CSP-based parallel programming language OCCAM with a compiler and virtual machine is used (recently replaced by a CSP Lua implementation using luajit+ and lvm). The implementation of simple parallel data processing systems on an FPGA is practically shown using the ConPro high-level synthesis framework and the XilinX ISE FPGA design suite.

The course content is structured as follows: Classical multiprocess model with communicating sequential processes including process algebra, discussion of synchronization, extension of the classical CSP model with competition and global shared resources, mapping this extended CCSP model on Register-Transer architecture level, and finally discussing the properties of parallels and distributed systems in general. The examination consists of an oral final examination.

Topics

Motivation and introduction
- Use and limits of single-processor systems
- Architecture of a single-processor system
- Programmed execution
Multi-process model
- With generic processors
- Scaling to application-specific digital logic systems
Multiprocessor architectures
- With generic processors
- Scaling to application-specific digital logic systems
Inter-process Communication {Mutex, Semaphore, Event, Queue, Barrier, Monitor}
- software
- hardware
Parallel algorithms
- software
- hardware
Parallel architectures
- System-on-chip architecture
- Using FPGAs & ASICs
Logic synthesis and high-level synthesis for behavioral modeling of the system
Pipelined architectures
- In functional systems
- In reactive systems
Petri Nets

Material

[Lecture Script]: PARSYS 2017, Revision 20.6.2017, PDF