The ReTiS Lab is active on many research topics related to several aspects of embedded and cyber-physical systems, support for time-critical applications, operating systems, and cloud computing.
RT Operating Systems & Virtualization
The operating system (OS), and its kernel in particular, is the core of a software system, acting as an interface between user applications and the hardware.
To properly support real-time applications, the OS must be able to serve them so that their temporal constraints are respected. The use of appropriate scheduling and synchronization algorithms is essential to ensure low and predictable latencies in the computational activities.
Research at ReTiS Lab has been traditionally focused on both these aspects, investigating and implementing novel algorithms for task scheduling and resource management, as well as reducing latencies and developing a proper timing analysis to provide strict bounds on the worst-case delays introduced by the OS kernel.
Following the recent evolution of software systems, this research line has also been extended to cover virtualized systems with real-time constraints, both in the context of embedded applications and cloud computing. As for the OS, the research focuses on both scheduling and resource management algorithms, as well as on analyzing them and reducing the latency introduced by virtualization mechanisms provided by hypervisors, containers, and similar
More recently, research efforts have also been spent on software and hardware mechanisms capable of providing strong isolation among execution domains in a virtualized environment, and on safety and security mechanisms
Nowadays, particular attention is given to heterogeneous platforms that integrate asymmetric multiprocessors, field-programmable gate arrays (FPGAs), general-purpose graphical processing units (GP-GPUs), and other accelerators.
In summary, the ReTiS Lab focuses its research on:
- Scheduling and synchronization algorithms for OSes and virtualized systems (design, theoretical analysis, implementation, and performance evaluation);
- Timing analysis and reduction of the latency introduced by OS kernels and hypervisors;
- Development of new scheduling and resource management algorithms to reduce kernel and virtualization latencies;
- Development of new real-time kernels and hypervisors;
- Strong isolation mechanisms for hypervisors (cache partitioning, control of memory contention, etc.);
- Development, analysis, and implementation of real-time containerization mechanisms;
- Safety-related mechanisms for OSes and hypervisors (run-time platform testing, fault detection and recovery);
- Time-predictable virtualization of FPGAs;
- Support for trusted execution environments (TEEs) in virtualized real-time systems.
More details are available on this page.
RT Cloud Computing
Nowadays Cloud Computing infrastructures are being challenged by increasing demand for evolved cloud services characterized by heterogeneous performance requirements including real-time, data-intensive, and highly dynamic workloads. The classical way to deal with dynamicity is to scale computing and network resources horizontally. However, these techniques can be way more effective when coupled with mechanisms ensuring efficient and predictable execution of software components in distributed, shared & multi-tenant infrastructures. Such mechanisms may span across the multitude of layers or planes characterizing a cloud infrastructure:
- ensuring temporal isolation at the OS kernel and/or hypervisor level
- intelligent, QoS-aware mechanisms for VM placement and migration
- preventing unstable networking performance by avoiding cross-talks and resources saturation through appropriate QoS-aware management of the network and applications’ data flows
- design scalable data stores with guarantees on the end-to-end performance and latency, to support soft real-time workloads in cloud infrastructures
In summary, the ReTiS Lab focuses its research on:
- Real-time cloud computing;
- Operating systems for massively parallel & distributed systems;
- Adaptive resource management and optimization;
- Open source real-time operating systems.
Cyber Security & Safety-Critical Software
Safety and security is a key requirement for many cyber-physical systems. Indeed, there are many situations where a cyber-attack or a safety issue may cause catastrophic consequences involving the loss of human lives. For example, imagine what could happen if a malicious user takes control of an autonomous car or if the brake control system fails due to a software misbehavior. To address these issues, the RETIS Lab is carrying out research for improving the robustness of cyber-physical systems against cyber attacks and software misbehavior.
For decades, the RETIS lab mastered techniques to guarantee safety and security in safety-critical software. Research efforts include modeling and automatic code generation of safe and secure AUTOSAR components in automotive systems, enhancing security through hypervisor technology and multi-domain software architectures, exploiting hardware mechanisms to ensure security in COTS platforms (e.g., through Trustzone and Pointer Authentication Codes), security mechanisms for FPGA system-on-chips, security in cloud computing, design optimization of temporal and security requirements, design and development of protection and monitoring techniques for potential cyber-attacks at the hypervisor and OS level, and the related recovery strategies.
The RETIS lab also gained considerable experience in enhancing the safety and security of autonomous systems that leverage artificial intelligence and deep neural networks in perception and control tasks, by investigating attack and defense methods for adversarial examples and architecture frameworks to increase the trustworthiness of deep neural networks and tolerate faults in AI components.
More details are available on this page.
In summary, the ReTiS Lab focuses its research on:
- Security and safety for hypervisor technology and multi-domain software architectures;
- Security mechanisms for FPGA systems-on-chips;
- Security and safety of AI algorithms;
- Security in Cloud Infrastructures for Real-Time and High-Performance Services;
- Security and safety in AUTOSAR automotive systems.
Predictable Heterogeneous Computing (FPGA / GPU)
Complex real-time cyber-physical systems are typically characterized by computational activities of different criticality and performance requirements. For instance, tasks related to audio/video processing, sensor fusion, and AI-based algorithms have a large computational load and need to be executed on a rich operating system (e.g., Linux) to exploit all the available drivers, libraries, and development frameworks. On the other hand, tasks closer to the physical part of the system, as sensing, control, and actuation, are highly critical and must be managed by a real-time operating system to guarantee the required timing and safety levels.
In addition, to meet the real-time performance requirements, such systems need to exploit multi-core, GPU and FPGA acceleration, which allows reducing the end-to-end delays as well as energy consumption on the platform. To address such needs, the software architecture should allow the coexistence of powerful and high-level development tools, available in an OS like Linux, with more critical real-time components deployed in a hard real-time environment, such as the Erika OS. This can be accomplished through a real-time hypervisor that mediates the access to the various heterogeneous components.
Research topics in this area include:
- Tools for the management of the life-cycle complexity in development of heterogeneous multi-core, GPU and FPGA acceleration;
- Real-time support for FPGA with partial dynamic reconfiguration;
- Predictable execution of computations involving the use of Deep Neural Networks (DNNs);
- Energy-efficient and energy-aware real-time scheduling techniques with the ability to adapt to the on-line conditions of the system.
Design Methodologies and Tools
Most of the current and future Cyber-Physical Systems (CPSs) exhibit complex features including real-time video and sensor-data processing to realize advanced and autonomous control mechanisms, and are characterized by challenging non-functional requirements including tight timing, reliability, security, and energy-efficiency requirements, among others. Therefore, these systems are deployed on hardware that is growing in complexity, including increasingly often non-symmetric multi-core processing, GPU acceleration, and FPGA off-loading capabilities, which are needed to support the most advanced high-performance embedded computing scenarios of nowadays CPSs. As a consequence, the complexity of the software engineering and development process for these systems has been growing over the last years, also in light of the higher and higher number of functions and components that are deployed on a single Electronic Control Unit (ECU), and the higher and higher degree of connectivity among different ECUs that realize increasingly often distributed functional systems.
At the same time, the need for predictable, reliable, and secure execution has led to the need for adopting model-driven engineering (MDE) approaches, which are capable of capturing critical non-functional requirements from the early design stages of a system/component life-cycle, and to ensure that, also thanks to automatic model transformation and code-generation techniques, these requirements are properly, formally and systematically refined and realized in the finally implemented system, strictly adhering to the design-time specifications. The industry has been relying on MDE-based approaches for a long time, in safety-critical application domains like automotive, railroad, and aerospace. However, the recent advancements in multi-core and parallel processing hardware, the relentless adoption of complex platforms in CPSs, the need for supporting complex features where exploitation of hardware parallelism is key to success, and the increasing weight of energy-efficiency in the overall picture, are all factors that are heavily challenging the traditional MDE-based workflows and tools, as adopted in current industrial practices.
Nowadays, we need more powerful modeling tools, more powerful and complete analysis techniques, more involved optimization strategies, more adaptation to the conditions that a CPS will face at run-time, including fault-management scenarios, that call for new generations of MDE-based modeling tools, like those being investigated in our research at the RETIS.
Specific research topics within this area include:
- Support for multi-core processing in AUTOSAR and AMALTHEA based designs;
- Accounting for energy-management capabilities of the hardware, like DVFS and big.LITTLE ARM architectures, within the design flow;
- Enrichment of MDE-based approaches with GPU and FPGA accelerated functional blocks;
- Automatic code generation for GPU and FPGA accelerated designs, and identification from the early design stages of possible critical factors that might impair correctness of a design.
Robots, self-driving cars, and automated industrial manufacturing are only a few examples of autonomous systems that are becoming widespread in our lives. These systems have unique requirements and give rise to challenging research problems for their development. The RETIS lab has worked with autonomous systems for a long time.
The main areas of expertise are localization algorithms, simulation environments and frameworks for testing, perception algorithms and architectures for autonomous driving, virtualized environments for AI-powered systems, and computer vision for embedded devices.
Middleware frameworks (e.g., ROS 2, CyberRT, DDS) also gained considerable attention in the fast prototyping, development, and deployment of autonomous systems. To mention a relevant example, ROS 2 is used by tens of thousands of developers and researchers in both industry and academia, and it is also the mechanism used in the most popular autonomous driving framework, i.e., Autoware, to handle the large number of messages exchanged among the various components through publish/subscribe communication.
The RETIS lab has a pioneering experience in studying the predictability of middleware frameworks such as ROS 2 to analyze the timing behavior of complex applications’ where the joint scheduling effects of the operating system and the middleware need to be considered.
Specific research topics in this area include:
- Virtualized environments for AI-powered autonomous systems;
- Predictability of middlewares for autonomous systems;
- Unmanned Aerial Vehicles;
- Algorithms for autonomous driving;
- Computer vision for embedded devices.
Predictable and Trustworthy AI
The objective of this research is to increase the predictability and the trustworthiness of AI algorithms and deep networks enable their use in safety-critical cyber-physical systems, as autonomous vehicles, advanced robots, space crafts, and medical systems. To be safely deployed, such systems must be certified and must react within given timing constraints imposed by the environment. Unfortunately, current deep learning frameworks are not designed to be used in safety-critical systems and cannot guarantee predictable response times. More details are available on this page.
To solve this problem the following research is carried out at the RETIS Lab:
- Safe and secure architectures for AI-powered cyber-physical systems;
- Defense perturbations to detect adversarial examples;
- Coverage analysis for increasing trustworthiness of deep neural networks;
- Predictable support for concurrent deep neural networks on GPU platforms;
- Predictable FPGA acceleration of deep neural networks;
- Lidar odometry and localization through deep learning;
- Explainability of deep neural networks;
- Verification of deep neural networks;
- Accident prediction in autonomous driving;
- Enhance predictability in inference engines;
- AI for cloud computing and network function virtualization (NFV) infrastructures;
- Improving predictability, safety, and security in the Apollo autonomous driving framework.
The different skills acquired in the ReTiS Lab have been exploited in several applicative domains where robust real-time techniques and mechanisms can significantly improve the performances.
For example, providing robustness in timing constraints to wireless sensors networks allowed their application to critical scenarios.
The steady growth of low-cost wearable medical devices has allowed reshaping several activities ranging from monitoring to telerehabilitation. Mechanisms for embedded real-time systems could be applied to design reliable and low-cost medical devices, as it has been done in some real use-case.
The same view has been applied to bring time-predictable concurrency to the Arduino framework, simplifying its use for more complex applications and educational purposes.
- Arduino Real-Time Framework;
- Wireless Sensor Networks;
- Embedded systems for healthcare.