Predictable and Trustworthy AI

The objective of this research is to enable the use of AI algorithms and deep networks in safety-critical cyber-physical systems, such as autonomous vehicles, advanced robots, space crafts, and medical systems. To be safely deployed, such systems must be certified and must react within given timing constraints imposed by the environment. Unfortunately, current deep learning frameworks are not designed to be used in safety-critical systems and cannot guarantee predictable response times. To solve this problem, the following research is carried out at the RETIS Lab:

Safe and secure architectures for AI-powered cyber-physical systems

This work leverages hypervisor technology to integrate multiple components of different criticality and safety requirements into a single computing platform. In this way, it is possible to execute a high-performance computing domain (hosting replicas of neural controllers) under the Linux operating systems together with a safe, certifiable computing domain (hosting safety-critical components) under a real-time operating system. In such an architecture, the hypervisor ensures strong time and memory isolation among the different domains, guaranteeing security and real-time properties. To ensure safety, the critical domain must continuously monitor the machine learning modules to timely detect possible unreliable outputs that could jeopardize the whole system, switching to a simpler backup controller able to bring the system to a safe state. Developing a safety monitor that evaluates the reliability of a deep neural network in real-time is also a key research topic of the RETIS Lab.

Defense methods against adversarial attacks

This research is aimed at developing methodologies for enhancing the security of machine learning algorithms, which have been shown to be quite sensistive to adversarial attacks, that is, malicious perturbations applied to inputs or objects in the environment able to induce erroneous behvaviors in neural networks.

Different effective methods have been developed at the RETIS Lab to detect adversarial attacks. One approach exploits the high sensitivity of adversarial inputs to specific transformations. Another method is based on a coverage analysis and consists of monitoring the neural activations of the internal layers of a neural network, comparing them with a reference behavior, a sort of “signature” acquired offline from a trusted dataset. This method allows the building of a confidence value used to judge the trustworthiness of the network prediction. Different coverage analysis methods have been evaluated and tested using multiple detection logics. A new method has also been developed for detecting and masking malicious perturbations applied in the physical world to fool neural models for image segmentation.

Finally, more fundamental research is being conducted to design new types of neural networks for which a certifiable robustness can be provided against adversarial attacks.

Predictable acceleration of deep neural networks

A time-predictable acceleration of neural network inference is crucial for the development of safety-critical autonomous systems such as self-driving vehicles, robots, satellites, and space probes for planetary exploration. Unfortunately, the computing devices used today for accelerating neural computations are not able to provide a predictable timing behavior when executing multiple neural models. In fact, highly variable delays can be introduced during execution due to different types of interference among various micro-architecture components.

The RETIS Lab developed a set of efficient methodologies to bound such delays in different heterogeneous architectures that integrate general purpose GPUs (GPGPUs), FPGA, and multi-core processors of different types.

In particular, FPGAs can accelerate computations with a more predictable timing behavior and much less energy consumption than GPGPUs. In addition, dynamic partial reconfiguration can be exploited to reprogram parts of the FPGA area while the other parts are running. Such a feature has been exploited to implement a virtual FPGA that can execute a higher number of hardware accelerators sharing the same fabric, thanks to a timesharing mechanism similar to the one used to implement virtual memory. In this way, multiple neural networks can run concurrently on the same FPGA, greatly reducing response times for software implementation. Also, an automated framework allows optimized neural accelerators to be synthesized under given timing and resource constraints.

In another work, the FPGA is used to run multiple instances of Xilinx’s deep processing unit (DPU) to accelerate multiple neural networks for real-time object detection and tracking. This approach was effectively tested on a drone equipped with a camera and two LiDARs to perform real-time tracking of multiple persons at 30 fps, running the autopilot and a YOLOv8 for object detection on a Xilinx zcu102 board.

Explainability of deep neural networks

The high performance of deep neural networks comes with a price: these systems are highly complex, and their outputs cannot easily be interpreted and, hence, trusted by humans. Such difficulty in providing a clear explanation of their behavior makes AI inapplicable in areas where explanations are necessary for legal, safety, or security reasons. This work investigates different methodologies for building a clear graphical explanation of the results generated by a deep neural network. This research also investigates how to exploit generated explanations for automatically detecting possible biases present in the training set and possible unsafe inputs, such as adversarial examples or out-of-distribution samples.

Enhance predictability in inference engines

The native scheduler used by popular inference engines, e.g., the one employed by TensorFlow, to run deep neural networks on multicore platforms does not take timing issues into account since it has been designed to optimize the average case rather than the worst-case performance. Therefore, it can introduce long and unpredictable delays, making it unsuitable for safety-critical applications. This work aims at enhancing predictability by acting on the node scheduler to introduce mechanisms designed to handle neural-network-specific workloads.

AI for cloud computing and network function virtualization (NFV) infrastructures

Cyber-physical systems are becoming increasingly interconnected, and low-latency and high-reliability connectivity are among hot topics in networking, for example, with reference to 5G scenarios. In this context, adaptive AI-based techniques are becoming more and more important to support communications in distributed cyber-physical systems. This task investigates techniques based on artificial intelligence and machine learning to analyze the massive amount of data coming from the monitoring system of a cloud/NFV infrastructure for purposes related to supporting operations, performance troubleshooting, root-cause analysis, workload prediction, and capacity planning.

Improving predictability, safety, and security in the Apollo autonomous driving framework

Modern frameworks for autonomous driving include several functionalities that need to run in a predictable, safe, and secure manner. The Apollo open-source framework for autonomous driving consists of multiple modules, each taking care of a specific task, e.g., control, planning, and perception. Since Apollo requires interacting with sensors and devices (such as GPUs) whose drivers and software stacks may not be available on a real-time operating system, it runs on Linux, a feature-rich operating system that, however, is vulnerable to safety threats and cyber-attacks. For this reason, it is not suitable for the certification of the most safety-critical components, e.g., control and actuation. This work aims to improve Apollo’s safety and security features by using a hypervisor for creating two virtual machines that share the same physical platform: a Linux-based virtual machine (Linux-VM) and a virtual machine running a real-time operating system (RTOS-VM). In this way, the Linux-VM runs the perception-related components requiring a tight interaction with sensors and hardware accelerators, while the RTOS-VM is in charge of handling the most safety-related activities. Furthermore, a more predictable acceleration of Apollo’s deep neural networks is provided by using FPGA instead of GPUs.