Google develops SiliFuzz project for mass detection of hidden CPU defects

By: Yuriy Stanislavskiy | 20.10.2021, 11:50

Google is working hard to proactively detect software defects in key open-source projects. But now it has become known that the company is also developing SiliFuzz, a system that will detect defects in processors.

What is it

The principle of SiliFuzz is to analyze the performance of a processor by running pre-prepared test data collected through emulators. It is a kind of phasing - the processor is loaded with "random" calculations, the result of which is checked at the output. If there is a discrepancy, the processor is considered faulty.

Why it's needed

The system is primarily designed to detect electrical defects in microchips that can occur during manufacturing, assembly, workflow, etc. Special attention is paid to them rather than to logical errors in the processors themselves. At the same time, the tests under consideration don't use any low-level debugging mechanisms that allow using them in "live" systems.

In fact, the developers' task is to create a system capable of regularly testing every core of every Google server with minimal impact on its performance. In its current form, SiliFuzz selects a point in time when the load on a particular machine is not too heavy, and consistently tests groups of four threads (2 cores with SMT) in no more than two minutes. At present, the developers are focusing on x86-64 processors, which are widely used by Google itself.

The main aim of the project is to automate the detection of hidden defects leading to miscalculations which are much more dangerous than simple crashes and failures since only small deviations in the chip operation lead to accumulation of a whole array of errors. In some cases the difference was less than 0.0000003%, but this could be enough to cause serious problems.

How effective is

About 45% of defects detected by SiliFuzz are not tracked by other tools. In the future, the developers plan to expand SiliFuzz, increase the speed of the program, and generally improve its quality.

Source: phoronixgithub

Illustration: Laura Ockel on Unsplash