Vulnerability found in vast number of compilers to hide malicious code

By: Yuriy Stanislavskiy | 02.11.2021, 12:14
Vulnerability found in vast number of compilers to hide malicious code

Cambridge University experts have published information about a dangerous vulnerability (CVE-2021-42574) that affects almost all modern source code compilers. The Trojan Source article describes an insidious attack that allows hackers to hide malicious code in the source code of various programs.

The attack relies on the way compilers handle the unique identifiers used to determine whether the text is oriented left to right or right to left. The weakness lies in the Unicode Bidi algorithm, which allows words written from right to left and from left to right to be used together. Thanks to this algorithm, Arabic and English words can be combined. It makes it possible to read the text written from right to left, from left to right and vice versa.

In some cases the abilities of the Unicode Bidi algorithm are insufficient to change the way these words are displayed, and in such cases special control characters are used. However, if one line combines words with different text direction, it is possible to use these control characters to change the direction in which the compiler reads this text and, for example, make lines that look like comments work as executable code.

Using this method, you can add a malicious instruction to normal source code and make the text of that instruction invisible when viewing the code with a subsequent comment. This will result in the insertion of completely different characters, which could actually be arbitrary code. The final source code remains semantically correct, but the opposite happens after compilation.

When reviewing and analyzing such source code the programmer sees code with comments which don't arouse any suspicions, but the compiler or interpreter will reverse the logical character order and an innocent comment will turn into an additional code inserted into the program. The bug is present in almost all compilers - for programming languages C, C++ (gcc and clang), C#, JavaScript (Node.js), Java (OpenJDK 16), Rust, Go and Python; in various popular code editors, including VS Code, Emacs, Atom, as well as in the source code review interfaces in repositories GitHub, Gitlab, BitBucket and all Atlassian products.

Source: trojansourcezdnet