The Hydras: Improving the C/C++ Development Experience via GCC Static Analysis Plugins

Taras Glek - taras@mozilla.com

https://blog.mozilla.com/tglek

Software Development Stone Age

Any C++ developers in the audience? C hackers? Currently we rely too much on heroic efforts and utter-bruteforce by the developers. Throwing more people at code doesn't help. The general impression, is that initially the developer is fully in control of a program and as the program grows it develops a life of it's own and the developer becomes more and more helpless. Code is always growing, our ability to understand the codebase is shrinking. There is no cure for this, but this talk will show how we can forestall the inevitable doom. Little ability to ensure apis are used correctly. Hard to ensure optimizations are not broken

Closer Cooperation is the Way Out

Open source tools are in a position to cross-polinate. Yet in reality there is relatively little [vertical in relation to the diagram] cooperation spanning projects. I'm not sure how other open source projects work, but generally mature projects are treated as black boxes...earily similar to non-Open software. Mozilla's work is a step towards future with more cooperation.

Static Analysis?

Static analysis tools that we may be familiar with: Coverity, sparse linux static analysis. A static analysis framework to code is sort of like DOM to webpages...Just imaging having to customize webpages by using regexps and string insertions here/there. So once you have a tree-like representation of the source, you can ... (stuff in UL tag)

Why GCC?

GCC isn't really a choice. It's more of a matter of why would one NOT use GCC for static analysis? It is both THE C++ compiler on open source platforms AND THE ONLY C++ compiler that works. So it is really a question of why would one not use GCC. When I started working on analyzing C++ there was a lot of folklore about how abysmal the GCC intermedite forms were. Clang has the potential to become a formidable GCC competitor, but at the moment their C++ frontend isnt complete, so it's not in the running. I started out with Elsa which is a from-scratch C++ parser which is well suited for refactoring code, but not so well for analysis. After an initial failed attempt on elsa, i Moved on to gcc and never looked back. The other problem is that any non-gcc C++ frontend will end up in a C++ arms race with G++ as it introduces new features. Unfortunately when I started, GCC did not support any way of being extended with third-party functionality.

GCC 4.5: Here Come the Plugins

GCC Features

GIMPLE is awesome because it basically allows one to treat C++ as C with a few extra features. it's a great simplified ast for static analysis. GCC attributes are fantatic. Messing with grammars to figure an annotation scheme isn't trivial(as can be seen by C++0x). GCC attributes allow annotating anything we want so far. Release is due any day now. Currently Mozilla relies on 4.3 for production, we'll be moving to 4.5 asap. Other big 4.5 features: LTO

The Hydras

Dehydra

Treehydra

Why is a Browser Vendor Hacking Compilers?

Mozilla is Big and Fast Moving

Can't stop programming to do refactoring. Competitive landscape means we are always looking into any potential wins. Tried switching mozilla to garbage collection, brand new js engine, etc. Optimizations are very risky in a mature codebase, safeguarding them with static analysis makes them plausible.

DXR Demo

Before I get into how mozilla write analyses, here is a pretty demo. Search for nsJARInputStream. Show clicking on parent, members, how to jump to implementation search

Mozilla Analyses

Shadow Variables Demo

Future

Thank you