Coverity Origins, Static Analysis & DHS Oct, 2009
David Maxwell Coverity's Open Source Strategist For
Stanford Open Source Group
Agenda • The Origins of Coverity (Stanford!) • Static Analysis • The Open Source Hardening Project • What about my project? • Open Source Report 2009 • Architectural Analysis • Summary
$60,000,000,000
$10,500
The Origins of Coverity • Stanford Professor – Dawson Engler – Graduate Students • Ben Chelf (CTO) • Andy Chou (Chief Scientist) • Seth Hallem (CEO) • Dave Park
The Origins of Coverity • Bugs as Deviant Behavior: A General Approach to Inferring Errors in Systems Code (2001)
The Origins of Coverity • Bugs as Deviant Behavior: A General Approach to Inferring Errors in Systems Code (2001) • Programmers' beliefs about the program state affect the code they write
The Origins of Coverity • Bugs as Deviant Behavior: A General Approach to Inferring Errors in Systems Code (2001) • Programmers' beliefs about the program state affect the code they write • B = *A;
Means the programmer believes A is a valid pointer
The Origins of Coverity • Bugs as Deviant Behavior: A General Approach to Inferring Errors in Systems Code (2001) • Programmers' beliefs about the program state affect the code they write • B = *A;
Means the programmer believes A is a valid pointer
• If (A) { … Means the programmer believes A is only sometimes valid at this point in the code
Contradictions • If these lines occur together: B = *A; If (A) { … • There's a contradiction, since A can't be both 'valid' and 'sometimes invalid' at the same time
Static Analysis • Static Analysis includes: – Path Simulation • Code is not a single linear sequence of instructions
Static Analysis • Static Analysis includes: – Path Simulation • Code is not a single linear sequence of instructions – Data Flow Analysis • Values of one variable affect values of others
Static Analysis • Static Analysis includes: – Path Simulation • Code is not a single linear sequence of instructions – Data Flow Analysis • Values of one variable affect values of others – False Path Pruning • Some paths can not occur at runtime. Reporting errors on those paths is a distraction
Open Source Hardening Project • DHS contract awarded to Stanford, Coverity, and Symantec
Open Source Hardening Project • DHS contract awarded to Stanford, Coverity, and Symantec • 3 years, total of $300,000 • Research automated detection of software vulnerabilities
Open Source Hardening Project • DHS contract awarded to Stanford, Coverity, and Symantec • 3 years, total of $300,000 • Research automated detection of software vulnerabilities • Prove value of technique
Open Source Hardening Project • DHS contract awarded to Stanford, Coverity, and Symantec • 3 years, total of $300,000 • Research automated detection of software vulnerabilities • Prove value of technique • Harden Open Source
Open Source Hardening Project • DHS contract awarded to Stanford, Coverity, and Symantec • 3 years, total of $300,000 • Research automated detection of software vulnerabilities • Prove value of technique • Harden Open Source • Validate findings from a Security centric point of view
Coverity Scan Site Created by U.S. Department of Homeland Security Part of ‘Open Source Hardening Project’ Coverity Prevent is exclusive static analysis tool Now contains over 250 open source packages
20
> 11,200
Software Tools • Version Control • Bug Trackers • Debuggers
What about my project? • Eligibility guidelines are available on the Scan site – http://scan.coverity.com/devfaq.html
• Essentially, non-commercial open source is automatically eligible
Self-Builds • Coverity's Analysis requires code be compiled – Coverity has been managing builds for all Open Source projects in the Scan • • • •
Changing version control systems Changing library dependencies Changing compiler dependencies Changing Environment dependencies
• Creates a bottleneck on Scan staff time • Released to current Scan projects in Nov 2008 – Projects can now do their own builds, and submit them for analysis
Self-Builds
http://scan.coverity.com/self-build/
Report on Open Source Software 2009
Let's Reconsider some common beliefs about good coding practices... By looking at a lot of code, and a lot of bugs
Original Research
60 million LOC 250 open source projects 26,181 analysis runs Over 11 billion LOC analyzed
Overall Project Progress
Frequency of Defects (2008) Defect Type
# of Defects
Percentage
NULL Pointer Dereference
6,448
27.95%
Resource Leak
5,852
25.73%
Unintentional Ignored Expressions
2,252
9.76%
Use Before Test (NULL)
1,867
8.09%
Buffer Overrun (statically allocated)
1,417
6.14%
Use After Free
1,491
6.46%
Unsafe use of Returned NULL
1,349
5.85%
Uninitialized Values Read
1,268
5.50%
Unsafe use of Returned Negative
859
3.72%
Type and Allocation Size Mismatch
144
0.62%
Buffer Overrun (dynamically allocated)
72
0.31%
Use Before Test (negative)
49
0.21%
Cyclomatic Complexity/Lines of Code
Architectural Analysis
• Data about high level architecture of code, not low level code defects • Collected by the same analysis mechanisms
Architectural Architectural Analysis Analysis
Architectural Analysis
Q&A • Questions?
http://scan.coverity.com/ http://scan.coverity.com/report/ http://scan.coverity.com/arch/ David Maxwell
Open Source Strategist
[email protected]