LWN.net Logo

Open Source Security Report

From:  "Martha de Monclin" <martha-AT-pageonepr.com>
To:  <Undisclosed-Recipient:;>
Subject:  Open Source Security Report
Date:  Tue, 20 May 2008 14:19:32 +0100
Message-ID:  <061701c8ba7c$f42055d0$4001a8c0@yourf14ac45099>

 
Open Source Software Continually Improving According to

Research from CoverityT Joint Venture with 

U.S. Department of Homeland Security 

 

New Scan Report on Open Source Software 2008 Shows 16% Reduction in Static
Analysis Defect Density Across 250 Popular Open Source Projects Over 2 Year
Period

 

Researchers Uncover New Information Regarding Defect Density, 

Code Base Size and Other Indices of Code Complexity

 

SAN FRANCISCO - May 20, 2008 - CoverityT, Inc., the leader in improving
software quality and security, today announced the availability of the Scan
Report on Open Source Software 2008. The Coverity Scan site was developed
with support from the U.S. Department of Homeland Security as part of the
federal government's 'Open Source Hardening Project.' The report is based on
2 years of analysis of more than 55 million lines of code on a recurring
basis from over 250 popular open source projects with Coverity PreventT, the
industry-leading static source code analysis solution. 

 

"The continued improvement of projects that already possess strong code
quality and security underscores the commitment of open source developers to
create software of the highest integrity," said David Maxwell, open source
strategist for Coverity. "Working with the open source community over the
past two years has been an exceptional opportunity for researchers at both
the Scan site and Coverity. Based on preliminary feedback from preview
readers, the report contains thought provoking information about defect
density and code complexity and provides a strong foundation for future
research on the nature of software."

 

Open source projects analyzed at the Scan site include some of the worlds
most widely used applications, including the Apache web server and the Linux
operating system. Source code analysis from the Scan site is freely available
to qualified open source projects at: http://scan.coverity.com

 

"Close collaboration between Coverity and the FreeBSD Project over three
years has been both exciting and remarkably valuable," said Robert Watson,
FreeBSD foundation president. "Coverity has had a positive impact on the
correctness of our source code and has helped improve our software
development methodology."

 

The breadth and volume of analysis data presented in the Scan Report on Open
Source Software 2008 is unlike any other collection of code analysis data in
existence, representing 14,238 individual project analysis runs for a total
of nearly 10 billion lines of code analyzed over 2 years.

 

The report also draws conclusions that may apply equally to open source and
commercial software regarding the relationship between variables such as code
base size, defect density, function length, Cyclomatic complexity and
Halstead effort. In summary, the Scan Report on Open Source Software 2008
contains the following findings:

  a.. The quality and security of open source software is improving -
Researchers at the Scan site observed a 16% reduction in static analysis
defect density over the last 2 years, which reflects the elimination of more
than 8,500 individual defects 
  b.. Prevalence of specific defect types - The report shows a clear
distinction between the frequencies of defect types across the scan database.
'NULL pointer dereference' was the most common defect while 'Use before test
of negative values' was the least common defect 
  c.. Average project function length and static analysis defect density -
Data in the report contradicts conventional wisdom, indicating that projects
with large average function length are not prone to higher defect densities 
  d.. Cyclomatic complexity and Halstead effort - Research indicates these
two measures of code complexity are significantly correlated to code base
size 
  e.. False positive results - The average rate of false positives identified
by open source developers on the Scan site is below 14%
 

Detailed data and analysis of these and other findings are available in the
complete Scan Report on Open Source Software 2008, which is freely available
for download in the research library at www.coverity.com

 

"The use of open-source technologies to enhance and evolve commercial
products has become a common strategy. Vendors will continue to leverage this
movement by embedding open source into products, while end-user organizations
will use stable open-source projects as a competitive differentiator against
companies that refuse to acknowledge that open source is now
enterprise-ready. By 2012, 80% or more of all commercial software will
include elements of open-source technology," according to analyst Mark Driver
in his recent Gartner report 'Open Source in Vendor Business Strategies,
2008,' published March 31, 2008.

 

Results of the Scan Report on Open Source Software 2008 will also be
discussed during a complimentary webinar on Wednesday, May 21, 2008 by David
Maxwell, Coverity's open source strategist. Registration is available at:
http://w.on24.com/r.htm?e=107874&s=1&k=41E3686F9B...

 

About the Scan site
The Scan site was developed by Coverity with support from the U.S. Department
of Homeland Security as part of the federal government's 'Open Source Code
Hardening Project'. The site divides open source projects into rungs based on
the progress each project makes in resolving defects. Projects at higher
rungs receive access to additional analysis capabilities and configuration
options. Projects are promoted as they resolve the majority of defects
identified at their current rung.

 

About Coverity

Coverity (www.coverity.com), the leader in improving software quality and
security, is a privately held company headquartered in San Francisco.
Coverity's groundbreaking technology enables developers to control complexity
in the development process by automatically finding and helping to repair
critical software defects and security vulnerabilities throughout the
application lifecycle. More than 450 leading companies including ARM,
Phillips, RIM, Rockwell-Collins, Samsung and UBS rely on Coverity to help
them ensure the delivery of superior software.

 

###

 

Coverity is a registered trademark, and Coverity Extend and Coverity Prevent
are trademarks of Coverity, Inc. All other company and product names are the
property of their respective owners.

 


(Log in to post comments)

Open Source Security Report

Posted May 20, 2008 21:47 UTC (Tue) by smoogen (subscriber, #97) [Link]

I wonder if they will/could develop a test for "not enough entropy" in security code? Mainly
to find out what other problems are lurking in packages.

Open Source Security Report

Posted May 20, 2008 21:56 UTC (Tue) by orospakr (guest, #40684) [Link]

I suspect that some kind of unit/regression testing in OpenSSL would have 
helped avoid the recent entropy snafu.

Testing

Posted May 21, 2008 4:16 UTC (Wed) by eru (subscriber, #2753) [Link]

It is pretty difficult to create an automatic regression test for a module which by its very specification must produce as random results as possible! I guess it could be done, but this probably involves running it millions of times and performing a statistical analysis of the results.

It's not help...

Posted May 21, 2008 7:29 UTC (Wed) by khim (guest, #9252) [Link]

The module was designed to be good PRNG too. So statistical analysis will fail. You need to restart it many times and then calculate correlation over million FIRST results. Doable, but SLOOOW - and can be circumvented if someone left time as seed of PRNG...

It's not help...

Posted May 21, 2008 8:27 UTC (Wed) by mv (subscriber, #17258) [Link]

The module was designed to be good PRNG too. So statistical analysis will fail. You need to restart it many times and then calculate correlation over million FIRST results. Doable, but SLOOOW - and can be circumvented if someone left time as seed of PRNG...

The output is deterministic though, given the right input.

You could run an automated test with e.g. an LD_PRELOAD wrapper that provides replacements for getpid(), time(), gettimeofday() which return static values. Then feed the PRNG known data using RAND_seed/RAND_add and finally check the output.

That could have caught the missing use of the data provided to RAND_add in the Debian OpenSSL case.

Fighting the last war

Posted May 21, 2008 9:09 UTC (Wed) by tialaramex (subscriber, #21167) [Link]

But now you're very much getting into fighting the last war.

Writing a new unit test for each bug is the sort of thing enthusiastic new students suggest.
But unit tests must be useful enough to justify their continued maintenance. Most unit tests
that "check" a random number source e.g. by statistical analysis would meet that burden, but
wouldn't detect the Debian bug due to it retaining the strong PRNG.

Suppose code of the sort you're suggesting had existed. The Debian developers would have
commented out those lines, rebuilt, run the unit tests, and the entropy test fails.
"Unexpected PRNG output" it says. They find that it compares one known output with the output
from OpenSSL, they conclude that the "uninitialised data" was contaminating this output, and
they "improve" their patch to update the unit test. Now it passes unit tests, the little boy
is still shouting "Wolf" but no-one is coming to save him.

Fighting the last war

Posted May 21, 2008 16:42 UTC (Wed) by stephen_pollei (guest, #23348) [Link]

There is nothing that said that you couldn't do both at the same time. make the time etc
static and test the statistical properties of the output. In fact if you did both you might
find marginal cases which a test of one without the other would have missed.

Not even fighting the last war

Posted May 22, 2008 11:39 UTC (Thu) by tialaramex (subscriber, #21167) [Link]

So your proposal is to create an unchanging environment in which OpenSSL can run, and then run
it several times, using statistical tests to ensure that the random output is statistically
independent between runs despite holding all of the environment (except /dev/random
presumably) constant. That sounds like quite a serious piece of work, how much development
time do you think it would take to build a robust and portable version of that test ?

You can come up with all sorts of sufficiently arbitrary tests that would so happen to be
tripped by this error but they all incur a maintenance cost and don't seem to really justify
it with a rationale as to what proportion of real world bugs they'll catch other than this one
which we already fixed.

Running MD5 over the released OpenSSL source and having a unit test fail with "Stop messing
with things you don't understand" if you've changed it would also have been an effective way
to detect this bug, but I don't think we're really considering that.

Testing

Posted May 21, 2008 8:34 UTC (Wed) by bvdm (guest, #42755) [Link]

There are existing statistical test suites that check for randomness and this is a well
explored research area. Hardware (true) RNG generators typically include such tests to guard
against physical failure. 

Given the near ubiquitous of OpenSSL I consider it a shame that such tests are not included in
the OpenSSL test suite. The failure mode in the Debian case was so bad that I would venture
that even a simple statistical test should have picked it up. Cost due to speed of execution
of the test cannot compare to the cost of failure for something this important!

Anyone interested in the details should consult NIST Special Publication 800-22 "A Statistical
Test Suite for Random and Pseudorandom Number Generators".

Testing

Posted May 21, 2008 9:23 UTC (Wed) by tialaramex (subscriber, #21167) [Link]

The Debian output should (I haven't checked) pass this test, indeed OpenSSL may well include
such a statistical test and it wouldn't have changed a thing.

Unless NIST specifies a protocol whereby a large number of such RNGs are tested and the
results compared to ensure they are different (which can't be used in a unit test since only
one sample is available), it doesn't detect the Debian bug because that bug only removes the
entropy from a PRNG seed, and the PRNG output is still pseudo-random, thus passing the test.

Statistical tests don't, and indeed can't prove that something is random in the sense needed
for most cryptography. They can only prove that the data isn't correlated by the known types
of relation handled by the analysis. A program which always outputs...

0, 4, 8, 8, 1, 5, 0, 7

is exactly as "random" as the 10-sided die I just used to actually generate that sequence, so
far as a statistical analysis is concerned. Yet for cryptography rolling dice is acceptable
(if you do it right) while a program which always outputs that sequence is useless.

Testing

Posted May 21, 2008 12:56 UTC (Wed) by bvdm (guest, #42755) [Link]

Okay, I was not clear enough.

The function of a PRNG is of course to take a small number of bytes of high entropy and turn
them into a much longer stream of bytes that has much lower entropy per byte. This is
necessary because true high entropy is scarce. So a PRNG solves a "supply and demand" problem.
And of course Linux has both /dev/random and /dev/urandom which allows a choice of how much
entropy you are guaranteed to get.

What I meant was that OpenSSL should have tests where it statistically verifies the quality of
its *entropy input*, not the output.

This is what hardware RNG's do, they run statistical tests on their entropy input to verify
that physical failure of the high entropy phenomenon that is being measured (radioactivity
etc.) does not destroy the device's security claims. Hardware devices need to do this because
the physical entropy source is typically a single point of failure whereas software PRNG's
rely on multiple sources (keyboard strokes, network packet arrival times etc.)

The Debian OpenSSL bug went undetected because OpenSSL apparently  has no test of the entropy
input similar to what hardware RNG's have. I sure that the upstream team carefully verified
the mechanism, but given the subtleness by which the bug was introduced, it warrants extra
precautions for the future.


Input was perfect!

Posted May 21, 2008 13:42 UTC (Wed) by khim (guest, #9252) [Link]

There are nothing subtle there. OpenSSL used very good source of high entropy: /dev/random. Also there was good PRNG to produce a lot of lower quality entropy. The thing that was at fault was tiny procedure responsible to transfer high entropy to the PRNG pool. In the end it just ignored good source of entropy but shook the pool. So verification of input will be useless: input was not at fault. And verification of output will be hard (as discussed above).

Input was perfect!

Posted May 22, 2008 3:07 UTC (Thu) by bvdm (guest, #42755) [Link]

You are being disingenuous. Tests can be added at any or multiple levels. And the bug was
subtle, just reading the actual code (as in a previous LWN article) does not raise any
immediate suspicions.

Testing

Posted May 21, 2008 12:58 UTC (Wed) by bangert (subscriber, #28342) [Link]

writing a test that ensures a PRNG is indeed a PRNG is dificult.

the debian openssl bug however, could have been found by a test that 
ensures that the PRNG generates more than 16bits of random data. i suppose 
checking for 28 or 29 bits should still be pretty fast. 16 bits is abysmal 
and should be caught immediately.

also the slowness factor is really not a problem for a binary 
distribution. i wouldnt mind if openssl had an extended testsuite which 
ran for a couple of days...

Testing

Posted May 21, 2008 13:13 UTC (Wed) by bvdm (guest, #42755) [Link]

The NIST document describes the range of available statistical tests for PRNG's well enough,
but that's not what I am suggesting.

The OpenSSL bug was made possible because OpenSSL has its own layer of entropy processing on
top of sources such as /dev/random on Linux. This is because OpenSSL needs to support
platforms where /dev/random is not available. So even in the presence of a the high quality
entropy source /dev/random on Debian, a bug in OpenSSL negated that entropy. So I am arguing
that OpenSSL needs to have a test suite on top of their entropy stack to detect future bugs.

That's not what these tests do

Posted May 21, 2008 14:07 UTC (Wed) by tialaramex (subscriber, #21167) [Link]

Here are two data sets, one has 8-bits of true random data input, the other has 16-bits of
true random data input, but like the Debian OpenSSL implementation in each case the actual
data is PRNG output, just one has a smaller range of possible seed values. Simply reply
showing how you algorithmically determined which was which with the generous list of values I
provide...

(Or since that's practically impossible for the reasons I already explained, go away and learn
about PRNGs and seeding and make sure you don't touch any crypto software)

set 1: 9, 89, 64, 12, 49, 19, 72, 28, 34, 6, 9, 31, 65, 73, 17, 28

set 2: 9, 50, 30, 22, 36, 89, 29, 7, 82, 65, 77, 30, 42, 54, 83, 77

I am getting bored of having to roll dice to prove my point here. Please, further
contributions only from people who've both looked at the code in question and already know a
little about the subject.

The tests don't need to do that

Posted May 21, 2008 21:34 UTC (Wed) by man_ls (subscriber, #15091) [Link]

There is a possible mechanism to check not the input, but the output of ssh-keygen. The test could only generate a certain number of keys and check that none of them are repeated. With the Debian hole you would have to generate at most 2^16 keys to find a collision. Given that it took a few hours to generate all the compromised keys, we might expect that the whole process shouldn't last much more than that. Of course the generation might be optimized. In fact on my machine ssh-keygen for RSA with the default key length takes less than 2 seconds; so creating 65536 keys might be done in about 36 hours.

But the process might even be quite shorter. Given that a birthday attack is possible, only about the square root of the number of possible keys is needed. Even less if the keys are not evenly distributed, which was the case. So in this case less than 8 minutes of random generation of keys might have found the security hole, on my hardware. It should be trivial to check this approach in practice, wouldn't it?

A long test ahead

Posted May 21, 2008 23:58 UTC (Wed) by man_ls (subscriber, #15091) [Link]

Uh... I should have read to the end of the comments first, sorry for that. I see that below you mention the birthday attack, and then discard it because the only entropy comes from the process ID which will not repeat itself until it rolls around. So the PRNG is fed with sequential data, and is not random enough even to create a birthday collision.

The time to generate the compromised keys is just a couple of hours -- on a 31-processor cluster. It makes sense that on my lowly AMD64@2.0 GHz it takes about 36 hours. I just checked that generating a few thousand keys does not provide a collision. I will let it run for a while until PID wraps, and then check if there are indeed collisions. But you are right that it is hardly a unit test that which might take longer than a day.

A long test ahead

Posted May 22, 2008 6:08 UTC (Thu) by man_ls (subscriber, #15091) [Link]

After running overnight, my little script is already generating duplicate keys as expected. (PID wrapped around some hours ago.) This with libssl0.9.8 version 0.9.8g-1.

One wonders if such a verification would really be useful for a distro, or if it is indeed "fighting yesterday's battle".

Testing

Posted May 21, 2008 13:47 UTC (Wed) by ajb44 (guest, #12133) [Link]

You're talking about different things. A random number generator is required to generate a
random *sequence*. In this case, we only want to generate one number, and be sure that our
procedure is choosing it from a large set. 
This bug reduced the size of the set to 2^24, IIRC. Because of the birthday paradox we could
detect this in about 2^12 tests. This is not sufficient (because we want a much bigger set
than 2^24) but it's probably worth doing.

Not really a unit test

Posted May 21, 2008 14:18 UTC (Wed) by tialaramex (subscriber, #21167) [Link]

The information I've seen suggested there were at about 2^31 possible initial values for the
entropy pool because it was seeded from PID which is usually in the range of a few hundred to
32767 or so on Linux systems.

In that case you'd have to run an OpenSSL key generator at least of the order of 100 000 times
to get duplicate keys and reject the null hypothesis that the generator produces a random key
each time.

You must restart OpenSSL each time (most likely by doing this work in a separate process) in
order to ensure that it isn't able to keep the entropy pool between keys, since if it does
that you will just be testing the PRNG which we already know is strong.

A unit test which produces 100 000 distinct keys using OpenSSL would be a real pain. On
systems with no dedicated entropy generating hardware it'd almost surely empty the entropy
pool, so this "Unit test" suddenly requires either dedicated hardware or a human to go shake
things up, otherwise it will hang for long periods waiting for more entropy.

This is sounding less and less like a unit test and more and more like an exercise which would
be useful in a formal review of the system's security, of the sort which OpenSSL but not
Debian has passed previously...

Not really a unit test

Posted May 21, 2008 14:36 UTC (Wed) by tialaramex (subscriber, #21167) [Link]

Huh, ignore the above particulars since (as any school boy should know), 32767 is a long way
short of 2^31

However unfortunately the birthday paradox actually doesn't help you in this case because the
"random" factor isn't - you will likely run your unit test on a single machine where PID (the
main source of "entropy" left after the disastrous Debian patch) is happily incrementing
slowly for each key generator process you run, so you won't get a collision until your PID
actually wraps.

Not really a unit test

Posted May 22, 2008 3:12 UTC (Thu) by bvdm (guest, #42755) [Link]

This bug literally had a global impact. The principle at stake is whether enough control
mechanisms exist at present to eliminate future problems. It is well worth considering. 

Individuals compiling OpenSSL and other FOSS cryptographic packages may choose not to run
tests that requires days of processing, but distributions that make modifications are
obligated to do so in my opinion.

Testing needs instrumentation

Posted May 21, 2008 15:21 UTC (Wed) by dd9jn (subscriber, #4459) [Link]

Right, it can be done but the outcome of adding such a test may be worse than without a test.

All serious RNG implementations use some kind of hash function before returning the random
bytes, thus you can't run statistical tests on the output. To run tests you need to instrument
the code.  That very instrumentation is a source of bugs and thus it should be avoided. In
fact, OpenSSL was recently trapped by such a bug due to extra code added for the FIPS
validation of OpenSSL.

Open Source Security Report

Posted Jun 3, 2008 22:06 UTC (Tue) by lunz (guest, #43534) [Link]

You guys are aware that the openssl bug was introduced by someone trying to silence warnings
from a source code checker not unlike Coverity's, right?

Open Source Security Report

Posted Jun 3, 2008 22:34 UTC (Tue) by nix (subscriber, #2304) [Link]

Valgrind isn't a source code checker, and its mechanism of action (JITted 
dynamic binary instrumentation) is utterly different from Coverity's. Also 
they don't spot especially similar classes of problem.

Open Source Security Report

Posted May 21, 2008 13:44 UTC (Wed) by lmb (subscriber, #39048) [Link]

Coverity's source code scanner is a really cool tool. I wish there was an open source
alternative.

OSS alternative

Posted May 21, 2008 16:03 UTC (Wed) by dwheeler (guest, #1216) [Link]

Agreed. splint is probably the closest. My Flawfinder tool is another OSS tool for scanning code (though it's VERY naive); the flawfinder home page has links to LOTS of related work.

OSS alternative

Posted May 23, 2008 6:11 UTC (Fri) by PO8 (guest, #41661) [Link]

The Sparse folks are starting to make that codebase do some fairly general checks as well.
They have a long-term goal of doing some abstract interpretation based analyses.

Open Source Security Report

Posted May 23, 2008 14:26 UTC (Fri) by scripter (subscriber, #2654) [Link]

Copyright © 2008, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds