First Release of LibreSSL Portable Available [LWN.net]

First Release of LibreSSL Portable Available

Posted Jul 11, 2014 21:12 UTC (Fri) by mb (subscriber, #50428) [Link] (5 responses)

What is the reason for splitting this into 'portable' and 'openbsd' variants?

First Release of LibreSSL Portable Available

Posted Jul 11, 2014 21:22 UTC (Fri) by amacater (subscriber, #790) [Link]

This is the normal way that OpenBSD package OpenSSH and other packages: a version to run under OpenBSD itself and a "portable" package designed as a basis for use on other operating systems. They support the OpenBSD version natively and the other variant may be supported to a slightly lesser extent / on best endeavours or simply by the porters.

First Release of LibreSSL Portable Available

Posted Jul 11, 2014 21:23 UTC (Fri) by proski (subscriber, #104) [Link] (1 responses)

I believe the reason is the same as for OpenSSH. Developers working on the code should be focused on security and correctness, not on portability. Porting is done separately.

I don't think it's a perfect recipe for other kinds of software, but it's working well for security related code.

Same as SSH

Posted Jul 12, 2014 3:02 UTC (Sat) by david.a.wheeler (subscriber, #72896) [Link]

As noted above, it's the same way they handle SSH. I'm not a fan of this approach; I certainly wouldn't do it this way. But it seems to work for them.

First Release of LibreSSL Portable Available

Posted Jul 12, 2014 7:00 UTC (Sat) by troglobit (subscriber, #39178) [Link]

Because operating systems other than OpenBSD do not have all the API's needed to support each project: LiReSSL, OpenSSH, OpenNTPd, etc. The OpenBSD approach is to use safer APIs like strlcpy() & c:o that aren't supported by, e.g., GLIBC on Linux. The porting effort is, simply put, usually to add these API's in a local library for the given project so that it compiles and runs.

First Release of LibreSSL Portable Available

Posted Jul 12, 2014 23:06 UTC (Sat) by busterb (subscriber, #560) [Link]

They are different source trees.

portable is just the code needed to compile the portable libressl version. It is mostly shell scripts and automake stuff.

openbsd is a trimmed-down clone of the openbsd CVS tree, imported into git from cvs and mirrored on github. This is provided as a convenience so that Linux and other OS users can checkout the source as easily as possible. It is trimmed to only include the bits needed by libressl, so the download is just a few megabytes, rather than 3+GB for the whole openbsd source. But, it preserves all of the relevant history using cvs2git.

The portable tree, when you run 'autogen.sh', automatically downloads the openbsd tree and moves all the files into the right place for building the portable version. We are then able to automate the integration and release process somewhat, making it easier to provide rapid releases straight from the openbsd tree.

The first release tarball was generated largely automatically by running from scripts - it's pretty neat.

First Release of LibreSSL Portable Available

Posted Jul 11, 2014 21:21 UTC (Fri) by rillian (subscriber, #11344) [Link] (9 responses)

139ac81c9478accd38a9eb667623d75997a2197cec36f184cd8d23e98a7e475b libressl-2.0.0.tar.gz

...is the version I got from the server. If anyone wants to crowdsource the missing release signature.

First Release of LibreSSL Portable Available

Posted Jul 12, 2014 1:09 UTC (Sat) by Siosm (subscriber, #86882) [Link] (6 responses)

No sha256sum, no PGP/GPG signed email. Are they serious? They need to work on their release process.

Note: For what it's worth, I got the same sha256sum.

First Release of LibreSSL Portable Available

Posted Jul 12, 2014 7:07 UTC (Sat) by keeperofdakeys (guest, #82635) [Link] (4 responses)

I'd guess that they only consider this a preview release, not a stable release that they want people to actually use.

First Release of LibreSSL Portable Available

Posted Jul 12, 2014 8:14 UTC (Sat) by Otus (subscriber, #67685) [Link] (1 responses)

Then why did they name it 2.0.0?

First Release of LibreSSL Portable Available

Posted Jul 12, 2014 22:43 UTC (Sat) by busterb (subscriber, #560) [Link]

Gotta start somewhere.

First Release of LibreSSL Portable Available

Posted Jul 12, 2014 23:00 UTC (Sat) by busterb (subscriber, #560) [Link] (1 responses)

Yes - this release is intended for getting initial feedback, reports of build breaks, tell us where we messed up, etc. The release process is still being revised, but we wanted to get some initial feedback ahead of more polished future releases.

If you see LibreSSL 2.0.0 show up in your distro's stable package updates tomorrow, you should probably question the maintainers a little bit.

First Release of LibreSSL Portable Available

Posted Jul 14, 2014 7:52 UTC (Mon) by tedd (subscriber, #74183) [Link]

Aww, and I was just going to check the Arch repository.

First Release of LibreSSL Portable Available

Posted Jul 12, 2014 13:20 UTC (Sat) by tomgj (guest, #50537) [Link]

no PGP/GPG signed email

You may be interested to know that the protocol is called OpenPGP. The two applications you mention implement it, and other applications probably do too.

The following hopefully illustrates the issue by treating the term "email" in the same way:

No PGP/GPG signed Outlook/gmail

vs

No OpenPGP signed email

First Release of LibreSSL Portable Available

Posted Jul 14, 2014 22:55 UTC (Mon) by rillian (subscriber, #11344) [Link]

9596f6cb3e8bafe35d749dfbdb6c984f1bbd86233598eb5fdb4abf854a5792ba libressl-2.0.1.tar.gz

There's now a SHA256.sig which confirms this and the previous checksums. I verified the EC 25519 signature against the published key in the same directory with this port of OpenBSD's package signing tool.

untrusted comment: LibreSSL Portable public key RWQg/nutTVqCUVUw8OhyHt9n51IC8mdQRd1b93dOyVrwtIXmMI+dtGFe

Continuing the record here until we get a better trust path established.

First Release of LibreSSL Portable Available

Posted Jul 18, 2014 22:19 UTC (Fri) by rillian (subscriber, #11344) [Link]

Published signature also passes for

4d16b6852cbd895ed55737819d2c042b37371f1d80fcba4fb24239eba2a5d72b libressl-2.0.2.tar.gz

No gpg signature on the ftp server

Posted Jul 11, 2014 23:07 UTC (Fri) by boklm (guest, #34568) [Link]

Where is the gpg signature ?

First Release of LibreSSL Portable Available

Posted Jul 12, 2014 14:37 UTC (Sat) by roblucid (guest, #48964) [Link] (19 responses)

They've already had feedback, that to act as drop-in replacement for OpenSSL, they need to use same version number, that they forked from. They ought define something else to say it's LibreSSL for applications that want to know, like Google's "Boring" SSL has.

Presumably their willigness to accept help, providing source packages in formats convenient for distro's and portability patches would be a good test of practicality of working with them.

First Release of LibreSSL Portable Available

Posted Jul 12, 2014 21:23 UTC (Sat) by moltonel (subscriber, #45207) [Link] (14 responses)

That sounds a bit pointless. They're already breaking potential program expectations by droping some cyphers (and maybe other changes too), so going all the way to version number compatibility won't help: a program that is fragile enough to break on a version number will likely break on other changes too.

They do need to keep source compatilibity, but I doubt that ABI compatibility is necessary.

First Release of LibreSSL Portable Available

Posted Jul 12, 2014 22:07 UTC (Sat) by moltonel (subscriber, #45207) [Link] (13 responses)

https://blog.hboeck.de/archives/851-LibreSSL-on-Gentoo.html lists a few problems you get when trying to use LibreSSL as a drop-in replacement. None were about version numbers.

The version number issue could get hairy for packagers who want to user a virtual package that depends on either LibreSSL or OpenSSL (or BoringSSL). But that sounds like a very common packaging problem that ought to have known workarounds.

First Release of LibreSSL Portable Available

Posted Jul 12, 2014 23:39 UTC (Sat) by jengelh (guest, #33263) [Link] (1 responses)

Well, libressl cannot a drop-in replacement for all cases since it does not provide all of the openssl symbols. When I tried rebuilding w3m, it became apparent that, for example, "RAND_egd" was missing.

First Release of LibreSSL Portable Available

Posted Jul 13, 2014 20:52 UTC (Sun) by moltonel (subscriber, #45207) [Link]

RAND_egd is the typical example of an insecure feature that should not be used, and rightfully removed from LibreSSL. The fun thing is, w3m probably doesn't make use of the feature, it just links to it. So it's arguably a w3m bug (potentially using a feature that'll reduce security), that gets exposed by trying to link against LibreSSL instead of OpenSSL.

First Release of LibreSSL Portable Available

Posted Jul 13, 2014 0:12 UTC (Sun) by gidoca (subscriber, #62438) [Link] (1 responses)

> https://blog.hboeck.de/archives/851-LibreSSL-on-Gentoo.html lists a few problems you get when trying to use LibreSSL as a drop-in replacement. None were about version numbers.

Those in Dovecot and Net-SSLeay were.

First Release of LibreSSL Portable Available

Posted Jul 13, 2014 20:52 UTC (Sun) by moltonel (subscriber, #45207) [Link]

> Those in Dovecot and Net-SSLeay were.

You're right, I must have had my biasing glasses on when I read the blog post.

That said there are bigger gotchas than just the version string. It seems that you can't fix OpenSSL without breaking the ABI or removing some features. In that light, having a different version string is probably a good thing. It'll alert the users that this is in fact *not* OpenSSL.

You could argue that LibreSSL is only a drop-in replacement if the linked program is properly writen, and that failing to work with LibreSSL is a bug of the program, not the library. Yes, it's a bit twisted. But it'll improve your program's security.

First Release of LibreSSL Portable Available

Posted Jul 13, 2014 16:10 UTC (Sun) by roblucid (guest, #48964) [Link] (8 responses)

OpenBSD may have chosen not to consider any ABI compatibility, but there would be a point to an incomplete ABI compatibility and if NOT then totally break it explicitly. The end goal is a more secure Free SSL implementation.

The obvious benefit to minimise incompatibility of the forks, is to allow independent review, cause cooperating improvements. like the various kernel git trees, rather than spreading developer resources more thinly, fragmenting, as has (unfortunately) happened with proliferation of Package management and Desktops.

Respecting ABI makes LibreSSL much easier to deploy as an experimental "security" replacement. They may not care taking "simple" narrow view for OpenBSD world, but I think that's a shame and liable to lead to greater and unecessary fragmentation. If they choose incompatibility, then #define all the func/struct's and make it total, which re-enables a choice at runtime if application developer chose to support it. That would help catch OpenSSL unwanted behaviour like re-defining system routines eg) malloc(3) generating link time conflicts.

A widely used "subset" ABI, would help enable OpenSSL developers to follow the lead also deprecating features in future "major" release, reducing the breadth of ABI, combating unwise freeping creaturism.

Pro. Sys. Admins. limit things for security frequently, deliberately breaking developer provided functionality, ulimit, SE Linux, chroot(2), network packet filters and all such, operate to enforce a subset of the POSIX ABI on programs. So really it's NOT scary or that unusual to have incomplete functionality, so long as it's not silent, mysterious breakage.

Programs using deprecated ciphers or those perl EGD dependant functions can be tested with drop-in preload library, and made to fail. Upstream have incentives to drop those features and use the intersection of original and forked SSL.

The alternative is asking distro packagers to configure applications, for both OpenSSL and LibreSSL or pick 1 for the distro, at configure time for a package. it's not like alternative logging daemons isolated by using IPC.

You can use in RPM, a virtual dependency which would most naturally be OpenSSL-1.0.1 in this case, but already Libre SSL is claiming 2.0, which can only be confusing as OpenSSL move to 1.0.2 or their own 2.0 which would again be incompatible with this LibreSSL 2.0 release.

However, if programs are needing to be modified, compiled & linked treating LibreSSL as a new API, then they make that single virtual symbol pointless, the alternatives will have to be pushed back a level, thus requiring double number of packages, or more likely ONLY supporting one of the alternatives. Choosing at install/link time, is static, not flexible and removes choice from end-user, not taking advantage of the dynamic loading at runtime of shared libraries.

First Release of LibreSSL Portable Available

Posted Jul 13, 2014 20:58 UTC (Sun) by moltonel (subscriber, #45207) [Link] (5 responses)

> A widely used "subset" ABI, would help enable OpenSSL developers to follow the lead also deprecating features in future "major" release, reducing the breadth of ABI, combating unwise freeping creaturism.

It seems like Google's BoringSSL would make a good foundation for that "least common denominator API". But I'm not holding my breath, as there doesn't yet seem to be much cooperation between the various forks.

First Release of LibreSSL Portable Available

Posted Jul 13, 2014 21:32 UTC (Sun) by roblucid (guest, #48964) [Link] (4 responses)

Exactly, that's what bothers me. Each doing what suits them, with no big picture and lots of downstream excuses to shrug shoulders and do nothing which risks a costly ongoing support commitment.

First Release of LibreSSL Portable Available

Posted Jul 15, 2014 10:17 UTC (Tue) by zenaan (guest, #3778) [Link] (3 responses)

>Each doing what suits them, with no big picture and lots of downstream excuses to shrug shoulders and do nothing which risks a costly ongoing support commitment.

Come on - patience! This is not slashdot, this is lwn.net, and at least a cursory wikipedia check ought be done before fishing for "better support" with "negativity"/ projection of assumptions etc. Not cool. Not intelligent.

From the wikipedia page for BoringSSL, checked just now:
"In June 2014, Google announced its own fork of OpenSSL dubbed BoringSSL. Google plans to co-operate with OpenSSL and LibreSSL developers.[29]"

First Release of LibreSSL Portable Available

Posted Jul 15, 2014 15:36 UTC (Tue) by roblucid (guest, #48964) [Link] (2 responses)

BoringSSL is the fork, which kept version number and sets another symbol to indicate to application it's not OpenSSL, mentioned earlier in thread.

We'll just see if there's technical/personal disagreements and long term split like Emacs, or whether a consensus can be reached. My impression is, developers under-estimate usefullness of binary compatability to clued up sys admins.

First Release of LibreSSL Portable Available

Posted Jul 15, 2014 16:41 UTC (Tue) by moltonel (subscriber, #45207) [Link] (1 responses)

> My impression is, developers under-estimate usefullness of binary compatability to clued up sys admins.

My impression is, you overestimate the feasability of fixing OpenSSL without breaking binary compatibility :p You can only do so much while keeping compatibility. Some bugs are exposed in the API itself; fixing them requires changing the API and breaking compatibility. Not fixing something because it'd break compatibility is a recipe for the next blockbuster security flaw.

As annoying as it is for sysadmins and downstream projects, they won't get better security without some porting efforts.

First Release of LibreSSL Portable Available

Posted Jul 16, 2014 9:55 UTC (Wed) by roblucid (guest, #48964) [Link]

And changing the API so things are no-ops, looks like a recipe for unfortunate consequences eg) rand_poll() returning 1.

There's a balance to be struck, I agree with what you are saying here and don't expect a bug for bug binary compatibility, but am sceptical about the useful results and effects of the approach taken by this fork.

First Release of LibreSSL Portable Available

Posted Jul 14, 2014 8:56 UTC (Mon) by jengelh (guest, #33263) [Link] (1 responses)

OpenBSD is numbering their shared libraries, so there is some compatibility statement. Its lifetime may not appeal to you, but it is sound enough for the filesystem (and package managers like rpm) to enable multiple versions to coexist, keeping all programs runnable (in principle).

A virtual dependency for runtime in an RPM package does not buy you anything, because SONAMEs are what counts (not only to rpm, but also ld.so), and they are different even within openssl.

>Choosing at install/link time, is static, not flexible and removes choice from end-user

It was never a end-user choice anyway. The appearance of LibreSSL added another choice for developers (next to openssl, mozilla-nss), and added a hacking opportunity for distro packagers (substituting openssl by libressl by more or less patching of source codes).

First Release of LibreSSL Portable Available

Posted Jul 14, 2014 9:41 UTC (Mon) by moltonel (subscriber, #45207) [Link]

For source-based distributions like Gentoo it's a different story : it *is* the end-user's choice (ideally) to choose between one implementation and the other. Program packages depend on a virtual library package, which depends on either library package, chosen by useflag. To choose the implementation for each program, you put the useflag on its package, and either slot the virtual package or choose the implementation in the program package.

First Release of LibreSSL Portable Available

Posted Jul 12, 2014 22:56 UTC (Sat) by busterb (subscriber, #560) [Link] (3 responses)

Hi, I'm working on the team doing portability work.

Before this, I worked for a while on an unofficial port, and the OpenBSD team as a whole was (and still is) more than willing to accept reasonable patches that fixed compatibility without introducing #ifdef mazes and the like. Of course, sometimes you to be well prepared to defend your position, but the team is very good.

The current mantra for libressl is, if it is broken downstream (on a portable target), get it fixed upstream (in the openbsd tree) and don't hack a workaround. There are active efforts as well to get some the additional security functions (see crypt/compat) integrated into other OSes and C libraries directly, so that the portability bits eventually whittle away.

First Release of LibreSSL Portable Available

Posted Jul 13, 2014 9:59 UTC (Sun) by richmoore (guest, #53133) [Link] (2 responses)

Where would you like bugs reported? For example building Qt against libressl fails with lots of errors like:

/usr/local/libressl/include/openssl/sha.h:101:46: error: ‘__bounded__’ attribute directive ignored [-Werror=attributes]

First Release of LibreSSL Portable Available

Posted Jul 13, 2014 10:59 UTC (Sun) by busterb (subscriber, #560) [Link] (1 responses)

The github project is good for reporting bugs.

We actually just fixed that issue I believe, thank you for the report!

First Release of LibreSSL Portable Available

Posted Jul 14, 2014 20:42 UTC (Mon) by richmoore (guest, #53133) [Link]

Indeed, Qt builds fine with 2.0.1 though I've not yet tested if it actually works.

Voodoo coding

Posted Jul 13, 2014 11:44 UTC (Sun) by cesarb (subscriber, #6266) [Link] (34 responses)

It's great that they are cleaning a lot of OpenSSL's voodoo coding, but it's sad that they are adding some voodoo coding of their own. In particular, getentropy_linux.c.

They are very careful when reading /dev/urandom, and they correctly point that it has two issues: the most important one, it can fail when the file descriptor table has been exhausted, and a less important one, it can fail on a chroot without a properly configured /dev (I'd argue that such chroots are broken). They correctly argue that there should be a system call to get some entropy without a chance of failure, and they try to use a deprecated system call to get it in that case. Even though they are doing a direct syscall(SYS__sysctl, ...) for that, I can agree that it's a sane thing to do.

But if both methods (/dev/urandom and sysctl) fail, they try the old "throw everything at the pool and hope it sticks" method of gathering entropy, instead of just aborting the process. It wouldn't surprise me if one day some crazy researcher shows a way to predict most of these bits. And it's where some compilation problems appear; they get the address of main(), which is in the executable, but since it's not supposed to be used from libraries, it might not be exported (and it might even not exist, since it's not the true entry point for the executable; a non-C language can use a different function). I hope this does not lead us to a Debian-openssl-bug situation again.

The sane way to proceed will be to add a new system call (and corresponding glibc stub) to get entropy from the /dev/urandom pool (and perhaps other pools like /dev/random), with a guarantee of not failing due to fd-table exhaustion. And then change getentropy_linux.c to first try the new system call, then /dev/urandom, then the sysctl trick, and then kill the process.

And they could also use a more robust way of killing the process. They use raise(SIGKILL), but that can fail in the presence of a misdesigned seccomp filter. IMHO, it would be best to first try raise(SIGKILL), and if it somehow returns then go into an infinite loop calling a blocking select(). (I'd go beyond that and use a variant of OPTIMIZER_HIDE_VAR to prevent the compiler from knowing I'm calling raise() with SIGKILL, since a "sufficiently smart compiler" could infer a non-returning function from that argument).

Voodoo coding

Posted Jul 13, 2014 12:45 UTC (Sun) by busterb (subscriber, #560) [Link] (8 responses)

Adding the new system call is a terrific idea, and definitely preferred.

Here is the reference manual page documenting the getentropy(2) API:

http://www.openbsd.org/cgi-bin/man.cgi/OpenBSD-current/ma...

We look forward to adding support for it to LibreSSL on Linux systems.

Voodoo coding

Posted Jul 13, 2014 13:06 UTC (Sun) by busterb (subscriber, #560) [Link] (6 responses)

As an addendum, as long as the deprecated sysctl continues to remain in the kernel (or at least remains until after the syscall is available), the voodoo should never be used, given your suggested order of entropy method calls.

The fresh 2.0.1 release also removes the main pointer as a source for the fallback entropy method.

Voodoo coding

Posted Jul 13, 2014 13:13 UTC (Sun) by busterb (subscriber, #560) [Link]

As a convenience, the portable builds are already designed to use getentropy(2) if it exists on a system, and will prefer it over fallback mechanisms if it is available.

https://github.com/libressl-portable/portable/blob/master...

Voodoo coding

Posted Jul 13, 2014 17:35 UTC (Sun) by cesarb (subscriber, #6266) [Link] (3 responses)

> As an addendum, as long as the deprecated sysctl continues to remain in the kernel (or at least remains until after the syscall is available), the voodoo should never be used, given your suggested order of entropy method calls.

AFAIK, with seccomp you can deny any syscall, making it return any errno you want, so an errant seccomp filter can in theory make the deprecated syscall fail with ENOSYS, even if it still exists in the kernel. So there exists at least one path which leads to the voodoo code: someone trying to sandbox code by making unexpected syscalls return the "inoffensive" -ENOSYS value, and not special-casing the sysctl syscall used by libressl, and something (perhaps the same seccomp filter) making the read of /dev/urandom fail.

I don't know how relevant that is, and I hope we never have to find out.

Voodoo coding

Posted Jul 13, 2014 17:43 UTC (Sun) by busterb (subscriber, #560) [Link] (2 responses)

If a system implements the getentropy() syscall, and it fails, the process is currently terminated in LibreSSL.

https://github.com/libressl-portable/openbsd/blob/master/...

Voodoo coding

Posted Jul 14, 2014 9:17 UTC (Mon) by makomk (guest, #51493) [Link] (1 responses)

If I'm not entirely mistaken, that means that any version of LibreSSL compiled on a system with getentropy() will crash on older systems that don't support it. Bit of a disincentive to implementing it.

Voodoo coding

Posted Jul 14, 2014 15:26 UTC (Mon) by apoelstra (subscriber, #75205) [Link]

> If I'm not entirely mistaken, that means that any version of LibreSSL compiled on a system with getentropy() will crash on older systems that don't support it. Bit of a disincentive to implementing it.

s/dis//

As has been argued in other posts, if there is no secure source of entropy, silent failure is deadly. I can't think of a cryptographic application which needs secure entropy but is somehow able to hobble along without it — peacefully failing just provides another opportunity for a bug to translate into hard-to-detect but seriously compromised software.

Programs which crash in bad places because they assumed they had access to cryptographic tools are buggy. If they need cryptography, they should be sanity-checking their environment on startup anyway.

sysctl(2) is a security vulnerability.

Posted Jul 14, 2014 5:57 UTC (Mon) by ebiederm (subscriber, #35028) [Link]

sysctl(2) in the kernel has not received any maintenance for years and when the bit-rot is sufficient the code will be torn out.

Seriously. No one cares. No one has ever cared. No one is going to care.

Voodoo coding

Posted Jul 17, 2014 12:23 UTC (Thu) by cladisch (✭ supporter ✭, #50193) [Link]

And it's being added: [PATCH, RFC] random: introduce getrandom(2) system call:

The getrandom(2) system call was requested by the LibreSSL Portable developers. It is analoguous to the getentropy(2) system call in OpenBSD.
The rationale of this system call is to provide resiliance against file descriptor exhaustion attacks, where the attacker consumes all available file descriptors, forcing the use of the fallback code where /dev/[u]random is not available. Since the fallback code is often not well-tested, it is better to eliminate this potential failure mode entirely.
The other feature provided by this new system call is the ability to request randomness from the /dev/urandom entropy pool, but to block until at least 128 bits of entropy has been accumulated in the /dev/urandom entropy pool.

Voodoo coding

Posted Jul 13, 2014 20:46 UTC (Sun) by mezcalero (subscriber, #45103) [Link] (24 responses)

This is so confused. The only correct thing to do if things fail, is well, to let them fail. Return an error, there's really no shame in that. People should be prepared that things fail.

And yuck. A new syscall? Of course this can fail too. For example very likely on all kernels that dont't have that yet, i.e. all of today's...

This sounds like an awful lot of noise around something that isnt really a real problem anyway, since that chroot example is awfully constructed...

If these are the problems the libressl folks think are important, then libressl is npt going to look any cleaner in a few years than openssl does now...

Voodoo coding

Posted Jul 14, 2014 3:42 UTC (Mon) by wahern (subscriber, #37304) [Link] (13 responses)

CSPRNG calls are so deeply embedded within program logic that allowing the calls to fail in the normal course of execution (I.e. hitting a descriptor limit) is as sane as allowing unsigned addition to throw an error. It's simply not a sane interface. (Of course if the CSPRNG cannot be seeded it should abort the program)

Also, chroot jails _should_ be within volumes mounted nodev. Nor should /proc be visible. The point of a chroot jail is to minimize kernel attack surface.

This is probably why OpenBSD just added a getentropy syscall to replace their sysctl interface; to allow simple, wholesale disabling of the sysctl interface entirely using systrace.

Researchers have settled on the sane behavior of a CSPRNG syscall: block until initial seeding, then never block again. And DJB has argued that once seeded the kernel CSPRNG should never be seeded again, as it would be superfluous and might provide more opportunity for malicious hardware to exfiltrate bits undetectably.

The issue is simply no longer debatable. The proper API is precisely something like getentropy.

Voodoo coding

Posted Jul 14, 2014 4:39 UTC (Mon) by andresfreund (subscriber, #69562) [Link] (3 responses)

> CSPRNG calls are so deeply embedded within program logic that allowing the calls to fail in the normal course of execution (I.e. hitting a descriptor limit) is as sane as allowing unsigned addition to throw an error. It's simply not a sane interface. (Of course if the CSPRNG cannot be seeded it should abort the program)

What? You compare a single instruction issue with a call that does a long series of complex mathematical computations? With kernel interaction, entropy estimation, et. al.? Really?
If you write safety critical code that's written in a way that makes it impossible or infeasible to check for errors when using a CSPRNG: Please stay away from anything I will possibly use.

I don't have particularly strong feelings for/against getentropy() but this argument isn't doing it any favors.

> Also, chroot jails _should_ be within volumes mounted nodev. Nor should /proc be visible. The point of a chroot jail is to minimize kernel attack surface.

There's some value in that argument, but I think in reality the likelihood of opening new holes in software because /dev/null, /dev/urandom, /proc/self et al. aren't available is much higher than the security benefit.

Voodoo coding

Posted Jul 14, 2014 19:10 UTC (Mon) by wahern (subscriber, #37304) [Link] (2 responses)

Anyone who manages to turn an algorithm requiring O(1) space into O(N) space, especially an algorithm existing in a definite and fixed problem space that does not nor will ever benefit from any type of abstraction (in the manner of file objects), probably shouldn't be writing software, period. (Granted, we got to where we are by contingent history, so I don't blame the people who came up with /dev/urandom, only the people who defend it despite overwhelming experience and reason.)

At the end of the day it's a QoI issue.

If I have a non-blocking server and an already established socket to a browser and want to establish a secure channel with perfect forward secrecy, and I try to generate some random numbers, but the operation of simply generating a random number could fail, do you have any idea how f'ing ugly it is to insert a _timer_ and a loop trying acquire that resource? Of course it's possible. But it's infinitely nastier than dealing with other kinds of failures, and completely unnecessary. (And compound all of this by trying to do this in a library, lest you simply argue that one should open /dev/urandom and leave it open, which is sensible but still problematic.)

But thanks for the ad hominem. Even though I check every malloc call, handle multiplicative overflow when I can't prove it's safe, and try to regularly test these failure paths (which is the most difficult of all); and despite the fact that you'd probably have to stop using Apple products, Google products, and several other services and products if you wanted to avoid using my software directly or indirectly; and not withstanding the fact that my /dev/urandom wrappers have been used in all manner of software, including derivatives in some extremely popular open source software; I guess I never thought about how easy it is to overcome the design problems with /dev/urandom.

Voodoo coding

Posted Jul 14, 2014 19:37 UTC (Mon) by andresfreund (subscriber, #69562) [Link] (1 responses)

> If I have a non-blocking server and an already established socket to a browser and want to establish a secure channel with perfect forward secrecy, and I try to generate some random numbers, but the operation of simply generating a random number could fail, do you have any idea how f'ing ugly it is to insert a _timer_ and a loop trying acquire that resource?

Why do you need a timer? Why is this different than any of the other dozen or two of things you need to do to establish a encrypted connection to another host?
If error handling in any of these parts - many of which are quite likely to fail (dns, connection establishment, public/private key crypto, session key negotiation, renegotiation) - is a fundamental structural problem something went seriously wrong.

> Of course it's possible. But it's infinitely nastier than dealing with other kinds of failures, and completely unnecessary. (And compound all of this by trying to do this in a library, lest you simply argue that one should open /dev/urandom and leave it open, which is sensible but still problematic.)

You argued that it's required to this without /dev/urandom because it is *impossible* to do error handling there. Which has zap to do with being asynchronous btw.
Note that /dev/urandom - if it actually would block significantly for the amounts of data we're talking about here - would allow for *more* of an async API than a dedicated getentropy() call. The latter basically has zero chance of ever getting that. You're making arguments up.

Voodoo coding

Posted Jul 14, 2014 20:12 UTC (Mon) by wahern (subscriber, #37304) [Link]

I never said it was impossible. I said it wasn't a sane interface.

And I stand by that claim. Why make something which could fail when you don't have to and it's trivial not to?

I always try to write my server programs in a manner which can handle request failures without interrupting service to existing connections. There are various patterns to make this more convenient and less error prone, but one of the most effective is RAII (although I don't use C++), where you acquire all the necessary resources as early as possible, channeling your failure paths into as few areas as possible. I also use specialized list, tree, and hash routines which I can guarantee will allow me to complete a long set of changes to complex data structures free of OOM concerns. One must rigorously minimize the areas that could encounter failure conditions so as to ensure as few bugs as possible in the few areas that are contingent on success or failure or logical operations.

But how many applications do you know of which bother trying to ensure entropy is available in the very beginning of process startup or request servicing? How do would you even do this in a generic fashion? Is it really sane to open a descriptor for every request, or to cache a separate descriptor inside every component or library that might need randomness? If you seed another generator, how do you handle forking? getpid? pthread_atfork? There's a reason most PRNGs (CSPRNGs included) support automatic seeding; not just for convenience, but for sane behavior in the common case.

Hacks and tweaks to the kernel implementation of /dev/urandom to ensure entropy is ready as soon as possible is a perennial bicker-fest, and yet those can't compare to the contortions applications would need to go through just to maintain a descriptor. And they'd all be doing it differently! That's not a recipe for a secure application ecosystem. And getting people to use third-party libraries (like Nick Matthewson's excellent libottery) would be like herding cats and adds an unnecessary dependency. It harks back to the bad old days of EGD, before even /dev/urandom was available.

Of course it's possible. Lots of things are possible, but not all things are practical given limited human and machine resources, and even less unequivocally contribute to a safer software ecosystem free of hidden traps.

When I talk about CSPRNGs being deeply embedded within other algorithms, imagine things like a random sort, or a random UUID generator. These are almost always implemented through a single routine and normally would never need to communicate a failure because _logically_ they should never fail. And yet they could fail, even with valid input, if you rely on /dev/urandom without taking other extraordinary measures completely unrelated to the core algorithm.

Computational complexity attacks, side-channel attacks, etc, have made use of CSPRNGs useful and in many cases mandatory within many different kinds of algorithms which once upon a time could never fail.

Voodoo coding

Posted Jul 14, 2014 7:28 UTC (Mon) by Cyberax (✭ supporter ✭, #52523) [Link] (8 responses)

Except that chroot is NOT a way to minimize an attack surface. The docs says so. And the root user has tons of ways to escape the chroot on Linux.

Voodoo coding

Posted Jul 14, 2014 14:18 UTC (Mon) by rsidd (subscriber, #2582) [Link]

The OP said "chroot jail", not "chroot" -- presumably meaning something like the FreeBSD version.

Voodoo coding

Posted Jul 14, 2014 18:53 UTC (Mon) by wahern (subscriber, #37304) [Link] (6 responses)

A chroot jail implies dropping privileges. It's not much of a jail if you can walk out.

Voodoo coding

Posted Jul 14, 2014 18:54 UTC (Mon) by Cyberax (✭ supporter ✭, #52523) [Link] (5 responses)

Linux doesn't have chroot jails.

Voodoo coding

Posted Jul 14, 2014 19:16 UTC (Mon) by wahern (subscriber, #37304) [Link] (4 responses)

chdir, chroot, setgid, setuid, etc.

Linux absolutely does support chroot jails. And plenty of software does this, and it's 100% portable to almost all POSIX-compliant or POSIX-aspiring systems. (Notwithstanding the fact that chroot was removed from POSIX.)

Actually, Linux supports chroot jails more than most, as PaX has patches which can prevent even root from breaking out using the normal methods, and there are patches floating around which allow you to keep descriptors to directories outside the chroot jail open by preventing use of fchdir or openat which would allow you to break out.

Voodoo coding

Posted Jul 14, 2014 20:21 UTC (Mon) by PaXTeam (guest, #24616) [Link]

PaX itself doesn't have the hardened chroot feature, grsecurity does.

Voodoo coding

Posted Jul 15, 2014 18:28 UTC (Tue) by drag (guest, #31333) [Link] (2 responses)

root in chroot still has root privileges. Unless you are extremely careful breaking out of a chroot 'jail' is _VERY_ easy.

If chroot made sense from a security perspective we wouldn't have any need for things like 'LXC containers'.

Voodoo coding

Posted Jul 15, 2014 20:34 UTC (Tue) by wahern (subscriber, #37304) [Link]

If you setgid and setuid to a non-privileged user and don't have any open directory descriptors, how easy is it to get out?

There are issues with signal and ptrace, but those are easily fixed by using a specialized UID and GID per service.

Arguing that root can break out of a chroot jail is a strawman. Nobody runs as root inside a chroot jail.

And if you're really paranoid, neither LXC nor even full-blown virtualization is sufficient, because the Linux kernel (like all software) is riddled with bugs, and last time I checked sophisticated hackers didn't find themselves defeated by the presence of VMWare or KVM.

Voodoo coding

Posted Jul 15, 2014 23:39 UTC (Tue) by dlang (guest, #313) [Link]

well, you would get out of root as quickly as you can after establishing the chroot, and if you properly minimize the things accessible inside the chroot you make it harder to find a local exploit to get back to root.

Voodoo coding

Posted Jul 14, 2014 11:20 UTC (Mon) by cesarb (subscriber, #6266) [Link] (9 responses)

> This is so confused. The only correct thing to do if things fail, is well, to let them fail. Return an error, there's really no shame in that. People should be prepared that things fail.

The problem here is that C does not have exceptions.

It does no good to return an error code if everyone ignores it. It's especially bad in crypto: failure to seed the RNG results in something which _looks_ like a valid key/iv/nonce, works like a valid key/iv/nonce, but completely breaks the underlying mathematical assumptions the crypto algorithms depend on, by being easily guessable and/or not unique. Two years later, someone finally notices, and the whole Internet has to generate new keys (this has happened before).

With exceptions, ignoring the error kills the program. Without exceptions, the only sane way out is to pretend it was an uncaught exception and kill the program.

@busterb, if you are reading this, I can see where mezcalero is coming from: he's a systemd developer, and it's really bad if the init process is killed (though not nearly as bad as a crypto key compromise), so init system developers tend to develop allergies to libraries which kill their own process.

How about this suggestion: only the initial seed (and the first reseed after a fork) should kill the process on a failure return from getentropy(). If it fails on other reseeds, accept the failure (generating an extra few bytes with the RNG itself and using then as the new seed) and keep going. This way, a developer using libressl would only have to force a reseed (by trying to get a random number) at the start of the program (if you can't open a fd at that point, you have bigger problems and it's best to just dump core) and after a fork, and the developer would know the library won't randomly (heh) kill the program after that point.

> And yuck. A new syscall? Of course this can fail too. For example very likely on all kernels that dont't have that yet, i.e. all of today's...

Today's kernels shouldn't fail because they have the sysctl syscall (the idea is to try first the new syscall, then fallback to /dev/urandom, then fallback to sysctl). The idea is to get the getentropy() syscall (or equivalent; I'd propose a syscall with an extra flags parameter) into the kernel before sysctl is gone for good, so there won't be kernel versions where it all fails.

----

As an aside: it probably can't be done because of API compatibility concerns, but the way I'd do it if it was possible and didn't cause any new problems would be to open the fd to /dev/urandom early _and keep it open_ (let the kernel close it on exit or exec). If reading from an open /dev/urandom fd fails, you probably have bigger problems.

Voodoo coding

Posted Jul 14, 2014 19:15 UTC (Mon) by ledow (guest, #11753) [Link] (1 responses)

I'd be much more wary of a supposedly secure program not checking the return code of a function vital to its operation than an OS that deliberately and carefully returns that code in the first place.

Most programs in the world do not care about the randomness of a RNG. Only one type really does - those that handle public key encryption. If that program fails to check THE most important part of its initialisation and not at least throw out a warning string on stderr, then there's a bigger problem than how we signal that kind of error to it.

And, personally, I'd much prefer a warning of the "deprecation" kind in my logs from init if something goes wrong with that function, than any application crashing because it can't handle a particular syscall. If people are running secure systems and ignore printk messages that tell them the program used a function that it shouldn't, then they get what they deserve.

The "exceptions in C" thing is really just another dig at the language of choice in all these matters. There are plenty of ways for a C program to signal there was a problem - for instance failing any further calls until it has been properly initialised, setting a particular flag, returning a code to callers, etc. If people still AREN'T BOTHERING to check - whatever that method is - that's pretty much the death-knell to any kind of supposedly "secure" program, to my eyes.

Voodoo coding

Posted Jul 14, 2014 19:57 UTC (Mon) by alonz (subscriber, #815) [Link]

Just to correct one misconception—public key encryption is not the only case where randomness is mandatory. Quite a few other crypto primitives/schemes will fail subtly when used with bad randomness. A nice overview can be found here.

Deterministic public-key encryption is an active research area; for many uses (including common cases, such as key exchange) it actually is feasible.

Voodoo coding

Posted Jul 14, 2014 19:46 UTC (Mon) by wahern (subscriber, #37304) [Link] (6 responses)

The problem with sysctl is that RedHat has removed sysctl syscalls by default. sysctl(2) will _always_ fail on modern stock RedHat systems. It also fails on all the Gentoo systems I've tried, but I'm not sure if that's the default or a deliberate decision by our sysadmins. I only realized this recently as I use Debian and Debian-derivatives, and despite knowing about the kernel option I never fathomed that large vendors (especially ones which make claims to stable ABIs and APIs) would knowingly disable sysctl syscalls, considering all the software (like Tor) which depended on it at the time.

So sysctl({CTL_KERN, KERN_RANDOM, RANDOM_UUID}) is no longer a viable alternative. The only way to directly access kernel randomness is through an open reference to /dev/urandom or /proc/sys/kernel/random/uuid (the /proc sysctl interface).

That's the crux of the issue. If sysctl was still available then all would be well, other than some bickering over a sysctl versus a dedicated syscall interface.

In short: sysctl(2) is dead for all practical purposes on Linux. Now Linux behaves pretty much like Solaris, which never had sysctl (a later BSD extension). A lack of sysctl is one of the most annoying things about Solaris (although that's a long list).

OS X's arc4random also relies on /dev/urandom, since it copied an early FreeBSD implementation from before FreeBSD added sysctl({CTL_RAND, KERN_ARND}). And it will silently fail if /dev/urandom isn't visible when it initially seeds! And although I've long tried to support systems like OS X, Solaris, and FreeBSD<10.0 which lacked a kernel entropy syscall, I've always considered them second-class citizens in this regard, and willing to live with a disclaimer about possible issues. But now that Linux is second-class in this regard, it's a much more intolerable situation.

Voodoo coding

Posted Jul 14, 2014 20:05 UTC (Mon) by alonz (subscriber, #815) [Link] (4 responses)

By the way—another underutilized source of entropy in Linux programs is the vector returned by getauxval(AT_RANDOM). Sure, it is intended for use by libc (e.g. to produce stack canaries), but when nothing else is available, it can be very valuable.

Voodoo coding

Posted Jul 14, 2014 20:42 UTC (Mon) by wahern (subscriber, #37304) [Link] (3 responses)

Nice. I was unaware of that interface, although it doesn't help with forking, etc.

But it looks like Linux finally supports a fork-safe issetugid implementation. Linux was one of the last systems which didn't provide issetugid or a similar interface for detecting whether the current process or (crucially) an ancestor was setuid or setgid. glibc had a hack in its loader for supporting secure_getenv and similar behavior, but it wasn't guaranteed to work in children because it depended on the real and effective IDs being different, which wouldn't be the case if you effectively dropped privileges.

Voodoo coding

Posted Jul 14, 2014 21:20 UTC (Mon) by wahern (subscriber, #37304) [Link] (2 responses)

Caveat emptor: On OS X issetugid is another broken stub (like pselect) which doesn't actually implement the correct behavior, but apparently thrown in so software can compile while remaining silently, delightfully bug ridden. Although at least the pselect man page documents the broken behavior.

The BSDs and Solaris implement the correct behavior, as does Linux's new getauxval(AT_SECURE). That is, the status is inherited across fork but not exec.

Voodoo coding

Posted Jul 15, 2014 16:41 UTC (Tue) by busterb (subscriber, #560) [Link] (1 responses)

Hmm, that is interesting, I'll check it out.

Solaris 10 and 11.0 also apparently have issues with issetugid, though it kind-of works (they apparently didn't patch it for 10 because not enough software used it yet?)

http://mcarpenter.org/blog/2013/01/15/solaris-issetugid(2)-bug

Though there are more issues building on Solaris 10 so far, so we haven't crossed that bridge yet.

Voodoo coding

Posted Jul 15, 2014 16:55 UTC (Tue) by busterb (subscriber, #560) [Link]

Huh, ran the same test as above for Solaris on OS X 10.9.4, it would appear to have the same issue at first glance:

test: main: issetugid: 1
test: parent: issetugid: 1
test: parent: uid: 1000
test: parent: euid: 0
test: child: issetugid: 0
test: child: uid: 1000
test: child: euid: 0

Voodoo coding

Posted Jul 14, 2014 20:23 UTC (Mon) by wahern (subscriber, #37304) [Link]

For the record (lest somebody try to use me as a strawman), all my code always checked the return value of sysctl and fell back on /dev/urandom. If that failed my apps then went through the typical horrible hacks of manually collecting entropy, although I realize now that was a poor engineering decision--obscuring the Red Hat kernel changes for far too long--and am changing all such code to bail by default.

Fork detection

Posted Jul 15, 2014 1:46 UTC (Tue) by cesarb (subscriber, #6266) [Link] (1 responses)

I just saw a discussion at https://news.ycombinator.com/item?id=8033779 about a potential problem with clone(): if a program forks twice (with the original process exiting before the second fork), it can end up with the same PID, breaking the "fork detection" of the PRNG (which tries to detect forks by seeing if the PID has changed).

I thought of a way around that problem, with the kernel's help: a per-process generation counter. Make it large enough (like 64 bits) and increment it with each fork. If the generation counter has changed, it's a fork even if the PID ended up the same.

The discussion on the original post has a simpler (though less reliable) workaround: comparing the process creation time.

(Perhaps the audit subsystem has some non-wrapping id which could be useful here?)

Fork detection

Posted Jul 16, 2014 0:35 UTC (Wed) by wahern (subscriber, #37304) [Link]

One solution is using pthread_atfork, and instead of resetting the state in the child, reset the state in the parent. Which to me seems like a better idea than getpid() because you can't always trust the child to do the right thing after returning from fork. Maybe it was compromised, and now it's leaking your entire PRNG state.

OpenBSD just added a new flag to mmap which requests the kernel to zero mapped pages in the child process after a fork. This solves all the issues. The PRNG state is gone in the child and the child will know to reseed.