> Of course, all Debian packages (100%) do have the debian/copyright file with hopefully complete and accurate information about the packages copyright owner(s) and license(s). It is the new machine readable format is, however, already applied to 44% of the packages.
Oh, right, thanks for spotting it --- I didn't notice this during my first read of the article.
debian/copyright (in whatever format) is indeed a prerequisite for all Debian packages; I have mentioned this in my talk and you can see it in the slide about package qualification included in the article. In that respect, the Debian archive already offers a huge corpus of reviewed copyright statements. Of those statements, "only" ~44% are both reviewed *and* machine readable according to the new debian/copyright format.
@mkerrisk: fancy adding a word or two to clarify this?