September 28, 2005
By Pamela Jones, Editor of Groklaw
Lawyers, like the rest of us, are reacting with great interest and some
passion to the Author's Guild's copyright infringement lawsuit against
Google over its new Google Print Library Project, by which Google plans to
scan books from the libraries of Harvard, Stanford, Oxford, the University
of Michigan, and the New York Public Library and make them searchable by
keyword. Google describes the project's goals like this:
The Library Project's aim is simple: make it easier to find relevant
books. We hope to guide users to books specifically books they might
not be able to find any other way all while carefully respecting
authors' and publishers' copyrights. Our ultimate goal is to work with
publishers and libraries to create a comprehensive, searchable,
next-generation card catalog of all books in all languages that helps users
discover new books and publishers find new readers.
The
Author's Guild describes it differently. To them, it's massive copyright
infringement, pure and simple. The lawyers are trying to figure out who is
right and which side is more likely to prevail, to the extent anyone can
predict a fair use case, but there are bigger issues raised by this
litigation. Here's the complaint [PDF]
and Google's public
statement in response. If you'd like to follow the lawyers'
discussions, here are some places where you can do so: Susan
Crawford's blog, William Patry's The
Patry Copyright Blog, and Eric Goldman's Technology
and Marketing Law Blog, and here's Andrew Raff's excellent collection
of attorney reactions on IPTAblog. You might enjoy reading Tim O'Reilly's thoughtful
take on the lawsuit, looking at it from a publisher's point of view.
How Google Print Library Works
What exactly is Google doing with Google Print?
First, what *isn't* it doing? It isn't making copyrighted books available
cover to cover against anyone's will. There are three parts to Google
Print. One, Google makes books available in their entirety only when the
books are in the public domain, like Project Gutenberg has done for years.
Second, when publishers or authors agree, it makes sections available, the
page the keyword appears on and a few pages on either side, but that is a
separate facet of the project, the Google Print Publisher Program. The one
the Author's Guild is fighting over is the third part, Google's Print
Library Program, and for that Google will show only a few sentences on both
sides of the keyword searched for, and not necessarily complete sentences.
You never see a full page, let alone an entire book. You will also find
bibliographic information and where you can find related information on the
web. In all cases, you will also be directed to nearby libraries and
bookstores where the book is available for purchase or loan, including
second-hand bookstores for out-of-print books.
Screenshots of the
three different offerings can be viewed here. And
Google's Common
Questions about the Google Print Library Project says that Google Print
is "designed to help you discover books, not read them from start to
finish. It's like going to a bookstore and browsing only with a Google
twist."
Google's Side
On the Google side, the
clearest arguments are presented by EFF's Jason
Schultz, who explains the four fair use tests; Jonathan Band's paper,
"The Google
Print Library Project: A Copyright Analysis" [PDF]; and Susan
Crawford on her blog, all of whom essentially say that copying entire
books in order to make a digital keyword-based catalog is transformative
and is fair use. Google isn't copying more than is necessary, they argue,
because you can't search for keywords unless you have the whole book
available. And anyway, where's the harm to the market? They cite the Kelly v. Arriba Soft case [PDF], in which the defendant made
thumbnails of other people's photos available online in response to search
requests, with links to the original works, if anyone wanted to purchase
them. Arriba's use was ruled fair use, despite the fact that not only was
an entire copy of the original made, a smaller version of it, in its
entirety, was made available to the public. Google is only showing a
sentence or two, not the entire book, for works where the author hasn't
given approval to show more. If Arriba is fair use, why isn't
Google Print's Library Project also?
If you wrote an article for a magazine and quoted a sentence or two,
likely no one would complain, because it's so obviously fair use, so why is
it a problem for Google to do the same thing with books? And what is the
difference between Google collecting the world's content made available on
the Internet so as to make it searchable and collecting keywords from the
world's books? Copyright holders can opt out. If Google Print violates
copyright law, why doesn't Google, period?
A common theme on both sides of the argument goes like this: Google has
had a fantastic idea, one that can benefit the human race, and almost
everyone hopes there is a way for them to do this. It's just a question of
how to do it right. Google is shouldering the expense and effort of making
a library card catalogue, so to speak, of the world's knowledge and
offering it free to the world. Can anyone *not* want that to happen?
Authors should want to be included so they can be found. The world does
its research now predominantly online, and authors, particularly authors
whose works aren't selling like hot cakes, have everything to gain from
being included in Google Print.
Author's Guild's Side
On the Author's Guild side is the argument that authors have the right to
decide when others may or may not copy their works. This case differs from
Google indexing the web's content, because a license can be
inferred when someone puts content on the web and doesn't take steps to
ban Google and other search engines with a robots.txt file. There is no
equivalent implied permission from the authors of these books.
Copyright law gives copyright holders the right to make copies, period, and
no one else can do so without permission. Libraries don't own the
copyrights to these works, so they can't give permission, it is
argued. Google will violate copyright law, no matter how little it shows
the world, because it will make copies and store them on its servers. The
onus is on Google to contact all the authors and publishers and get
permissions, one by one, they say. If that is so onerous and costly that
Google Print Library can't happen, so be it. The law is the law. This
side cites the MP3 decision
[PDF].
We might wish it could happen, some on that side say, but copyright law
is what it is, so it can't. Some even predict that this litigation will
shut down search engines like Google's. A few hope that happens. Some of
the complaints about Google Print seem more emotional than based on fact.
One comment
on Boing Boing by a publisher is particularly interesting:
Google Print for Libraries has two pretty major flaws. One
being giving a digital copy of all of our works to the participating
libraries where they will then most likely be used in e-course reserves
without any compensation to either author or publisher. University
Libraries have an awful track record at compensating for e-course reserves
and post our content frequently without any restrictions or security.
The second being Google will be profiting (through GoogleAds) on this
content again without compensating the authors or publishers. Fair use
should exclude commercial use. Even Creative Commons licenses (which I
grant to my flikr account) gives you that option.
If we expect the production of good scholarship to be a viable, it has to
be paid for somehow.
A little more accurate information may help calm these fears. First, fair
use doesn't exclude commercial use. I can write a parody, for example, of
your book, even if you don't want me to, and I can sell my parody. Second,
take a look at the terms of the Google-University of Michigan agreement
[PDF], which is available on the university's web site, and you will see that Google
has bound the University, and any of its partners, to limitations on access
and use. Further, should there ever be a dispute between an author and
Google about including a work, the work can be removed by Google, and the
University must then follow suit. Authors can always opt out.
What about the allegation that Google will make money from this project
from ads? Google says there won't be any ads on the books scanned from a
library. This is important, because the Complaint specifically alleges
that Google will be profiting by ads: "4. Google has announced plans to
reproduce the Works for use on its website in order to attract visitors to
its web site and generate advertising revenue thereby." As for the links
to bookstores, Google says that the links they will provide will not be
"paid for by those sites, nor does Google or any library benefit if you buy
something from one of these retailers." Clause 4.3 of the agreement says
that the service will be provided "at no direct cost to end users".
While the Author's Guild makes much of Google allegedly profiting off of
its members' work, a strong argument can be made that it's the other way around,
since Google is providing a new way for readers to discover their members'
books, even those on the deep, deep backlist, as you can see in this example.
Are There Problems with the Complaint?
Then there are some attorneys already pointing
out flaws, procedural defects they believe they see in the Author's
Guild complaint. It is supposedly a class action, but some see a problem
with class certification. The complaint defines the class as all persons
or entities that hold the copyright to a literary work that is contained in
the library of the University of Michigan.
Class action lawsuits are supposed to represent the group the few who are
named allegedly represent, but Lawrence Solum, who is an author, a member
of the plaintiff class in the sense that he has several works in the
University of Michigan's library, opposes
the lawsuit and says he will be harmed if the Author's Guild prevails:
I have a very strong objective interest in Google Print succeeding --
because as a scholar, I benefit from the dissemination of my works and
because reaching agreement with Google will be costly to me and Google,
essentially killing the project. A substantial intraclass conflict of
interest destroys "adequacy of representation," making class certification
inappropriate, both under the federal rules of civil procedure and under
the due process clause of the fifth amendment of the
U.S. Constitution. . . . Pro-bono representation for intervenors opposing
certification, anyone?"
Is it Copying That Causes Harm, or Distribution?
Think about brick and mortar libraries. Suppose I were a librarian. I
want to catalogue every book in my library and do it by keyword, so readers
can come to the library and look up information by keywords on index cards
that I laboriously file alphabetically in file cabinets. Each keyword
will show you where in that library you can find a book that uses that
keyword, with the page given, and additionally tells you where, in nearby
bookstores, you can buy the book.
Would my painstaking work be a copyright offense? It's laughable to even
think of it.
Now, suppose I take all my index cards, and I laboriously hand type them
into a computer. I have a computer database now, listing every
keyword. Now have I violated copyright? Again, it doesn't pass the laugh
test, does it?
But what if I realize that instead of the hand method, all I have to do is
scan in the whole book and then pick out keywords by algorithm. Now am I a
copyright infringer? If so, why? On the technicality that I had to scan
in the whole book, thus making a copy, in order to break it down into
keywords for my card catalogue of my library's contents? Purists for the
law will say "Yes. You are an infringer," because you made a copy.
And they are right. You did. But exactly who is harmed by this scenario?
The end result is exactly the same, whether I do the work by hand or by
computer, except that Google deliberately limits how much I can see,
whereas in the library, the keyword would lead me to the entire book, which
presumably I could borrow, take home and scan or Xerox myself, if I don't
care about copyright.
If the copy merely stays on Google's servers, used only for making a
digital card catalogue, in what way is the author or the publisher harmed?
Have they lost any sales?
Google isn't displaying the works in their entirety on its website, as
the Author's Guild seems to imagine. It isn't selling the books or
offering them for download. It is offering a tool to search books. Where is
the harm to the market? Libraries have special rights under Copyright
Law. Why shouldn't this project?
The Big Picture Questions
For those of us who are not lawyers, our dominant reaction to this
lawsuit is probably that if Google Print Library violates copyright law,
somebody needs to change the law.
This litigation raises some important questions: What is a library in the
digital age? What is a book? Is Google Print going to do away with books
as containers of knowledge, replaced by searchable databases? What about
this litigation's effect on copyright law in the US? Is it possible, as
one comment on the Conglomerate blog suggests,
that if it wins, "Google may be planting the seeds of the destruction of
copyright as we know it"?
Computers are, under current law, the ultimate infringers, in the sense
that you can't read anything on a computer without making a copy in RAM.
There is, in short, no way to avoid making a copy, if you access at
all. It's the gotcha of copyright law in the digital age, and at some
point, some say, we need to think about that issue and decide what to do
about it. If you want the hairs on your head to stand straight up, note
the lack of comprehension of the tech involved in using a computer by
reading the MAI
SYSTEMS CORP. v. PEAK COMPUTER, INC., 991 F.2d 511 (9th Cir. 1993)
decision: "After reviewing the record, we find no specific facts
. . . which indicate that the copy created in the RAM is not fixed."
Susan Crawford explains:
All computers do is copy. Copyright
law has this idea of strict liability -- no matter what your intent is, if
you make a copy without authorization, you're an infringer. So computers
are natural-born automatic infringers. Copyright law and computers are
always running into conflict -- we really need to rewrite copyright
law.
Ernest Miller and Joan Feigenbaum, in their very interesting paper "Taking the Copy out of
Copyright" [PDF], suggest that we drop the copy from copyright law and
focus on distribution instead. After all, it's distribution that harms
authors and publishers, not copies on a Google server no one can see or
access but Google.
We watched Napster get hogtied, killed, cremated and scattered to the
winds, and most of us were sad that the law was trying to snuff out a
great new idea because the courts seemed not to grasp the tech and the real
potential for businesses founded on this new technology.
But the world's books? Should the law block a new way to research and find
books on any topic any human has ever written about, broken down and
searchable by keyword, a way to to find specific books by keyword in the
finest libraries in the world, without having to travel there physically?
Larry Lessig puts it like
this:
Google Print could be the most important contribution to
the spread of knowledge since Jefferson dreamed of national libraries. It
is an astonishing opportunity to revive our cultural past, and make it
accessible. . . . Google wants to do nothing more to 20,000,000 books than
it does to the Internet: it wants to index them, and it offers anyone in
the index the right to opt out. If it is illegal to do that with 20,000,000
books, then why is it legal to do it with the Internet? The "authors'"
claims, if true, mean Google itself is illegal. Common sense, or better,
commons sense, revolts at the idea. And so too should you.
The Author's Guild has only 8,000 members. I say "only" because Groklaw has
more members than that. The value to the public of Google's Print Library
collection so far outweighs the value of one book to one author or even
8,000 books to 8,000 authors, that it is hard to comprehend how any law
could be permitted that could allow such a result as shutting down Google
on the demand of those 8,000 authors.
Copyright law is designed to protect authors, yes, but it is supposed to do
so in a balance with the public good. Copyright law's purpose is to
further the public good by promoting more works of authorship, so as to
make knowledge available. When did that part of the law's purpose get
forgotten? Protecting authors' rights is a means to the end of making
knowledge more freely available, which is exactly what Google is trying to
do. If the Author's Guild succeeds in blocking this project, it will have
managed to turn copyright into a means for restricting the spread of ideas
and reducing the public good.
(
Log in to post comments)