About those orphans

In the days since they filed a wide-ranging lawsuit against research libraries over their collaboration with Google and the nascent Orphan Works Project, the Authors Guild has dropped several clever blog posts walking through the orphan candidates list to see if they can find rights holders. While this kind of collaboration is exactly what HathiTrust and its partners had in mind when they posted the list, the Michigan Library has announced they are suspending the project so they can improve the search process in light of the information the Guild is finding. As Michigan says in its statement, this iteration is a salutary part of the process and does nothing to discredit the underlying goal of identifying and surfacing neglected works.

It is important to remember a few things as the project re-tools.

This kind of input from rightsholders is welcome and helpful

External image

It looks like the Guild’s blog is surfacing some of the kinds of information the published orphan candidates list is supposed to surface: not only the identity of one particular rights holder, but also ways to improve the search process so that it is likely to yield better information on other works. As commenter @mcburton points out, groups like the Authors Guild could play a more constructive role in this work by joining in the search process. That is why the list is published. Collaboration would surely be cheaper than the legal fees associated with an epic lawsuit, would generate less dissent from authors who support libraries, and provide a better service to members who themselves benefit from increased access to knowledge held in research libraries. And it would not dismantle an invaluable preservation resource of 10 million works from library collections.

Fair use and Section 108 apply in many situations even if you find a rightsholder

Finding an author, even finding a rights holding author (not all authors hold the necessary rights, after all), does not defeat the legal arguments in favor of Hathi’s uses. (See Jonathan Band’s brief for details on the lawfulness of the project). The key fact is really that these works are out of print and, hence, off the market. Of course, this does mean that for the couple of works that have been found to be in print, the legal case for their use is much, much weaker.

Hathi and its partners propose very modest, non-commercial uses that are traditional library uses and are beneficial for authors as well as users

Notwithstanding the Guild’s suggestion that the author they located had been rescued from a fate worse than literary death, the libraries’ use of these works is extremely modest and is more likely to resuscitate a work on its death bed than to harm it. As Ed Van Gemert, deputy director of libraries at the University of Wisconsin-Madison, has pointed out, finding authors is a great way to open up content even more, as authors (especially authors of out-of-print works) typically want their works visible, rather than buried in a dark archive.

Works designated as “orphans” will only be available for study by authenticated faculty and students at institutions who hold the physical book in their collections. The number of simultaneous users will be limited to the number of copies held in the library collection; if the library holds one copy, only one user (not “at least 250,000,” as the Guild implies) will be able to access the work digitally at a given time. Users can view the works in a browser, or download one page of the book at a time as PDFs; this mode of access replicates the limited access that users have to library books in the stacks.

At the same time, there is extraordinary new value in preservation, search, and access for the print-disabled. The likely impact for works like The Lonely Country, which is out of print, seems to have no braille or audiobook edition, and, in at least one major research library, has not been checked out since at least 1993 (and perhaps longer; records from earlier years were not readily accessible), is that more library users will find it when they are doing relevant research, more library users will be able to read it, and users far into the future will still have access to it even when the physical copies have long ago deteriorated. Finding your audience, reaching the print-disabled, and surviving throughout the ages is hardly a nightmare scenario for a writer.

Finding an author on Wikipedia ≠ finding a work’s rights holder

Indeed, finding an author in any case is not the same as finding the person who holds rights to a particular work. Authors sign away their rights, contracts get lost, publishers go out of business. It is simply not the case that a work is no longer an “orphan” once you can find that its author is alive, or that she left heirs, much less that her papers reside somewhere or that she has endowed a department in her name. Indeed, the slim pickings that the Guild has turned up over the last week - two works in print and one author with a literary agent, plus a lot of “strong leads” - shows how difficult this work really is. Two or three confirmed false positives out of 170 or so candidates is barely a 1% error rate.

The careful use of a question mark in their “Two more?” blog post acknowledges that what the Guild is finding is not the kind of information that would allow a good-faith user to ask permission for uses of these works. Anyone who knows anything about academic publishing knows that finding an author’s papers or even contacting an author and talking to them about their rights will rarely, if ever, settle the question of who holds the copyright. Even libraries who are themselves the custodians of an author’s papers can have significant difficulty determining the copyright status of these collections. Often the best you can do is a kind of quitclaim deed that says, “I don’t know if I have rights in this thing, but if I do, I won’t sue you.”

So, while the commenters on the Authors Guild blog have turned up what they consider to be “strong leads” on rights holders for books on the candidate list, we should be clear about the limits of these discoveries.

The search should be reasonable given the circumstances, and it will never be perfect

These facts about the difficulty of locating bona fide rights holders should help folks see that perfection is not an option in this context. There’s no question that, given infinite time and resources, there would be no ‘orphan works’ problem. A hundred thousand genealogists together with as many lawyers could surely find every rights holder there is to be found. The whole idea of “orphan works” assumes that at some point, a reasonable search will still come up empty-handed. A recent study from the UK suggests that, given a more normal staffing situation (i.e., ordinary librarians), searching for orphans manually on an item-by-item basis is inherently unworkable for large-scale projects, estimating that this process would take 1,000 years to process 500,000 works. That’s 20,000 years to process the 10 million volume HathiTrust collection. Crowd-sourcing and other innovations can cut down on that time, but clearly a perfect search is impractical and unnecessary given the modest, non-commercial nature of the use involved here.

Recent proposals for orphan works legislation reflect this reality. The Shawn Bentley Orphan Works Act of 2008, for example, would have required a search that is “reasonable under the circumstances,” a qualified phrase that leaves room to factor in the nature and value of the use as well as the cost of the search. Adapting a novel for a commercial motion picture justifies a more diligent search than scanning photographs in a historical archive for preservation. Library searches should be thorough, but they should be proportional to library uses.

Threatening to unplug the HathiTrust is extremely shortsighted and unhelpful

The extreme nature of the remedy the Guild seeks shows how little the officers at the Guild value the preservation and stewardship of millions of volumes in library collections. Redundant digital storage ensures that works on crumbling acidic paper, works held in only a few institutions, works long out-of-print and forgotten by their authors and publishers, will remain part of our cultural heritage for generations to come. It makes those works discoverable to a new generation of scholars through the development of fair uses such as search and access to orphan works. Any author whose work was helped or inspired by research at a library, as Mr. Turow has proclaimed his own work was, should applaud this increased access and preservation. And yet, the Guild thinks it is better to pull the plug on this resource than to incur the inevitable risk of a few false positives in the search process.

So, while it is at first glance discouraging that the identification process seems to have misfired so soon, we should keep these mistakes in perspective. They do not change the legal case, and they will only make the process stronger and more useful in the long run.

Library Associations File Brief in Defense of Fair Use

Last Friday, the Association of Research Libraries, the American Library Association, and the Association of College and Research Libraries filed a friend of the court brief to defend the fair use rights of libraries. The brief responds to the Authors Guild’s extraordinary arguments in a lawsuit against the Hathi Trust and several member libraries. The brief demonstrates that if the Authors Guild were to win the day, libraries would be severely curtailed in their ordinary activities, including lending books and providing Internet access to the public.

The Authors Guild has brought a suit against the Hathi Trust and several of its member institutions claiming that these groups violate copyright by accepting, archiving, and making accessible thousands of digitized volumes created by Google in connection with the Google Books project. The libraries have responded that the project is protected by both the fair use doctrine and parts of the specific exception for libraries in Section 108 of the Copyright Act.


Voila: the archived video from Wednesday’s ARL webcast about the HathiTrust decision!

Finally, An Easy-ish Question

Copyright law can seem so confusing, with simple questions like “when will the song ‘Happy Birthday’ enter the public domain?” prompting elaborate research projects that still do not settle things once-and-for-all. It’s so nice, then, when you find clear and unambiguous statutory language that is right on point and provides a simple answer to your question.

External image

Today is our lucky day, because we are setting out to answer one of those rarest questions: one with a clear and unambiguous answer in the Copyright Act. Namely:

Does anything in Section 108 of the Copyright Act in any way affect the right of fair use as provided by Section 107?

It’s an important question, because libraries have certain specific rights under Section 108 that might be seen as preempting any more general rights they have under Section 107. If libraries want to use copyrighted materials in ways that are not covered by Section 108, is fair use available as a rationale to justify those uses? To put a finer point on it: Do libraries have fair use rights? Hmm.

Let’s do a quick search of Title 17…

AH-HA! The statute gives us a clear and unambiguous answer at Section 108(f)(4):

“Nothing in this Section in any way affects the right of fair use as provided by section 107.”

So libraries (and anyone else) have the same fair use rights under Section 107 as they would have if Section 108 did not exist. The scope of those rights is still up for debate (though I can recommend some resources that will help sort things out), but we know one thing for sure: fair use is completely separate from Section 108. Settled beyond rational dispute, right? You’d think so.

In what seems like a Swift-ian satire masquerading as a legal brief, the Authors Guild argues that, in fact, despite 108(f)(4), everything in Section 108 comprehensively affects the right of fair use as provided by Section 107, at least for libraries. As they put it,

“Congress included these rules [in Section 108] to carefully delineate the boundaries of fair use in the context of library copying.”

Wait, what?

The three pages or so that the Guild dedicates to making this Orwellian move do very little to overcome an informed reader’s incredulity. Indeed, Google Books litigation guru James Grimmelmann calls it a clever litigation trick but suggests even the Guild doesn’t really believe what it’s selling.

Canons of statutory interpretation

The first few arguments the Guild makes are based on general canons of statutory interpretation, like “the specific governs the general.” As any lawyer will tell you, canons of interpretation are what you turn to when there’s some ambiguity in the statutory text itself. If there were an open question about how Congress meant 108 and 107 to be interpreted, we might look to these general rules as a way to break the ambiguity. But there is no such ambiguity. Congress has said explicitly how courts are to use 108 as they determine a library’s (or anyone else’s) rights under Section 107: not at all. Maybe in a world without 108(f)(4) you could say that 108 is “specific” and 107 is “general,” and so 108 ought to trump. But we are not in that world, and have not been there for many decades. In the real world, Congress has determined that the specific provisions in 108 do not trump the general right described by Section 107, just as specific federal laws do not trump the general First Amendment right of free expression. 108(f)(4) settles this question, making recourse to general rules of thumb a subversion of clear Congressional intent. Strike 1 for the Guild.

Copyright Office Reports

Next, the Guild relies on a report written by the Copyright Office in 1983. In that report, the Office opines on the proper scope of library fair use rights in light of Section 108. Leaving aside the fact that to opine in this way is already in tension with the clear statutory language, the Guild’s invocation of this report is not persuasive for several reasons:

  1. There is no ambiguity in the statute. Again, expert opinion, even expert agency opinion, is only useful if you have a statute that is unclear. That is not the case, here. The statute could not be clearer. There is no reason to consult any experts, not even the Copyright Office, and certainly not the Copyright Office circa 1983.

  2. Fair use law is made by the courts, not the Copyright Office, and it has evolved considerably since 1983. It is clear beyond dispute that when Congress 'codified’ fair use in 1976, it was endorsing the continuing evolution of the doctrine in courts. The statutory factors are open-ended and flexible, and the courts have not been timid in deploying the doctrine in new and exciting ways over the last three decades. The Copyright Office in 1983 could not in its wildest dreams have anticipated the kinds of practices that have since been blessed as fair use, or the challenges that 21st Century libraries would face and the tools they would devise to meet those challenges.

  3. Courts are not bound by Copyright Office reports. That’s not always true. Some federal agencies are entitled to what is called Chevron deference,” which means they are empowered to interpret the law in their area of specialty, and courts have to follow the interpretations those agencies make even if the judges disagree (unless the agency has completely gone off the deep end). The Copyright Office is not one of these specially-empowered agencies. Instead, courts must show them only Skidmore deference,” i.e., courts will defer to them to the extent that their arguments are persuasive. In other words, courts show these agencies no deference at all; they make their arguments just like everyone else, and the court can take or leave them. That doesn’t stop the Guild from trying to say that the Office gets “deference,” shoving Skidmore into a footnote, perhaps hoping that the Judge will not read this brief carefully.

  4. It’s clear even from the excerpts in the Guild’s own brief that the Copyright Office in 1983 did not say that Section 108 is an exhaustive delineation of library fair use rights. Rather, the Office suggests that some stakeholders (though not the libraries) believed in 1976 that Section 108 allowed practices in excess of fair use, which does not necessarily entail that it comprehends all that fair use would allow. Indeed, as professor Grimmelmann points out, the 1983 Report says fair use will “often” be unavailable in cases beyond the limits of 108; it does not say that fair use will “always” be unavailable.

External image

To the extent that the Report seems to take a dim view of relying on fair use for large, systematic library projects, it is worth emphasizing the Report predates the advent of the Internet, of digital search algorithms, of cloud computing, and of a host of new practices that involve systematic large scale copying that courts have blessed as fair use. Courts are the ones empowered by the Copyright Act to determine the evolving bounds of fair use, and they have found copying that is very similar to that at issue here to be fair. They have done so repeatedly, and with growing certainty, and they have endorsed these practices when engaged in by commercial, for-profit actors. The Guild would have us believe that libraries whose mission is to promote teaching, learning, and scholarship have fewer fair use rights than billion dollar companies whose mission is to maximize advertising revenue. On the contrary, libraries should follow the lead of these companies in areas like web archiving.

The DMCA Savings Clause

Here I will just defer to Professor Grimmelmann, who untangles this knotty issue very effectively in his own blog post on the brief:

And then there is the brief’s discussion of another Copyright Act fair use saving clause, in Section 1201 of the DMCA:

Nothing in this section shall affect rights, remedies, limitations, or defenses to copyright infringement, including fair use, under this title.

In the famous DeCSS case Universal City Studios v. Corley the Second Circuit held that fair use was no defense to DMCA anti-circumvention liability. But—as the Second Circuit explained but the Authors Guild doesn’t—that was because the DMCA creates an independent form of circumvention liability that is different from infringement liability:

In the first place, the Appellants do not claim to be making fair use of any copyrighted materials, and nothing in the injunction prohibits them from making such fair use. They are barred from trafficking in a decryption code that enables unauthorized access to copyrighted materials.

That is, fair use as a defense to copyright infringement remains completely intact under the DMCA. Unlike the DMCA, however, Section 108 does not create new forms of liability, so that “violation” of it is not some new exotic action to which fair use does not apply. Failure to qualify for Section 108, per the text of the savings clause, simply kicks one back into the usual fair use balancing test.

So, there you have it. The statute is clear, the policy question is easy, and the Guild is making a series of hail Mary arguments to try to avoid a long and (hopefully) fruitful inquiry into what fair use really means for libraries. Jonathan Band has already explored this substantive area, and his analysis is quite compelling.

What has developed in the content industries is a sense that copyright exists to support their businesses, so any new way they find to extract a little extra money from the rights they hold should be endorsed and protected by the courts. If you start from that premise, it makes sense to sue libraries for providing digital copies to blind people and professors for giving students access to short excerpts from a scholarly book because you believe you are acting from within the core purpose of copyright. But the premise is wrong.
—  Kevin Smith, in today’s LJ article Why Are Some Publishers So Wrong About Fair Use?
Judge Baer channels #librarianscode

In what can only be described as a total victory for libraries, Judge Harold Baer of the Southern District of New York held in an opinion published yesterday that the HathiTrust’s mass digitization project is protected fair use.

This isn’t news to academic and research librarians, who spoke loud and clear in the Code of Best Practices in Fair Use for Academic and Research Libraries. Judge Baer’s opinion should sound delightfully familiar to anyone who’s read Principles 3, 5, and 7 of the Code, which describe the consensus of academic and research librarians around preservation, accessibility, and non-consumptive uses (like search and text mining). Like the librarians, Judge Baer recognizes that these activities are “transformative,” especially the search and accessibility aspects. (For preservation he refers to the Sony decision, suggesting that non-commercial copying of this kind should be favored under the first factor.) He also recognizes that fair use generally favors these library uses as hugely valuable to the public, and particularly to “progress in science and the useful arts.” In short, Judge Baer fundamentally gets the bottom line assumption that also underlines the Code - that fair use can and does provide space for bold action by libraries in service of their public service mission, because that mission is itself in service of the same goals as copyright.

There’s a lot more to say and to celebrate in this opinion, and I’ll certainly be writing more here and elsewhere, but for now I just wanted to point out this wonderful affirmation of the logic of the Code.

The goal of the project is to allow students, faculty and staff at the university to view books that the university owns, and to download one page at a time, similar to taking the book off the shelf and making a copy at a copy machine.
—  An MLibrary press release describing in the most detail so far the level of access that students and faculty would be afforded to digitized orphan works. Hardly a mass distribution or republication.
I cannot imagine a definition of fair use that would not encompass the transformative uses made by Defendants’ MDP and would require that I terminate this invaluable contribution to the progress of science and cultivation of the arts that at the same time effectuates the ideals espoused by the ADA.
—  Judge Harold Baer, in his wonderful opinion in the HathiTrust case.
Thoughts on the Copyright Office's Priorities for 2011-2013

The Copyright Office (CO) announced its priorities for the next two years yesterday, including several items of interest to research libraries. This blog post will walk through some of the highlights; the full document is here.

Report on Mass Digitization Coming (Very) Soon

Of all the goals outlined in the CO’s report, the one with the shortest time horizon is a preliminary analysis of the issues surrounding large-scale book digitization. The CO indicates that its analysis will be posted sometime in October 2011 (i.e., in the next few days).

As the CO’s mass digitization site’s current contents show, this work is an outgrowth of the Google Books litigation, in which the CO was a highly visible participant. Then-Register Marybeth Peters may have coined the most oft-repeated phrase in the oral arguments when she described the proposed “opt-out” settlement as “turning copyright on its head.” Peters has continued her work in support of “opt-in” solutions in her retirement, taking a position on the Board of Directors of the Copyright Clearance Center.

Mass digitization presents a host of unique problems that have not been addressed in previous efforts to sort out smaller-scale uses of library materials, especially orphan works. The one-at-a-time diligence that past orphan proposals have envisioned simply do not scale to the thousand- or million-volume level.

The CO says its analysis will include an evaluation of various solutions based in collective licensing (voluntary, collective, extended, and statutory). Recent conflicts in Canada, a close look at Norway’s regime, proposals in Europe, and a look at our own statutory licensing regimes for satellite TV all suggest that these types of solution can have significant disadvantages for libraries. It will be interesting to see what the CO makes of these issues.

Section 108, again.

In 2008, a study group comprised of representatives from the rights holder communities as well as libraries, archives, museums and other user groups issued a Report on the many shortcomings of the current specific exception for libraries and archives. While the Report expressed a consensus that Section 108 had not kept pace with the changing needs of beneficiary institutions (e.g., it does not deal adequately with needs associated with ‘born-digital’ works), the consensus did not reach many specific recommendations for changing the statute. Parties simply could not agree. The CO suggests that the Google Books litigation was also a factor.

The CO says it will “formulate a discussion document and preliminary recommendations” on the issues raised by the 108 Report. Given the failure of the stakeholders to come to consensus, we should watch closely to see how the CO resolves the tensions surrounding this important issue.

Orphans, still.

Another issue raised by the Google litigation, and by the new lawsuit against HathiTrust and its library partners, is the fate of 'orphan works.’ ARL has worked with other stakeholders, including the CO, to find an acceptable legislative solution to this issue, but those negotiations left off in Congress at the very limit of what would be feasible for libraries. It is not clear that revisiting this issue in the legislative arena will give libraries a solution that is preferable to the strong fair use arguments already available to support library projects. Indeed, members of the Legal Issues workstream of the Digital Public Library of America reported at last week’s plenary that even its ambitious plans don’t include pushing for legislative change, as it makes more sense to work with what we have than to gamble that Congress will improve things.

The CO has already issued a comprehensive report on this issue, and legislative language already exists, so the CO is wise to refrain from announcing any specific work product on this question. Instead, they will “continue to provide analysis and support to Congress.”

Other issues

  • The CO is in the midst of its triennial DMCA rulemaking, in which it considers classes of works that should be exempt from the digital locks provisions of the Digital Millennium Copyright Act. In the past, these exceptions have focused on uses in academic settings, and ARL will continue to work to support useful rules in this area.

  • The CO will be issuing its report on Pre-1972 Sound Recordings in December. In our comments on this issue, ARL has asked the CO to highlight fair use.

  • The CO mentions that it has weighed in on the issue of “Rogue Websites,” without specifically endorsing the approaches that have been taken by the bills introduced on the issue. There are significant free speech concerns associated with those bills.

  • The CO highlights its work to digitize and make accessible its records of copyright registrations. This is an important corollary to the orphan works and mass digitization problems, as it would make it much easier for libraries to determine whether and when copyright terms might have expired.

It is worth noting that the Authors’ Guild complaint propagates a common but incorrect assumption that all US works published between 1923-1963 are in copyright. Our Copyright Review Managment System has reviewed nearly 200,000 of these works, and found more than 50% of them to be in the public domain. The same will be true of many works published outside of the United States. How many among the 7 million volumes that they wish to sequester might also be in fact works that no one—including the plaintiffs—has the right to restrict from the public?
These books would have long since been remaindered and pulped, if libraries like the ones you sued had not graciously given them the precious shelf space to endure through the years past their popularity.
—  Another great point about HathiTrust, from New Jersey librarian Tom Bruno. And it’s not just about “orphans.” Even works whose authors can be found can (and do) go out of print and cease to exist outside the walls of libraries. THE JERSEY EXILE: Say it ain’t so, Superfudge!
Under Michigan’s protocols, unlimited e-book downloads of Mr. Salamanca’s book were scheduled to be made available to an estimated 250,000 students and faculty members on November 8th.
—  A blatant falsehood from the latest Authors Guild press release. The facts about the scope of the orphan works project have been public for weeks. No one gets to download e-books of designated orphans as a result of this project. Why can’t they get their facts straight on this basic issue?