Earlier this year, 20 million pages of the U.S. Federal Court’s PACER database were downloaded, audited for privacy violations, and submitted as evidence to the Judicial Conference, the policy-making body of the courts. That incident led to a Senate investigation, clean-up by 30 district courts, and PACER now requires each lawyer to click at each login that they understand their privacy requirements. (Scribd, PDF )When public data is locked up behind a cash register, nobody has an incentive to fix privacy problems. Only when the public got access to the data did privacy problems begin to be fixed. When public data becomes public, we also start to see real innovation.
A great example is today’s release by Princeton’s Center for Information Technology Policy of RECAP, a Firefox plugin. RECAP is a public domain proxy that allows professional PACER users–lawyers, journalists, and law students–to save money on access charges and at the same time create a public domain archive. RECAP lets lawyers do good by doing good.
Here’s how it works. The 20 million pages harvested earlier this year have been unfolded into the Internet Archive by the Princeton team in a format that includes extras like metadata and SHA1 hashes. When you use the RECAP plugin to access uscourts.gov, if somebody already grabbed this doc, you get it for free. If not, you pay $0.08/page, but the doc gets recycled so the next user gets it free.