Many of you may have heard of Distributed Proofreaders (DP) already, which I just came across on the O'Reilly Radar blog. DP is a network of volunteers who perform what is the proofreading equivalent of the double-keying method:
During proofreading, volunteers are presented with a scanned page image and the corresponding OCR text on a single web page. This allows the text to be easily compared to the image, proofread, and sent back to the site. A second volunteer is then presented with the first volunteer's work and the same page image, verifies and corrects the work as necessary, and submits it back to the site. The book then similarly progresses through two formatting rounds using the same web interface.
Titles proofread by DP are submitted to Project Gutenberg, but would this idea work for smaller digitization projects? How would a library go about getting users to proofread dirty OCR?