What do you do if you're the National Library of Finland and have millions of pages of newspapers, magazines and books that you want to digitise, but are set in a difficult typeface that OCR software has problems with?
Do you a) employ someone for the next few thousand years to complete the job, or b) come up with a novel and fun solution that involves thousands of citizens and schoolchildren taking bitesize pieces out of the problem.
Yes, you've guessed it - it's b.
The project, Digitalkoot (Digital Volunteers), blends microtasks, crowdsourcing, and video games to break up and distribute some of the dull repetitive work of verifying digitized records.
"We have millions and millions of pages of historically and culturally valuable magazines,
newspapers and journals online. The challenge is that the optical character recognition often contains errors and omissions, which hamper for example searches," says Kai Ekholm, Director of the National Library of Finland. "Manual correction is needed to weed out these mistakes so that the texts become machine readable, enabling scholars and archivists to search the material for the information they need."
Microtask has designed two games that will make this work entertaining.
In 'Mole Hunt', the player is shown two different words, and they must determine as quickly as possible if they are the same. This uncovers erroneous words in archived material. In 'Mole Bridge', players have to spell correctly the words appearing on the screen. Correct answers help moles build a bridge across a river. Again, the game helps verify the OCR and make sure that digitized materials are accurate and searchable.
It's a brilliant and innovative idea that spreads the work and give the public a sense of ownership of the project and by extension the original material.
No comments:
Post a Comment