As phase 2 of the British Libray’s newspaper digitisation programme prepares to conclude (1m more pages are due to be added in the coming months), there are some interesting reports about the digitisation process becoming available.
The project’s final report looks at the issues such as the capture of metadata, the standards used, and the complex workflow developed. It also gives detailed information on which newspapers have been scanned are being added to the collection
Readers may also be interested in Simon Tanner’s article in dLib magazine, which goes into greater depth on how to measure the success of OCR technologies, and the methodologies required for such work.
It also worth comparing the digitisation of the British newspapers with other sites around the world
USA’s National Digital Newspaper Program
Australian Newspapers Digitisation Program
A fuller list of international digitisation projects for newspapers is available at the International Coalition on Newspapers site
One reply on “How to digitise newspapers”
[…] post: How to digitise newspapers : Digitisation How to digitise newspapers : Digitisation Posted in General, News Tags: News « How to implement mergesort with parallel_reduce? […]