Introduction to Text Mining on Digital Content

The JISC digitisation team is currently planning an international grant competition to look at the exploiting text mining methodologies for digitised content.

Provisionally called the Million Books Challenge, the competition will see how analysis of large corpora of texts, images or other digital material can open up new avenues for research.

Alastair Dunning gave an introduction to text mining and outlined plans for the competition at the JISC Collections AGM (20th November 2008)


Thanks to Ian Gregory and Andrew Hardie (University of Lancaster) for providing the case study

Learning on Screen 2009 - Call for Papers

The Learning on Screen Conference 2009 will be held at The Wellcome Collection (183 Euston Road, London NW1 2BE) on 7th and 8th April 2009.

Learning on Screen

This annual conference was established by the Society for Screen-Based Learning and focuses on the delivery of learning and research with moving image and sound - be it broadcasting, web delivery or cinema.

Two key themes of the conference will be:
Disability and Access to Moving Image and Sound
Online Moving image and Sound Services for Learning

The programme will be arranged in 30 minute sessions. The organisers are therefore seeking proposals with a speaker presentation time of 20 minutes each.

Proposals should be submitted (with a title and 200 word summary, along with your name, current position and short biography) to pa@bufvc.ac.uk on or before 15th January 2009.

For more information about the conference please visit the BUFVC web site.

Portal for European heritage available online (sort of)

The European Union’s Europeana portal project was launched yesterday, offering user access to a wealth of cultural heritage content, harvested from the continent’s museums, archives and libraries.

europena1.jpg

There has been some scepticism about the long-term success of the project, especially in regards to its sustainability model and it’s ability to deflect users away from Google - a problem for any budding portal.

However, two recent events might changes the sceptics’ views.

Firstly, is the overwhelming popularity of the site on its first day - 10m hits, according to the website, with the unpleasant side effect that the site will be down until mid-December.

Recent comments from Google suggest they might be interested on working with Europeana, a partnership that would definelty add to the portal’s impact. It will be interesting to see how this develops.

Of war, cartoons, eggs, and more…

Cartoons are a very effective medium not only to comment on the social, political and historical events of our times but also for their power to stay in people’s hearts forever, thus recalling a particular event.

One of the many contributors to the Great War Archive, part of the First World War Poetry Digital Archive, which launches today, submitted a Punch cartoon to the site with the comment:

My interest in the First World War originated from chatting to my two grandfathers […] However, my early reading about the war centred around old copies of Punch magazine, especially the cartoons. This is my favourite.

Cartoon from Punch “How to order eggs in France”

The same anxiety about eggs, diet and culinary habits during the First World War is also immortalised by Carl Giles in a cartoon published in the Sunday Express on Armistice Day on 11 Nov 1984 and now part of the recently launched British Cartoon Archive.

Cartoon by Carl Giles

The caption reads: “We didn’t have all this Cordon Bleu when I was your batman in the last lot - I used to boil your eggs in our ‘ot tea.”

Both the First World War Poetry Digital Archive and the British Cartoons Archive are projects funded under the JISC Digitisation programme and make freely accessible thousands of digital images, texts, and audio-visual material for use in teaching, learning, and research.

The First World War Poetry Digital Archive focuses on resources on the major poets of the period and also includes the Great War Archive, a collections of digital objects relating to the First World War submitted by the general public.

The British Cartoons Archive web site is dedicated to the history of British cartooning over the last two hundred years. It holds more than 130,000 original editorial, socio-political, and pocket cartoons, supported by large collections of comic strips, newspaper cuttings, books and magazines.

European Conference on OCR and Mass Digitisation

From the IMPACT project, a European Union project which is aiming to create a centre of excellent for the digitisation of textual cultural heritage

Introduction

On 6 and 7 April 2009 the IMPACT project will organise a conference on OCR in mass digitisation projects. This conference will focus on exchanging views with other researchers and suppliers in the OCR field, as well as presenting some preliminary results from the first year of the IMPACT project.

Tentative programme

Monday 6 April 2009: New advances in OCR technology, such as collaborative correction and adaptive OCR techniques, a possible way forward for future large-scale digitisation programmes.

Tuesday 7 April 2009: Current and future challenges facing OCR technology, such as image enhancement and linguistic issues that come up when digitising historical text material.

Both days will feature key speakers from outside of the project, in addition to experts from the IMPACT consortium (to be announced in the near future).

Each day’s programme will last from 10.00 – 18.00, with a conference dinner on the first day.

Practical information

The venue will be the Koninklijke Bibliotheek (KB – National Library of the Netherlands) in The Hague. There is a maximum of 150 participants. Registration is now possible at an early bird fee of € 95. After 1 January 2009, the regular fee will be € 110. This fee includes coffee breaks, lunches and a conference dinner on Monday 6 April.