Changing digital workflows with IIIF and semantic tools (report)

Here is a recording of a recent webinar (Zoom meeting) about the report and our next steps.

Pamphlets have long been a popular way of getting people involved in campaigns. We have been working with teams at the London School of Economics and at UCL to digitise some pamphlets from their collections as part of a project called Social Movements of the 20th Century (SM20C). The pamphlets were published between world war one and two, originating with women’s organisations working for a more equal and just society.

The problem is that the digitisation of so-called ‘special collections’ is often hampered by a lack of usable metadata (eg title, author, date of publication) or even any detailed catalogue records from which metadata might be extracted. This may result in a laborious process of cataloguing first before being able to move on to digitisation.

The LSE collection is well catalogued and is drawn from the Women’s Library, an important national resource. The selected material includes pamphlets from the National Union of Societies for Equal Citizenship, the Women’s International League for Peace and Freedom, the London and National Society for Women’s Service, the National Council of Women, Women’s Co-operative Guild, the Association of Women Teachers, the National Federation of Women’s Institutes, the Association of Moral and Social Hygiene, the British Federation of University Women, the Federation of Women Civil Servants, Open Door International, the Society for the Ministry of Women and the International Council of Women.

The UCL material is all drawn from the archive of the National Union of Women Teachers which has been partially catalogued.  The archive relates to the journal of the NUWT which ULC has already digitised and made available online. The selected pamphlets reside in archive boxes together with related correspondence and background literature. So, the team at UCL has had to extract the pamphlets from the boxes in readiness for digitisation and this means they will need to describe each pamphlet individually.

Both sets of material have overlaps and plenty of potential connections. The questions we have been grappling with is how to make these materials discoverable on the web. One strategy will be to put them in multiple places. The other might be the enrichment of the associated metadata descriptions of the digital publications.

We asked Digirati to explore workflow economies which could be made by utilising the International Image Interoperability Framework (IIIF) enabled images. One thing we worked out early on was that we wanted to turn the ‘catalogue first’ paradigm on its head, by exploring the possibility of taking a ‘digitise first’ approach.

The resulting report suggests that we should make the 75,000 or so page images IIIF compliant. We could then make some interventions using IIIF-based semantic tools in order to facilitate the sorting and enrichment of the images (or the publications). This report explains how Digirati worked with the two teams to explore their current digitisation workflow. It goes on to suggest interventions which might help to optimise such a workflow in the future, as part of a ‘digitise first’ paradigm.

As we needed to get on with digitisation, we decided not to test out these recommendations on the current set of material. Instead, we are going to explore the idea of digitising a sample of other completely uncatalogued, but related, content from each institution to test out the assertions made in the report.

If you would like more information about the project please get in touch with peter.findlay@jisc.ac.uk

Leave a Reply

The following information is needed for us to identify you and display your comment. We’ll use it, as described in our standard privacy notice, to provide the service you’ve requested, as well as to identify problems or ways to make the service better. We’ll keep the information until we are told that you no longer want us to hold it.
Your email address will not be published. Required fields are marked *