Archive forSearching

The Long Tail of Usability

The Stormont Papers resources makes available the debates from the parliament of Northern Ireland (Stormont) from creation in 1921 until the end of Home Rule in 1971.

It’s been available since 2006 and some statistics from the website are available. Of most interest is graph showing the spread of search terms entered by users

graph from stormont papers website

There are two interesting points from this

1) The bulk of your users may not be looking for the things you expect them to be looking for.

2) Pre-arranged hyperlinks on your home page can provide a user-friendly way of letting users get to know a resource’s contents.

Comments

Creating Keywords Automatically

There’s an awful lot of interesting ideas to unpack in the Nineteenth-Century Serials Edition (ncse) resource mentioned in a previous posting.

For a start, there is novel to addition to showing results by showing the image reproduction for a search results as well as the OCR’d transcription.

ncse1.jpg

There’s the whole range of partners involved in such a website, indicative of who needs to be involved to run an ambitious digitisation project.

ncse2.jpg

And the related conference brought up a whole host of intellectual questions related to integrating the work into scholarly research.

But what is most interesting is the project’s attempts to automatically give subject keywords to articles within their resource by using natural language processing.

ncse3.jpg

Each article in their website has been processed in two ways. Firstly, to extract persons, places, institutions from the complete data; and secondly to create subject terms (e.g. Arts & Crafts or Emotional Actions, States & Processes ) which relate to each of the digitised articles in the collection.

This is handy for users because it bypasses the tyranny of having to use precise search terms to discover particular articles; and it’s useful for the digitiser because they do not have to go through each article individually and make manual decisions about the subject there within.

I’m not sure it completely works as yet (there are some faults in the results and the interface is not intuitive), but this is a brave and valuable step in trying to really exploit the richness of digitised resources, a richness we have not really tapped into yet.

Comments (1)