This post forms part of our signposting series for collections as data, AI and computational approaches to Arts and Humanities for libraries and archives.
Nice clear AI information
We were pleased to see that the Russell Group of universities in the UK has recently issued a clear policy statement of principles to guide its members in the appropriate use of generative AI. It states, “Staff should be equipped to support students to use generative AI tools effectively and appropriately in their learning experience.”
Where does the data come from?
The Russell Group principle covering plagiarism is interesting in that it suggests plagiarism is a significant risk associated with generative AI because one cannot tell the sources of information with current systems. The big search providers are trying to develop traceable systems but at the moment where information comes from is often opaque to say the least. This is a tricky area for libraries and archives as they are fully committed to upholding copyright. Upholding it requires traceability; where did the text, photo, or this 3D drawing come from? This is probably one reason why there is so much reticence about AI in libraries and archives.
It is notable that the UK government is currently reviewing its stance on copyright in relation to digital technologies. If libraries want to help address this issue, it would be good to release some of their own digitised collections (or even catalogue data about them) with accompanying datasheets or Datacards which provide extensive dataset descriptions.
Getting your heritage collections out there
In the collections space I have already written in this previous blog post about the Collections As Data initiative as a way for libraries to release well-formed data collections. We eagerly await an updated Santabarbara Statement on Collections as Data (recently reviewed) and then we look forwards to running three webinars starting in November to explore both the theory of how to effectively provide access to heritage collections in the form of data with a particular focus on the realities of implementation.
Case studies request
As part of this work, we would like to develop some Jisc member stories to explore the opportunities and barriers to making collections available as data and we will also seek to cover Datasheets for Datasets and Datacards (also described in that earlier post linked above). If your library or archive is interested in partaking in a case study or even discussing it, please do get in touch.
In the meantime, you can listen to part three of our Jisc podcast miniseries, exploring artificial intelligence in the context of the humanities. My colleague Paola Marchionni is joined by Professor Jane Winters to discuss the often complex and messy, data that historians increasingly deal with when working with digital collections. We’ll take a series break in the summer but look out for the next episode in September when I will be talking with Professor Leif Isaksen about research practices.