Some members of the HE content team have been discussing issues about the collaborative digitisation of collections. We know, from conversations with our Digital Archival Collections advisory group, that there is a thirst for institutions to work together to make their collections available. We were wondering about the barriers to making this happen.
Institutions are obviously in competition for students and funding, whilst, at the same time, we see rival institutions and academic groups working together to undertake a variety of large and small projects, where there is a common interest, but there have been few collaborative digitisation projects. Some were previously funded by Jisc and others have built a long-term development cycle base on a strong originating project.
A prime example of the latter is the Proceedings of the Old Bailey (Old Bailey Online). This resource started off by undertaking a high-quality digitisation project led by academics from various institutions. It produced excellent metadata and over time other projects have gained funding to enhance the content, making it eminently searchable and discoverable. Enabling new projects, which make use of the original material to make enhancements, was perhaps not the original strategy, but it has happened because a community developed around the content. High-quality outputs were the starting point. There was a good original plan in place to produce these outputs so subsequent builders were able to add to the original data.
Commercial publishers are unlikely to have created such a resource, as so often the restrictive delivery of the collection does not allow for open sharing. Publishers can also skew the collection focus towards identifiable markets. They focus naturally where they can most likely maximise their revenues. There is nothing wrong with making a product to sell but, if the outputs are focused to appeal to a market, selection decisions are then made based on that market’s drivers. These might not be the ones that engender academic use of the content in the UK.
Old Bailey exposes all its data to the web and there is no hidden metadata and no paywall once you discover what you would like to use. It’s an open collection, much like the freely available journals which are being produced by the Open Access movement. Why is it then that UK institutions so often seek to monetize their collections by working with a commercial publisher? The main driver is obviously a lack of investment from institutional coffers, so getting a commercial contract in place to digitise material makes a lot of sense. The institution then benefits from some royalties, gains kudos from involvement with the publisher, and hopefully receives a perpetual copy of the digitised content. The driver is to get the collection digitised even if the resulting material is locked behind a password. It all makes economic sense for the institution.
We have recently developed a new model for institutions to work with global publisher Wiley to make the British Association for the Advancement of Science (BAAS) collection available. The model we have built ultimately guarantees the collection will be openly available on the web; something that has often not happened with material that has been digitised but is in fact in the public domain. Participating institutions also have a say in what gets digitised and UK institutions have perpetual free access. This benefit is crucial if we are to skew the selection process towards participating universities.
With some notable exceptions, universities are less willing to invest institutional spend in digitisation of their collections. Academics often want their institution’s collections to be available and are willing to accept the password restrictions applied by publishers to gain access. The fact that a commercial publisher is providing it also gives the collection some prestige. Publishers after all do invest a lot to make the collection digital and often provide added value features on their platforms.
We have recently been working with UCL and the LSE to make material available as part of digitising Social Movements of the 20th Century (SM20C). This is a crowdfunded model, and control of what is digitised rests entirely with the institutions involved. During this small-scale project, we have tried to find ways of bringing the collections held by each institution together as a whole, but that has proven quite challenging due to differences in how collections are catalogued and we have been looking at technologies such as IIIF to help us to make metadata across the collections more homogenous. Again, such experimentation would probably not be conducted in a more commercial setting but that is what we are about, finding innovative ways of working with collections and leading publishers to be more responsive to the needs of those who need content for teaching, learning and researching.
We will continue to explore the issues of collaborative digitisation on this blog and may also run some online events soon.
In the meantime, you can also look at this THE article (subscription access) about a variety of our collaborative content initiatives.