Categories
Digitisation conference 2007

Conference 2007: Workshop: Mass digitisation

Mass digitisation: best practice and lessons learnt
David Dawson (moderator)
Senior Policy Adviser (Digital Futures), Museums, Libraries and Archives Council
Stuart Dempster,
Director, Strategic Content Alliance, JISC
Joyce Ray
Associate Deputy Director for Library Services, Institute of Museum and Library Services (IMLS), Washington DC
Ricky Erway
Programme Officer,
OCLC Programs and Research

RE: comments based on review of publicly available agreements:

Exclusivity – appearance v reality

  • They are non-exclusive deals
  • the private partner bears all the digital costs
  • they are only limited term deals
  • institutions are free to serve the content to users

Deliverables – what do you get back?

  • master images
  • access images
  • metadata describing associated files amd their sequence
  • coordinates to map ocr text to images
  • records with links
  • technical metadata
  • records of rejects for later attention

Functionality:

  • access rights: display; index; allow downloading of individual copies; add third party service enhancements; combine in whole it part; distribute in whole or in part
  • rights related to preservation: influence technical specifications; check for quality control; insist on published, open standards

Disclosure – how much is too much?

  • OK to protect propriety information
  • Should not restrict community discussions

Financial models:

  • worst case: they digitise and then licence is back to us
  • consider making use free and charging for a free use licence
  • if partner repurposes in ways not forseen contributor should get a proportion of the revenue
  • ask yourself: how much more would to cost to make it freely available?

Negotiating tips

  • inform your counsel of desired outcome
  • let counsel determine warrants and indemnifications
  • know your bottom line
  • start with best agreement to date
  • negotiate on behalf of the broader community

Result of inaction

  • content accessible via patchwork of environment sunder dramatically different terms
  • recognise shared interest after too late
  • public domain materials locked up
  • library assets marginalised by commercial entities

Richard Boulderstone: British Library has a project with Microsoft – you make it sound like we shouldn’t do these deal – we spent nine months negotiating with Microsoft – ultimately getting the money and doing the digi is a good thing – allowing too many restrictions is a problem and we have to contribute to the effort to make it happen – but the material is there and could always be digitise again – so, yes, have to negotiate long and hard but walking away is a big problem as well as content is virtually inaccessible today so anything that makes it more accessible is a good thing

RE: if just fight for one thing then its survivability after the contract – could accept restrictions for term of the contract as long as then becomes completely free – Richard, we should be building on your effort to take it to the next stage

SP: it is about sharing information so we can make informed decisions

JR: are you prohibited from disclosing your agreement?

RB: with Microsoft we own the content and licence it to Microsoft

Michael from Oxford: work with Google – agreement isn’t in the public domain but presented to us that we were getting benefits but as rolled out, even with their deep pockets, would not be able to give anyone else the same deal. I think we got a pretty good deal – it’s out there now and being used in intersting ways

Stuart Dempster

When I came into post the programme had about 6 sides of paper as documentation so the first few months hairy but what we’ve learnt has fed into current thinking and planning and manifested as second programme of work and we should not underestimate appetite of users and providers of e-content. £3.8m and requested proposals, got 38 bids totalling £34m and as a result we managed to increase it to £12.5m and fund 16 projects.

Have tried to cascade the learning down through project partners. Points for consideration include:

  • consensus – is it compatible with your organisations mission – do you have support of your management
  • content – what makes the collection unique and valuable, do you have accurate information?
  • copyright – what content do you own and/or require licences for? What licence will be use and why?
  • catalogue – what descriptive, technical and administrative metadata exists and to what standard?
  • capacity – what level of infrasructure does your organisation have to captur, convert and deliver

Lessons learnt from JISC

  • research – there are significant sources of advice and guidance
  • pilot – where possible always test assumptions and methodology
  • procurement – ensure adequate time and resources
  • quality assurance – ensure sufficient time
  • sample material from suppliers and check against agreed standards
  • ensure QA put into workflow from start
  • metadata – resource intensive and should not underestimated
  • metadata schemas are plentiful – check that fit for purpose
  • user consultation – involvement through advisory groups
  • users are also most effective marketing tool
  • always conduct full user needs analysis and ensure accessibility
  • build enough resiliance to meet anticipated demand for content
  • IPR – complex process of clearing right
  • not all content has equal IPR issues
  • ensure sufficient liability and indemnity protection
  • ensure IPR rights expresed in metadata
  • consider alternatives should IPR not be achievable
  • project management – do not underestimate the worth of good project management
  • use standard project management techniques
  • keep detailed records
  • marketing and communication – consider trade marketing your ‘brand’
  • always have FAQs and exemplars
  • build relationships with external bodies
  • ensure you maintain a web log for traffic analysis
  • consider the use of social networking software

Anne Marie Millner: methodology for assessing success of the projects?
SD: University of Kent monitors things and that’s under review and we are keen to ensure that the digi projects are included in that. different methodologies for capturing that info. JISC Services in negotiation at the moment

DD: European Union also looking into this

Richard: How much work done to link the content together

SD: separate strands of work – there’s digital only, licenced content etc – some work underway to pull those things together – putting registries in place, bringing things together

Michael Popham: hard to use users to validate content for selection – example of 19th century novels with the Google project: when took the books down from the shelves the pages had not been cut – differing views about what to do – librarians keen to leave them as is, Google wanted to digitise – BL may have had well-used copies – but we didn’t have the resources to tackle that. Also, ephemera – as soon as you use that word half the population doesn’t have a clue what you’re talking about. Case for taking the plunge and just going ahead with it – people often don’t know what they want until they’ve seen it

SP: risk missing informal content if always rely on the same people we always rely on to find out what should be digitised

Peter Findlay: took approach of marketing to librarians and then to try to get to lecturers and then to students (leaflets, postcards) – different approaches to different audiences – need to have a vision of who you are trying to communicate with – we focused on the academic community so we could deliver content in a structured way to a specific audience – otherwise risk watering down the content

Joyce Ray

We haven’t funding the same kind of mass digi projects – mostly funded special collections material but also some collaborative projects (large scale rather than mass). Colarado digitisation project, first funded 1999, set up five regional scanning centres and digitised materials brought in by all kinds of institutions and then the institution expected to preserve the images but Colarado had centralised metadata repository. Expanded to other states and now looking at what services they can provide to be self-sustaining.

Discussion

Richard Boulderstone: people most interested in last few years’ content – not interested in 19th century books. so worried that the fuding for pure digitisation will run out. Latest EU calls say that will fund anything except digitisation. Dutch and Belgians are doing some stuff but in the UK funding sources are JISC and then maybe the Googles and Microsofts and that’s it.

JR: anything that’s not online will be lost to history and I think that is a public good issue

Michael Popham: question of mass – a million of something might not be big enough for Google but it’s too big for us to tackle

SP: it all relates to the failure of advocacy – there is money there but we’re not coordinated engh as national bodies to get at it

DD: how does Google know who to talk to apart from picking off organisations who are well known – we could put together aggregated projects/collections that they might be interested in

Leave a Reply

Your email address will not be published. Required fields are marked *