Competition 2013 - Expression of Interest
| Competition Details | Digital Collections | TOCs | FAQs | Judging | Example Ideas | Resources and Tools | Draft Entry | Submit Entry | Events |

Idea 1: Tangut Manuscript Character Recognition | Idea 2: British National Bibliography for AuthorClaim | Idea 3: Circaa | Idea 4: Entry from Weiwei Liang - Title TBC | Idea 5: Mixing the Library. Information Interaction and the DJ | Idea 6: SketchBookBritannia: "Favoriting" preferences in Grand Tour Era Topographics @ BL | Idea 7: Mapping the 19th century BL digital collections | Idea 8: From digital music collections to digital music research | Idea 9: Advanced Sampling Techniques and the Digital | Idea 10: Exploring metadata | Idea 11: The influence which English music publishers had on the consumption of sheet music in the latter half of the nineteenth century | Idea 12: Music De-coder | Idea 13: Infant Mortality -- how the past can inform the present and shape the future | Idea 14: Visualising the Spoken Heritage of Britain | Idea 15: Provide a full-text search research support tool | Idea 16: Visualisation of materials available | Idea 17: Make copyright-free images available as stock images on popular art community gathering sites | Idea 18: Generate audio files from sheet music | Idea 19: Connect metadata (or full text data) to Europeana objects | Idea 20: Visualising Sound Collections | Idea 21: Generating stories from archives using agent-based simulation | Idea 22: Creation of an API to retrieve digitised manuscript information for integration into 3rd party systems | Idea 23: (Document Interlinking) | Idea 0: (working title)
Please add your details to this page (you will need to create an accountfor this wiki) with your expression of interest.
Please feel free to create a separate page for your idea on this wiki to encourage discussion and collaboration. You may also want to ask a question on the Labs mailing list or prefer to use other technology such as twitter, a blog, a website to start a discussion (please use #BL_Labs hashtag so we can follow the debate).

Remember to:
  1. Read the competition details, terms and conditions, FAQs, judging / assessment criteria, example ideas and resources pages to help you formulate an idea before you submit for the competition.
  2. Start working on a draft submission using a text version of the final submission form (it will be easier to submit the final version by copying text from this version on to the live form)
  3. Contact us to discuss your idea before you submit!
  4. You can choose to discuss tools, datasets, digital resources in an open way using this wiki (get an account), use our mailing list, or other tools such as twitter, blogs, websites (remember to use the #BL_Labs hash tag) before you finally submit your entry for the competition.
  5. Attend our virtual event on Friday 17 May, 2013, hack event in London on Tuesday 28 and Wednesday 29 May, 2013, or a roadshow event in the UK.
  6. Remember you will need to submit your final entry using the entry form by 26 June 2013 (midnight)

Idea 1: Tangut Manuscript Character Recognition

Name of Person(s): Andrew West
Date: 4 May 2013
Affiliation: None
Description (optional): Develop an open source software tool to assist in the identification and processing of Tangut characters in Tangut manuscripts or printed texts from Kharakhoto held by the British Library and other institutions that have been digitised as part of the International Dunhuang Project. This tool would be similar in appearance to the Chopper application, allowing the user to select individual Tangut characters in a manuscript image and apply annotations to the selected characters. However, whereas Chopper only allows users to manually chop and annotate characters, the proposed tool would feature a custom OCR algorithm that attempts to identify selected Tangut characters. The tool would present the user with a list of candidate characters, and once the user confirms the correct candidate character it would automatically fill in details of the selected character (e.g. Li Fanwen dictionary index, reconstructed reading, meanings, etc.). If a selected Tangut character cannot be identified by the software (due to manuscript condition or cursive calligraphy) the tool would still offer functions to assist the user to manually identify the character (e.g. by component selection).
Research method(s): Write application in C++; develop a Tangut OCR algorithm.
Tools to be used :
British Library digital collection(s) being used: International Dunhuang Project
Other data to be used: Tangut character data extracted from modern Tangut dictionaries (copyright permitting)
Other notes / help needed: Need to select appropriate manuscripts or printed texts with clear and neatly written Tangut characters for testing the tool with.
More information (optional) [link to separate page on wiki]

Idea 2: British National Bibliography for AuthorClaim

Name of Person(s): Thomas Krichel
Date: 2013-05-03
Affiliation: Open Library Society
Description (optional): make British National Bibliography available in AuthorClaim.
Research method(s):
Tools to be used : AuthorClaim, XML processing
British Library digital collection(s) being used: British National Bibliography
Other data to be used: 3lib data
Other notes / help needed:
More information (optional) [link to separate page on wiki]

Idea 3: Circaa

Name of Person(s): Sylvan Golden
Date:13 May 2005
Affiliation:
Description (optional): A social project to develop narratives from meta data
Research method(s):
Tools to be used :Circaa Platform
British Library digital collection(s) being used:Meta Data
Other data to be used:
Other notes / help needed:Assistance needed to guide our development to be compliant with BL standards
More information (optional) [link to separate page on wiki]

Idea 4: Entry from Weiwei Liang - Title TBC

Name of Person(s): Weiwei Liang
Date: 2013 May 14
Affiliation: Royal College of Art
Description (optional):
Research method(s):
Tools to be used :
British Library digital collection(s) being used:
Other data to be used:
Other notes / help needed:
More information (optional) [link to separate page on wiki]

Idea 5: Mixing the Library. Information Interaction and the DJ

Name of Person(s): Dan Norton
Date: 15 May
Affiliation: University of Dundee
Description (optional): To explore implementation of a Disc Jockey's model of information interaction as a system for working in large digital collections of all kinds: Using DJing as a read/write platform for accessing and developing data to facilitate learning, scholarship, and presentation/publication. Develop a template for an interactive platform for stimulating creative use of large collections.
Research method(s): practice-led/interface development
Tools to be used :
British Library digital collection(s) being used:
Other data to be used:
Other notes / help needed: Understanding formats and systems of access to collections. Discussion regarding platforms for annotating resources.
More information (optional) [link to separate page on wiki]

Idea 6: SketchBookBritannia: "Favoriting" preferences in Grand Tour Era Topographics @ BL

Name of Person(s): Bettina Cousineau
Date: 15 May 2013
Affiliation:
Description (optional): determine and map period "favoriting" preferences across topographic collections
Research method(s): geotagging, data visualization
Tools to be used : MetaModel, Scribe and Timeline JS, others TBD
British Library digital collection(s) being used: topographic (Image) collections
Other data to be used:
Other notes / help needed: BL web team to create micro site to publish results (during residency period only)
More information (optional) [link to separate page on wiki]

Idea 7: Mapping the 19th century BL digital collections

Name of Person(s): Patricia Murrieta-Flores
Date:24/May/2013
Affiliation:Lancaster University
Description (optional): Integration of Corpus Linguistics techniques and Geographic Information Systems methods to analyse texts with a geographical nature.
Research method(s):
Tools to be used : GIS, CQPWeb, Edinburgh geoparser
British Library digital collection(s) being used: 19th century UK Periodicals: Series 2, 19th century BL Newspapers and 19th century books.
Other data to be used:
Other notes / help needed:Discussion on the formats available and access to the datasets.
More information (optional) [link to separate page on wiki]

Idea 8: From digital music collections to digital music research

Name of Person(s): Polina Proutskova
Date: 10 June 2013
Affiliation: Goldsmiths, University of London

Description (optional):
Assessing potential mutual benefits for British Library and Music Information Research (MIR)

British Library possesses unique, one of world's largest holdings of music recordings from around the world. While a huge effort goes into digitizing, annotating and maintaining recordings for future generations, these holdings remain underused and under-explored. Meanwhile, the booming field of Music Information Research (MIR) has recently shown a particular interest in non-Western data - to test existing and to develop novel approaches to extracting information from large music corpora and automating aspects of their analysis. Researchers are desperately looking for annotated datasets to experiment with.

My project aims at bridging the gap between the librarians' and the researchers' worlds to facilitate MIR tools development specifically targeted to the needs of British Library users and archivists. Digital music collections with a potential for computational processing will be identified and practical modalities of content and metadata re-use will be investigated. On the one hand, user needs will be assessed for chosen collections based on their current and potential use; on the other hand, the collections' content and annotations will be explored in terms of their suitability for MIR applications.

Research method(s):
-Qualitative approach for user needs assessment: interviews with archivists, data engineers, the digital team and those who knows the users' needs at the library
-Content and metadata assessment: manual processing to assess ground truth preparation for chosen collections, to enable statistical classification and analysis, machine learning.

Tools to be used :
British Library digital collection(s) being used: former National Sound Archive, World and Traditional Music section

Other data to be used:

Other notes / help needed:
-Legal guidance on which collections and which annotations can be used for which purposes.
-Some assistance by archivists required
-Interviews with members of various teams to assess user needs

More information (optional) [link to separate page on wiki]

Idea 9: Advanced Sampling Techniques and the Digital

Name of Person(s): Pieter Francois
Date: 15 June 2013
Affiliation: University of Oxford
Description (optional):
Research method(s): statistics
Tools to be used : R or Matlab
British Library digital collection(s) being used: 19th century books
Other data to be used:
Other notes / help needed:
More information (optional) [link to separate page on wiki]

Idea 10: Exploring metadata

Name of Person(s): Sara Wingate Gray & Kate Lomax: ARTEFACTO
Date: 16 June
Affiliation:
Description (optional):
Research method(s):
Tools to be used : OSS tools
British Library digital collection(s) being used: Evanion; Kensington Tunrpike; Victorian Popular Music; Grimms Northumberland Sketchbooks; others tbc.
Other data to be used:
Other notes / help needed: Continue discussions we've already had with BL Labs team on geodata types & original geo-sources in different metadata sets; meet with curators of different collections to ascertain metadata details for each collection (need more specifics on granularity & types available)
More information (optional) [link to separate page on wiki]

Idea 11: The influence which English music publishers had on the consumption of sheet music in the latter half of the nineteenth century

Name of Person(s): Wendy Stafford
Date:17/06/2013
Affiliation: University of Southampton
Description (optional):My area of interest is in the influence which English music publishers had on the consumption of sheet music in the latter half of the nineteenth century. I would need help with identifying which tools to use. The methodology of this case study could be useful to others in terms of an example of accessing and using BL digital resources.
Research method(s):
Tools to be used :
British Library digital collection(s) being used:
Other data to be used:
Other notes / help needed:
More information (optional) [link to separate page on wiki]

Idea 12: Music De-coder

Name of Person(s): Julia Craig-McFeely
Date:3 June
Affiliation:Oxford University
Description (optional):
Research method(s):
Tools to be used :
British Library digital collection(s) being used:
Other data to be used:
Other notes / help needed:
More information (optional) [link to separate page on wiki]

Idea 13: Infant Mortality -- how the past can inform the present and shape the future

Name of Person(s): Tony Gordon
Date:9 June
Affiliation:Independent Scholar
Description (optional):
Research method(s):
Tools to be used :
British Library digital collection(s) being used:
Other data to be used:
Other notes / help needed:
More information (optional) [link to separate page on wiki]

Idea 14: Visualising the Spoken Heritage of Britain

Name of Person(s): Georgina Brown
Date: 14 June
Affiliation: University of York
Description (optional):

By using a new approach to accent recognition by machine, this project proposes to create a versatile software interface to compare accents as they change through time. The development of an accent space will provide a visual platform to investigate the similarity between pronunciation systems independently of geographical space. There has been extensive work on producing accent maps where phonetic observations have been logged and located in specific geographical locations, such as the BBC Voices project, but this project aims to shed light on more abstract relationships between accents along continuous spectra. Using the Sounds Familiar digital collection, this project can offer insights into various historical movements of people and the convergence and divergence effects on dynamic pronunciation systems; their relation to one another.

Rather than assigning an accent label to a speech sample based on the presence or absence of certain linguistic features, an accent space can be developed to represent the 'acoustic distance' between various accents. This project will model an accent space, which meaningfully generates distances between accents.

Users of the interface will be able to select individual speakers, accents, or even individual sounds and be able to visualise and compare their similarity relationships through time.


Research method(s):
Tools to be used : Python, Gephi or other visualisation tools
British Library digital collection(s) being used: Sounds Familiar
Other data to be used:
Other notes / help needed: Assistance with user interfaces
More information (optional) [link to separate page on wiki]


Idea 15: Provide a full-text search research support tool

Name of Person(s): Emanuil Tolev
Date: 17 June 2013
Affiliation: Cottage Labs LLP ( http://www.cottagelabs.com )
Description (optional): Working on several ideas. Just listing them here, more details will be provided with the official submission. This particular idea has many implications, such as word analysis (books turned "inside out" with all the words separated and categorised by book), proper noun analysis (names of places and characters) and more in-depth topical analysis (researching the evolution of whole philosophical or historical concepts).
Research method(s): desk-based research, interviews with digital humanities scholars for the purpose of requirements gathering and evaluation
Tools to be used : elasticsearch indexing engine, bespoke software developed for the project, software development
British Library digital collection(s) being used: 19th Century Books, potentially other textual collections if this one works out (the software will be generic enough)

Idea 16: Visualisation of materials available


Name of Person(s): Emanuil Tolev
Date: 17 June 2013
Affiliation: Cottage Labs LLP ( http://www.cottagelabs.com )
Description (optional): Working on several ideas. Just listing them here, more details will be provided with the official submission. The basic idea here is that there are a lot of interesting materials at the BL, but more people could be acquainted with them, thus, make a visualisation similar to (visually)
http://wheredoesmymoneygo.org/bubbletree-map.html that exposes these materials and customise it to suit this application (e.g. allow drill-downs by type of material and tags).
Research method(s): desk-based research, interviews with digital humanities scholars for the purpose of evaluation (continuous evaluation in an agile manner), software development
Tools to be used: elasticsearch indexing engine, bespoke software developed for the project
British Library digital collection(s) being used: Metadata from all collections that the BL wants to expose will be used, starting with 19th Century Books and Victorian Pop Music to give the visualisation some variety.

Idea 17: Make copyright-free images available as stock images on popular art community gathering sites


Name of Person(s): Emanuil Tolev
Date: 17 June 2013
Affiliation: Cottage Labs LLP ( http://www.cottagelabs.com )
Description (optional): Working on several ideas. Just listing them here, more details will be provided with the official submission. This idea is again related to exposure of materials. All copyright-free materials like the images of the Victorian pop music collection can, in theory, be used and remixed in current artworks, i.e. used as "stock" images by contemporary artists. However, in order for that to occur, these artists need to know about the BL's rich digital collections. DeviantArt and other artist communities brings *thousands* of current artists from all over the world to share and improve in their respective arts, and have categories for stock images. Thus, the BL's collections could be uploaded to DeviantArt and similar sites and linked back to the BL in the images' descriptions automatically, as well as tagged as "stock" automatically. Of course, a marketing effort (reaching out to deviantArt staff) can also be made to greatly increase popularity of the "newly" available images.
Research method(s): desk-based research, software development
British Library digital collection(s) being used: Data from all collections that the BL wants to expose will be used, starting with Victorian Pop Music's images.

Idea 18: Generate audio files from sheet music


Name of Person(s): Emanuil Tolev
Date: 17 June 2013
Affiliation: Cottage Labs LLP ( http://www.cottagelabs.com )
Description (optional): Working on several ideas. Just listing them here, more details will be provided with the official submission. Sheet music available in the collections does not necessarily have corresponding audio files which could be readily used in contemporary art without having to manually convert them into audio intepretations on a instrument such as a digital piano. It would be interesting to investigate whether it's possible to generate piano (and other instruments) interpretations automatically. These would never be the same as a human performer of course, but if they are of sufficient quality it would make the music itself much more widely known and available. Yamaha Corporation has certainly managed it with the Hatsune Miku vocaloid and human voice!
Research method(s): desk-based research, software development
Tools to be used: open source software which does sheet interpretation + bespoke enhancements or small bits of software developed for the project
British Library digital collection(s) being used: All collections which contain sheet music, e.g. Victorian Popular Music and Early Music Online.

Idea 19: Connect metadata (or full text data) to Europeana objects


Name of Person(s): Emanuil Tolev
Date: 17 June 2013
Affiliation: Cottage Labs LLP ( http://www.cottagelabs.com )
Description (optional): Working on several ideas. Just listing them here, more details will be provided with the official submission. The idea here is very simple: when viewing an object on the Europeana culture portal, enable it to say "
"this object you're viewing is refered to in these 12 books: X, Y, Z, etc." with hyperlinks to the BL digital collection record of the book.
Research method(s): desk-based research, software development
Tools to be used: elasticsearch if doing the full text version of this idea, otherwise just minor software development to connect the BL metadata with the Europeana metadata and allow Europeana to display the information properly
British Library digital collection(s) being used: All BL collections (if using only the metadata), for full text search all BL text collections

Idea 20: Visualising Sound Collections

Name of Person(s): Kim Foale
Date:
Affiliation: University of Salford
Description (optional):
Research method(s):
Tools to be used :
British Library digital collection(s) being used:
Other data to be used:
Other notes / help needed:
More information (optional) [link to separate page on wiki]
---Blank template----

Idea 21: Generating stories from archives using agent-based simulation

Name of Person(s): Kevin Walker, Kate McLean, Sam McElhinney
Date:17 Jun 2013
Affiliation: Royal College of Art, Canterbury Christ Church University, MUD Architecture
Description (optional):
Research method(s): Agent-based simulations
Tools to be used : Bespoke software + BL collections
British Library digital collection(s) being used: IDP, IAMS, EAP
Other data to be used:
Other notes / help needed:
More information (optional) [link to separate page on wiki]

Idea 22: Creation of an API to retrieve digitised manuscript information for integration into 3rd party systems

Name of Person(s): Sarah de Haas
Date: 17 June 2013
Affiliation: None
Description (optional): It would be interesting to try to promote inter-institutional collaboration by allowing 3rd party systems - for example, other libraries or research institutions - to include material from digitised collections of the British Library alongside their own digital collections. Access could be improved by offering metadata accessible via an application interface.
Research method(s):
Tools to be used : Dependent on the current database systems in place at the BL
British Library digital collection(s) being used: The BL Digitised Manuscripts collection
Other data to be used:
Other notes / help needed:
More information (optional) [link to separate page on wiki]

Idea 23: (Document Interlinking)

Name of Person(s): Bernardo Pereira Nunes / Terhi Nurmikko
Date:
Affiliation: Leibniz University of Hannover & PUC-Rio / University of Southampton
Description (optional):
Research method(s):
Tools to be used :
British Library digital collection(s) being used:
Other data to be used:
Other notes / help needed:
More information (optional) [link to separate page on wiki]

Idea 0: (working title)

Name of Person(s):
Date:
Affiliation:
Description (optional):
Research method(s):
Tools to be used :
British Library digital collection(s) being used:
Other data to be used:
Other notes / help needed:
More information (optional) [link to separate page on wiki]