Bernado Pereira Nunes, Terhi Nurmikko-Fuller


Submitted Entry for 2014 Competition


Abstract

Galleries, libraries, archives and museums are in a central position for the collection, preservation, storage, indexing and dissemination of the vastness of human creativity and knowledge. Yet, even in cases of centrally stored and openly accessible data, information can remain confined in the infamous “silo”, unconnected to other relevant but external knowledge which could enrich it. Adherence to the Linked Data method, which involves the publication of data online in non-proprietary and machine-readable formats, is one solution that has recently gained momentum in the Digital Humanities, becoming increasingly popular.

We propose a playful approach for unlocking the potential of the knowledge within the British Library’s (henceforth “the Library”). For this Labs competition, we have designed and tentatively produced a serious game, or a Game with a Purpose (GWP). This platform provides an informal educational environment, where participants engage in a quest for knowledge through solving riddles and helping others to do so - each participant also has the opportunity to create their own riddles using resources available in the Library and on the Web. Each user can create tracks with clues that both directly and indirectly provide connections between information resources.

Our proposal builds on the success of the “Digital Explores” submission for 2013 for which we received a Special Mention. Since then, our project and ideas have had time to mature and develop, and by applying lessons learnt from other successful crowdsourcing projects such as Duolingo.com and the ever-growing Zooniverse.org, we have combined the successful elements of reward schemes and social interaction which encourage initial take up and help increase participant retention. Our proposed project is an interdisciplinary endeavour, combining elements from successful crowdsourcing and citizen science projects to effectively fuel a Linked Data system.

The GWP operates on two levels simultaneously. Participants encounter and experience the interactive and engaging user interface (which we plan to improve in coming months), which encourages the creation and solving of riddles as a way of navigating through the Library’s data. Behind the scenes, the riddles are used to create machine-readable connections between resources. The validation of each created path is based on the ‘wisdom of the crowd’: as soon as a number of users solve the riddle, the semantic links between the resources are automatically created. The GWP has the potential to generate links between various resources and even result in the discovery of underlying links between them.
http://research.ccead.puc-rio.br/treasureExplorers/

Assessment Criteria

The research question / problem you are trying to answer

Please focus on the clarity and quality of the research question / problem posed:

The problem space which inspired this research project incorporates a number of seemingly disparate topics. Firstly, we were interested in the potential of semantic web technologies and Linked Data to help bring to light connections within the Library’s dataset internally, and then to broaden that out to the rest of the Web. This would not only allow data to be enriched by external sources but would also help increase the visibility of the collections and may bring new audiences to the Library. The sheer scale of the collections and the inherent richness of the data mean that such a task is not feasible for an individual (or a small number of individuals) to complete, and this, in turn, inspired the research question:

How successful would an informal learning environment be in encouraging members of the public to engage with a game with a purpose, to unearth implicit knowledge and hidden facts within the collections of the British Library?

Please explain the ways your idea will showcase British Library digital collections

Please ensure you include details of British Library digital collections you are showcasing (you may use several collections if you wish), a sample can be found at http://labs.bl.uk/Digital+Collections

Our proposed idea (the GWP “Treasure Explorers”) is designed for the multiple tasks of user entertainment, the acquisition and dissemination of new knowledge and to aid in the improvement or completion of tasks beyond the limits of automated systems. The project is not limited to any particular area or expertise or interest, nor requiring any existing skill-set and is thus in a position to showcase the collections to new and diverse audiences. The novel approach and the game layout may attract participants from those demographics that have traditionally had low levels of engagement with cultural and heritage institutions. Furthermore, the linking of data and resources will help improve traffic to the British Library’s site, and open up the collections to new directions. As signing to the game is currently done through Facebook, players are able to advertise their involvement with the game, potentially harnessing some of the power of social media to disseminate information about the game, and the Library.

Please detail the approach(es) / method(s) you are going to use to implement your idea, detailing clearly the research methods / techniques / processes involved

Indicate and describe any research methods / processes / techniques and approaches you are going to use, e.g. text mining, visualisations, statistical analysis etc.

Much of the initial game design has already been completed. These elements include initial setting up of the development environment and the modelling of the game structure (in PHP). The design of the final game interface and the fine-tuning of the rules are tasks that we believe would greatly benefit from close cooperation with the staff at the Library, helping us adhere to branding, and to ensure that data collected reflects the research interests of involved Library staff. Similarly, although not requiring any specialist software, the game requires the creation of initial riddles - for this too, we would welcome the professional expert knowledge of the Library’s staff to help highlight little known but useful connections between separate collections. This stage of the process would also benefit from qualitative feedback from the Library’s staff.

The inherent flexibility of Digital Explorers as a platform allows us to manage the challenges we anticipate from working with such a large and heterogeneous dataset as the Library’s collection. Effectively crowdsourcing the annotations to generate Linked Data, our project lies in an interdisciplinary space which combines both technical and social considerations. Thus, the success of the project depends not only on the generation of linked datasets, but also on the initial attraction of participants to engage with Digital Explorers and the retention of these participants. Beta-testing and qualitative analysis of the feedback from volunteers will help us develop the game structure and the user interface, as well as assess the need for other additional elements such as a clearly defined reward scheme, error monitoring and reduction, a medium for interaction between participants and any other elements which the evaluation process might bring to light.

Please provide evidence of how you / your team have the skills, knowledge and expertise to successfully carry out the project by working with the Labs team

E.g. work you may have done, publications, a list with dates and links (if you have them)

Our interdisciplinary team has extensive experience and a record of publications in online engagement via social media and online projects, including a long-standing, up to-date and well-maintained presence across several different social networking sites. We have combined expertise in the fields of Web Science, Semantic Web, Information Retrieval, Technology Enhanced Learning, Museum Studies and Digital Humanities.

A recent list (since 2012) of publications:
Dietze, S. , Sanchez-Alonso, S. , Ebner, H., Yu, H., Giordano, D., Marenzi, I. and Nunes, B. P. (2013) Interlinking educational Resources and the Web of Data - a Survey of Challenges and Approaches. Emerald Program: electronic Library and Information Systems, (47)12013.

Fetahu, B. , Nunes, B. P. and Dietze, S. (2013) Towards Focused Knowledge Extraction: Query-based extraction of structured Summaries. In Proceedings of the 22nd World Wide Web Conference, ACM, 2013.

Fetahu,B., Nunes, B. P. and Dietze, S. (2013) Summaries on the fly: Query-based extraction of Structured Knowledge from Web Documents. In Proceedings of the 13th International Conference on Web Engineering, 2013.

Kawase, R. , Fisichella, M., Niemann, K. , Pitsilis, V., Vidalis, A. , Holtkamp, P. and Nunes, B. P. (2013) OpenScout: harvesting business and management learning objects from the web of data. In Leslie Carr, Alberto H. F. Laender, Bernadette Farias Lóscio, Irwin King, Marcus Fontoura, Denny Vrandecic, Lora Aroyo, José Palazzo M. de Oliveira, Fernanda Lima, and Erik Wilde (Eds.), WWW Companion Volume, 445-450, International World Wide Web Conferences Steering Committee / ACM, 2013.

Kawase, R. , Fisichella, M., Nunes, B. P., Ha, K.-H. and Bick,M. (2013) Automatic Classification of Documents in Cold-start Scenarios. WIMS 2013: International Conference on Web Intelligence, Mining and Semantics, June 2013.

Kawase, R. , Siehndel, P., Nunes, B. P. and Fisichella, M. (2013). Automatic Competence Leveling of Learning objects. ICALT 2013: 13th IEEE International Conference on Advanced Learning Technologies ICALT, Beijing, China, July 2013.

Kawase, R. , Siehndel, P., Nunes, B. P., Fisichella, M. and Nejdl, W. (2012)Towards Automatic Competence Assignment of Learning Objects. In Andrew Ravenscroft, Stefanie N. Lindstaedt, Carlos Delgado Kloos, and Davinia Hernández Leo (Eds.), EC-TEL, (7563):401-406, Springer, 2012.

Kawase, R., Nunes, B. P. and Siehndel, P. (2013) Content-based Movie Recommendation within Learning Contexts. ICALT 2013: 13th IEEE International Conference on Advanced Learning Technologies ICALT, July 2013.

Kawase, R., Nunes, B. P., Herder, E., Nejdl, W. and Casanova, M. A. (2013) Who Wants To Get Fired. Proc. Web Science 2013, ACM, May 2013. (Worldwide Press Coverage - CNN, FOX, Le Monde, Bild, etc. - google: fireme.me)

Nunes, B. P. , Kawase, R., Dietze, S. , de Campos, G. H. B. and Nejdl, W. (2012) Annotation Tool for Enhancing E-Learning Courses. In Elvira Popescu, Qing Li, Ralf Klamma, Howard Leung, and Marcus Specht (Eds.), ICWL, (7558):51-60, Springer, 2012. (Best paper nominee)

Nunes, B. P. , Kawase, R., Dietze, S., Taibi, D., Casanova, M. A. and Nejdl, W. (2012) Can Entities be Friends?. In Giuseppe Rizzo, Pablo Mendes, Eric Charton, Sebastian Hellmann, and Aditya Kalyanpur (Eds.), Proceedings of the Web of Linked Entities Workshop in conjuction with the 11th International Semantic Web Conference, (906):45--57, November 2012.

Nunes, B. P., Caraballo, A. A. M., Casanova, M. A. and Kawase, R. (2012) Automatically generating multilingual, semantically enhanced, descriptions of digital audio and video objects on the Web. In Manuel Graña, Carlos Toro, Jorge Posada, Robert J. Howlett, and Lakhmi C. Jain (Eds.), KES, (243):575-584, IOS Press, 2012.

Nunes, B. P., Pedrosa, S. , Kawase, R., Alrifai, M., Marenzi, I., Dietze, S. and Casanova, M. A. (2013) Answering Confucius: the reason why we complicate. 8th European Conference of Technology Enhanced Learning, EC-TEL 2013, September 2013.

Nunes, B. P., Caraballo, A. A. M. , Casanova, M. A. and Kawase, R. (2012) Boosting Retrieval of Digital Spoken Content. In Manuel Graña, Carlos Toro, Robert J. Howlett, and Lakhmi C. Jain (Eds.), KES Selected Papers, (7828):153-162, Springer, 2012.

Nunes, B. P., Caraballo, A. A. M. , Casanova, M. A. , Fetahu, B. , Leme, L. A. P. P. and Dietze, S.(2013) Complex Matching of RDF Datatype Properties. In Proceedings of 24th International Conference on Database and Expert Systems Applications, 2013.

Nunes, B. P., Dietze, S., Casanova, M.A., Kawase, R., Fetahu, B. and Nejdl, W. (2013) Combining a co-occurrence-based and a semantic measure for entity linking. ESWC 2013 - 10th Extended Semantic Web Conference, May 2013.

Nunes, B. P., Fetahu, B. and Casanova, M. A. (2013) Cite4Me: Semantic Retrieval and Analysis of Scientific Publications. Proceedings of the LAK Data Challenge, held at LAK 2013, the Third Conference on Learning Analytics and Knowledge, 2013. (Cite4Me was awarded with the second prize in the LAK Data Challenge)

Nunes, B. P., Kawase, R. , Fetahu, B., Dietze, S., Casanova, M.A. and Maynard, D. (2013) Interlinking documents based on semantic graphs. In Proceedings of the 17th International Conference on Knowledge-Based and Intelligent Information & Engineering Systems KES2013, September 2013.

Nunes, B. P., Kawase, R., Siehndel, P., Casanova, M.A. and Dietze, S. (2013) As Simple As It Gets – A sentence simplifier for different learning levels and contexts. ICALT 2013: 13th IEEE International Conference on Advanced Learning Technologies ICALT, Beijing, China, July 2013.

Nurmikko, T. (2013), „To survey the land, he left his city” and other proverbs: Mapping ancient Mesopotamia from cuneiform inscriptions. In the proceedings of HESTIA2, Southampton, 18 July 2013.

Nurmikko, T. (forthcoming) Assessing the suitability of existing OWL ontologies for the representation of narrative structures in Sumerian literature. In Elliott, T., Heath, S. and Muccigrosso, J. (eds). “Current Practice in Linked Open Data for the Ancient World“. ISAW Papers, 7. Institute for the Study of the Ancient World, New York University.

Nurmikko, T. (forthcoming) Computed Tomography for Cuneiform Tablets: Seeing Inside Sealed Envelopes. In proceedings of the Current Research in Cuneiform Paleography workshop at the 60th Rencontre Assyriologique Internationale, University of Warsaw, 21-25 July 2014.

Nurmikko, T., (2012), Linking the Data of Ancient Sumer. In the proceedings of the Computer Applications and Quantitative Methods in Archaeology, Southampton, 26 – 30 March, 2012.

Nurmikko, T., (2014) Ontological Representation of Sumerian Literary Narratives. In the proceedings of British Association of Near Eastern Archaeology, University of Reading, 9-11 January 2014.

Nurmikko, T., Earl, G. and Gibbins, N. (2013), Ontological Representation of Mesopotamian Literary Compositions. In the proceedings of 2013 American Schools of Oriental Research Annual Meeting, Baltimore, Maryland, 20 – 23 November, 2013.

Nurmikko, T., Earl, G., Dahl, J., and Gibbins, N., (2012) Citizen Science for Cuneiform Studies. In the proceedings of the ACM Web Science Conference, Chicago, 22 - 24 June, 2012.

Nurmikko, T., Earl, G., Martinez, K. and Dahl, J., (2013), Web Science for Ancient History: Deciphering proto-Elamite Online. In the proceedings of the ACM Web Science Conference, Paris, 2 – 4 May, 2013. (Awarded Best Poster Presentation of a Full Paper)

Paes Leme, L. A. P., Lopes, G. R., Nunes, B. P., Casanova, M.A. and Dietze, S. (2013) Identifying candidate datasets for data interlinking. Proceedings of the 13th International Conference on Web Engineering, 2013.

Please provide evidence of how you think your idea is achievable on a technical, curatorial and legal basis

Indicate the technical, curatorial and legal aspects of the idea (you may want to check with Labs team before submitting your idea first).

Technical
The technical requirements for the implementation of the proposed game are modest, simply requiring a Linux or Windows-based Web server with PHP (version +5.2) and MySQL (version +5.0) support. For the client, or user, players only require devices with Web browsers that support Javascript.

The game is Web-based and will be developed using Web languages/scripts such as PHP, HTML, and Javascript/JQuery, for the most common Web browsers (Firefox, Internet Explorer, Chrome and Safari). As such, a domain name is required in order to make the game public (this could perhaps be a subdomain under the British Library Labs domain: xyz.labs.bl.uk, for example). The PUC-Rio has a server which can be used for the project.

Curatorial
Providing materials and knowledge for the formation of the original connections between resources will be an opportunity to showcase types of curatorial knowledge which might not frequently be exposed to the public. Connections drawn by the participants may help bring to light such implicit connections that have as of yet to be made, simply due to restrictions of resources such as curatorial time. The project will enable curatorial staff to engage with the public via mediums that are not excessively time-consuming, and at the same time help foster relationships between the institution and the public. At the same time, since it is the public, and not the curators who are composing the riddles, and the results are validated by an automated system, the project places little demand on a curator’s time.

Legal
We are adhering to the copyright restrictions of the collection (http://www.bl.uk/copyright). As for external sources, we will only be seeking to link to resources published openly with relevant Creative Commons licences and attributing rights as required.

Please provide a brief plan of how you will implement your project idea by working with the Labs team

You will be given the opportunity to work on your winning project idea between May 26th - Oct 31st, 2014.

May 26 2014 - Onwards
Establish a number of riddles to populate the game and give participants content to trial, before they produce their own riddles.

June 2014
Tentative deadline for the game, ready for beta-testing and participant evaluation.

July 2014
User testing - assessing uptake and participation, first look at audience demographics

August 2014 - October 2014
Data analysis - examining who plays, what connections are made, identification of possible emerging patterns.