Abstract


Despite the persistent concern in the cultural heritage sector to break down so-called digital "silos" over the past two decades, in one respect visual art collections often remain resolutely "siloed": in the move to digital, most institutions have maintained the same curatorial and organisational principles found in their physical gallery space. Typically, images are catalogued by artist,provenance, date, and perhaps genre. As a result, the online experience of an institution's collection ends up vaguely resembling traversal of the institution itself: by default, like items are grouped with like, and the user explores an information architecture that vaguely mirrors the institution's physical layout.
There are two orders of explanation for this siloing - the first disciplinary, the second technical. Organisation by artist,period, provenance, and style facilitates the perception of commonalities, training the eye in the perception of detail and lending coherence and an evolutionary narrative to collections. But identifying commonalities is of course only one end of the cognitive spectrum involved in understanding the visual arts. Exploring difference allows stylistic choices to emerge with clarity; deeper examination of the co-implication of various factors (medium, genre, period) in the creation of artworks; and ultimately, disclosure of the vivid and contrasting 'ways of seeing' artists and traditions bring to bear upon their subjects.
On a technical level, siloing by similarity arises both from the design of the standard search interface and from limitations in our metadata. By their very nature, search interfaces are designed to group like works together: items appear in a result list by virtue of the fact that they share terms in common. This grouping operation is furthermore performed over metadata fields largely informed by traditional, similarity-focused curatorial practices. In particular, the subject field - the metadata information that most readily reveals visual and stylistic contrast across cultures, epochs, and areas - is usually absent entirely. If provided at all, it is typically a freetext field of very varying quality. Desirable as it might be to compare common themes and artistic subjects cross-culturally, existing cataloguing and interface-design makes this difficult or impossible to achieve.
The purpose of the Chiaroscuro Plugin Suite is to provide a set of three simple plugins allowing librarians and curators to facilitate not just comparison, but contrast across their collections: a CV Editor, for the creation of basic SKOS-based multilingual controlled vocabularies for subject headings; an Enricher, which intelligently extracts and normalises these terms and their synonyms from existing metadata fields, to populate a Subject field; and a Viewer, a standard faceted interface extended to facilitate contrast exploration with a simple 'Split' button inspired by that of KDE's File Explorer. The development path for this will also involve the creation of a small (<2 500 items) demonstration collection of animal images drawn across different periods (modern, medieval) and cultures (European, Islamic, Asian) that can stand as a web-application in its own right.
Relevant URL (if available):

Research Question / Problem

In a sense, the purpose of the Chiaroscuro Plugin Suite is less to pose a particular research question than to provide a framework enabling curators, art historians, anthropologists, and others to formulate their own and showcase their results. It is almost impossible, using existing digital tools, for researchers to ask questions about the visual arts cross-culturally and receive back meaningful answers - or indeed any at all. Chiaroscuro aims to remedy this.
The inspiration for Chiaroscuro originally grew out of my interest in zoomorphic art: that is to say, art depicting animals in the process of transmogrifying into humans or into other animals. Zoomorphic decoration and motifs are found across a wide range of premodern cultures, appearing to stand at the beginning of the e.g. native North American, northern European and Iron-Age Chinese artistic traditions and persisting for centuries thereafter; a comparative study would have the potential to illuminate several related aspects of art and culture simultaneously.
The topic, however, proved impossible to research online for several reasons.
First, subject cataloguing (in the sense of 'subject depicted') is missing from almost all online imagebanks. Dates, provenance, and creator name are regularly supplied; but subject information normally has to be gleaned from the title field - or, if present, from a freetext general description field.
Second, where subject cataloguing *is* given, it is very often provided in an uncontrolled and/or inconsistent manner, both internally within a single institution and between institutions.
Third, such controlled vocabularies as do exist for 'subject depicted' in artworks - ICONCLASS and the evolving Getty Iconography Authority/CONA standard - are restricted in scope, and in addition focus within and along cultural lines rather than across them. In other words, they cannot readily be used for cross-cultural comparison.
Fourth, multilinguality is a challenge. Cross-cultural comparison is often also cross-linguistic; relevant works may be held by institutions anywhere in the world; and the reliance on freetext fields such as *title* and *description* demands that researchers have a relatively high degree of linguistic competence not only to formulate queries, but to understand the results.
Finally, display is a problem. The metadata difficulties described above make search a hunt-and-peck operation. But in almost all existing search interfaces, every hunt-and-peck move wipes out the search list from the previous operation. While it is sometimes possible to compensate for this in various ways - opening new tabs, using more than one screen - these workarounds even when available are unwieldy to manage and do not scale well. Relatedly, search interfaces are almostalways designed to facilitate a continuous narrowing of search focus, rather than comparison between distinct sets of items. UX and UI changes are needed to allow fluid collection exploration and meaningful comparison of results.
As the above summary indicates, the considerations at the heart of Chiaroscuro are much broader and more widely applicable than research into zoomorphic art. Comparison and contrast are basic, arguably fundamental, modes of understanding the visual arts; but both operations - and in particular *contrast* - are poorly served in existing online collections. Were Chiaroscuro to become a reality, it becomes possible to conceive of a whole range of research areas previously closed to digital scholarship. Examples might include: comparison of heavenly 'messenger' beings (angels, malakim, and devas) in Christian, Islamic, and Hindu art; images of rulership in various epochs of the Byzantine empire and its dependencies; or representations of northern 'barbarians' in Chinese and Greco-Roman art. But of course the point of Chiaroscuro is to empower users to define their own. In the course of development, we will be creating an example application showcasing animal imagery taken from several different cultures - in part because of my abiding interest in zoomorphism, in part because this forms a usable test domain, in part because of the popularity of animals as subjects more generally. But this collection is intended to serve primarily as a template and testbed for other users - to serve as a starting point for others, rather than an end in itself.

Showcasing BL Digital Collections

As noted above, the course of Chiaroscuro's development will see the creation of a web application - tentatively titled 'Menagerie' - dedicated to showcasing the richness and diversity of its stored records depicting animals. In the first instance, these records will be drawn from the British Library's Illuminated Manuscripts collection (http://gallery.bl.uk/viewall/default.aspx?e=Illuminated%20Manuscripts), rich in bestiaries and animal marginalia, and its Flickr account (https://www.flickr.com/photos/britishlibrary). These will be complemented by records drawn from Europeana, the British Museum - and, schedule permitting, from the Louvre and the New York Public Library.
Methods

The technical architecture of the Chiaroscuro plugins is intended to leverage advanced existing technologies, while remaining in itself quite simple.
The CV Editor: this will be a simple editor supporting users in the creation of simple subject hierarchies with controlled terms. The underlying data structure will consist of a small subset of SKOS constructs. Multilingual synonym suggestion will be provided using the BabelNet API (http://babelnet.org/). While BabelNet is backed by extremely sophisticated Natural Language Processing, the Editor itself will act merely as a client of these.
The Enricher: It is assumed that the backend data-store being used is Solr; all that the Enricher does, then, is search for terms of interest in Solr documents, map these to a canonical controlled-vocabulary term, and then add this term as a subject to the relevant records. While the need for dynamic updating and multilingual support introduce some complexity here, correct configuration of Solr's out-of-the-box functionality will give us everything required.
The Viewer: The Viewer in most respects be a standard faceted interface, of the kind that Solr makes simple to implement on a data level. However, the UI will be provided with a 'Split' button (akin to that found in KDE's File Explorer application), allowing the user both to split the result display into discrete sections, and to restrict search and browse operations to any one of these sections arbitrarily. In addition, Javascript will be used to enable dragging and reordering functionality. HTML5, jQuery, and CSS will be sufficient to achieve this.
These three isolated pieces of functionality will be developed and integrated into the Blacklight framework, meaning that most of the basic 'plumbing' and infrastructure work will already have been completed.

Evidence that Entrant(s) can successfully complete the project

As can be seen from my LinkedIn profile (https://www.linkedin.com/in/tim-hill-7994a44), I have extensive experience working with metadata, cultural artifacts, and the intersection between the two. My earliest experience in this area was in helping to create the British Printed Images to 1700 browserfor King's College London and the British Museum (http://bpi1700.org.uk/jsp/) in 2009. Since then I have worked on search interfaces for papyri (http://papyri.info), as ETL and metadata specialist for a commercial humanities publisher (http://alexanderstreet.com/), and as a Search Engineer for Europeana (http://europeana.eu/portal/). In all of these roles I have worked extensively with existing metadata standards, with Solr, with object-oriented programming frameworks - and with the often-messy, sometimes-insane, always-fascinating data the cultural-heritage sector throws up. The idea for Chiaroscuro is the fruit of long experience working in this area - and of the realisation that the solutions for some of its most-frequently encountered frustrations might be relatively simple.
In terms of producing *code*, then, I feel I am fairly self-sufficient here. However, creating a working application will involve considerabe amounts of data analysis, data-munging, and mapping. For instance, although the BL's Illuminated Manuscript collection is all implicitly medieval European in provenance, this is not stated as such in the metadata itself, so some work needs to be done encoding this information - and hopefully more fine-grained distinctions - explicitly; there will doubtless be many edge-cases (mythical animals; figures composed of more than one animal) that will have to be dealt with; and while the CV Editor provides editing functionality, somebody will actually have to perform this editing and CV creation. It is in this sort of work that I hope to be supported by the BL team - with many eyes, all data problems are obvious. In addition, development is intended to be iterative, and it is hoped that the BL can play an active role in using, testing, and providing feedback on the software during the development process - and, of course, ideally coding.

How idea is achievable on a Technical, Curatorial and Legal basis

Technical
As described above, the various Chiaroscuro plugins are all technically straightforward: while leveraging advanced-NLP services, the plugins themselves are relatively simple.
Of course, 'simple' does not mean 'unproblematic'. While standard existing object-oriented application frameworks are sufficient to develop Chiaroscuro, unforeseen difficulties and unanticipated questions are bound to arise. For this reason, two iterations are scheduled for each plugin (see the plan of work, below).
Curatorial
As described above, much of the point of the Chiaroscuro plugins is to structure and enrich metadata records that have received only minimal curation, so extensive curatorial involvement is not envisaged here. That said, it is hoped that British Library curators and librarians will participate in Chiaroscuro iterations as stakeholders.
Legal
All software and collections required are open-source and free for non-commercial re-use.

Plan

June

  • Data acquisition, exploratory analysis, and metadata mapping (XSLT, scripts)
  • Creation of basic SKOS taxonomy for animal application
  • Determination of Solr schema + population of Solr
July

  • first iteration of CV Editor
  • first iteration of Enricher
August

  • First iteration of Viewer
  • Application of Enricher to dataset
  • Second iteration of Enricher
September

  • Second iteration of Viewer
  • Second iteration of CV Editor
October

  • Documentation
  • Import of additional datasets into the animal application (time permitting)
  • Appification of the animal dataset (time permitting)