The Standardization Survival Kit: presentation

To support the digital evolution within Social Sciences and Humanities research, it is necessary to stabilize knowledge on standards and research good practices. The goal of the Standardization Survival Kit (SSK), developed within the PARTHENOS project, is to accompany researchers along this route, giving access to standards and best practices in a meaningful way, by the mediation of research scenarios. A research scenario is a (digital) workflow practiced by researchers, that can be repeatedly applied to a task that will help to gain material or insights in view of a research question. These scenarios are at the core of the SSK, as they embed resources with contextual information and relevant examples on standardized processes and methods in a research context. The SSK is an open tool where users are able to publish new scenarios or adapt existing ones. These scenarios can be seen as a living memory of what should be the best research practices in a given community, made accessible and reusable for other researchers.

Why standards after all?

The main issue in defining a policy about standards is to understand what they actually are. In the context of research, standards usually take the form of documents informing about practices, protocols, artefact characteristics or data formats that can be used as reference for two parties working in the same field of activity to be able to produce comparable (or interoperable) results. This will also foster innovative, cross-disciplinary research paths, and eventually contribute to bridge the gap between the different cultures that are represented in the wide landscape of the Arts, Humanities and Cultural Heritage studies.

Standards are usually published by standardization organisations (such as ISO, W3C or the TEI Consortium) which ensure that the following three requirements for standards are actually fulfilled:

  • Expression of a consensus: the standard should reflect the expertise of a wide (possibly international) group of experts in the field
  • Publication: the standard should be accessible to anyone who wants to know its content
  • Maintenance: the standard is updated, replaced or deprecated depending on the evolution of the corresponding technical field.

Standards are not regulations. There is no obligation to follow them for research except when one actually wants to produce results that can be compared with those of a wider community. This is why a standardization policy for an infrastructure in the Arts and Humanities should include recommendations as to what attitude the scholarly communities could or should adopt with regard to specific standards.

The preceding characteristics outlined for standards put a strong emphasis on the role of communities of practice and the corresponding bodies that represent them. Ideally, a good standard reflects the work of the relevant community and is maintained by the appropriate body. This is exactly the case of the Text Encoding Initiative with respect to text representation standards and, to a lesser extent, of EAD (Encoded Archival Description), whose maintenance is taken up by the Library of Congress with support of the Society of American Archivists.

Because there is no obligation to use a given standard, it is essential to provide potential users with:

  1. awareness about the appropriate standards and the interest to adopt them,
  2. the cognitive tools to help them identify the optimal use of standards through selection and possibly customization of a reference portfolio.

The experience gained within the various communities and infrastructure represented in PARTHENOS that have been in the need of adopting existing standards, is that there is always an initial phase during which scholars should be made aware of some core standards that are systematically related to the definition of interoperable digital objects. This is basically what has lead us to identify the notion of Standardization Survival Kit (SSK). In the table below, for instance, we can see a first group of standards without which it is more or less impossible to deal with digital content in a proper way.

ISO 639 series Codes for the representation of languages and language families
ISO 15924 Codes for the representation of scripts
ISO 3166 Codes for the representation of country names
IETF BCP 47 Standard for encoding linguistic content, combining ISO 639, ISO 15924 and ISO 3166
ISO 10646 Universal encoding of characters (unicode)
ISO 8601 Representation of dates and times
Extensible Markup Language (XML) Provides the basic technical concept related to XML documents

The concept of SSK goes far beyond these baseline examples and aims at covering reference digital scenarios in the Arts and Humanities: a role of the SSK is to help communities participate in standardization activities where they exist, or at least document and spread the best practices as de facto standardized guidelines. Such a strategy will also contribute to the actual stabilization of existing conceptual and technical knowledge within ongoing projects, and provide a channel for the wider dissemination of the corresponding results.

The SSK: a toolkit for Humanities scholars

The work carried out by the SSK covers four types of activities related to the deployment and use of standards in the Humanities and Cultural Heritage fields:

  • Documenting existing standards to provide a reference for scholars who want to find out more about their role and content. This relates to the specific provision of bibliographic sources, available documentation, specific targeted introductions, as well as providing prototypical examples which can serve as models for similar work, possibly made available through focused Virtual Research Environments within the PARTHENOS infrastructure;
  • Supporting the actual adoption of standards by identifying how they relate to research scenarios and gathering the essential materials for controlling their deployment (e.g. schemas);
  • Communicating with research communities so that they can be made aware of both the need to apply standards in their digital scholarly practices but also be informed of the essential standards for their own fields.
  • Training for researchers, by giving them access to complete frameworks so that they may acquire knowledge and know-how on standardized methodologies.

In order to apply these four principles, the SSK focuses on giving researchers access to standards in a meaningful way. That is why it is built around research scenarios.

These scenarios are the core of the SSK because they aim at providing contextual information and relevant examples on how standards can be applied in a given research project. They potentially cover all the domains of the Humanities, from Literature to Heritage science, including History, Social sciences, Linguistics, etc.

They have been created and they are added to the SSK by domain experts, from real life researcher-oriented use cases), divided into different steps, and involving specific tasks.

These scenarios can be seen as a living memory of what should be the best research practices in a given community, made accessible and reusable for other researchers wishing to carry out a similar project but unfamiliar with the recommended tools, formats, methods to use, etc. For that reason, the SSK can be considered as a complete framework showing concrete use of standards, rather than simply a catalogue of resources.

Design principles & features of the SSK

From the very start of the project, the aim has been to build an easy-to-use online and collaborative platform with a user-friendly design. The idea of having general, end-to-end scenarios to help researchers carry out their project by following best practices and clear methods in their area of expertise is the most important design principle for the SSK.

The second principle is to add context: a “story-telling” approach to the use of standards in the Humanities and Social Sciences. The goal is to avoid providing yet another catalogue of resources, and offer instead contextual, activity-based information on how to use standards for researchers who are unfamiliar with them, but could see how they are used and what workflow they help achieve by following a scenario.

With these principles in mind, the SSK should enable the user, to perform two main actions:

  1. Consult and follow the guidelines expressed in the scenarios you are interested in for your project. Finding the most relevant ones should be easy since the navigation relies on strong taxonomies covering the different aspects of research: the disciplines, the type of techniques and objects involved, the concrete activities carried out, the standards needed.
  2. Propose new scenarios of your own by following a predefined model, with the possibility of both adding new content (steps as well as resources) and reusing existing content (to avoid duplication if a general step is already available in another scenario).

The first feature is fully operational. It was tested for the first time in April 2018, and iterations with test users have contributed to improve the information readability and attractiveness, in particular the exploration and search of scenarios.

The work on the second feature, allowing the user to contribute, is still ongoing. It is possible to create research scenarios with the SSK underlying data model, the Text Encoding Initiative, or TEI (see the dedicated section for more information). However, as we are aware that this solution requires some technical knowledge, a user-friendly interface is currently under development and should be released during the first trimester 2019.

Anyone who has registered and agreed to the “Terms of use” clearly stated will be able to contribute. This option has been chosen due to the difficulties of setting up some kind of editorial board in charge of reviewing any scenarios submitted. The quality check of the contributions should come from the very strict model that has to be followed in the scenario creation process. In addition, by contributing to the SSK, the user accepts to be visible and citable as an author, to be responsible for the work that he/she decides to share with others.

Why would you, as a researcher, want to contribute to the SSK? There are three main reasons:

  • to make your research project align with the best practices in your community
  • to get peer review and visibility
  • to share a project in another form than the usual blog / article (a new way to disseminate your work).

SSK components: Scenarios < Steps < Resources

The SSK is a web platform builded on three main layers nested within each other following a specific order: Research scenarios, steps and resources.

Each scenario within the SSK works like a high-level research guide for scholars. They are made up of successive steps or tasks, and can be followed as a complete process to solve a given problem with the most standardized means. For each step, the appropriate resources to perform the given task are proposed, divided into two categories: the “general resources” that include the primary documentation and tools; and the “project-specific resources” that point to concrete use cases in which a similar task was accomplished. The material contained in these sections is of various kinds:

  • The most important is the state-of-the-art bibliography, which includes all the documentation needed to carry out a given task. The bibliographical references are up-to-date and gathered within a Zotero library, which was specially created for this project. This choice was made to ease the resource selection process and to allow for a collaborative watch and curation of relevant information. When the resource is available online, a direct link is provided; otherwise, the user is given all the necessary metadata.
  • The SSK also offers the possibility to point to more technical resources, such as stylesheets, code samples, software or services.
  • Training materials such as tutorials.