Quantcast
Channel: TEI-P5 – Digital Intellectuals
Viewing all articles
Browse latest Browse all 12

OPEN – Contribution to the AGATE Workshop January 16th, 2017

$
0
0

This is the text of my contribution to the Workshop “Open – Connective – Sustainable”

Ladies and Gentlemen, Dear President Hatt, Dear Academy Representatives & Guests,

I am honored and happy to have this opportunity to respond to Ulrike Wuttke and by doing so to proceed with the opening session of this workshop. There is nothing more adequate to open such a day than to start thinking about the very concept of openness. In my response, I will first take a short historical glimpse back in the history of the institution hosting today’s event. Then I will take a more systematic take on Ulrike Wuttke’s presentation.

Let us turn back to the year 1813. At this point, a young professor of philology initiated what would become the first German Academy long-time project in SSH ever. How did this young professor justify his iconoclastic idea? He wrote the following in his application: “Der Hauptzweck einer Akademie der Wissenschaften muss dieser sein, Unternehmungen zu machen und Arbeiten zu liefern, welche kein Einzelner leisten kann” (Engl.: „The main goal of an Academy of Sciences must be to initiate endeavors and deliver works which a single person alone cannot possibly achieve.“) What does this mean? That an Academy of Sciences, in this view, is the best place to conduct large research projects because many scholars are able to contribute, because an Academy can actually host the research results and implicitly also because a communication between the instances is given by the institution’s structure. Large research projects in the Humanities at Academies like this very first German one were established in the spirit of openness and of making access, cooperation and long-time archiving their core values.

An interesting issue raised by this 1813 application that is related to large research projects and their scope is the question of their endpoint. Because the project applied for in 1813 is, in fact, not yet finished to this day. August Boeckh was starting back then what would become the Corpus Inscriptionum Graecarum. His idea was to gather a corpus of all Greek inscriptions – something Boeckh imagined would take a few years by sending a couple of colleagues out to realize field studies and contacting other Academies that were likely to either have material they could send in copy or give information as where to look for additional material. Even if the general idea was valid and pretty much the epitome of an Academy project, the concrete implementation turned out to be more complex than it seemed at first sight. What does this tell us? That the constitution of a large corpus of data relevant to research requires solid workflow and data management plans to be fruitful.

It is interesting to see with that example that the roots of SSH projects at German Academies point to questions that are still essential for us today in the context of AGATE. How can a project relate to other scholarly endeavors? How to give access to a corpus before it is completely “finished”? How to involve as many scholars as possible in order to gain manpower, without losing backbone and momentum? In a word: How to make openness into a virtue rather than an empty shell?

This is the question I would now like to address by focusing on the Open Science dimension of the AGATE project presented by Ulrike Wuttke.

Open Science is generally considered to have several components: Open Educational Resources, Open Access, Open Peer Review, Open Methodology, Open Source, Open Data. Ulrike Wuttke concentrated in her presentation mainly on Open Access. While Open Educational Resources do not belong to the core missions of Academies, I would like to draw a line back to the other dimensions which Ulrike mentioned only briefly and that are to my eyes as important as the fact of being guided by the ideal of Open Access.

Academy projects in SSH tend to be blackboxes. The only output made accessible is the final, published version of the research data, that is the publications, the secondary data. It is equally important to make accessible not only the primary data (raw material), but also documentation on the workflow and on the methodology. This is encompassed by three of the categories mentioned before in the definition: Open Source (for software and other digital tools), Open Data (for raw data) and Open Methodology (by making one’s methodology transparent and hence the results verifiable and reproducible).

Now truth be told these are domains in which SSH scholars are unexperienced. They are clueless about how to conceive access to information that until now had always been reserved to hidden backyards and only needed to be understandable for a restricted group of scholars, the one that carried out the project. We definitely need to put some specific effort into this. The real question here is not only to find the adequate format to make such data, software or methods accessible per se, but also to make them accessible in such a way that they are useful and reusable. For research data, this means that openness must be accompanied by standards, by publication and replication methods and pedagogy on these topics as well as by evaluation criteria that allow the data and methods to be integrated not only in the research data ecosystem, but also in the academic reputation lifecycle. This is where we need the last pillar of the definition of Open Science I mentioned before, namely Open Peer Review. I am not completely sure that Open Peer Review is the magical answer here. But we certainly need a form of leverage that allows academic recognition of all the information circulating in the context of Open Science and which, as I just explained, goes way beyond scientific publications, which are secondary data. Some of this leverage can be gained from constraining policies like H2020 requiring EU-funded research data to be made open. But this top-down approach does not necessarily convince the scholars implementing them that it makes sense for their research.

Ulrike Wuttke mentioned the FAIR principles and I would like to go, from there, one step further in terms of how to implement these principles, by pleading for the systematic use of established standards and for openness to new publication formats. This is not something that has to be carried out by Academies alone, but a direction in which a great part of the scholarly community is going and to which the Academies, headed by umbrella organisations such as the Akademienunion and ALLEA, could make a decisive contribution. I speak here in my function as a Managing Editor of the Journal of the Text Encoding Initiative. In our upcoming call for papers, we have introduced data papers. These will be peer-reviewed like other papers and with the data attached to it made available in a stable environment, we hope to see a significant raise in interoperable, long-time archived, reusable TEI data. Working out peer-review criteria is confronting us to the exact same questions that are relevant here. How can we encourage openness? How can we measure interoperability? How can we take workflows into consideration? This discussion has to be carried openly by the SSH community and it has to allow to improve the forms of academic evaluation and recognition that are to be expected. The Academies can provide a valuable leverage in this evolution by adhering to all the dimensions of Open Science that I have mentioned, especially in the context of AGATE, by encouraging their scholars to get involved in a renewal of the evaluation process emerging from these new formats, and by contributing to these formats. I can say by experience that this is particularly true for the TEI community, to which scholars from the house that is welcoming us today are delivering a valuable – and highly valued – contribution.

 

If the European academies were to realize the ecosystem I have sketched so far, they would have reached the epitome of Open Science defined as “the practice of science in such a way that others can collaborate and contribute, where research data, lab notes and other research processes are freely available, under terms that enable reuse, redistribution and reproduction of the research and its underlying data and methods.” (definition by Nancy Pontika) And they would have contributed to integrate it to the reputation system inherent to the practice of science, which we know is a world with economic values of its own.

But what does “reuse, redistribution and reproduction” mean exactly in this definition? It would be too narrow to say that favoring cc licenses is the beginning and the end of a data reuse policy. Openness does not amount to cc licenses. In order to foster openness, like this definition suggests, reuse conditions must allow all forms of scholarly reuse by all interested scholars at large – this includes text and data mining, data extraction for Wikipedia and other uses that do not belong today to key criteria for impact, but might well do so in a few years. Open Science also means opening to forms of Citizen Science we might still not be aware of. Reuse, in that sense, is really connected to redistribution and reproduction as two core challenges for AGATE. Beyond abiding to principles, to formats, to licenses, AGATE will have to define in cooperation with its European Academy partners a clear policy on distribution, citation, versioning, long-time availability, as essential conditions for openness for all the types of data and methods mentioned before. All of these elements are vital to the realization of openness as it is intended in AGATE. In the context of the Cultural Heritage Data Reuse Charter DARIAH is currently working on, different models for such dissemination settings are brought together in order to make a new space available that allows reuse requirements to be made transparent, scholarly communication to be made fruitful and in the end – hopefully – research on Cultural Heritage Data better in quality and quantity. Including metadata from AGATE to the Charter environment would be a major asset. I hope to be able to discuss this with you at more length during the infrastructure Café this afternoon.

But for now, I would like to leave you with the Rule Number One of Open Data: “Love Your Data, and Help Others Love It, Too.” With that in mind, nothing can go wrong.


Viewing all articles
Browse latest Browse all 12

Latest Images

Trending Articles



Latest Images