NEWSLETTER

The FP7 CLIMSAVE project ("Climate Change Integrated Assessment Methodology for Cross-Sectoral Adaptation and Vulnerability in Europe") finished at the end of 2013. The project developed the CLIMSAVE Integrated Assessment Platform which is a unique user-friendly, interactive web-based tool that enables stakeholders to interactively explore the complex multi-sectoral issues surrounding impacts, adaptation and vulnerability to climate and socio-economic change within the agriculture, forest, biodiversity, coast, water and urban sectors.  Two versions of the tool have been developed: one for Europe and one for Scotland.

Two summary reports have been produced highlighting the policy relevant final results of the project for the European and Scottish case studies. The summary reports can be accessed from:
 

Today, 16/12/2013,  the European Commission announced the launch of a new Pilot on Open Research Data in Horizon 2020, to ensure that valuable information produced by researchers in many EU-funded projects will be shared freely. Researchers in projects participating in the pilot are asked to make the underlying data needed to validate the results presented in scientific publications and other scientific information available for use by other researchers, innovative industries and citizens. This will lead to better and more efficient science and improved transparency for citizens and society. It will also contribute to economic growth through open innovation. For 2014-2015, topic areas participating in the Open Research Data Pilot will receive funding of around €3 billion.

The Commission recognises that research data is as important as publications. It therefore announced in 2012 that it would experiment with open access to research data (see IP/12/790). The Pilot on Open Research Data in Horizon 2020 does for scientific information what the Open Data Strategy does for public sector information: it aims to improve and maximise access to and re-use of research data generated by projects for the benefit of society and the economy.

The Pilot involves key areas of Horizon 2020:

  • Future and Emerging Technologies

  • Research infrastructures – part e-Infrastructures

  • Leadership in enabling and industrial technologies – Information and Communication Technologies

  • Societal Challenge: Secure, Clean and Efficient Energy – part Smart cities and communities

  • Societal Challenge: Climate Action, Environment, Resource Efficiency and Raw materials – with the exception of topics in the area of raw materials

  • Societal Challenge: Europe in a changing world – inclusive, innovative and reflective Societies

  • Science with and for Society

Neelie Kroes, Vice-President of the European Commission for the Digital Agenda said "We know that sharing and re-using research data holds huge potential for science, society and the economy. This Pilot is an opportunity to see how different disciplines share data in practice and to understand remaining obstacles."

Commissioner Máire Geoghegan-Quinn said: "This pilot is part of our commitment to openness in Horizon 2020. I look forward to seeing the first results, which will be used to help set the course for the future."

Projects may opt out of the pilot to allow for the protection of intellectual property or personal data; in view of security concerns; or should the main objective of their research be compromised by making data openly accessible.

The Pilot will give the Commission a better understanding of what supporting infrastructure is needed and of the impact of limiting factors such as security, privacy or data protection or other reasons for projects opting out of sharing. It will also contribute insights in how best to create incentives for researchers to manage and share their research data.

The Pilot will be monitored throughout Horizon 2020 with a view to developing future Commission policy and EU research funding programmes.


Ecological modellers require reliable sources of data for their analysis. Often, these sources are databases, checklists and specimen labels. Yet another rich source is the corpus of biological literature. It is estimated that there are well over 100 million pages of scientific publications and the volume grows every year. Publishing in advanced XML-based journals, such as Zookeys, Phytokeys or the Biodiversity Data Journal is recommended for new data, but what is the solution for legacy texts?

The EU FP7 project pro-iBiosphere has been piloting the mark-up and extraction of biological information from literature, which has been pioneered by Plazi (Agosti & Egloff, 2009). The EU FP7 Coordination and Support Action "pro-iBiosphere" was launched to investigate ways to increase the accessibility of biodiversity data, improve the efficiency of its curation and increase the user base of biodiversity data consumers and applications. The project addresses the technical and semantic interoperability between different forms in which data are published and analyses the sustainability issues related to the maintenance and curation of biodiversity data and derived information and knowledge. It also involves encouraging the biodiversity community to publish biodiversity data in a way that satisfies the technical requirements for an envisioned Open Biodiversity Knowledge Management System.

In order to reach these objectives three pilots for data mark-up and one for interoperability are being conducted (for detailed information on the pilots please see here). The mark-up pilots are evaluating accessibility of data within literature for a wide range of organisms and data types; and ways to facilitate  extraction of biological information from literature, including observations, traits, nomenclature, habitat information and interactions between organisms. For example, one pilot is looking at biogeographic data using the species Chenopodium vulvaria as a subject. In another, trait data is being extracted from literature on tropical mistletoes; while yet others are extracting data from papers on spiders, ants, centipedes, mosses and fungi.

In order to extract these data one can use either "born" digital texts or scanned texts, converted through text capture. These texts are then progressively marked up into XML documents, with tags defining the meaning of the containing text. The degree of mark-up granularity and the choice of textual elements to be marked-up depend on the type of data to be extracted and its granularity in the text. In taxonomically based literature, text is usually divided into the individual "treatments" for each species. Fortunately, most paragraph elements of these texts are in standard formats, for example, separate blocks of text contain the physical description of the organism, details of the distribution and habitat information, often separated with sub-headings.

The pro-iBiosphere pilots have used several methods for mark-up, but the main tool has been the GoldenGate Editor, which combines manual and automated methods to identify key text elements. For example, an algorithm identifies Latin names and then an interface guides the user through the verification of the algorithm’s results. Once marked-up, the XML document can be uploaded to the Plazi document repository. Plazi is a not-for-profit organization devoted to promoting open-access to taxonomic literature. You are free to use the data contained in Plazi’s repository and if you want you can refine the mark-up for your own purposes.

Extracting data from the legacy literature can be expensive. Modern XML based publications have additional advantages of linkages via DOI identifiers, and immediate dissemination to harvesters like EOL or GBIF. Yet, digitisation and mark-up has the possibility to reanimate the data in our publications, making them almost as useful as modern linked publications.

Task 3.4 of EU-BON is to develop tools to prepare, extract and mine published biodiversity literature (led by Plazi - Donat Agosti). For this task Plazi is looking for rich sources of data from the biodiversity literature, particularly where those data can be applied within other EU-BON tasks. For further information please contact Plazi

Agosti, D., & Egloff, W. (2009). Taxonomic information exchange and copyright: the Plazi approach. BMC research notes, 2(1), 53. doi:10.1186/1756-0500-2-53

Quentin Groom (National Botanic Garden, Belgium) & Donat Agosti (Plazi)


Members area

Lost your password?

      NEWS DIGESTS

flag big This project has received funding from the European Union’s Seventh Programme for research, technological development and demonstration under grant agreement No 308454.