Publications of results of work carried out as part of recognition treatments (OCR) of the Herbadrop project on more than 5 million herbaria images have been accepted at two major events
- At the summit of the Digital Infrastructure for Research sponsored by EOSC (Brussels 30 Nov., 1 Dec. 2017), and
- At the IEEE International conference on Big Data (Boston 11-14 déc. 2017), Workshop on Computational Archival Science (CAS)
At the Digital Infrastructure for Research Summit, the Herbadrop project will be one of three use cases as well as ENES for climatology and EOSC pilot DPHEP for high energy physics that contribute to offer innovative solutions for building scientific knowledge through large research infrastructures (EUDAT, EGI, GEANT, PRACE or OpenAire).
A session entitled “New perspectives for knowledge-building through massively distributed scientific corpora” is dedicated to presenting concrete use cases and should help to stimulate debate on possible orientations to improve services on future research infrastructures in Europe.
The article that was accepted at the IEEE International conference on Big Data presents a statistical exploration on various contents obtained from the OCR analyzes, in particular the distribution of the herbaria images by date of collection, or a qualitative analysis on specific terms. The purpose of this publication is to provide some illustrations about the mining possibilities that the botanist community will be able to perform afterwards.