Show simple item record

dc.contributor.authorPérez Pérez, Martín
dc.contributor.authorPérez Rodríguez, Gael 
dc.contributor.authorBlanco Míguez, Aitor
dc.contributor.authorFernández Riverola, Florentino 
dc.contributor.authorValencia, Alfonso
dc.contributor.authorKrallinger, Martin
dc.contributor.authorGARCIA LOURENÇO, Analia Maria 
dc.date.accessioned2022-11-17T09:40:44Z
dc.date.available2022-11-17T09:40:44Z
dc.date.issued2019-06-24
dc.identifier.citationJournal of Cheminformatics, 11(1): 42 (2019)spa
dc.identifier.issn17582946
dc.identifier.urihttp://hdl.handle.net/11093/4072
dc.description.abstractBackground: Shared tasks and community challenges represent key instruments to promote research, collaboration and determine the state of the art of biomedical and chemical text mining technologies. Traditionally, such tasks relied on the comparison of automatically generated results against a so-called Gold Standard dataset of manually labelled textual data, regardless of efficiency and robustness of the underlying implementations. Due to the rapid growth of unstructured data collections, including patent databases and particularly the scientific literature, there is a pressing need to generate, assess and expose robust big data text mining solutions to semantically enrich documents in real time. To address this pressing need, a novel track called “Technical interoperability and performance of annotation servers” was launched under the umbrella of the BioCreative text mining evaluation effort. The aim of this track was to enable the continuous assessment of technical aspects of text annotation web servers, specifically of online biomedical named entity recognition systems of interest for medicinal chemistry applications. Results: A total of 15 out of 26 registered teams successfully implemented online annotation servers. They returned predictions during a two-month period in predefined formats and were evaluated through the BeCalm evaluation platform, specifically developed for this track. The track encompassed three levels of evaluation, i.e. data format considerations, technical metrics and functional specifications. Participating annotation servers were implemented in seven different programming languages and covered 12 general entity types. The continuous evaluation of server responses accounted for testing periods of low activity and moderate to high activity, encompassing overall 4,092,502 requests from three different document provider settings. The median response time was below 3.74 s, with a median of 10 annotations/document. Most of the servers showed great reliability and stability, being able to process over 100,000 requests in a 5-day period. Conclusions: The presented track was a novel experimental task that systematically evaluated the technical performance aspects of online entity recognition systems. It raised the interest of a significant number of participants. Future editions of the competition will address the ability to process documents in bulk as well as to annotate full-text documents.en
dc.description.sponsorshipPortuguese Foundation for Science and Technology | Ref. UID/BIO/04469/2013spa
dc.description.sponsorshipPortuguese Foundation for Science and Technology | Ref. COMPETE 2020 (POCI-01-0145-FEDER-006684)spa
dc.description.sponsorshipXunta de Galicia | Ref. ED431C2018/55-GRCspa
dc.description.sponsorshipEuropean Commission | Ref. H2020, n. 654021spa
dc.language.isoengspa
dc.publisherJournal of Cheminformaticsspa
dc.rightsAttribution 4.0 International
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/
dc.titleNext generation community assessment of biomedical entity recognition web servers: metrics, performance, interoperability aspects of BeCalmen
dc.typearticlespa
dc.rights.accessRightsopenAccessspa
dc.relation.projectIDinfo:eu-repo/grantAgreement/EU/H2020/654021spa
dc.identifier.doi10.1186/s13321-019-0363-6
dc.identifier.editorhttps://jcheminf.biomedcentral.com/articles/10.1186/s13321-019-0363-6spa
dc.publisher.departamentoInformáticaspa
dc.publisher.grupoinvestigacionSistemas Informáticos de Nova Xeraciónspa
dc.subject.unesco1203.12 Bancos de Datosspa
dc.subject.unesco1203.17 Informáticaspa
dc.subject.unesco2499 Otras Especialidades Biológicasspa
dc.date.updated2022-11-17T09:36:52Z
dc.computerCitationpub_title=Journal of Cheminformatics|volume=11|journal_number=1|start_pag=42|end_pag=spa


Files in this item

[PDF]

    Show simple item record

    Attribution 4.0 International
    Except where otherwise noted, this item's license is described as Attribution 4.0 International