2013-01-16T04:39:21-08:00
Resource:GREC Corpus
GREC
The annotations within the abstracts of the GREC corpus are the result of work carried out at the National Centre for Text Mining (NaCTeM)
School of Computer Science
University of Manchester
UK. The annotations are copyrighted and licensed by NaCTeM under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License. For Copyright of abstracts refer to PubMed.
uncurated
http://www.nactem.ac.uk/GREC/
The GREC corpus is a semantically annotated corpus of 240 MEDLINE abstracts (167 on the subject of E. coli species and 73 on the subject of the Human species) which is intended for training information extraction (IE) systems and/or resources which are used to extract events from biomedical literature.
The corpus has been manually annotated with events relating to gene regulation by biologists. Each event is centered on either a verb (e.g. transcribe) or nominalized verb (e.g. transcription) and annotation consists of identifying, as exhaustively as possible, the structurally-related arguments of the verb or nominalized verb within the same sentence. Each event argument is then assigned the following information:
* A semantic role from a fixed set of 13 roles which are tailored to the biomedical domain.
* A biomedical concept type (where appropriate).
As a simple example, consider the following sentence:
The narL gene product activates the nitrate reductase operon
The sentence contains a single event, centered on the verb activates, with 2 arguments, i.e.:
* The narL gene product
* the nitrate reductase operon
The argument The narL gene product is assigned the semantic role AGENT and the biological concept Protein, whilst the argument the nitrate reductase operon is assigned the semantic role THEME and the biological concept Operon.
Other types of argument include:
* LOCATION, e.g. In Escherichia Coli, glnAP2 may be activated by NifA
* MANNER, e.g. cpxA gene increases the levels of csgA transcription by dephosphorylation of CpxR
* CONDITION, e.g. Strains carrying a mutation in the crp structural gene fail to repress ODC and ADC activities in response to increased cAMP
The corpus in available for download in 2 formats:
* A standoff format, based on the BioNLP'09 Shared Task format
* An XML format, based on the GENIA event annotation format
nif-0000-06688
Resource:GREC Corpus
2012-09-10T00:00:00
19852798
Resource
Gene Event Regulation Corpus
Resource
Synonym
JISC
Supporting Agency
SuperCategory
Escherichia coli
Human
Species
Resource:FORCE11
Resource:MEDLINE
Beyond the pdf
RelatedTo
PublicationLink
PMID
ModifiedDate
Label
Computational Linguistics
Gene
Semantic search
Biomedical concept type
Semantic role
Information extraction system
Text mining
Information extraction
Annotation
Keywords
Resource:National Centre for Text Mining
Is part of
Id
Training set
Data or information resource
Has role
File:GREC Corpus.PNG
ExampleImage
Definition
DefiningCitation
CurationStatus
Availability
Abbrev