Exciting research in biology has resulted in a large
amount of biological publications.
Knowledge discovery in biology becomes an interesting
task which can be established
by recognizing terms in text to extract useful
information such as interaction relationships.
We propose the Automatic Biological Term Annotation
(ABTA) system which uses classification methods to
annotate terms in text. A novel method is presented
to express lexical features in pattern notations.
Prefix and suffix characters are used instead of
lists of potential terms or external resources. We
demonstrate that part-of-speech tag information is
the most effective attribute. Creating classification
exemplars is conducted from text by using word n-gram
model. We illustrate improvements on our system''s
performance which depends on the feature attributes
we define. Biological concept markers are also
assigned to each located term indicating its meaning.
Our results are comparable to the performance of
other existing systems while our system retains
simplicity and generalizability.
amount of biological publications.
Knowledge discovery in biology becomes an interesting
task which can be established
by recognizing terms in text to extract useful
information such as interaction relationships.
We propose the Automatic Biological Term Annotation
(ABTA) system which uses classification methods to
annotate terms in text. A novel method is presented
to express lexical features in pattern notations.
Prefix and suffix characters are used instead of
lists of potential terms or external resources. We
demonstrate that part-of-speech tag information is
the most effective attribute. Creating classification
exemplars is conducted from text by using word n-gram
model. We illustrate improvements on our system''s
performance which depends on the feature attributes
we define. Biological concept markers are also
assigned to each located term indicating its meaning.
Our results are comparable to the performance of
other existing systems while our system retains
simplicity and generalizability.