Computers in Biology and Medicine 37 (2007) 1511 – 1521
A knowledge based method for the medical question answering problem
Rafael M. Terol∗, Patricio Martínez-Barco, Manuel Palomar
Department of Software and Computing Systems, The University of Alicante, San Vicente del Raspeig Road, Alicante, Spain
Received 16 March 2006; received in revised form 22 January 2007; accepted 24 January 2007
Abstract
In this paper, a restricted domain question answering (QA) system is described. The design architecture of this QA system and the features that
allow the adaptation of the QA system to the medical domain are also presented. The advantages of this QA system include the simple processof defining the question taxonomy answered by the system as well as the possibility of locally or remotely managed document collections. Themain computing methods of the QA system are based on the application of natural language processing (NLP) techniques to infer the logicforms and on the treatment of the logic forms. The knowledge of the system is acquired through the use of two different resources: UnifiedMedical Language System (UMLS) to handle the medical terminology and WordNet to manage the open-domain terminology. ᭧ 2007 Elsevier Ltd. All rights reserved. Keywords: Bioinformatics; Biomedical; Text mining; Medicine; Medinformatics; Question answering framework; Medical question taxonomies
1. Introduction
set of questions that users need. These FAQ systems handle adatabase where the list of questions and their related answers
Open-domain textual question answering (QA), as defined
are stored. Thus, the FAQ system allows users to choose one of
by the TREC competitions,1 is the task of extracting the right
the possible questions that the system is able to answer by way
answer from text snippets identified in large collections of doc-
of searching in the database for the answers related with that
uments where the answer to a natural language question lies.
question. The natural language questions do not consider by
Open-domain textual QA systems are defined as capable
these FAQ systems and, the increment in the questions treated
tools to extract concrete answers to very precise needs of infor-
by the system require the user to compare if the question is
mation in document collections. For instance, in open domains,
matched with the large number of the answered questions. For
a system can respond to society questions such as where was
these reasons, these FAQ systems are replaced by QA systems
Marilyn Monroe born?, what is the name of Elizabeth Taylor’s
over restricted domains. Nowadays, textual QA is also exhib-
fourth husband?; geography questions such as where is Halifax
ited in restricted domains such as clinical tourism med-
located? and so on. Examples of these kinds of QA systems in
ical and so on. These system are described in the next back-
open domains can be located in authors such as Moldovan
Sasaki Vicedo Zukerman and so on. These types
According to official results of the QA track at the last TREC
of QA systems locally process document collections discarding
conference, QA systems in open domains are between 30%
the access to internet information sources.
and 40% of precision.In a restricted domain such as medical
In restricted domains, frequently asked question (FAQ) sys-
domain, it is necessary to highly improve this score due to the
tems are often used to obtain common answers to a restricted
critical information that is handled in these medical areas whereerroneous information can originate serious risks to people’shealth (no answer is better than incorrect answers).
∗ Corresponding author. Tel.: +34 965903772; fax: +34 965909326.
This is the reason why our research effort is directed towards
E-mail address: (R.M. Terol).
the textual QA on medical domain retrieving the information
1 The Text REtrieval Conference (TREC) is a series of workshops
organized by the National Institute of Standards and Technology (NIST),designed to advance the background in information retrieval (IR) and QA.
2 This evaluation measure gives the accuracy of the QA system.
0010-4825/$ - see front matter ᭧ 2007 Elsevier Ltd. All rights reserved. doi:10.1016/j.compbiomed.2007.01.013
R.M. Terol et al. / Computers in Biology and Medicine 37 (2007) 1511 – 1521
from internet websites. There exists a lot of feasible medi-
ognizing weather events, and the domain independent ontology
cal information towards internet, the largest network in the
for place names. Rinaldi et al. shows the adaptation to the
world. This fact increases the importance of evaluating the
genomics domain of an existing QA system. The knowledge
quality of information on medical websites because anyone
was extracted from several resources such as Unified Medical
can create a website and can put any medical information on
Language System (UMLS) SWISS-PROT, OMIM, Ge-
this website. This medical information would not be accurate
neOntology, GenBank and LocusLink. As an adaptation of the
ExtrAns system to the new genomics domain, this system
In this paper, a QA system is presented. This QA system
uses the minimal logical forms to perform the semantic repre-
is capable of working over any restricted domain. The adap-
sentation of documents and questions. Niu and Hirst previous
tation to the system medical domain (medical QA system) is
work showed that current technologies for factoid QAin
also exhibited. The medical QA system is able to answer med-
open domains were not adequate for clinical questions, whose
ical questions according to a generic question taxonomy. In the
answers must often be obtained by synthesizing relevant con-
following sections, the main features of the QA system are de-
text. To adapt to this new characteristic of QA in the medi-
scribed focusing in detail the question analysis performance.
cal domain, they exploited the relations between the semantic
Section 2 introduces the state of the art of QA systems. In Sec-
tion 3, we show the motivation of working in QA over medical
As shown in the present section, different ways of process-
domain. Section 4 details the modulate architecture of the re-
ing logic forms are applied in the open-domain QA perfor-
stricted domain QA system and its adaptation to the medical
mance. Also, open-domain QA systems can be adapted to re-
domain. In Section 5, we describe the evaluation task and show
stricted domains. In the following sections, our QA system
the obtained results by our medical QA system. Section 6 dis-
based on the processing of logic forms is presented. The fea-
cusses the contribution of our research work. The last section
tures that allow the portability of the QA system to a new
domain (the medical domain) are also presented. These porta-bility features imply that our QA system runs as a medical
2. Background
QA performance requires complex natural language process-
3. Motivation
ing (NLP) techniques. The core of our QA system is the textprocessing by way of logic forms. In the following sections,
There exists several agents that can interact in the clinical
this complex NLP technique is defined. A logic form is a way
domains such as doctors, patients, laboratories and so on. All
of representing natural language sentences. Other authors em-
of them need quick and easy ways to access electronic infor-
ploy logic forms in their QA systems. Concretely, Moldovan
mation. Access to the latest medical information helps doctors
developed an open domain QA system, and Mollá de-
to select better diagnoses, helps patients to know about their
signed an open domain QA system capable of answering nat-
conditions, and allows to establish the most effective treatment.
ural language questions in the frame of the commands of the
These facts produce a lot of information and different types of
UNIX operating system. In Moldovan’s QA system, the identi-
information between these agents that must be electronically
fication of the predicates is based on the format of Logic Form
processed. For example, people want to find competent med-
Transformation of eXtended WordNet while Mollá identi-
ical answers to medical questions: when they have some un-
fies the predicates using a more complex terminology based on
known symptoms and want to know what they could be related
logic treatment. In order to focus their QA systems on open do-
to, or when they want to know another medical opinion about
mains, Moldovan and Mollá employ complex inference rules
the best way to treat their disease, or when they can ask expe-
in the logic forms treatment performance.
rienced doctors any medical questions related to any unknown
Moreover, the use of these open-domain textual QA systems
symptoms or their state. All these features conclude that the
in restricted domains such as medical domain do not produce
number and the type of medical questions that a medical QA
good results because these systems use NLP generic resources
system can respond to is very great.
such as WordNetwhich is not specialized in medical
These reasons motivated us to adapt the QA system to the
terminology. When QA systems are directed to restricted do-
medical domain. This medical QA system is capable of answer-
mains, it is necessary to acquire rich knowledge resources of
ing medical questions according to a medical question taxon-
the domain that allows the system to understand the meaning of
omy. This question taxonomy is based on the study developed
the treated information in the user’s question and documents.
by Ely et al. whose main objective is to develop a taxonomy
Chung et al. presented a practical QA system in the me-
of doctor’s questions about patient care that could be used to
teorology domain that extracts information about the weather
help answer such questions. In this study, the participants were
every hour from the website of the Korea Meteorological Ad-
103 Iowa family doctors and 49 Oregon primary care doctors.
ministration. This information is structured and locally stored
The authors concluded that clinical questions in primary care
in a database management system (DBMS). The knowledge is
can be categorized into a limited number of generic types. A
obtained by consulting a domain-dependent ontology for rec-
4 A factoid question is a fact-based, short answer question such as When
3 WordNet is a large lexical database of the English language. R.M. Terol et al. / Computers in Biology and Medicine 37 (2007) 1511 – 1521
moderate degree of interrater reliability was achieved with the
4. QA system architecture
taxonomy developed in this study. The taxonomy may enhancethe understanding of doctors’ information needs and improve
The main components (modules) of our QA system could be
the ability to meet those needs. According to this question tax-
onomy, the 10 most frequent questions formulated by doctorsare ranked in the following enumeration:
(1) Question analysis. (2) Document retrieval.
(1) What is the drug of choice for condition x?
(3) What test is indicated in situation x?(4) What is the dose of drug x?
These components are related to each other and process the
(5) How should I treat condition x (not limited to drug treat-
textual information available on different levels until the QA
(6) How should I manage condition x (not specifying diag-
The natural language questions formulated to the system are
processed initially by the question analysis component. This
(7) What is the cause of physical finding x?
process is very important since the quantity and quality of the
(8) What is the cause of test finding x?
information extracted in this analysis will condition the per-
(9) Can drug x cause (adverse) finding y?
formance of the remaining components and therefore, the final
(10) Could this patient have condition x?
A part of the information obtained from this question analysis
Thus, our medical QA system must be able to answer natural
process is used by the document retrieval module to perform
language questions according to this set of 10 generic medical
a first selection of documents from websites. In a restricted
questions, discarding other questions (medical and from other
domain the document collections are frequently updated and
domains). The fact that our QA system is only able to answer
this fact derives high maintenance costs of updated document
questions in this question taxonomy produces on one hand a
collections locally stored. This is the main reason why this task
lower recall but on the other hand a higher precision with the
is remotely performed using the Google search service. The
aim that our system will be very useful in the medical domain
obtained result is a very reduced subset of the documentary
according to this question taxonomy.
This adapted domain QA system (in this case, medical do-
Subsequently, the relevant passages selection module per-
main) uses complex NLP techniques as logic forms treatment.
forms a more detailed analysis of the relevant documents sub-
The main differences in the logic forms of our QA system and
set with the objective of detecting those reduced text fragments
those of Moldovan and Mollá are based on the method of deriva-
that are susceptible of containing the search answer.
tion of the logic forms, the method of identifying the predicates
Finally, the answer extraction module processes the small
in the logic forms and the complexity of the inference rules
text fragments set obtained from the previous process with the
in the treatment of the logic forms. On the one hand, the QA
purpose of locating and extracting the search answer.
systems of Moldovan and Mollá derive the logic forms through
graphically shows the execution sequence of these processes
the syntactic analysis of the sentence while, on the other hand,
and the relationships to each other modules.
our QA system derives the logic form through the dependencyrelationships between the words. As Courtin and Genthial said, the processing based on syntactic analysis allows to add
some semantic information on words. In open-domains, thismethod of derivation of the logic forms improves the knowl-edge of the system. On the other hand, in restricted domains
where there exists other knowledge resources, the derivation
of the logic form through the dependency relationships be-tween the words is more concise. Also, in our QA system asin Moldovan’s QA system, the identification of the predicates
is based on the format of Logic Form Transformation of eX-
tended WordNet. In order to focus our QA system in restricteddomains, in the logic forms treatment task, our inference
rules are deeper than the inference rules applied by Moldovan
The next section details the modulate architecture of our QA
system capable of answering the questions formulated accord-ing to a question taxonomy. Concretely, we show the adapta-
tion to the specific medical domain taxonomy, implemented bymeans of the medical QA system.
Fig. 1. Medical QA system modulate architecture. R.M. Terol et al. / Computers in Biology and Medicine 37 (2007) 1511 – 1521
The computational cost of this complex process is primar-
ily dependent on two main factors: the speed of the internet
connection in the tasks of document retrieval and named enti-ties recognition, and the logic form derivation task. The tem-
poral costs derived from the speed of the internet connectionwould be lower if the document collection and the knowledge
resources (presented in the following subsections) were locally
stored because our system is also able to locally work with
these resources. We prefer to remotely work with these re-sources because they are frequently updated (new drugs, new
releases of knowledge resources, and so on). Moreover, with
the aim of running this medical QA system in the most com-
mon operating systems, the JAVAீ platform has been usedin the development phase. The needs of persistent informa-
tion are stored in the file system of the operating system.
Thus, the dependencies between the DBMSs and the operat-ing systems are avoided. Considering these development fea-
Fig. 2. Dependency tree of the sentence.
tures of accessing the resources via internet, the medium tem-poral cost of answering a question using the QA system isaround 8 s.
root of the dependency tree does not modify any word. It is
The Section 3.1 presents how the QA system performs a
also called the head of the sentence.
previous preprocessing of the sentences (questions and pos-
For example, shows the dependence tree of the sen-
sible answers). The Section 3.2 shows the portability fea-
tence “Patient assistance programs help millions get the med-
tures that allow the QA system to run as a medical QA
ications that they need”. The lexical category of each word is
system. Then, the rest of the Sections (from 3.3 to 3.6)
shown inside the brackets behind the word. These lexical cate-
describe the main components of the medical QA system
gories can be noun (N), verb (V), adjective (A), and so on. Each
one of the arrows label the dependency relationship betweenthe modifier and the head. These dependency relationships canbe s (subject), mod (modifier), obj (object), and so on. In this
4.1. Preprocessing of the sentences
example, the verb “to help” is the head of the sentence (the rootof the dependency relationship).
This previous preprocessing of the sentences allows the main
modules to infer logic forms of sentences and obtain similar-ity relationships between verbs in the WordNet lexical
4.1.1.2. Logic form derivation. Once the dependency relation-
ships have been acquired, the next step to automatically inferthe logic form of the sentence is the analysis of these depen-dency relationships between the words of the sentence. Then,
4.1.1. Inferring logic forms of sentences
the logic form derivation is a compositional process that starts
Our medical QA system makes use of the logic forms of the
in the leaves of the dependency tree, continues through the
sentences with the aim of simplifying the sentence treatment
ramifications of the dependency tree and ends in the root of
process. The logic form of a sentence is derived through ap-
the derivation tree. Thus, the logic form is inferred on the one
plying NLP rules to the dependency relationship of the words
hand by the application of simple NLP rules to the leaves of
the dependency tree and, on the other hand, by the applicationof complex NLP rules to all the pairs (modifier, head) in the
4.1.1.1. Getting dependency relationships. The first step nec-
dependency tree. This distinction between single and complex
essary to infer the logic form of a sentence is to obtain the
NLP rules is produced because in the leaves of the dependency
dependency relationships between the words of the sentence.
tree there does not exist any dependency relationship in which
The NLP resource used to obtain the dependency relationships
the word is the head of the dependency relationship while in
between the words of the sentence is MINIPAR a broad-
the ramifications and in the root of the dependency tree depen-
According to the definition proposed by Lin a depen-
To design the single NLP rules only the lexical category of the
dency relationship is an asymmetric binary relationship be-
word has been contemplated while in the design of the complex
tween a word called head (or governor, parent), and another
NLP rules the lexical category of the head, the lexical category
word called modifier (or dependent, daughter). Normally the
of the modifier, the type of dependency relationship and the rel-
dependency relationships constitute a tree that links all the
ative position of the modifier (before the head or after the head)
words in the sentence. This dependency tree has different lev-
have been considered. shows some simple NLP rules
els of words because a word in the sentence may have different
and describes some complex NLP rules. In these tables,
modifiers, but each word may modify at most one word. The
the Leaf column expresses the lexical category of the leaf, the
R.M. Terol et al. / Computers in Biology and Medicine 37 (2007) 1511 – 1521
LCH column describes the lexical category of the head in the
• Preposition: A combination between the x-type and e-type
dependency relationship, the LCM column expresses the lexical
arguments can be assigned as the two arguments of this
category of the modifier in the dependency relationship, the DR
predicate that only link the dependency relationship be-
column shows the type of dependency relationship, the MP col-
tween two other predicates. For instance, the expression
umn expresses the relative position of the modifier with respect
“south of America” could be codified as “south:NN(x1)
to the head, and the LF column shows the inferred logic form.
of:IN(x1, x2) America:NN(x2)” while the expression “go
The assignation of predicates and arguments to the lemma of
to the airport” could be codified as “go:VB(e1, x1, x2)
the words is based on the codification applied by Logic Form
Transformation of eXtended WordNet a lexical resourcebased on logic forms. This codification depends on the part-of-
We summarize this complex process of inferring the logic form
of a sentence through the following example in the sentence“The aspirin is effective”. The first step is to find the dependency
• Noun: An x-type argument is assigned to the predicate of
relationships between the words in the sentence. shows
this word. This argument uniquely identifies this predicate
the dependency tree. The second step consists of applying the
in the logic form. For instance, the noun “house” could be
simple NLP rules to the leaves of this dependency relationship
codified by the predicate “house:NN(x1)”.
and obtaining the predicates of the logic form derived in these
• Verb: An e-type and two x-type arguments are assigned
leaves (see The next step is based on applying the
to the predicate of this word. The first one uniquely
complex NLP rules to the ramifications and the root of the
identifies this predicate (the action of the verb) in the
dependency tree deriving the logic form (see
logic form and the other ones denote the subject and
Once all these rules have been applied to the dependency
the object of the word. If the verb is intransitive then
tree of the sentence “The aspirin is effective”, the logic form is
the object argument must be dummy. As an example,the noun “take” could be codified by the predicate
• Adjective: An x-type argument is assigned to the predicate
of this word. This argument uniquely recognizes this pred-icate (the property denoted by the adjective) in the logic
form. For instance, the adjective “young” could be cod-ified by the predicate “young:NN(x1)”. When the adjec-
tive modifies a noun (there exists a dependency relation-
ship from the adjective to the noun) then both predicatesin the logic form are instantiated by the same x-type ar-
Fig. 3. Dependency tree of the sentence.
gument. For instance “young man” could be codified as“young:JJ(x1) man:NN(x1)”.
• Adverb: An e-type argument is assigned to the predicate of
this word. This argument uniquely identifies this predicate
Simple NLP rules applied to the leafs in the dependency tree
(the action expressed by the adverb) in the logic form. As
an example the adverb “clearly” could be codified by thepredicate “clearly:RB(e1)”.
Table 1Subset of simple NLP rules applied to the leafs in the dependency tree
Table 4Complex NLP rules applied to dependency relationships
subj Before aspirin: NN (x2) be: VB (e1,x2,x3)
Table 2Subset of complex NLP rules applied to dependency relationships
modifier LF + lemma of head:JJ(modifier x var)
modifier LF + lemma of head:VB(new e var, modifier x var, new x var)
head LF+ Atributo:IN(head e var, modifier x var) + modifier LF
R.M. Terol et al. / Computers in Biology and Medicine 37 (2007) 1511 – 1521
inferred as “aspirin:NN(x2) be:VB(e1, x2, x3) Atributo:IN(e1,
x1) effective:JJ(x1)”. Note that the verb “to be” is intransitive.
This fact produces in the logic form that on the one hand the ar-
gument of its predicate that represents the object (x3) is dummyand, on the other hand, the predicate “Atributo” links the de-
pendency relationship between the verb and the adjective.
Our NLP technique used to infer the logic is different to other
techniques that accomplish the same goal such as Moldovan’s
C0020538 Hypertension T047 Disease or syndrome
that takes as input the parse-tree of a sentence, or Mollá’s
that introduces the flat form as an intermediate step betweenthe sentence and the logic form.
This generic NLP resource based on inferring the logic forms
of the sentences is used by our medical QA system in the
biomedical and health terminologies by way of concepts and
performance of question analysis (deriving the logic forms of
semantic types in the UMLS Metathesaurus. On the one hand,
the questions) and answer extraction (deriving the logic forms
the CUI column uniquely identifies the concept while the CN
of the sentences that would contain the answer).
column shows the name of the concept, and on the other hand,the TUI column uniquely identifies the semantic type while the
4.1.2. Similarity relationships between verbs
STY column describes the name of the semantic type associated
In spite of the fact that UMLS is a rich resource in medical
to the concept. Thus, our medical named entities recognition
expressions, it does not contain much information related to
module is based on dictionary. This module retrieves from the
verbs because the verbs should not be domain independent. For
UMLS Metathesaurus all the information relative to the concept
this reason our system uses WordNet to extract the similar-
and the semantic types of the free-text received as argument.
ity relationships of one verb to another. WordNet is a database
The retrieval of this information from the UMLS Metathesaurus
of word meanings and lexical relationships that records the se-
is performed by consuming the UMLS Metathesaurus webser-
mantic relations between the synonym sets, also called synsets.
vice through Simple Object Access Protocol (SOAP), an XML-
A synset can be defined as a group of synonym words. These
based messaging protocol. The processing of this retrieved in-
synsets are related to each other according to different rela-
tions: synonymy, hyponymy, hyperonymy, coordinate terms,
Even though our QA system is able to locally work with
holonymy, meronymy, antonymy, and so on.
the UMLS Metathesaurus, this feature is actually discardedbecause this resource is frequently updated with new releases. 4.2. Portability of the system to the medical domain
The fact that the execution time decreases in a few seconds bylocally working with this resource would suppose the following
To adapt the QA system to the medical domain it is necessary
disadvantages: to detect when a new release has been published,
to obtain medical knowledge by way of medical named entities
to download this new release, to replace the previous installation
recognition, and develop the patterns associated to each one of
with the new release, and to make possible changes in the
the treated generic medical questions.
software that interacts with the new release. 4.2.1. Medical named entities recognition
Our medical QA system needs to recognize the medical enti-
This off-line task consists of the definition of the patterns that
ties in the sentences focusing on the processing in the different
identify each generic question. These patterns are composed by
phases of the QA process. The medical named entities recog-
a combination of types of medical entities and verbs. These pat-
nition performance is developed by using the UMLS a
terns can be generated according to two different methods: the
resource of the language of biomedicine and health. A great
first one consists of the easy process of definition of patterns by
number of concepts, relationships and definitions contained in
an advanced user of the system, and the second one consists of
UMLS have been derived from the Medical Subject Headings
the automatic generation of the patterns through the processing
(MeSH) vocabulary.Concretely, our system uses the UMLS
of questions according to the question taxonomy. We are going
knowledge source called Metathesaurus to accomplish this
to describe these two different ways of generating patterns:
goal. The UMLS Metathesaurus contains information about
Manual pattern generation: The manual definition of these
biomedical and health related concepts (meanings) facilitating
patterns is presented in The advanced user of the system
mapping free-text entries to biomedical and health terminolo-
has to identify the types of medical entities and verbs that must
gies. The UMLS Metathesaurus is organized by concept. These
match in the generic question. The automatic expansion of these
concepts have assigned, at least, one semantic type (category).
verbs according to their similarity relationships with other ones
shows an example of the mapping free-text entries to
in the WordNet lexical database is also performed. The follow-ing step consists of setting the medical entities lower threshold
(MELT) and the medical entities upper threshold (MEUT) of
MeSH is a huge controlled vocabulary created by the United States
National Library for the purpose of indexing journal articles and books in
each pattern. On the one hand, MELT can be defined as the
minimum number of medical entities that must match between
R.M. Terol et al. / Computers in Biology and Medicine 37 (2007) 1511 – 1521
the pattern and the question formulated by the user and, on the
in the logic form through the similarity relations with other
other hand, MEUT can be defined as the maximum number
verbs in the WordNet lexical database. The next step consists
of medical entities that can match between the pattern and the
of the automatic setting of the MELT whose score is set to the
question formulated by the user. Finally, the last step consists
number of medical entities in the logic form minus one, and
of the manual setting of the possible expected answer types.
the automatic setting of the MEUT of which the score is set to
Supervised automatic pattern generation: The automatic
the number of medical entities in the logic form. Finally, the
generation of these patterns by the system is performed through
last step consists of the manual setting of the possible expected
the processing of questions matched to the question taxon-
answer types. This task is supervised by an advanced user of
omy as shown in Thus, the first step consists of the
the system that can modify the results obtained by the system
derivation of the logic form associated to each question. The
next step is the medical named entities recognition in the logicform of those predicates whose type is noun (NN) or complex
nominal (NNC) including their possible adjective modifiers(JJ). The third step is the recognition of the main verb in the
The question analysis performance consists of classifying
logic form and the automatic expansion of this main verb
and analyzing the natural language questions that users can ask. This computational process is based on two different tasks:
• Question classification: assigning one of the generic pat-
terns to each one of the questions that the user asks oursystem.
• Question analysis: performing a complex process on the
question according to the matched pattern and its respectivematched generic question.
This question classification task starts after the user enters
the question into the system. In this implementation of the QAadapted to the medical domain, 10 classes of user questions aremanaged according to the 10 generic questions treated by the
system. Then, this task has to decide if the user question belongs
to one class (matches with one of the generic questions) or not.
To accomplish this goal, this task focuses on the treatment of
question forms derived from the user questions according to the
Fig. 4. Manual pattern generation task.
steps shown in Thus, the first step consists of inferringthe logic form of the question entered to the system. The second
Fig. 5. Supervised automatic pattern generation task.
Fig. 6. Question classification task. R.M. Terol et al. / Computers in Biology and Medicine 37 (2007) 1511 – 1521
step is the extraction of the main verb in the logic form. The
medical website classes where our system can retrieve the med-
next step is the recognition of the medical entities of those
ical documents. Once these medical website classes have been
predicates whose type is noun (NN) or complex nominal (NNC)
defined, an additional task that consists of relating the generic
including their possible adjective modifiers (JJ). The fourth
questions and these medical website classes can be defined but
step is the analysis of the question form setting the medical
it is not necessary. Note that a medical website class can be
entities score in question (MESQ). MESQ can be defined as the
related to more than one generic question, and a generic ques-
number of medical entities in the logic form of the question.
tion can be associated to more than one medical website class.
The next step consists of finding those patterns of questions of
Thus, this association relates each one of the generic questions
which the list of verbs contains the main verb of the logic form
and the medical websites that can answer them.
Then, this document retrieval engine can start retrieving those
the entities matching measure (EMM) which is defined as the
relevant documents from medical websites whether there exists
number of medical entities that match between the question and
or not the association between the searched generic question
the pattern. Finally, the last step is the selection of the pattern
whose difference between EMM and MELT is the lowest one. 4.4.1. Document retrieval by way of medical websites classes
When the treated generic question has been related to at least
Once the user question is matched to a generic question
one medical websites class then the Google search engine re-
pattern from one of the 10 generic questions treated by the
trieves the relevant documents according to the question key-
system, this question analysis task firstly captures the seman-
tics of the user question. As mentioned before, WordNet andUMLS Metathesaurus are used in this performance. The fol-
4.4.2. Document retrieval by way of MFC algorithm
lowing step consists of the recognition of the expected answer
When the treated generic question has not been related to any
type. These medical answer types can be diseases, symptoms,
medical website class then we apply our most frequent classes
dose of drugs, and so on, according to the possible answers to
(MFC) algorithm. This algorithm calculates the most frequent
the 10 generic questions treated by the system. After that, the
medical website classes that rightly answer the treated generic
keywords are identified. These question keywords are directly
question in the latest searches. Thus, the Google search engine
recognized by applying a set of heuristics to the predicates and
retrieves the relevant documents according to the question key-
the relationships between predicates in the logic form. Like
words in the medical websites that belong to these most fre-
question keywords our QA system identifies complex nominals
quent medical website classes. The update of the MFC for the
and nouns recognized as medical expressions (using medical
treated generic question is produced using an adaptation of the
named entities recognition) including their possible adjective
LRU algorithm for database disk buffering This task con-
modifiers, the rest of the complex nominals and nouns includ-
sist of updating the MFC for the treated question with the actual
ing their possible adjective modifiers and the main verb in the
medical website classes where the right answer can be found.
logic form. For instance, in the part of the logic form “. . . high:JJ(x3) blood:NN(x1) NNC(x3, x1, x2) pressure:NN(x2)
. . .”, the predicate x3 is recognized as a Disease or Syndromeand then “high blood pressure” is treated as a keyword. These
Once the medical documents are retrieved, this relevant pas-
question keywords can be expanded by applying a set of heuris-
sage selection process consists of extracting the sentences in
tics. For example, medical expressions can be expanded using
these medical documents that could answer the user question.
similarity relations given by UMLS Metathesaurus. Thus, ac-
These sentences are extracted by applying a technique based on
cording to UMLS Metathesaurus, “high blood pressure” can be
comparing the question keywords in the documents and, those
sentences that at least contain a question keyword are extracted
This set of question keywords is sorted by priority, so if
from the document and are evaluated by the next answer ex-
too many keywords are extracted from the question, only a
traction module that decides if the sentence rightly answers the
maximum number of keywords are searched in the information
This module extracts the answer by analyzing the sentences
Even though the document retrieval module can retrieve lo-
extracted by the previous relevant passage selection module.
cally stored documents, its remote facility retrieves the relevant
This process is performed by applying the following steps to
documents from medical websites using the Google search ser-
each one of the retrieved sentences: the first one consists of
vice. These medical websites can be sorted from the previously
inferring the logic form of the sentence and identifying the
defined medical website classification. This medical website
main verb in this logic form; the following step is to verify if
classification is performed before the real-time execution of
this main verb belongs to the set of verbs that can answer the
the Google search engine and consists of defining the different
generic question; the third step is the recognition of the medical
R.M. Terol et al. / Computers in Biology and Medicine 37 (2007) 1511 – 1521
entities in the logic form; the next step consists of comparing if
the medical entities searched as the answer is found in the logicform; and finally, the last step is the analysis of the predicatesthat relate the candidate answer, the main verb and the restof the medical entities in the logic form (answer form). This
Question
process produces an answer ranking. In a valid answer, the
Classification
verb can uniquely relate two medical entities considering this
feature as a direct link. Also, IN-type predicates can take part inthe relation between the two medical entities considering thisfeature as a connect link. Our system differently scores thesetwo links: 1 for the direct link, and 0.8 for the connect link. Torank the answer, our system applies the link measure defined as
Fig. 7. Question classification task.
For example, if a user formulates the system with the ques-tion “Which drugs are associated with the high blood pressure
when a QA system is directed to any restricted domain do not
problem?”, this question is classified according to the first
exist these kinds of evaluation tracks. This is the main moti-
generic question “What is the drug of choice for condition x?”.
vation why the evaluation of the question classification task is
Continuing with the processing, the answer extraction mod-
based on the evaluation presented by Chung et al. in their pre-
ule receives as input the following sentences: “Cozaar treats
vious research work Thus, a pilot evaluation task applied
hypertension” and “Hyzaar is indicated in the management of
to the evaluation of the question classification performance has
hypertension”. The logic form associated to the first sentence
been developed involving a group of people that did not work
is “cozaar:NN(x1) treat:VB(e1, x1, x2) hypertension:NN(x2)”
on the design and development phases of the QA system. These
while the logic form associated to the second sentence is de-
people received several instructions about the manual construc-
fined as “hyzaar:NN(x1) indicate:VB(e1, x1, x4) in:IN(e1, x3)
tion of these types of questions to manually create 50 questions
management:NN(x3) of:IN(x3, x2) hypertension:NN(x2)”. The
according to the 10 generic questions answered by the system
answer form associated to the first logic form is instantiated
(GQ1: five questions that are matched to the first generic ques-
as “Pharmacologic_Substance:NN(x1) treat:VB(e1, x1, x2)
tion; . . .; GQ10: five questions that are adjusted to the tenth
Disease_or_Syndrome:NN(x2)”. Only a direct link (the treat
generic question.). Also, the OQ question set that is composed
verb) relates both medical entities (Pharmacologic_Substance
of 200 questions of the last QA English Track at CLEF 2005
and Disease_or_Syndrome). In this case LM = 1. The an-
conference is also included to evaluate the robustness of the
swer form associated to the second logic form is instanti-
question classification task in a noisy environment.
ated as “Pharmacologic_Substance:NN(x1) indicate:VB(e1,
shows how the question classifier task is able to classify
x1, x4) in:IN(e1, x3) management:NN(x3) of:IN(x3, x2)
each one of the given questions in one of the following classes
Disease_or_Syndrome:NN(x2)”. A direct link (the indicate
verb) and two connect links (in and of) relate both medical en-tities (Pharmacologic_Substance and Disease_or_Syndrome).
• GE: This class of questions include each one of the 10
In this case LM = 0.8. Then, the answer ranking according to
generic questions. Thus, GE1 corresponds with the generic
the LM scores would be: Cozaar and Hyzaar. These two an-
question “What is the drug of choice for condition x?”,
swers would be the results returned by the system. LM ranks
GE2 is matched with the generic question “What is the
the answers according to the length of the paths between the
cause of symptom x?”, . . . , and GE10 is arranged with the
treated medical entities. Thus, short paths would be in header
generic question “Could this patient have condition x?”.
• OE: The rest of the questions from other domains. 5. Results
Then the evaluation task consist of checking if each oneof the 250 evaluation questions (GQ1, . . . , GQ10 and OQ)
The evaluation of the medical QA system is based on the
have been correctly classified in the appropriate class of
question analysis module, the core of the system, because the
questions (GE1, . . . , GE10 or OE). As an evaluation mea-
good performance of its question classification task (rightly
sure, we apply the precision measure (P ) defined as
classifying the formulated question into one of the generic ques-
P = # correctly classified questions/# classified questions.
tions) finally derives in the increasing of the precision of the
In order to show the results obtained in this question classifi-
system. Despite the fact that open-domain QA systems can be
cation task, shows the obtained results in the evaluation
evaluated according to TREC and CLEFevaluation tracks,
of each subset of evaluation questions while presentsthese summarized results according to the generic set of evalu-
6 Similar to TREC, CLEF is other system evaluation campaign where
ation questions. The class column expresses the class of ques-
QA systems can be tested, tuned and evaluated.
tions that we are evaluating. The related class column shows
R.M. Terol et al. / Computers in Biology and Medicine 37 (2007) 1511 – 1521
web-browser. Thus, the use of the medical QA system would
Detailed evaluation of the question classification task
The main novelty of the medical QA system is that the infor-
mation can be retrieved from internet websites in comparison
to most QA systems (in open and restricted domains) that only
retrieve locally stored information in a known host. In spite
of the medical question taxonomy presented in this article, the
extension to other medical questions can be easily performed.
Due to the efficient resources and techniques used by the med-
ical QA system, the average temporal costs are round about 8 s
Also, with the aim to improve the temporal costs in answer-
ing the medical questions, each treated medical question canbe searched in the medical websites considered by the systemadministrator. If this feature is not considered then the system
automatically applies an adaptation to our task of the LRU al-
Summarized evaluation of the question classification task
gorithm used by the operating systems and the DBMSs in thememory management performance. This algorithm considers
the medical websites where the system retrieved the documents
that rightly responded to this class of question in previous exe-
cutions of the system, and orders them according to the number
of right responses retrieved in each medical website.
The software engineering rules that treat the module coupling
and module cohesion properties in an object-oriented context
the correct related class associated to each classified class. The
have been applied in the design of the medical QA system
questions column presents the number of classified questions.
architecture. For this reason the medical QA system can also
The number of five questions per classand 200 noisy ques-
be easily extended to other domains. This fact only implies the
tions has been empirically established in the pilot evaluation
adaptation to the new domain of the system’s submodule that
task but this fact does not mean that the classifier is only able to
performs the entities recognition task, and the indications of
classify this number of questions. The classifier, as the rest of
which are the right websites dependent on the new domain that
components of the QA system, does not consider this number
contain the information in which the answers can be extracted.
of questions to perform their functions. So, the QA system isable to sequentially manage an unlimited number of questions. 7. Summary
The correct column indicates the number of questions that havebeen correctly classified according to the related class. The pre-
QA is applied to medical disciplines in modern QA over re-
cision column shows the precision of this classification task
stricted domains. It allows users to efficiently obtain a list of an-
that agrees with the presented evaluation measure.
swers to medical questions. The medical QA system presented
According to the overall row in the precision score
in the present article is able to answer these questions according
of the question classifier task is 94.4%. This good score will
to a medical question taxonomy. Thus, the medical QA system
positively condition the right performance of the following parts
offers tools to automatically define the functional patterns of
of this QA process in the medical domain.
a new medical question towards a set of matched questions tothis new medical question. Once these functional patterns have
6. Discussion
been automatically created, the new medical question is ableto be answered by the medical QA system, in conjunction with
It is well known that there exists a lot of information needs
the rest of these generic medical questions. Also, the medical
related to the different medical areas and specialities. Most of
websites where the system can find the right answers to the new
the on-line information in the health and medical areas are un-
question can be given easily as an input of the system. This
known to people outside of these areas including health care
guide to medical websites will improve the temporal costs of
professionals. These information needs can be solved by ap-
the system in answering this class of medical questions. The
plying the medical QA system capable of answering medical
core of the medical QA system is the logic form treatment.
questions by retrieving the information from medical websites
This complex process is produced by applying advanced NLP
discarding any other wrong medical information that anybody
techniques. The logic form of a sentence is derived through ap-
can put on different websites. According to the proposed ar-
plying NLP rules to the dependency relationship of the words
chitecture the medical QA system can be easily transformed
in the sentence. The NLP resource used to obtain these depen-
to a client–server application on the web accessed through a
dency relationships is MINIPAR a broad coverage parser. Other NLP resources are used in this complex process: on the
7 Five questions per class according to the question taxonomy.
one hand the WordNet lexical database is used to extract
R.M. Terol et al. / Computers in Biology and Medicine 37 (2007) 1511 – 1521
the similarity relationships between the verbs and, on the other
[12] F. Rinaldi, J. Dowdall, G. Schneider, A. Persidis, Answering questions
hand, the UMLS Metathesaurus is used to recognize the
in the genomics domain, in: Proceedings of 42nd Annual Meeting of
medical named entities in the text. In spite of the fact that this
the Association for Computational Linguistics, Workshop on Question
Answering in Restricted Domains, Barcelona, Spain, July 2004.
QA system has been adapted to the medical domain, it also can
[13] D.A.B. Lindberg, B.L. Humphreys, A.T. McCray, The Unified Medical
be adapted to other restricted domains.
Language System, in: Methods of Information in Medicine, vol. 32(4),
Acknowledgments
[14] Y. Niu, G. Hirst, G. McArthur, P. Rodriguez-Gianolli, Answering clincal
questions with role identification, in: Proceedings of 41st Annual Meetingof the Association for Computational Linguistics, Workshop on Natural
This work has been partially funded by the Spanish Gov-
Language Processing in Biomedicine, Sapporo, Japan, July 2003.
ernment under project CICyT number TIN2006-15265-C06-0
[15] J.W. Ely, J.A. Osheroff, P.N. Gorman, M.H. Ebell, M.L. Chambliss, E.A.
and PROFIT number PI051438, by the European Union under
Pifer, P.Z. Stavri, A taxonomy of generic clinical questions: classification
project number FP6-IST-2005-33860 and by the Valencia Gov-
study, Brit. Med. J. 321 (2000) 429–432.
ernment under project number GV06/028.
[16] J. Courtin, D. Genthial, Parsing with dependency relations and robust
parsing, in: Proceedings of COLING-ACL ’98 Workshop on Processing
References
of Dependency-based Grammars, Montreal, August 1998, pp. 88–94.
[17] D. Lin, Dependency-based evaluation of minipar, in: Workshop on the
Evaluation of Parsing Systems, Granada, Spain, 1998.
[1] D. Moldovan, C. Clark, S. Harabagiu, S. Maiorano, COGEX: a logic
[18] D. Moldovan, V. Rus, Logic form transformation of WordNet and its
prover for question answering, in: Proceedings of HLT-NAACL 2003,
applicability to question-answering, in: Proceedings of 39th Annual
Human Language Technology Conference, Edmonton, Canada, 2003,
Meeting of the Association for Computational Linguistics, Toulouse,
[2] Y. Sasaki, Question answering as question-biased term extraction: a
[19] B.L. Humphreys, D.A.B. Lindberg, The UMLS project: making the
new approach toward multilingual QA, in: Proceedings of 43th Annual
conceptual connection between users and the information they need,
Meeting of the Association for Computational Linguistics, Michigan,
Bull. Med. Libr. Assoc. 81 (1993) 170–177.
[20] E.J. O’Neil, P.E. O’Neil, G. Weikum, The LRU-K page replacement
[3] J.L. Vicedo, M. Saiz, R. Izquierdo, F. Llopis, Does English help question
algorithm for database disk buffering, in: ACM SIGMOD Record, vol.
answering in Spanish, in: Proceedings of the Fifth Workshop of the
Cross-Language Evaluation Forum, CLEF 2004, Bath, UK, September2004.
[4] I. Zukerman, B. Raskutti, Lexical query paraphrasing for document
Rafael M. Terol (1979) graduated from the University of Alicante, Spain with
retrieval, in: H.-H. Chen, C.-Y. Lin (Eds.), Proceedings of the 19th
a Bachelor of Engineering Degree in Information technology. With a university
International Conference on Computational Linguistics, COLING 2002,
rank for his undergraduate degree, he joined the University of Alicante in the
year 2002 for his Master of Computer Sciences degree in Natural Language
[5] D. Demner-Fushman, J. Lin, Knowledge extraction for clinical question
Processing Systems under the supervision of Dr. Patricio Martinez-Barco andDr. Manuel Palomar. Working in the area natural language processing, Rafael
answering: preliminary results, in: Proceedings of the AAAI-05
performed his research work at GPLSI-UA Spain. He is currently working
Workshop on Question Answering in Restricted Domains, Pittsburgh,
in the area of question answering (QA) in restricted domains. His research
interests include, medical QA, textual entailment and information retrieval.
[6] F. Benamara, Cooperative question answering in restricted domains: the
WEBCOOP experiment, in: ACL 2004 Workshop on Question Answering
Patricio Martinez-Barco (1968) Ph.D. in Computer Science by the Univer-
in Restricted Domains, Barcelona, Spain, July 2004.
sity of Alicante (2001). Master in Computer Science by the University of
[7] Y. Niu, G. Hirst, Analysis of semantic classes in medical text for question
Alicante (1994). He is working since 1995 in the Department of Software
answering, in: Proceedings of 42nd Annual Meeting of the Association
and Computing Science (GPLSI division) at this University as a lecturer.
for Computational Linguistics, Workshop on Question Answering in
His research interests are focused on Computational Linguistics and Natural
Restricted Domains, Barcelona, Spain, July 2004.
Language Processing. His last projects are related to temporal expression res-
[8] D. Mollá, R. Schwitter, M. Hess, R. Fournier, ExtrAns, an answer
olution, syntactic-semantic patterns and logical forms applied to Information
exraction system, TAL Special Issue on Information Retrieval Oriented
Extraction, Information Retrieval and Question Answering. He was General
Natural Language Processing, 2002, pp. 495–522.
Chair of the ESTAL’04 (Alicante) and SEPLN’04 (Barcelona) conferences,
as well as Local Chair of the SLPLT’01 (Jaén) workshop. He has editedseveral books, and contributed with more than 40 papers to several journals
morphologically and semantically enhanced resource, in: Proceedings
of ACL-SIGLEX99: Standardizing Lexical Resources, Maryland, June1999, pp. 1–8.
[10] G.A. Miller, WordNet: an on-line lexical database, Int. J. Lexicography
Manuel Palomar (1964) is the pro-vice-chancellor for research at the Uni- versity of Alicante, and he is the head of the Natural Language Processing
Group (GPLSI) of Language and Information Systems Department at the
[11] H. Chung, Y.-I. Song, K.-S. Han, D.-S. Yoon, J.-Y. Lee, H.-C. Rim, S.-
University of Alicante, Spain. Palomar received a Ph.D. in computer science
H. Kim, A practical QA system in restricted domains, in: Proceedings of
from the Technical University of Valencia, Spain. His research interests in-
42nd Annual Meeting of the Association for Computational Linguistics,
clude information extraction, question answering, linguistic resources and in
Workshop on Question Answering in Restricted Domains, Barcelona,
general research on Human Language Technologies.
VLS-AIM INTERFACE MODULE MANUAL INTRODUCTION The information is this manual is intended as an installation guide for the Videx intercom to SmartDisc/SmartTel digital video transmission system. This manual should be read carefully before the installation commences. Any damage caused to the equipment due to faulty installations where the information in this manual h