OpenAIR @ RGU >
Design and Technology >
Theses (Computing) >
Please use this identifier to cite or link to this item:
|Title: ||Using document clustering and language modelling in mediated information retrieval.|
|Authors: ||Muresan, Gheorghe|
|Supervisors: ||Harper, David J.|
|Issue Date: ||Jan-2002|
|Publisher: ||Robert Gordon University|
|Citation: ||MURESAN, G. and HARPER, D. J., 2001. Document clustering and language models for system-mediated information access. In: Proceedings of the 5th European Conference on Research and Advanced Technology for Digital Libraries (ECDL'01), Darmstadt, Germany, September 2001, pp. 438-449|
MURESAN, G., HARPER, D. J. and GOKER, A., 2001. ClusterBook: a tool for system-mediated access via clustered collections. In: ECDL'01. Darmstadt, Germany (Demo/poster)
MURESAN, G., HARPER, D. J., GOKER, A. and LOWIT, P., 2000. Cluster-Book: a tool for dual information access. In: Proceedings of SIGIR'00, Athens, July 2000, pp. 391 (Demo)
MURESAN, G., HARPER, D. J. and MECHKOUR, M., 1999. WebCluster a tool for mediated information access. In: Proceedings of SIGIR'99, Berkeley, August 1999, pp. 337 (Demo)
HARPER, D. J., MECHKOUR, M. and MURESAN, G., 1999. Document clustering for mediated information access. In: Proceedings of the 21st BCS-ISRG Annual Colloquium on IR Research, Glasgow, April 1999.
MECHKOUR, M., HARPER, D. J. and MURESAN, G., 1998. The WebCluster project: using document clustering for mediating access to the World Wide Web. In: Proceedings of SIGIR'98, Melbourne, Australia, August 1998 (Poster)
|Abstract: ||Our work addresses a well documented problem: users are frequently unable to articulate
a query that clearly and comprehensively expresses their information need. This can
be attributed to the information need being too ambiguous and not clearly defined in the
user's mind, to a lack of knowledge of the domain of interest on the part of the user, to a
lack of understanding of a retrieval system's conceptual model, or to an inability to use a
certain query syntax.
This thesis proposes a software tool that emulates the human search mediator. It helps
a user explore a domain of interest, learn its structure, terminology and key concepts, and
clarify and refine an information need. It can also help a user generate high-quality queries
for searching the World Wide Web or other such large and heterogeneous document collections.
Our work was inspired by library studies which have highlighted the role of the librarian
in helping the user explore her information need, define the problem to be solved,
articulate a formulation of the information need and adapt it for the retrieval system at
hand in order to get information.
Our approach, mediated access through a clustered collection, is based on an
information access environment in which the user can explore a relatively small, well structured,
pre-clustered document collection covering a particular subject domain, in order
to understand the concepts encompassed and to clarify and refine her information need.
At the same time, the user can ostensively indicate clusters and documents of interest so
that the system builds a model of the user's topic of interest. Based on this model, the
system assists and guides the user's exploration, or generates `mediated queries' that can
be used to search other collections.
We present the design and evaluation of WebCluster, a system that reifies the concept
of mediated retrieval. Additionally, a variety of mediation experiments are presented,which provide guidelines as to which mediation strategies are more appropriate for different
types of tasks.
A set of experiments is presented that evaluate document clustering's capacity to
group together topical documents and support mediation. In this context we propose and
experimentally test a new formulation for the cluster hypothesis.
We also look at the ability of language models to convey content, to represent topics
and to highlight specific concepts in a given context. They are also successfully applied
to generate flexible, task-dependent cluster representatives for supporting exploration
through browsing and respectively searching.
Our experimental results show that mediation has potential to significantly improve
user queries and consequently the retrieval effectiveness.|
|Appears in Collections:||Theses (Computing)|
All items in OpenAIR are protected by copyright, with all rights reserved.