OpenAIR OpenAIR
 
 

OpenAIR @ RGU >
Design and Technology >
Computing >
Theses (Computing) >

Please use this identifier to cite or link to this item: http://hdl.handle.net/10059/623
This item has been viewed 15 times in the last year. View Statistics

Files in This Item:

File Description SizeFormat
Muresan PhD.pdf19.35 MBAdobe PDFView/Open
Title: Using document clustering and language modelling in mediated information retrieval.
Authors: Muresan, Gheorghe
Supervisors: Harper, David J.
Issue Date: Jan-2002
Publisher: Robert Gordon University
Citation: MURESAN, G. and HARPER, D. J., 2001. Document clustering and language models for system-mediated information access. In: Proceedings of the 5th European Conference on Research and Advanced Technology for Digital Libraries (ECDL'01), Darmstadt, Germany, September 2001, pp. 438-449
MURESAN, G., HARPER, D. J. and GOKER, A., 2001. ClusterBook: a tool for system-mediated access via clustered collections. In: ECDL'01. Darmstadt, Germany (Demo/poster)
MURESAN, G., HARPER, D. J., GOKER, A. and LOWIT, P., 2000. Cluster-Book: a tool for dual information access. In: Proceedings of SIGIR'00, Athens, July 2000, pp. 391 (Demo)
MURESAN, G., HARPER, D. J. and MECHKOUR, M., 1999. WebCluster a tool for mediated information access. In: Proceedings of SIGIR'99, Berkeley, August 1999, pp. 337 (Demo)
HARPER, D. J., MECHKOUR, M. and MURESAN, G., 1999. Document clustering for mediated information access. In: Proceedings of the 21st BCS-ISRG Annual Colloquium on IR Research, Glasgow, April 1999.
MECHKOUR, M., HARPER, D. J. and MURESAN, G., 1998. The WebCluster project: using document clustering for mediating access to the World Wide Web. In: Proceedings of SIGIR'98, Melbourne, Australia, August 1998 (Poster)
Abstract: Our work addresses a well documented problem: users are frequently unable to articulate a query that clearly and comprehensively expresses their information need. This can be attributed to the information need being too ambiguous and not clearly defined in the user's mind, to a lack of knowledge of the domain of interest on the part of the user, to a lack of understanding of a retrieval system's conceptual model, or to an inability to use a certain query syntax. This thesis proposes a software tool that emulates the human search mediator. It helps a user explore a domain of interest, learn its structure, terminology and key concepts, and clarify and refine an information need. It can also help a user generate high-quality queries for searching the World Wide Web or other such large and heterogeneous document collections. Our work was inspired by library studies which have highlighted the role of the librarian in helping the user explore her information need, define the problem to be solved, articulate a formulation of the information need and adapt it for the retrieval system at hand in order to get information. Our approach, mediated access through a clustered collection, is based on an information access environment in which the user can explore a relatively small, well structured, pre-clustered document collection covering a particular subject domain, in order to understand the concepts encompassed and to clarify and refine her information need. At the same time, the user can ostensively indicate clusters and documents of interest so that the system builds a model of the user's topic of interest. Based on this model, the system assists and guides the user's exploration, or generates `mediated queries' that can be used to search other collections. We present the design and evaluation of WebCluster, a system that reifies the concept of mediated retrieval. Additionally, a variety of mediation experiments are presented,which provide guidelines as to which mediation strategies are more appropriate for different types of tasks. A set of experiments is presented that evaluate document clustering's capacity to group together topical documents and support mediation. In this context we propose and experimentally test a new formulation for the cluster hypothesis. We also look at the ability of language models to convey content, to represent topics and to highlight specific concepts in a given context. They are also successfully applied to generate flexible, task-dependent cluster representatives for supporting exploration through browsing and respectively searching. Our experimental results show that mediation has potential to significantly improve user queries and consequently the retrieval effectiveness.
Appears in Collections:Theses (Computing)

All items in OpenAIR are protected by copyright, with all rights reserved.

 

 
   Disclaimer | Freedom of Information | Privacy Statement |Copyright ©2012 Robert Gordon University, Schoolhill, Aberdeen, AB10 1FR, Scotland, UK: a Scottish charity, registration No. SCO13781