OpenAIR @ RGU >
Design and Technology >
Computing >
Conference publications (Computing) >

Please use this identifier to cite or link to this item:
This item has been viewed 12 times in the last year. View Statistics

Files in This Item:

File Description SizeFormat
Song CIKM2006.pdf260.98 kBAdobe PDFView/Open
Title: Concept-based document readability in domain specific information retrieval.
Authors: Yan, Xin
Song, Dawei
Li, Xue
Keywords: Document ranking
Document readability
Document scope and cohesion
Readability formula
Issue Date: 2006
Publisher: ACM
Citation: YAN, X., SONG, D. and LI, X., 2006. Concept-based document readability in domain specific information retrieval. In: P. YU, V. TSOTRAS, E. FOX and B. LIU, eds. Proceedings of the 15th ACM International Conference on Information and Knowledge Management. 6-11 November 2006. Arlington, VA, USA. pp. 540-549.
Abstract: Domain specific information retrieval has become in demand. Not only domain experts, but also average non-expert users are interested in searching domain specific (e.g., medical and health) information from online resources. However, a typical problem to average users is that the search results are always a mixture of documents with different levels of readability. Non-expert users may want to see documents with higher readability on the top of the list. Consequently the search results need to be re-ranked in a descending order of readability. It is often not practical for domain experts to manually label the readability of documents for large databases. Computational models of readability needs to be investigated. However, traditional readability formulas are designed for general purpose text and insufficient to deal with technical materials for domain specific information retrieval. More advanced algorithms such as textual coherence model are computationally expensive for re-ranking a large number of retrieved documents. In this paper, we propose an effective and computationally tractable concept-based model of text readability. In addition to textual genres of a document, our model also takes into account domain specific knowledge, i.e., how the domain-specific concepts contained in the document affect the document's readability. Three major readability formulas are proposed and applied to health and medical information retrieval. Experimental results show that our proposed readability formulas lead to remarkable improvements in terms of correlation with users' readability ratings over four traditional readability measures.
ISBN: 9781595934338
Appears in Collections:Conference publications (Computing)

All items in OpenAIR are protected by copyright, with all rights reserved.


   Disclaimer | Freedom of Information | Privacy Statement |Copyright ©2012 Robert Gordon University, Garthdee House, Garthdee Road, Aberdeen, AB10 7QB, Scotland, UK: a Scottish charity, registration No. SC013781