Please use this identifier to cite or link to this item: http://etd.cput.ac.za/handle/20.500.11838/3300
Title: Hybridised indexing for research based information retrieval
Authors: Fitzgerald, Kyle Andrew 
Keywords: Hybrid token index;Information retrieval -- Research;Information storage and retrieval systems
Issue Date: 2019
Publisher: Cape Peninsula University of Technology
Abstract: Challenges exist for information retrieval systems in handling mismatching vocabularies in queries and candidate source documents. As a result, these information retrieval systems may retrieve some documents that are non-relevant and miss some that are relevant. This increases the time for research by forcing additional perusal of unsatisfactory results, and additional searches using alternative vocabularies, which renders information retrieval systems less effective than they could be, and inhibits productive research. The aim of this research was to design, build, and rigorously pilot test a hybrid indexing method that maintains phrase-term word ordinality and word proximity, and to compare the effectiveness of this method with the traditional inverted indexing method. The objectives were to prove statistically that the hybrid indexing method: i) increases the effectiveness of retrieving only those documents that are judged relevant by the user; ii) reduces errors in incorrect identification of user judged relevant documents, thus reducing the number of documents for the user to peruse; and iii) increases the rejection quality of user non-relevant documents, thus providing confidence to the user in the judgement of the information retrieval system. Finally, to determine whether this hybrid indexing method solves the problem of mismatching vocabulary between a query and a document, and satisfies the information needs of the user by retrieving only those documents from the collection relevant to the user. It must be noted that the results from the statistical analysis in this research are not the contribution to knowledge, as the statistics are used to prove that the hybrid indexing method worked. This indexing method is the contribution to the body of knowledge. The strategy used was based on design science research performing both an exploratory and an explanatory study. Quantitative data were collected from the results of processing search queries through two information retrieval systems (one using the hybrid indexing method and the other the inverted indexing method) and from the results of a questionnaire completed by five participants during an experiment. The quantitative data were converted to binary and tested statistically using the mean averages for precision, recall, and specificity, and the Kappa coefficient. The hybrid indexing method was presented and proved, with significance, to increase system effectiveness and specificity. Based on the results, the vocabulary mismatch problem between a query and a document was solved, but the information needs of the user were not satisfied.
Description: Thesis (Doctor of Information and Communication Technology: Information Technology)--Cape Peninsula University of Technology, 2019
URI: http://etd.cput.ac.za/handle/20.500.11838/3300
Appears in Collections:Information Technology - Doctoral Degree

Files in This Item:
File Description SizeFormat 
Fitzgerald_Kyle_205118801_Vol_1.pdfMain Thesis File4.53 MBAdobe PDFView/Open
Fitzgerald_Kyle_205118801_Vol._2.pdfAppendices File5.94 MBAdobe PDFView/Open
Show full item record

Google ScholarTM

Check


Items in Digital Knowledge are protected by copyright, with all rights reserved, unless otherwise indicated.