Keyword-Based Text Document Retrieval System

Edwin Mathew *, L. Karthikeyan **, B. Muthu Senthil ***
*-*** Department of Computer Science and Engineering, SRM Valliammai Engineering College, Kattankulathur, Tamil Nadu, India.
Periodicity:September - November'2020
DOI : https://doi.org/10.26634/jit.9.4.18244

Abstract

One of the most fundamental problems in data mining is to find out some piece of information from a data repository. This repository could be of any form. It could be a group of html documents, a list of text documents, a list of social media profiles etc. What it essentially comes down to is identifying a subset of documents that can be filtered or identified based on some keywords. Even though filtering based on keywords is useful to narrowing down and eliminating documents if the user is presented with a huge subset of documents, there should exist a reliable means to rank such a set of documents. Also, there should be a comprehensive and efficient system that put together all the pieces behind the ranking process.

Keywords

Information Retrieval, Query-Based Filtering.

How to Cite this Article?

Mathew, E., Karthikeyan, L., and Senthil, B. M. (2020). Keyword-Based Text Document Retrieval System. i-manager's Journal on Information Technology, 9(4), 1-8. https://doi.org/10.26634/jit.9.4.18244

References

[1]. Azis, M. A., Hamid, A., Fauzi, A., Yulianto, E., & Riyanto, V. (2019, November). Information retrieval system in textbased skripsi document search file using vector space model method. In Journal of Physics: Conference Series (Vol. 1367, No. 1, p. 012016). IOP Publishing.
[2]. Boukhari, K., & Omri, M. N. (2020). DL-VSM based document indexing approach for information retrieval. Journal of Ambient Intelligence and Humanized Computing, 1-12. https://doi.org/10.1007/s12652-020- 01684-x
[3]. Cakaloglu, T., & Xu, X. (2019). MRNN: A Multi- Resolution Neural Network with Duplex Attention for Document Retrieval in the Context of Question Answering. arXiv preprint, arXiv:1911.00964.
[4]. Fernández-Reyes, F. C., & Shinde, S. (2019). CV Retrieval System based on job description matching using hybrid word embeddings. Computer Speech & Language, 56, 73-79. https://doi.org/10.1016/j.csl.2019. 01.003
[5]. Horvat, M., Jović, A., & Ivošević, D. (2020). Lift Charts- Based Binary Classification in Unsupervised Setting for Concept-Based Retrieval of Emotionally Annotated Images from Affective Multimedia Databases. Information, 11(9), 429. https://doi.org/10.3390/info110 90429
[6]. Kayest, M., & Jain, S. K. (2019). Optimization driven cluster based indexing and matching for the document retrieval. Journal of King Saud University-Computer and Information Sciences. https://doi.org/10.1016/j.jksuci. 2019.02.012
[7]. Kumar, P. A., Rao, T. S. M., Raj, L. A., & Pugazhendi, E. (2021). An Efficient Text-Based Image Retrieval Using Natural Language Processing (NLP) Techniques. In Intelligent System Design (pp. 505-519). Springer, Singapore. https:// doi.org/10.1007/978-981-15-5400-1_52
[8]. Kundu, D., & Mandal, D. P. (2019). Formulation of a hybrid expertise retrieval system in community question answering services. Applied Intelligence, 49(2), 463-477. https://doi.org/10.1007/s10489-018-1286-z
[9]. Li, X., Shang, W., & Wang, S. (2019). Text-based crude oil price forecasting: A deep learning approach. International Journal of Forecasting, 35(4), 1548-1560. https://doi.org/10.1016/j.ijforecast.2018.07.006
[10]. Mafla, A., Dey, S., Biten, A. F., Gomez, L., & Karatzas, D. (2021). Multi-modal reasoning graph for scene-text based fine-grained image classification and retrieval. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (pp. 4023-4033).
[11]. Nandedkar, A. V., & Nandedkar, A. V. (2019, September). Multimodal Query Based Approach for Document Image Retrieval. In International Conference on Computer Vision and Image Processing (pp. 361-371). Springer, Singapore.
[12]. Nie, P., Zhang, Y., Ramamurthy, A., & Song, L. (2020). Answering Any-hop Open-domain Questions with Iterative Document Reranking. arXiv e-prints, arXiv-2009.
[13]. Ortiz, M. S., Kim, H., Wang, M., Seki, K., & Mostafa, J. (2019, September). Dynamic cluster-based retrieval and discovery for biomedical literature. In Proceedings of the 10 th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics (pp. 390- 396). https://doi.org/10.1145/3307339.3342191
[14]. Park, J., Park, S., Kim, K., Hwang, W., Yoo, S., Yi, G. S., & Lee, D. (2020). An interactive retrieval system for clinical trial studies with context-dependent protocol elements. PloS one, 15(9), e0238290. https://doi.org/10.1371/ journal.pone.0238290
[15]. Pathak, A., Pakray, P., & Das, R. (2019). Context guided retrieval of math formulae from scientific documents. Journal of Information and Optimization Sciences, 40(8), 1559-1574. https://doi.org/10.1080/025 22667.2019.1703255
[16]. Rahimi, Z., & Homayounpour, M. M. (2020). Tensembedding: A Tensor-based document embedding method. Expert Systems with Applications, 162, 113770. https://doi.org/10.1016/j.eswa.2020.113
[17]. Rinaldi, A. M., Russo, C., & Tommasino, C. (2020). A knowledge-driven multimedia retrieval system based on semantics and deep features. Future Internet, 12(11), 183. https://doi.org/10.3390/fi12110183
[18]. Schaer, R., Otálora, S., Jimenez-del-Toro, O., Atzori, M., & Müller, H. (2019). Deep learning-based retrieval system for gigapixel histopathology cases and the open access literature. Journal of pathology informatics, 10. https://dx.doi.org/10.4103%2Fjpi.jpi_88_18
[19]. Souissi N., Ayadi H. and Torjmen-Khemakhem M. (2019). Text-based Medical Image Retrieval using Convolutional Neural Network and Specific Medical Features. In Proceedings of the 12th International Joint Conference on Biomedical Engineering Systems and Technologies - HEALTHINF, Vol. 5, 78-87. https://doi.org/10.5220/0007355400780087
[20]. Suma, V. (2020). A novel information retrieval system for distributed cloud using hybrid deep fuzzy hashing algorithm. Journal of Information Technology and Digital World, 20(3), 151-160. https://doi.org/10.36548/jitdw. 2020.3.003
[21]. Zhuang, C., Li, W., Xie, Z., & Wu, L. (2021). A multigranularity knowledge association model of geological text based on hypernetwork. Earth Science Informatics, 14(1), 227-246. https://doi.org/10.1007/s12145-020- 00534-w
If you have access to this article please login to view the article or kindly login to purchase the article

Purchase Instant Access

Single Article

North Americas,UK,
Middle East,Europe
India Rest of world
USD EUR INR USD-ROW
Pdf 35 35 200 20
Online 35 35 200 15
Pdf & Online 35 35 400 25

Options for accessing this content:
  • If you would like institutional access to this content, please recommend the title to your librarian.
    Library Recommendation Form
  • If you already have i-manager's user account: Login above and proceed to purchase the article.
  • New Users: Please register, then proceed to purchase the article.