Design and Development of Feature Based Similarity Measure Crawling Algorithm: An Approach to Text Mining

Prashant Dahiwale*, Sanjay Mate**, M.M. Raghuwanshi***
* Lecturer, Department of Computer Engineering, Government Polytechnic Daman, UT of Daman and Diu, India.
** Lecturer, Department of Information Technology, Government Polytechnic Daman, UT of Daman and Diu, India.
*** Professor, Department of Computer Technology, Yashwantrao Chauhan College of Engineering, Nagpur, Maharashtra, India.
Periodicity:January - March'2018
DOI : https://doi.org/10.26634/jse.12.3.14554

Abstract

The speed at which World-Wide-Web (WWW) spreading its division from an insubstantial number of web-pages to a
enormous centre of web information progressively improves web crawling complications in a search engine. A search
engine control a set of queries from a varying part of this world, and the satisfaction of it only depend on the knowledge
that it collects by means of crawling. The most general habit of the society is information distribution, and it is done by
means of publishing prearranged, semi-structured and amorphous reserve on the web (Nandy, Sarkar, and Das, 2012).
This social practice directs to an exponential expansion of web-resource, and hence it became necessary to crawl for
non stop updating of web-knowledge and variations of some presented sources in any conditions. This paper proposes
feature based crawling algorithm for light weighted and efficient crawling. The scaling technique is used to evaluate the
performance of proposed method with the standard crawler. The great speed presentation is observed after scaling,
and the extract of related web-source in such a extreme speed is examined.

Keywords

Features Vector, Similarity Measure, Equivalence Measure, Term Frequency, Data Mining, Information Extraction, Focused Crawler, Crawler Analysis.

How to Cite this Article?

Dahiwale, P., Mate, S., and Raghuwanshi, M, M. (2018). Design and Development of Feature Based Similarity Measure Crawling Algorithm: An Approach to Text Mining. i-manager's Journal on Software Engineering, 12(3), 1-7. https://doi.org/10.26634/jse.12.3.14554

References

Purchase Instant Access

Single Article

North Americas,UK,
Middle East,Europe
India Rest of world
USD EUR INR USD-ROW
Pdf 35 35 200 20
Online 35 35 200 15
Pdf & Online 35 35 400 25

If you have access to this article please login to view the article or kindly login to purchase the article
Options for accessing this content:
  • If you would like institutional access to this content, please recommend the title to your librarian.
    Library Recommendation Form
  • If you already have i-manager's user account: Login above and proceed to purchase the article.
  • New Users: Please register, then proceed to purchase the article.