Generally, Information Extraction (IE) is concentrated on fulfilling exact, restricted, pre-specified solicitations from homogeneous corpora (e.g., extract the area and time of courses from a set of declarations). Information Extraction (IE) customarily relied with comprehensive individual engagement by means of hand crafted extraction guidelines as well as hand-tagged instruction illustrations. Furthermore, the user is required to clearly pre specify about every relative associated with attention. Data extraction in information control is usually associated with the artificial thinking ability throughout, and the progress associated with methods as well as algorithms for those aspects of words evaluation, as well as their laptop or computer enactment. Moving to another space requires the client to name the target relations and to physically make new extraction tenets or hand-label new preparing cases. This difficult work scales directly with the quantity of target relations. This paper, explains extraction strategy for site data focused around DOM to enhance the seeking proficiency, which is to safeguard the topic data, and to channel out the commotion data that the clients are not inspired by. The experiments are done by taking different data sets. The proposed semantic clustering gives the best way to extract the information from web than existing techniques. The experimental results clearly show that the proposed technique gives better results when compared to existing techniques.