The Violation of official rules of HTML results in some spot or error. Furthermore, when we convert a word document into web page, the code contains some unnecessary html tags as well as proprietary tags. Such undesirable, redundant, inessential, irrelevant tags are considered as noise. These noisy elements disturb the web page contents and make it difficult to read the contents of web page. Noise adversely affects web data mining and by eliminating noise we can reduce storage and indexing requirements Noise removal helps us to improve the performance of web page clustering, classification, content mining, and summarization. In the proposed work, web page noise has been identified by using four popular web browsers namely Google chrome, Internet Explorer7, Mozilla Firefox and opera and three web authoring tools which are Ms Word, Dreamweaver8 and Microsoft expression web4. Once the noise has been identified, we then classified this noise into different categories based on the source of word document. The experiment was conducted by running 40 web pages on the four popular web browsers and the results obtained shows that web page noise to a large extend depends on the source