![]() |
CiteULike | ![]() |
tulaydemir's CiteULike | ![]() |
![]() |
|
![]() |
Register | ![]() |
Log in | ![]() |
Mining data records in Web pagesIn KDD '03: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining (2003), pp. 601-606.
|
Reviews
[Write a review of this article]
Notes for this article
Find related articles from these CiteULike users
Find related articles with these CiteULike tags
Posting History
AbstractA large amount of information on the Web is contained in regularly structured objects, which we call data records. Such data records are important because they often present the essential information of their host pages, e.g., lists of products or services. It is useful to mine such data records in order to extract information from them to provide value-added services. Existing automatic techniques are not satisfactory because of their poor accuracies. In this paper, we propose a more effective technique to perform the task. The technique is based on two observations about data records on the Web and a string matching algorithm. The proposed technique is able to mine both contiguous and non-contiguous data records. Our experimental results show that the proposed technique outperforms existing techniques substantially.
BibTeX record
RIS record