![]() |
CiteULike | ![]() |
ChaTo's CiteULike | ![]() |
![]() |
|
![]() |
Register | ![]() |
Log in | ![]() |
Page-level template detection via isotonic smoothingIn WWW '07: Proceedings of the 16th international conference on World Wide Web (2007), pp. 61-70.
|
Reviews
[Write a review of this article]
Notes for this articleSimple approach: hash fragment, check frequencies.
Approach here: site-level. No training data required. The function that is learned goes from DOM nodes to {template/nontemplate}. Features: placement, bgcolor, aspect ratio, link density, avg. sentence site, etc. Smoothing: all the children of a template are templates.
Find related articles from these CiteULike users
Find related articles with these CiteULike tags
Posting History
BibTeX record
RIS record