CiteULike is a free online bibliography manager. Register and you can start organising your references online.
Tags

Diverse dimension decomposition for itemset spaces

by: Mikalai Tsytsarau, Francesco Bonchi, Aristides Gionis, Themis Palpanas
Knowledge and Information Systems (27 June 2012), pp. 1-27, doi:10.1007/s10115-012-0518-5  Key: citeulike:10843300

Formatted Citation


Show HTML

Likes (beta)

This copy of the article hasn't been liked by anyone yet.

View FullText article


Abstract

We introduce the problem of diverse dimension decomposition in transactional databases, where a dimension is a set of mutually exclusive itemsets. The problem we consider requires to find a decomposition of the itemset space into dimensions, which are orthogonal to each other and which provide high coverage of the input database. The mining framework we propose can be interpreted as a dimensionality-reducing transformation from the space of all items to the space of orthogonal dimensions. Relying on information-theoretic concepts, we formulate the diverse dimension decomposition problem with a single objective function that simultaneously captures constraints on coverage, exclusivity, and orthogonality. We show that our problem is NP -hard, and we propose a greedy algorithm exploiting the well-known FP-tree data structure. Our algorithm is equipped with strategies for pruning the search space deriving directly from the objective function. We also prove a property that allows assessing the level of informativeness for newly added dimensions, thus allowing to define criteria for terminating the decomposition. We demonstrate the effectiveness of our solution by experimental evaluation on synthetic datasets with known dimension and three real-world datasets, flickr , del.icio.us and dblp . The problem we study is largely motivated by applications in the domain of collaborative tagging; however, the mining task we introduce in this paper is useful in other application domains as well.


RafG's tags for this article

Citations (CiTO)

No CiTO relationships defined

X There are no reviews yet

X Find related articles with these CiteULike tags

X Posting History


X Export records

Privacy Statement | Terms & Conditions
CiteULike organises scholarly (or academic) papers or literature and provides bibliographic (which means it makes bibliographies) for universities and higher education establishments. It helps undergraduates and postgraduates. People studying for PhDs or in postdoctoral (postdoc) positions. The service is similar in scope to EndNote or RefWorks or any other reference manager like BibTeX, but it is a social bookmarking service for scientists and humanities researchers.