Fully Distributed EM for Very Large Datasets(2008)
|
Reviews
[Write a review of this article]
There are no reviews of this article
Find related articles from these CiteULike users
Find related articles with these CiteULike tags
AbstractIn EM and related algorithms, E-step compu- tations distribute easily, because data items are independent given parameters. For very large data sets, however, even storing all of the parameters in a single node for the M- step can be impractical. We present a frame- work that fully distributes the entire EM pro- cedure. Each node interacts only with pa- rameters relevant to its data, sending mes- sages to other nodes along a junction-tree topology. We demonstrate improvements over a MapReduce topology, on two tasks: word alignment and topic modeling.
BibTeX record
RIS record