Register | Log in | FAQ      [?] 
CiteULike is a free online bibliography manager. Register and you can start organising your references online.
Recent | Unread | Search | Authors | Tags | Export

Fully Distributed EM for Very Large Datasets

by: Jason Wolfe, Aria Haghighi, Dan Klein
(2008)


View FullText article


X Reviews [Write a review of this article]

There are no reviews of this article

X Find related articles from these CiteULike users

X Find related articles with these CiteULike tags

X Abstract

In EM and related algorithms, E-step compu- tations distribute easily, because data items are independent given parameters. For very large data sets, however, even storing all of the parameters in a single node for the M- step can be impractical. We present a frame- work that fully distributes the entire EM pro- cedure. Each node interacts only with pa- rameters relevant to its data, sending mes- sages to other nodes along a junction-tree topology. We demonstrate improvements over a MapReduce topology, on two tasks: word alignment and topic modeling.


X BibTeX record

X RIS record



RIS BibTeX
CiteULike organises scholarly (or academic) papers or literature and provides bibliographic (which means it makes bibliographies) for universities and higher education establishments. It helps undergraduates and postgraduates. People studying for PhDs or in postdoctoral (postdoc) positions. The service is similar in scope to EndNote or RefWorks or any other reference manager like BibTeX, but it is a social bookmarking service for scientists and humanities researchers.