<?xml version="1.0" encoding="UTF-8"?>

<rdf:RDF
   xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
   xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
   xmlns="http://purl.org/rss/1.0/"
   xmlns:dc="http://purl.org/dc/elements/1.1/"
   xmlns:prism="http://prismstandard.org/namespaces/1.2/basic/"
   xmlns:dcterms="http://purl.org/dc/terms/"

>
<channel rdf:about="http://www.citeulike.org/about">
<pubDate>Thu, 21 Aug 2008 06:53:09 BST</pubDate>


	<title>CiteULike: parmentierf's statistics</title>
	<description>CiteULike: parmentierf's statistics</description>


	<link>http://www.citeulike.org/user/parmentierf/tag/statistics</link>
	<dc:publisher>CiteULike.org</dc:publisher>
	<dc:language>en-gb</dc:language>
	<dc:rights>Copyright &#169; 2004-2008 citeulike.org</dc:rights>
	<items>
    <rdf:Seq>
        <rdf:li rdf:resource="http://www.citeulike.org/user/parmentierf/article/838590"/>
        <rdf:li rdf:resource="http://www.citeulike.org/user/parmentierf/article/312476"/>
        <rdf:li rdf:resource="http://www.citeulike.org/user/parmentierf/article/4487"/>

	</rdf:Seq>
	</items>
	</channel>


<item rdf:about="http://www.citeulike.org/user/parmentierf/article/838590">
    <title>Exploring Large Document Collections using Statistical Topic Models</title>
    <link>http://www.citeulike.org/user/parmentierf/article/838590</link>
    <description>&lt;i&gt;(2006)&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;We will demonstrate the topic model, a recent unsupervised learning technique that uses a statistical model to discover topics in a large collection of text documents. The first demonstration illustrates how the topic model automatically learns about the spectrum of research conducted by faculty members at UC Irvine and UC San Diego, how to topically characterize each researcher’s interests, and how to find researchers with similar interests – all in a completely unsupervised fashion. The second demonstration illustrates how medical researchers may use topic modeling to find new connections between genes and brain regions based on a large collection of articles on schizophrenia.</description>
    <dc:title>Exploring Large Document Collections using Statistical Topic Models</dc:title>

    <dc:creator>David Newman</dc:creator>
    <dc:creator>Arthur Asuncion</dc:creator>
    <dc:creator>Chaitanya Chemudugunta</dc:creator>
    <dc:creator>Vasanth Kumar</dc:creator>
    <dc:creator>Padhraic Smyth</dc:creator>
    <dc:creator>Mark Steyvers</dc:creator>
    <dc:source>(2006)</dc:source>
    <dc:date>2006-09-11T07:03:28-00:00</dc:date>
    <prism:publicationYear>2006</prism:publicationYear>
    <prism:category>automatic-learning</prism:category>
    <prism:category>data-processing</prism:category>
    <prism:category>statistics</prism:category>
</item>



<item rdf:about="http://www.citeulike.org/user/parmentierf/article/312476">
    <title>Unsupervised learning of natural languages.</title>
    <link>http://www.citeulike.org/user/parmentierf/article/312476</link>
    <description>&lt;i&gt;Proc Natl Acad Sci U S A, Vol. 102, No. 33. (16 August 2005), pp. 11629-11634.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;We address the problem, fundamental to linguistics, bioinformatics, and certain other disciplines, of using corpora of raw symbolic sequential data to infer underlying rules that govern their production. Given a corpus of strings (such as text, transcribed speech, chromosome or protein sequence data, sheet music, etc.), our unsupervised algorithm recursively distills from it hierarchically structured patterns. The adios (automatic distillation of structure) algorithm relies on a statistical method for pattern extraction and on structured generalization, two processes that have been implicated in language acquisition. It has been evaluated on artificial context-free grammars with thousands of rules, on natural languages as diverse as English and Chinese, and on protein data correlating sequence with function. This unsupervised algorithm is capable of learning complex syntax, generating grammatical novel sentences, and proving useful in other fields that call for structure discovery from raw data, such as bioinformatics.</description>
    <dc:title>Unsupervised learning of natural languages.</dc:title>

    <dc:creator>Z Solan</dc:creator>
    <dc:creator>D Horn</dc:creator>
    <dc:creator>E Ruppin</dc:creator>
    <dc:creator>S Edelman</dc:creator>
    <dc:identifier>doi:10.1073/pnas.0409746102</dc:identifier>
    <dc:source>Proc Natl Acad Sci U S A, Vol. 102, No. 33. (16 August 2005), pp. 11629-11634.</dc:source>
    <dc:date>2005-09-07T07:16:40-00:00</dc:date>
    <prism:publicationYear>2005</prism:publicationYear>
    <prism:publicationName>Proc Natl Acad Sci U S A</prism:publicationName>
    <prism:issn>0027-8424</prism:issn>
    <prism:volume>102</prism:volume>
    <prism:number>33</prism:number>
    <prism:startingPage>11629</prism:startingPage>
    <prism:endingPage>11634</prism:endingPage>
    <prism:category>automatic-learning</prism:category>
    <prism:category>chatterbots</prism:category>
    <prism:category>statistics</prism:category>
    <prism:category>unsupervised</prism:category>
</item>



<item rdf:about="http://www.citeulike.org/user/parmentierf/article/4487">
    <title>Automatic Meaning Discovery Using Google</title>
    <link>http://www.citeulike.org/user/parmentierf/article/4487</link>
    <description>&lt;i&gt;(21 December 2004)&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;We propose a new method to extract semantic knowledge from the world-wide-web for both supervised and unsupervised learning using the Google search engine in an unconventional manner. The approach is novel in its unrestricted problem domain, simplicity of implementation, and manifestly ontological underpinnings. We give evidence of elementary learning of the semantics of concepts, in contrast to most prior approaches. The method works as follows: The world-wide-web is the largest database on earth, and it induces a probability mass function, the Google distribution, via page counts for combinations of search queries. This distribution allows us to tap the latent semantic knowledge on the web. Shannon's coding theorem is used to establish a code-length associated with each search query. Viewing this mapping as a data compressor, we connect to earlier work on Normalized Compression Distance. We give applications in (i) unsupervised hierarchical clustering, demonstrating the ability to distinguish between colors and numbers, and to distinguish between 17th century Dutch painters; (ii) supervised concept-learning by example, using Support Vector Machines, demonstrating the ability to understand electrical terms, religious terms, emergency incidents, and by conducting a massive experiment in understanding WordNet categories; and (iii) matching of meaning, in an example of automatic English-Spanish translation.</description>
    <dc:title>Automatic Meaning Discovery Using Google</dc:title>

    <dc:creator>Rudi Cilibrasi</dc:creator>
    <dc:creator>Paul Vitanyi</dc:creator>
    <dc:source>(21 December 2004)</dc:source>
    <dc:date>2004-12-22T12:39:20-00:00</dc:date>
    <prism:publicationYear>2004</prism:publicationYear>
    <prism:category>ai</prism:category>
    <prism:category>automatic-learning</prism:category>
    <prism:category>metrics</prism:category>
    <prism:category>statistics</prism:category>
</item>



</rdf:RDF>

