| |
In Wikimedia Summer of Research (2011)
posted to open_access wrn2011 wrn201107
by WRN
on 2012-03-16 22:29:54
|
| |
In Wikimedia Summer of Research (2011)
posted to open_access wrn2011 wrn201107
by WRN
on 2012-03-16 22:29:54
|
| |
In Wikimedia Summer of Research (2011)
posted to open_access wrn2011 wrn201107
by WRN
on 2012-03-16 22:29:54
|
| |
In Wikimedia Summer of Research (2011)
posted to open_access wrn2011 wrn201107
by WRN
on 2012-03-16 22:29:54
|
| |
In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, No. June. (2011), pp. 97-102
posted to contribtb open_access wrn2011 wrn201107
by WRN
on 2012-03-16 22:29:54
Abstract
We present an open-source toolkit which allows (i) to reconstruct past states of Wikipedia, and (ii) to efficiently access the edit history of Wikipedia articles. Reconstructing past states of Wikipedia is a prerequisite for reproducing previous experimental work based on Wikipedia. Beyond that, the edit history of Wikipedia articles has been shown to be a valuable knowledge source for NLP, but access is severely impeded by the lack of efficient tools for managing the huge amount of provided data. By using ...
|
| |
|
| |
In Proceedings of the 2011 ACM Web Science Conference –WebSci'11 (2011)
|
| |
In Proceedings of the Workshop on Language in Social Media (LSM 2011), No. June. (2011), pp. 48-57
posted to contribdt open_access wrn2011 wrn201107
by WRN
on 2012-03-16 22:29:54
Abstract
We present the AAWD corpus, a collection of 365 discussions drawn from Wikipedia talk pages and annotated with labels capturing two kinds of social acts: alignment moves and authority claims. We describe these social acts and our annotation process, and analyze the resulting data set for interactions between participant status and social acts and between the social acts themselves. ...
|
| |
In Proceedings of the Fifth International AAAI Conference on Weblogs and Social Media (ICWSM '11) (2011), pp. 177-184
posted to open_access wrn2011 wrn201107
by WRN
on 2012-03-16 22:29:54
Abstract
Talk pages play a fundamental role in Wikipedia as the place for discussion and communication. In this work we use the comments on these pages to extract and study three networks, corresponding to different kinds of interactions. We find evidence of a specific assortativity profile which differentiates article discussions from personal conversations. An analysis of the tree structure of the article talk pages allows to capture patterns of interaction, and reveals structural differences among the discussions about articles from different semantic ...
|
| |
Abstract
We present a new, efficient method for automatically detecting severe conflicts `edit wars' in Wikipedia and evaluate this method on six different language WPs. We discuss how the number of edits, reverts, the length of discussions, the burstiness of edits and reverts deviate in such pages from those following the general workflow, and argue that earlier work has significantly over-estimated the contentiousness of the Wikipedia editing process. ...
|
| |
posted to open_access wrn2011 wrn201108
by WRN
on 2012-03-16 22:29:54
Abstract
We explore statistical properties of links within Wikipedia. We demonstrate that a simple algorithm can predict many of the links that would normally be added to a new article, without considering the topic of the article itself. We then explore a variant of topic-oriented PageRank, which can effectively identify topical links within existing articles, when compared with manual judgments of their topical relevance. Based on these results, we suggest that linkages within Wikipedia arise from a combination of structural requirements and ...
|
| |
posted to open_access wrn2011 wrn201108
by WRN
on 2012-03-16 22:29:54
|
| |
In Proceedings of the 7th International Symposium on Wikis and Open Collaboration - WikiSym '11 (October 2011), doi:10.1145/2038558.2038573
Abstract
User generated content (UGC) constitutes a significant fraction of the Web. However, some wiiki–based sites, such as Wikipedia, are so popular that they have become a favorite target of spammers and other vandals. In such popular sites, human vigilance is not enough to combat vandalism, and tools that detect possible vandalism and poor-quality contributions become a necessity. The application of machine learning techniques holds promise for developing efficient online algorithms for better tools to assist users in vandalism detection. We describe ...
|
| |
Abstract
Wikipedia provides an interesting amount of text for more than hundred languages. This also includes languages where no reference corpora or other linguistic resources are easily available. We have extracted background language models built from the content of Wikipedia in various languages. The models generated from Simple and English Wikipedia are compared to language models derived from other established corpora. The differences between the models in regard to term coverage, term distribution and correlation are described and discussed. We provide access ...
|
| |
posted to open_access wrn2011 wrn201108
by WRN
on 2012-03-16 22:29:54
|
| |
In WikiSym 2011: Proceedings of the 7th International Symposium on Wikis (2011)
|
| |
Journal of the American Medical Informatics Association : JAMIA, Vol. 16, No. 4. (2009), pp. 471-9, doi:10.1197/jamia.M3059
Abstract
OBJECTIVE To determine the significance of the English Wikipedia as a source of online health information. DESIGN The authors measured Wikipedia's ranking on general Internet search engines by entering keywords from MedlinePlus, NHS Direct Online, and the National Organization of Rare Diseases as queries into search engine optimization software. We assessed whether article quality influenced this ranking. The authors tested whether traffic to Wikipedia coincided with epidemiological trends and news of emerging health concerns, and how it compares to MedlinePlus. MEASUREMENTS ...
|
| |
In Proceedings of the 13th International Conference of the International Society for Scientometrics & Informetrics (2011), pp. 794-800
posted to open_access wrn2011 wrn201108
by WRN
on 2012-03-16 22:29:53
|
| |
Abstract
Collaborative environments, such as Wikipedia, often have low barriers-to-entry in order to encourage participation. This accessibility is frequently abused (e.g., vandalism and spam). However, certain inappropriate behaviors are more threatening than others. In this work, we study contributions which are not simply "undone" but deleted from revision histories and public view. Such treatment is generally reserved for edits which: (1) present a legal liability to the host (e.g., copyright issues, defamation), or (2) present privacy threats to individuals (i.e., contact information). ...
|
| |
|
| |
(2011)
posted to open_access wrn-201109 wrn2011
by WRN
on 2012-03-16 22:29:53
|
| |
In Program (2011)
posted to open_access wrn-201109 wrn2011
by WRN
on 2012-03-16 22:29:53
Abstract
Data provenance refers to the lineage or pedigree of data, including information such as its origin and key events that affect it over the course of its lifecycle. In recent years, provenance has become increasingly important as more and more people are using data that they themselves did not generate. Tracking data provenance helps ensure that data provided by many different providers and sources can be trusted and used appropriately. Data provenance also has several other critical uses, including data quality ...
|
| |
Kansas Journal of Medicine, Vol. 4, No. 3. (August 2011)
|
| |
Journal of Communication, Vol. 5 (2011), pp. 1138-1158
|
| |
In Proceedings of the 5th International Workshop on New Challenges in Distributed Information Filtering and Retrieval (DART 2011) (September 2011)
Abstract
This paper presents an empirical study about the temporal patterns characterizing the requests submitted by users to Wikipedia. The study is based on the analysis of the log lines registered by the Wikimedia Foundation Squid servers after having sent the appropriate content in response to users' requests. The analysis has been conducted regarding the ten most visited editions of Wikipedia and has involved more than 14,000 million log lines corresponding to the traffic of the entire year 2009. The conducted methodology ...
|
| |
Abstract
Wikipedia (WP) as a collaborative, dynamical system of humans is an appropriate subject of social studies. Each single action of the members of this society, i.e. editors, is well recorded and accessible. Using the cumulative data of 34 Wikipedias in different languages, we try to characterize and find the universalities and differences in temporal activity patterns of editors. Based on this data, we estimate the geographical distribution of editors for each WP in the globe. Furthermore we also clarify the differences ...
|
| |
In AAAI International Conference on Weblogs and Social Media (ICWSM '10) (2010)
|
| |
|
| |
|
| |
|
| |
|
| |
In Decade in Internet Time symposium (2011)
Abstract
In this paper we propose a theoretical framework to understand the evolving governance of internet-mediated social production. Specifically, we focus on the emergence of a collective capability that integrates knowledge relevant to large-scale production and coordination. Focusing on one of the most popular websites and reference tools, Wikipedia, we undertake an exploratory theoretical analysis to clarify the structure and mechanisms driving the endogenous change of a massive social production system. We argue that the standard transactions costs approach underpinning many extant ...
|
| |
(2011)
posted to open_access wrn2011 wrn201110
by WRN
on 2012-03-16 22:29:50
|
| |
In PAN 2011 (2011)
posted to open_access wrn2011 wrn201110
by WRN
on 2012-03-16 22:29:50
|
| |
ArXiV (October 2011)
Abstract
This paper was originally designed as a literature review for a doctoral dissertation focusing on Wikipedia. This exposition gives the structure of Wikipedia and the latest trends in Wikipedia research. ...
|
| |
In Workshop on Visual Interfaces to the Social and Semantic Web (VISSW2011) (2011)
posted to open_access wrn2011 wrn201110
by WRN
on 2012-03-16 22:29:50
Abstract
Wikipedia is emerging as the dominant global knowledge repository. Recently, large numbers of Wikipedia users have collaborated to produce more structured information in the online encyclopedia. For example, the information found in tables, categories and infoboxes. Infoboxes contain key-value pairs, manually appended to articles based on the unstructured text therein. The wiki contains some structured information which can be crawled by DBpedia 2, which attempts to organize wiki data into into a database of subject-predicate-object triples. By leveraging this data we ...
|
| |
In CIKM '11 (2011)
posted to open_access wrn2011 wrn201110
by WRN
on 2012-03-16 22:29:50
|
| |
(2011)
posted to open_access wrn2011 wrn201110
by WRN
on 2012-03-16 22:29:50
Abstract
In this thesis, I examine the Articles for Deletion (AfD) system in /Wikipedia/, a large-scale collaborative editing project. Articles in Wikipedia can be nominated for deletion by registered users, who are expected to cite criteria for deletion from the Wikipedia deletion. For example, an article can be nominated for deletion if there are any copyright violations, vandalism, advertising or other spam without relevant content, advertising or other spam without relevant content. Articles whose subject matter does not meet the notability criteria ...
|
| |
Journal of Documentation (2008)
|
| |
Journal of the American Society for Information Science and Technology, Vol. 62, No. 1. (January 2011), pp. 117-132, doi:10.1002/asi.21423
posted to open_access wrn2011 wrn201110
by WRN
on 2012-03-16 22:29:50
Abstract
This paper aims to review the fiercely discussed question of whether the ranking of Wikipedia articles in search engines is justified by the quality of the articles. After an overview of current research on information quality in Wikipedia, a summary of the extended discussion on the quality of encyclopedic entries in general is given. On this basis, a heuristic method for evaluating Wikipedia entries is developed and applied to Wikipedia articles that scored highly in a search engine retrieval effectiveness test ...
|
| |
In Proceedings of the 7th International Symposium on Wikis and Open Collaboration - WikiSym '11 (October 2011), doi:10.1145/2038558.2038595
Abstract
Researchers have used Wikipedia data to identify a wide range of antecedents to success in collective production. But we have not yet inquired whether collective production creates those public goods which bring most value-add from a social perspective. In this poster I explore two key circumstances in which collective production can fail to respond to social need: when goods fail to attain high quality despite (1) high demand or (2) explicit designation by producers as highly important. In the context of ...
|
| |
In Proceedings of the 7th International Symposium on Wikis and Open Collaboration - WikiSym '11 (October 2011), doi:10.1145/2038558.2038593
Abstract
We present results on a study of two levels of Wikipedia's article deletion process: speedy deletions (or CSDs) and articles for deletions (or AfDs). Our findings indicate that the deletion process is heavily frequented by a relatively small number of longstanding users. In analyzing the rationales given for such deletions, it is apparent that the vast majority of such deleted articles are not spam, vandalism, or 'patent nonsense,' but rather articles which could be considered encyclopedic, but do not fit the ...
|
| |
In Proceedings of the 7th International Symposium on Wikis and Open Collaboration - WikiSym '11 (October 2011), doi:10.1145/2038558.2038597
Abstract
WikiProject Countering Systemic Bias consists of a small group of English-language Wikipedia editors attempting to counterbalance Western-leaning content on the site. A population survey of members of this WikiProject is currently underway and will be followed by online interviews with select editors. This poster will present preliminary findings from the survey and interviews in order to understand how this group perceives bias on Wikipedia and how they work together to fight it. ...
|
| |
In Proceedings of the 7th International Symposium on Wikis and Open Collaboration - WikiSym '11 (October 2011), doi:10.1145/2038558.2038598
Abstract
This poster will present preliminary results of a study that considers the efforts of WikiProject Countering Systemic Bias, a collective of editors dedicated to combating bias on the English-language Wikipedia. Through a content analysis comparing the project to a sample from the general population, the scope of this group's labor is gauged and discussed. ...
|
| |
In Proceedings of the 7th International Symposium on Wikis and Open Collaboration - WikiSym '11 (October 2011), doi:10.1145/2038558.2038573
Abstract
User generated content (UGC) constitutes a significant fraction of the Web. However, some wiiki–based sites, such as Wikipedia, are so popular that they have become a favorite target of spammers and other vandals. In such popular sites, human vigilance is not enough to combat vandalism, and tools that detect possible vandalism and poor-quality contributions become a necessity. The application of machine learning techniques holds promise for developing efficient online algorithms for better tools to assist users in vandalism detection. We describe ...
|
| |
In Proceedings of the 7th International Symposium on Wikis and Open Collaboration - WikiSym '11 (October 2011), doi:10.1145/2038558.2038586
Abstract
The continuous success of Wikipedia depends upon its capability to recruit and engage new editors, especially those with new knowledge and perspectives. Yet Wikipedia over the years has become a complicated bureaucracy that may be difficult for newcomers to navigate. Mentoring is a practice that has been widely used in offline organizations to help new members adjust to their roles. In this paper, we draw insights from the offline mentoring literature to analyze mentoring practices in Wikipedia and how they influence ...
|
| |
In Proceedings of the 7th International Symposium on Wikis and Open Collaboration - WikiSym '11 (October 2011), doi:10.1145/2038558.2038613
|
| |
In Proceedings of the 7th International Symposium on Wikis and Open Collaboration - WikiSym '11 (October 2011), doi:10.1145/2038558.2038610
Abstract
This panel seeks to begin a discussion of how we can meaningfully compare and contrast between the diverse instances of open collaboration and peer production employed on the Internet today. Current research on the topic have tended to be too platform - (e.g. Wikipedia) or domain - (e.g. Open source) specific. The panelists will be tasked with addressing this problem using their own expertise and research projects to bear on the issue. Ultimately, the panel will seek to lay the foundations ...
|
| |
In Proceedings of the 7th International Symposium on Wikis and Open Collaboration - WikiSym '11 (October 2011), doi:10.1145/2038558.2038612
Abstract
This workshop has three key goals. First, we will examine existing and proposed systems for collecting and analyzing the research literature about wikis. Second, we will discuss the challenges in building such a system and will engage participants to design a sustainable collaborative system to achieve this goal. Finally, we will provide a forum to build upon ongoing wiki community discussions about problems and opportunities in finding and sharing the wiki research literature. ...
|
| |
In MetaWiki (2011)
posted to open_access wrn2011 wrn201111
by WRN
on 2012-03-16 22:29:50
|