| |
|
| |
Data Engineering, 2008. ICDE 2008. IEEE 24th International Conference on In Data Engineering, 2008. ICDE 2008. IEEE 24th International Conference on (2008), pp. 646-655.
Abstract
The rapid growth of Web communities has motivated many solutions for building community data portals. These solutions follow roughly two approaches. The first approach (e.g., Libra, Citeseer, Cimple) employs semi-automatic methods to extract and integrate data from a multitude of data sources. The second approach (e.g., Wikipedia, Intellipedia) deploys an initial portal in wiki format, then invites community members to revise and add material. In this paper we consider combining the above two approaches to building community portals. The new hybrid ...
|
| |
(10 May 2009)
Abstract
Wikipedia is a goldmine of information; not just for its many readers, but also for the growing community of researchers who recognize it as a resource of exceptional scale and utility. It represents a vast investment of manual effort and judgment: a huge, constantly evolving tapestry of concepts and relations that is being applied to a host of tasks. This article provides a comprehensive description of this work. It focuses on research that extracts and makes use of the concepts, relations, facts and descriptions found in Wikipedia, and ...
|
| |
(21 Dec 2005)
Abstract
This paper presents a novel analysis and visualization of English Wikipedia data. Our specific interest is the analysis of basic statistics, the identification of the semantic structure and age of the categories in this free online encyclopedia, and the content coverage of its highly productive authors. The paper starts with an introduction of Wikipedia and a review of related work. We then introduce a suite of measures and approaches to analyze and map the semantic structure of Wikipedia. The results show that co-occurrences of categories within individual articles ...
|
| |
SIGIR Forum, Vol. 40, No. 1. (June 2006), pp. 64-69.
|
| |
In Biannual Conference of the Society for Computational Linguistics and Language Technology (2007)
Abstract
We analyze Wikipedia as a lexical semantic resource and compare it with conventional resources, such as dictionaries, thesauri, semantic wordnets, etc. Different parts of Wikipedia reflect different aspects of these resources. We show that Wikipedia contains a vast amount of knowledge about, e.g., named entities, domain specific terms, and rare word senses. If Wikipedia is to be used as a lexical semantic resource in large-scale NLP tasks, efficient programmatic access to the knowledge therein is required. We review existing access mechanisms ...
|
| |
In The Social Semantic Web 2007 - Proceedings of the 1st Conference on Social Semantic Web
Abstract
Enhancing Wikipedia by means of semantic representations seems to be a promising issue. From a formal or technical point of view there are no major obstacles in the way. Nevertheless, a close look at Wikipedia, its structure and contents reveals that some questions have to be answered in advance. This paper will deal with these questions and present some first results based on empirical findings. ...
|
| |
Proceedings of the 20th International Joint Conference on Artificial Intelligence (2007), pp. 6-12.
|
| |
Natural Language Processing and Information Systems: 10th International Conference on Applications of Natural Language to Information Systems, NLDB 2005, Alicante, Spain, June 15-17, 2005: Proceedings (2005)
|
| |
Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics (2006), pp. 192-199.
|
| |
First Workshop on Semantic Wikis
|
| |
Abstract
Wikipedia is the world's largest collaboratively edited source of encyclopaedic knowledge. But in spite of its utility, its contents are barely machine-interpretable. Structural knowledge, e. g. about how concepts are interrelated, can neither be formally stated nor automatically processed. Also the wealth of numerical data is only available as plain text and thus can not be processed by its actual meaning. ...
|
| |
Advances in Web Intelligence (2005), pp. 380-386.
Abstract
We describe an approach taken for automatically associating entries from an on-line encyclopedia with concepts in an ontology or a lexical semantic network. It has been tested with the Simple English Wikipedia and WordNet, although it can be used with other resources. The accuracy in disambiguating the sense of the encyclopedia entries reaches 91.11% (83.89% for polysemous words). It will be applied to enriching ontologies with encyclopedic knowledge. ...
|
| |
(2005)
Abstract
Wikipedia is the biggest collaboratively created source of encyclopaedic knowledge. Growing beyond the borders of any traditional encyclopaedia, it is facing new problems of knowledge management: The current excessive usage of article lists and categories witnesses the fact that 19th century content organization technologies like inter-article references and indices are no longer su#cient for today's needs. ...
|
| |
Natural Language Processing and Information Systems (2007), pp. 48-60.
Abstract
Compared with plain-text resources, the ones in “semi-semantic” web sites, such as Wikipedia, contain high-level semantic information which will benefit various automatically annotating tasks on themself. In this paper, we propose a “collaborative annotating” approach to automatically recommend categories for a Wikipedia article by reusing category annotations from its most similar articles and ranking these annotations by their confidence. In this approach, four typical semantic features in Wikipedia, namely incoming link, outgoing link, section heading and template item, are investigated and ...
|
| |
In CIKM '07: Proceedings of the sixteenth ACM conference on Conference on information and knowledge management (2007), pp. 41-50.
Abstract
Berners-Lee's compelling vision of a Semantic Web is hindered by a chicken-and-egg problem, which can be best solved by a bootstrapping method - creating enough structured data to motivate the development of applications. This paper argues that autonomously "Semantifying Wikipedia" is the best way to solve the problem. We choose Wikipedia as an initial data source, because it is comprehensive, not too large, high-quality, and contains enough manually-derived structure to bootstrap an autonomous, self-supervised process. We identify several types of structures ...
|
| |
In WWW '06: Proceedings of the 15th international conference on World Wide Web (2006), pp. 585-594.
Abstract
Wikipedia is the world's largest collaboratively edited source of encyclopaedic knowledge. But in spite of its utility, its contents are barely machine-interpretable. Structural knowledge, e.,g. about how concepts are interrelated, can neither be formally stated nor automatically processed. Also the wealth of numerical data is only available as plain text and thus can not be processed by its actual meaning.We provide an extension to be integrated in Wikipedia, that allows the typing of links between articles and the specification of typed ...
|
| |
Computer Journal, Vol. 48, No. 1., 126.
|
| |
(17 August 1999)
Abstract
Sowa integrates logic, philosophy, linguistics, and computer science into this study of knowledge and its various models and implementations. His definitive new book shows how techniques of artificial intelligence, database design, and object-oriented programming help make knowledge explicit in a form that computer systems can use. ...
|
| |
(21 Dec 2005)
Abstract
This paper presents a novel analysis and visualization of English Wikipedia data. Our specific interest is the analysis of basic statistics, the identification of the semantic structure and age of the categories in this free online encyclopedia, and the content coverage of its highly productive authors. The paper starts with an introduction of Wikipedia and a review of related work. We then introduce a suite of measures and approaches to analyze and map the semantic structure of Wikipedia. The results show that co-occurrences of categories within individual articles ...
|
| |
(01 March 2005)
Abstract
As the World Wide Web continues to expand, it becomes increasingly difficult for users to obtain information efficiently. Because most search engines read format languages such as HTML or SGML, search results reflect formatting tags more than actual page content, which is expressed in natural language. <i>Spinning the Semantic Web</i> describes an exciting new type of hierarchy and standardization that will replace the current "web of links" with a "web of meaning." Using a flexible set of languages and tools, the ...
|
| |
Electronic Government, an International Journal, Vol. 3, No. 1. (2006), pp. 36-55.
Abstract
E-government webs are among the largest webs in existence, based on the size, number of users and number of information providers. Thus, creating a Semantic Web infrastructure to meaningfully organise e-government webs is highly desirable. At the same time, the complexity of the existing e-government implementations also challenges the feasibility of Semantic Web creation. We therefore propose the design of a two-layer semantic Wiki web, which consists of a content Wiki, largely identical to the traditional web and a semantic layer, ...
|
| |
Web Semantics: Science, Services and Agents on the World Wide Web, Vol. 3, No. 2-3. (October 2005), pp. 211-223.
Abstract
We present the Flink system for the extraction, aggregation and visualization of online social networks. Flink employs semantic technology for reasoning with personal information extracted from a number of electronic information sources including web pages, emails, publication archives and FOAF profiles. The acquired knowledge is used for the purposes of social network analysis and for generating a web-based presentation of the community. We demonstrate our novel method to social science based on electronic data using the example of the Semantic Web ...
|
| |
In Proceedings of the I-KNOW 2005. 5th International Conference on Knowledge Management (2005)
|
| |
In Poster at the International Semantic Web Conference ISWC 2004 (2004)
|
| |
The Learning Organization: An International Journal, Vol. 12, No. 5. (May 2005), pp. 402-410.
|
| |
IBM developerworks (18 October 2005)
Abstract
Ontologies form the backbone of a whole new way to understand online data [This article explores] the basics of Semantic Web technologies as Naveen Balani shows you how organizations can leverage ontology-based development. The Semantic Web can aid effective knowledge management and cost-effective product life cycle automation for faster development and integration processes. ...
|
| |
(06 April 2005)
Abstract
The long-term goal of the SESAME network of excellence is to overcome the fragmentation of the European research landscape in the area of Mathematical Knowledge Management. We intend to develop, implement, and provide semantic-based and context-aware techniques for acquiring, organizing, processing, sharing and using knowledge in Sciences, Technology, Engineering and Mathematical disciplines (STEM) to support research, education, and technology application. The SESAME network hopes to achieve this goal by deepening the understanding of semantic methods for knowledge representation and management currently under development for the field ...
|
| |
11th Asia-Pacific Software Engineering Conference (APSEC'04) (30 November 2004), pp. 384-391.
Abstract
The Semantic Web envisioned by Tim Berners Lee garnered much excitement from the research community due to its huge potential in enabling machine-readable web contents. Studies have shown that using semantic web technologies, such as the Resources Description Framework (RDF), not only effectively add meaning to content but also increase the relevancy of search results. This paper investigates the efficient use of the Semantic Web initiatives for Corporate Portal: the focus being the simplification of the content annotation process. It proposes ...
|
| |
(05 December 2003)
Abstract
This book constitutes the refereed proceedings of the Second International Semantic Web Conference, ISWC 2003, held at Sanibel Island, Florida, USA in October 2003. The 58 revised full papers presented were carefully reviewed and selected from numerous submissions. The papers are organized in topical sections on foundations; ontological reasoning; semantic Web services; security, trust, and privacy; agents and the semantic Web; information retrieval; multimedia; tools and methodologies; applications; and industrial perspectives. ...
|
| |
Lecture Notes in Computer Science, Vol. 2569 (January 2002), pp. 189-200.
|
| |
|
| |
(19 April 2005)
Abstract
The revised papers presented in this book are drawn from two meetings devoted to the Semantic Web and the legal domain: The International Workshop on Legal Ontologies and Web-Based Legal Information Management held in Edinburgh, UK in June 2003, and the International Seminar on Law and the Semantic Web, held in Barcelona, Spain in November 2003. This book presents 15 thoroughly refereed revised papers on topics relevant for law and the Semantic Web. The book is structured into three parts. Part ...
|
| |
Abstract
This book constitutes the refereed proceedings of the 4th International Semantic Web Conference, ISWC 2005, held in Galway, Ireland, in November 2005. The 54 revised full academic papers and 17 revised industrial papers presented together with abstracts of 3 invited talks were carefully reviewed and selected from a total of 217 submitted papers to the academic track and 30 to the industrial track. The research papers address all current issues in the field of the semantic Web, ranging from theoretical aspects ...
|
| |
(23 November 2004)
Abstract
This book constitutes the refereed proceedings of the Third International Semantic Web Conference, ISWC 2004, held in Hiroshima, Japan in November 2004. The 55 revised full papers presented together with abstracts of 2 invited talks were carefully reviewed and selected from a total of 227 submitted papers. The papers are organized in topical sections on data semantics, p2p systems, semantic Web mining, tools and methodologies for Web agents, user interfaces and visualization, large scale knowledge management, semantic Web services, inference, searching ...
|
| |
Abstract
Just like the industrial society of the last century depended on natural resources, today’s society depends on information and its exchange. Semantic Web technologies address the problem of information complexity by providing advanced support for representing and processing distributed information, while peer-to-peer technologies address issues of system complexity by allowing flexible and decentralized information storage and processing. Systems that are based on Semantic Web and peer-to-peer technologies promise to combine the advantages of the two mechanisms. A peer-to-peer style architecture ...
|
| |
(16 July 2004)
Abstract
The large-scale and almost ubiquitous availability of information has become as much of a curse as it is a blessing. The more information is available, the harder it is to locate any particular piece of it. And even when it has been successfully found, it is even harder still to usefully combine it with other information we may already possess. This problem occurs at many different levels, ranging from the overcrowded disks of our own PCs to the mass of unstructured ...
|
| |
In Fourth International Semantic Web Conference (ISWC 2005)
Abstract
We extend the traditional bipartite model of ontologies with the social dimension, leading to a tripartite model of actors, concepts and instances. We demonstrate the application of this representation by showing how community-based semantics emerges from this model through a process of graph transformation. We illustrate ontology emergence by two case studies, an analysis of a large scale folksonomy system and a novel method for the extraction of community-based ontologies from Web pages. ...
|
| |
AMIA Annu Symp Proc (November 2003), pp. 351-355.
Abstract
The Unified Medical Language System is an extensive source of biomedical knowledge developed and maintained by the US National Library of Medicine (NLM) and is being currently used in a wide variety of biomedical applications. The Semantic Network, a component of the UMLS is a structured description of core biomedical knowledge consisting of well defined semantic types and relationships between them. We investigate the expressiveness of DAML+OIL, a markup language proposed for ontologies on the Semantic Web, for representing the knowledge ...
|
| |
|
| |
Comput. Networks, Vol. 33, No. 1-6. (2000), pp. 473-491.
|
| |
Intelligent Systems, IEEE [see also IEEE Expert], Vol. 17, No. 1. (2002), pp. 78-86.
Abstract
The article discusses ways to let semantics emerge from simple observations from the bottom-up, rather than imposing concepts on the observations top-down, to provide precise query, retrieval, communication or translation for a wide variety of applications. The following areas are examined: image retrieval and databases; media information spaces including the Semantic Web and MPEG frameworks; language games for emergent semantics; and emergent semantics for ontologies ...
|
| |
Intelligent Systems, IEEE [see also IEEE Intelligent Systems and Their Applications], Vol. 16, No. 2. (2001), pp. 72-79.
Abstract
The Semantic Web relies heavily on formal ontologies to structure data for comprehensive and transportable machine understanding. Thus, the proliferation of ontologies factors largely in the Semantic Web's success. The authors present an ontology learning framework that extends typical ontology engineering environments by using semiautomatic ontology construction tools. The framework encompasses ontology import, extraction, pruning, refinement and evaluation. ...
|
| |
(2002)
Abstract
In a peer-to-peer (P2P) system, nodes typically connect to a small set of random nodes (their neighbors), and queries are propagated along these connections. Such query flooding tends to be very expensive. We propose that node connections be influenced by content, so that for example, nodes having many "Jazz" files will connect to other similar nodes. Thus, semantically related nodes form a Semantic Overlay Network (SON). Queries are routed to the appropriate SONs, increasing the chances that... ...
|
| |
|
| |
(2002)
Abstract
The Resource Definition Framework (RDF) is designed to support agent communication on the Web, but it is also suitable as a framework for modeling and storing personal information. Haystack is a personalized information repository that employs RDF in this manner. This flexible semistructured data model is appealing for several reasons. First, RDF supports ontologies created by the user and tailored to the user’s needs. At the same time, system ontologies can be specified and evolved to support a variety of high-level functionalities such as flexible organization schemes, semantic querying, and ...
|
| |
ACM (2004)
Abstract
Much past HCI research has examined the usability concerns of information management software for specific domains such as object-oriented software design, e-mail, and the Web. We believe that many of the results uncovered by these studies are applicable across multiple domains but that more broadly-scoped experiments require a system that can integrate multiple data sources. Haystack is a generalpurpose information management environment designed to attack this very problem. Haystack’s user interface, which incorporates capabilities from previous research such as context-specific visualization paradigms and attribute-based categorization, is built upon a highly expressive semistructured data ...
|
| |
|
| |
Commun. ACM, Vol. 47, No. 12. (December 2004), pp. 47-52.
|