| |
posted to all data-reuse data-sharing ethics
by hpiwowar
on 2008-11-28 23:16:45
|
| |
Abstract
The information seeking behavior of astronomers, chemists, mathematicians, and physicists at the University of Oklahoma was assessed using an electronically distributed questionnaire. All of the scientists surveyed relied greatly on the journal literature to support their research and creative activities. The mathematicians surveyed indicated an additional reliance on monographs, preprints, and attendance at conferences and personal communication to support their research activities. Similarly, all scientists responding scanned the latest issues of journals to keep abreast of current developments in their fields, ...
|
| |
PS: Political Science and Politics, Vol. 28, No. 3. (1995), pp. 444-452, doi:10.2307/420301
|
| |
Abstract
I show herein how to write a publishable paper by beginning with the replication of a published article. This strategy seems to work well for class projects in producing papers that ultimately get published, helping to professionalize students into the discipline, and teaching the scientific norms of the free exchange of academic information. I begin by briefly revisiting the prominent debate on replication our discipline had a decade ago and some of the progress made in data sharing since.My deepest appreciation ...
|
| |
|
| |
In eScience All Hands Meeting (2004)
|
| |
In Designing for Usability in e-Science (2006)
by Simon J. Coles, Jeremy G. Frey, Michael B. Hursthouse, et al.Andrew J. Milsted, Leslie A. Carr, Christopher J. Gutteridge, Liz Lyon, Rachel Heery, Monica Duke, Traugott Koch, Michael Day
posted to all case-study data-reuse
by hpiwowar
on 2008-10-25 14:35:29
|
| |
posted to all data-reuse data-sharing
by hpiwowar
on 2008-10-25 14:29:00
Abstract
This paper describes an ongoing collaborative effort across digital library and scientific communities in the UK to improve access to research data. A prototype demonstrator service supporting the discovery and retrieval of detailed results of crystallography experiments has been deployed within an Open Archives digital library service model. Early challenges include the understanding of requirements in this specialized area of chemistry and reaching consensus on the design of a metadata model and schema. Future plans encompass the exploration of commonality and ...
|
| |
Abstract
Contact: jdwrenatgmail.com ...
|
| |
|
| |
Journal of Science Communication
Abstract
From the life sciences to the physical sciences, chemistry to archaeology, the last 25 years have brought an unprecedented shift in the way research happens day to day. The traditional cycles of research, beginning with a study of the relevant journal articles and books, moving into experimental design and data gathering, hypothesis formulation and testing, and finally republishing new knowledge into the scholarly canon, remain in place. But the quantity of information now available at each step of the cycle has exploded, and the average scientist ...
|
| |
|
| |
|
| |
International Journal of Digital Curation, Vol. 3, No. 1. (2008)
Abstract
An article considering the changes afoot in the world of Science and how the exponentially increasing amounts of recorded data are affecting the way in which scientists now work, for example with data mining. Changes in the way that resources become obsolete are also discussed and how more value must be placed on the work of professionals in digital curation. ...
|
| |
Journal of applied crystallography, Vol. 34, No. 3. (2001), pp. 375-380
Abstract
Citation analysis has been widely used to quantify the influence of research articles on the development of science. This paper reports a citation analysis of ten highly cited papers associated with the Cambridge Crystallographic Data Centre (CCDC), covering the variation of citation with time, the journals in which citations occur, and the types of organization and the geographic regions that use the Cambridge Structural Database. The ten most highly cited papers, comprising four database descriptions (CSD), two geometrical tabulations (TAB) and ...
|
| |
|
| |
Abstract
Abstract An important set of challenges for eScience initiatives and digital libraries concern the need to provide scientists with the ability to access data from multiple sources. This paper argues that an analysis of scientists‘ reuse of data prior to the advent of eScience can illuminate the requirements and design of digital libraries and cyberinfrastructure. As part of a larger study on data sharing and reuse, I investigated the processes by which ecologists locate data that were initially collected by others. Ecological ...
|
| |
Abstract
Modern science is increasingly collaborative, as signaled by rising numbers of coauthored papers, papers with international coauthors, and multi-investigator grants. Historically, scientific collaborations were carried out by scientists in the same physical location--the Manhattan Project of the 1940s, for example, involved thousands of scientists gathered on a remote plateau in Los Alamos, New Mexico. Today, information and communication technologies allow cooperation among scientists from far-flung institutions and different disciplines. Scientific Collaboration on the Internet provides both broad and in-depth views of ...
|
| |
Abstract
This article analyzes the experiences of ecologists who used data they did not collect themselves. Specifically, the author examines the processes by which ecologists understand and assess the quality of the data they reuse, and investigates the role that standard methods of data collection play in these processes. Standardization is one means by which scientific knowledge is transported from local to public spheres. While standards can be helpful, the results show that knowledge of the local context is critical to ecologists' ...
|
| |
Neuroinformatics, Vol. 5, No. 3. (2007), pp. 154-160
Abstract
The computer-assisted three-dimensional reconstruction of neuronal morphology is becoming an increasingly popular technique to quantify the arborization patterns of dendrites and axons. The resulting digital files are suitable for comprehensive morphometric analyses as well as for building anatomically realistic compartmental models of membrane biophysics and neuronal electrophysiology. The digital tracings acquired in a lab for a specific purpose can be often re-used by a different research group to address a completely unrelated scientific question, if the original investigators are willing to ...
|
| |
Abstract
The Gene Expression Omnibus (GEO) repository at the National Center for Biotechnology Information archives and freely distributes high-throughput molecular abundance data, predominantly gene expression data generated by DNA microarray technology. The database has a flexible design that can handle diverse styles of both unprocessed and processed data in a Minimum Information About a Microarray Experiment-supportive infrastructure that promotes fully annotated submissions. GEO currently stores about a billion individual gene expression measurements, derived from over 100 organisms, submitted by over 1500 laboratories, ...
|
| |
Methods in molecular biology (Clifton, N.J.) In Gene Mapping, Discovery, and Expression, Vol. 338 (1 April 2006), pp. 175-190, doi:10.1385/1-59745-097-9:175
Abstract
The Gene Expression Omnibus (GEO) at the National Center for Biotechnology Information (NCBI) has emerged as the leading fully public repository for gene expression data. This chapter describes how to use Web-based interfaces, applications, and graphics to effectively explore, visualize, and interpret the hundreds of microarray studies and millions of gene expression patterns stored in GEO. Data can be examined from both experiment-centric and gene-centric perspectives using user-friendly tools that do not require specialized expertise in microarray analysis or time-consuming download ...
|
| |
Abstract
When the human genome project was conceived, its leaders wanted all researchers to have equal access to the data and associated research tools. Their vision of equal access provides an unprecedented teaching opportunity. Teachers and students have free access to the same databases that researchers are using. Furthermore, the recent movement to deliver scientific publications freely has presented a second source of current information for teaching. I have developed a genomics course that incorporates many of the public-domain databases, research tools, ...
|
| |
Abstract
The Minimum Information About a Microarray Experiment (MIAME) guidelines are a data content document developed by the Microarray Gene Expression Data (MGED) Society that outlines the information that should be provided when describing a microarray experiment1. Many journals and funding agencies have adopted the guidelines, with the aim of facilitating access to the elements of a study that would enable independent evaluation of results. ...
|
| |
Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing (2008), pp. 580-591
Abstract
Our limited ability to perform large-scale translational discovery and analysis of disease characterizations from public genomic data repositories remains a major bottleneck in efforts to translate genomics experiments to medicine. Through comprehensive, integrative genomic analysis of all available human disease characterizations we gain crucial insight into the molecular phenomena underlying pathogenesis as well as intra- and inter-disease differentiation. Such knowledge is crucial in the development of improved clinical diagnostics and the identification of molecular targets for novel therapeutics. In this study ...
|
| |
Science (New York, N.Y.), Vol. 298, No. 5593. (18 October 2002)
by C. A. Ball, G. Sherlock, H. Parkinson, et al.P. Rocca-Sera, C. Brooksbank, H. C. Causton, D. Cavalieri, T. Gaasterland, P. Hingamp, F. Holstege, M. Ringwald, P. Spellman, C. J. Stoeckert, J. E. Stewart, R. Taylor, A. Brazma, J. Quackenbush,
|
| |
Blog post on March 24, 2008: http://researchremix.wordpress.com/2008/03/24/envisioning-a-biomedical-data-reuse-registry/
|
| |
|
| |
(15 May 2000)
Abstract
Unlike most other similar books, this systematic, carefully reasoned, non-technical analysis of the nature and significance of scientific knowledge opens the way to reconciliation in the 'science wars'. By describing how academic scientists actually undertake research and communicate their findings, it shows that the philosophy, psychology and sociology of science are inextricably entwined, and that 'realism' and 'relativism' are just two sides of the same coin. The writing is well-informed, down-to-earth and lucid. ...
|
| |
|
| |
Abstract
References 1.↵ Hutchon DJR .Infopoints: Publishing raw data and real time statistical analysis on e-journals.BMJ2001;322: 529–530. (3 March.) 2.↵ Eysenbach G .The impact of preprint servers and electronic publishing on biomedical research.Curr Opin Immunol2000;12:499–503. 3.↵ Eysenbach G .Welcome to the Journal of Medical Internet Research.J Med Internet Res1999;1: e5. Online available at (accessed 2 March 2001). 4.↵ Russ AP, Aparicio SA, Carlton MB .Open-source work even more vital to genome project than to software.Nature2000;404: 809. 5.↵ Anonymous .Debates over credit for ...
|
| |
by T. R. Cech, S. R. Eddy, D. Eisenberg, et al.K. Hersey, S. H. Holtzman, G. H. Poste, N. V. Raikhel, R. H. Scheller, D. B. Singer, M. C. Waltham,
|
| |
|
| |
|
| |
Journal of the American Medical Informatics Association : JAMIA, Vol. 14, No. 4. (1 July 2007), pp. 478-488, doi:10.1197/jamia.m2114
Abstract
Themes identified in this study suggest that at least some common data management needs will best be served by improving access to basic level tools such that researchers can solve their own problems. Additionally, institutions and informaticians should focus on three components: 1) facilitate and encourage the use of modern data exchange models and standards, enabling researchers to leverage a common layer of interoperability and analysis; 2) improve the ability of researchers to maintain provenance of data and models as they ...
|
| |
|
| |
Education + Training (2001), pp. 206-214
Abstract
Knowledge is a social construct and cannot be managed as physical assets. The distinction between data, information and knowledge is made. The transformation of raw data and information into useful knowledge requires a sense of trust and reciprocity on the part of people. Knowledge flows involve the translation of tacit knowledge into explicit knowledge in a process of codification. Knowledge produced by individuals reaches its full potential to create economic value when it becomes embedded in organisational routines. It is important ...
|
| |
|
| |
Abstract
While scientific research and the methodologies involved have gone through substantial technological evolution the technology involved in the publication of the results of these endeavors has remained relatively stagnant. Publication is largely done in the same manner today as it was fifty years ago. Many journals have adopted electronic formats, however, their orientation and style is little different from a printed document. The documents tend to be static and take little advantage of computational resources that might be available. Recent work, ...
|
| |
Neuroinformatics, Vol. 5, No. 3. (2007), pp. 161-175
Abstract
As public availability of gene expression profiling data increases, it is natural to ask how these data can be used by neuroscientists. Here we review the public availability of high-throughput expression data in neuroscience and how it has been reused, and tools that have been developed to facilitate reuse. There is increasing interest in making expression data reuse a routine part of the neuroscience tool-kit, but there are a number of challenges. Data must become more readily available in public databases; ...
|
| |
Abstract
Recent advances in high-throughput genomic technologies are showing concrete results in the form of an increasing number of genome-wide association studies and in the publication of comprehensive individual genome-phenome data sets. As a consequence of this flood of information the established concepts of research ethics are stretched to their limits, and issues of privacy, confidentiality and consent for research are being re-examined. Here, we show the feasibility of the co-development of scientific innovation and ethics, using the open-consent framework that was ...
|
| |
by K. Henrick, Z. Feng, W. F. Bluhm, et al.D. Dimitropoulos, J. F. Doreleijers, S. Dutta, J. L. Flippen-Anderson, J. Ionides, C. Kamada, E. Krissinel, C. L. Lawson, J. L. Markley, H. Nakamura, R. Newman, Y. Shimizu, J. Swaminathan, S. Velankar, J. Ory, E. L. Ulrich, W. Vranken, J. Westbrook, R. Yamashita, H. Yang, J. Young, M. Yousufuddin, H. M. Berman
Abstract
The Worldwide Protein Data Bank (wwPDB; wwpdb.org) is the international collaboration that manages the deposition, processing and distribution of the PDB archive. The online PDB archive at ftp://ftp.wwpdb.org is the repository for the coordinates and related information for more than 47 000 structures, including proteins, nucleic acids and large macromolecular complexes that have been determined using X-ray crystallography, NMR and electron microscopy techniques. The members of the wwPDB-RCSB PDB (USA), MSD-EBI (Europe), PDBj (Japan) and BMRB (USA)-have remediated this archive to ...
|
| |
Nucleic Acids Res, Vol. 35, No. Database issue. (January 2007)
Abstract
The Gene Expression Omnibus (GEO) repository at the National Center for Biotechnology Information (NCBI) archives and freely disseminates microarray and other forms of high-throughput data generated by the scientific community. The database has a minimum information about a microarray experiment (MIAME)-compliant infrastructure that captures fully annotated raw and processed data. Several data deposit options and formats are supported, including web forms, spreadsheets, XML and Simple Omnibus Format in Text (SOFT). In addition to data storage, a collection of user-friendly web-based interfaces ...
|
| |
|
| |
|
| |
Abstract
Five-hundred twenty-seven full bibliographic records containing URLs were downloaded from SCISEARCH as part of an exploration of the extent of Web publication of electronic research-related information (E-RRI) in the sciences and classified as to resource type, subject area, and degree of intellectual property protection. Four hundred eighty-five records represented nonduplicate descriptions of data compilations (194), software (153), Websites (73), electronic documents (49), and digitized images (17). The greatest concentration of E-RRI was found in molecular biology (QP=123), general natural history and ...
|
| |
Journal of the American Medical Informatics Association, Vol. 14:, No. 2. (1 March 2007), pp. 212-220, doi:10.1197/jamia.M2191
Abstract
10.1197/jamia.M2191 Objective To characterize PubMed usage over a typical day and compare it to previous studies of user behavior on Web search engines.Design We performed a lexical and semantic analysis of 2,689,166 queries issued on PubMed over 24 consecutive hours on a typical day.Measurements We measured the number of queries, number of distinct users, queries per user, terms per query, common terms, Boolean operator use, common phrases, result set size, MeSH categories, used semantic measurements to group queries into sessions, and ...
|
| |
(8 Aug 2007)
Abstract
The large-scale analysis of scholarly artifact usage is constrained primarily by current practices in usage data archiving, privacy issues concerned with the dissemination of usage data, and the lack of a practical ontology for modeling the usage domain. As a remedy to the third constraint, this article presents a scholarly ontology that was engineered to represent those classes for which large-scale bibliographic and usage data exists, supports usage research, and whose instantiation is scalable to the order of 50 million articles ...
|
| |
(26 Oct 2006)
Abstract
There exist ample demonstrations that indicators of scholarly impact analogous to the citation-based ISI Impact Factor can be derived from usage data. However, contrary to the ISI IF which is based on citation data generated by the global community of scholarly authors, so far usage can only be practically recorded at a local level leading to community-specific assessments of scholarly impact that are difficult to generalize to the global scholarly community. We define a journal Usage Impact Factor which mimics the ...
|
| |
JCDL '06: Proceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries In JCDL '06: Proceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries (2006), pp. 298-307, doi:10.1145/1141753.1141821
Abstract
Although recording of usage data is common in scholarly information services, its exploitation for the creation of value-added services remains limited due to concerns regarding, among others, user privacy, data validity, and the lack of accepted standards for the representation, sharing and aggregation of usage data. This paper presents a technical, standards-based architecture for sharing usage information, which we have designed and implemented. In this architecture, OpenURL-compliant linking servers aggregate usage information of a specific user community as it navigates the ...
|