Deja vu A Study of Duplicate Citations in Medlineby: Mounir Errami, Justin M Hicks, Wayne Fisher, David Trusty, Jonathan D Wren, Tara C Long, Harold R Garner
Bioinformatics (1 December 2007), btm574.
|
Reviews
[Write a review of this article]
There are no reviews of this article
Find related articles from these CiteULike users
Find related articles with these CiteULike tags
AbstractMotivation: Duplicate publication impacts the quality of the scientific corpus, has been difficult to detect, and studies this far have been limited in scope and size .Using text similarity searches, we were able to identify signatures of duplicate citations among a body of abstracts. Results: A sample of 62,213 Medline citations was examined and a database of manually verified duplicate citations was created to study author publication behavior. We found that 0.04% of the citations with no shared authors were highly similar and are thus potential cases of plagiarism. 1.35% with shared authors were sufficiently similar to be considered a duplicate. Extrapolating, this would correspond to 3,500 and 117,500 duplicate citations in total, respectively. Availability: eTBLAST, an automated citation matching tool, and Deja vu, the duplicate citation database, are freely available at http://invention.swmed.edu/ and http:/spore.swmed.edu/dejavu. Contact: Harold.Garner@utsouthwestern.edu 10.1093/bioinformatics/btm574
BibTeX record
RIS record