We developed a novel strategy to obtain detailed investigator information by automatically parsing the affiliation string in PubMed records. We illustrated the results by using a published literature database in human genome epidemiology (HuGE Pub Lit) as a test case. Our parsing strategy extracted country information from 92.1\ of the affiliation strings in a random sample of PubMed records and in 97.0\ of HuGE records, with accuracies of 94.0\% and 91.0\%, respectively. Institution information was parsed from 91.3\% of the general PubMed records (accuracy 86.8\%) and from 94.2\% of HuGE PubMed records (accuracy 87.0).