Expertise identification using email communications
A common method for finding information in an organization is to use social networks---ask people, following referrals until someone with the right information is found. Another way is to automatically mine documents to determine who knows what. Email documents seem particularly well suited to this task of "expertise location", as people routinely communicate what they know. Moreover, because people explicitly direct email to one another, social networks are likely to be contained in the patterns of communication. Can these patterns be used to discover experts on particular topics? Is this approach better than mining message content alone? To find answers to these questions, two algorithms for determining expertise from email were compared: a content-based approach that takes account only of email text, and a graph-based ranking algorithm (HITS) that takes account both of text and communication patterns. An evaluation was done using email and explicit expertise ratings from two different organizations. The rankings given by each algorithm were compared to the explicit rankings with the precision and recall measures commonly used in information retrieval, as well as the d' measure commonly used in signal-detection theory. Results show that the graph-based algorithm performs better than the content-based algorithm at identifying experts in both cases, demonstrating that the graph-based algorithm effectively extracts more information than is found in content alone.