<< Click to Display Table of Contents >> Disambiguating an author |
|
One of the big challenges Socrates successfully deals with is “author disambiguation”. Often, the same author writes multiple articles, each with a different variation of the author name (e.g. Robert C Palmer, RC Palmer, R Palmer, etc.), and Socrates needs to figure out that these variations all describe the same person. Furthermore, in some cases, several authors may have the same name (including even the middle name).
Socrates uses a sophisticated algorithm to cluster the articles by their authors, coping with the above name challenges. It takes into account affiliations, co-authors, textual similarity and more.
However, the automatic disambiguation process is not 100% accurate. Sometimes the information available on an article is simply insufficient to make a conclusive identification of its authors.
If high accuracy in that respect is required, Socrates provides tools to manually finalize the disambiguation task.
See the following sections for details: