Large scale co-authorship networks open up a window into the complex patterns of collaborative behavior and provide a rich arena for scholarly investigation. This is not a new idea (Newman 2004), but in order to create a realistic and informative model of collaboration behavior, at least five different types of data need to be collected and combined: a) Information about individuals and their publication behavior such as their gender, affiliation, geographical location, seniority, expertise, topics, publication dates, productivity, collaborators, citations, funding, and patents. b) Measures relating pairs of individuals such as mentorship (senior vs. junior), topical similarity, social network proximity, geographical proximity, expertise complementarity. c) Measures that go beyond pairs to describe the roles of individuals and structures in the social network such as hubs, bridges, isolates, large communities, smaller cliques, central cores, periphery, and hierarchy. d) Multiple time scales – collaboration behavior has changed collectively over the years, and usually changes during the course of an individual's career. e) Interactive factors that take into account patterns that are context or discipline dependent – some individuals may tend to collaborate preferentially with others in their own field, but this may not apply to, for instance, statisticians.
Accurate and efficient collection of this data can be supported by Author-ity (Torvik et al. 2005, Torvik & Smalheiser 2009), an author-centered database of disambiguated names. Besides providing a better understanding of collaborative behavior, a multidimensional model can also inform the process of developing recommender tools that could, for example, assist scientists in finding potential collaborators – or co-panelists.