The papers, presented at the 10th Mining Software Repositories Working Conference in San Francisco, describe two novel data sets collected, curated and freely donated to the empirical software engineering research community.
The Apache Software Foundation is one of the largest organizations producing open source software in the world. How the ASF works is of great interest to scholars in empirical software engineering.
As described in her papers, Project Roles in the Apache Software Foundation: A Dataset [1], and Apache-Affiliated Twitter Screen Names: A Dataset [2], Associate Professor Megan Squire wrote software to collect and curated two novel sets of data which were then donated back to the research community as part of the FLOSSmole data commons.
These papers were presented May 19 at the 10th Mining Software Repositories Conference, a gathering of scholars who analyze large collections of software artifacts for empirical data about how the software is made.
In the Project Roles paper, Squire describes the process for determing the roles (leader, developer, committer, contributor, etc) for more than 5000 software developers within the ASF itself and more than 200 of its affiliate projects.
In the Twitter Names paper, Squire describes how she wrote software to automatically collect the Twitter screen names of Apache-affiliated developers. This data is useful for any researcher who wishes to study how Twitter is being used to create software.
[1]http://flosshub.org/content/project-roles-apache-software-foundation-dataset
[2]http://flosshub.org/content/apache-affiliated-twitter-screen-names-dataset