Megan Squire, an associate professor in the Department of Computing Sciences, has been awarded a research grant for computational resources provided by Amazon for 2013-15.
This competitive award program offers academic researchers no-cost allocations of high-end computing resources.
Squire’s allocation is for computing time on a cloud-based web service called Amazon Elastic MapReduce (EMR). Amazon EMR is a web service that enables researchers to process vast amounts of data using a hosted Hadoop framework running on the web-scale infrastructure of Amazon Elastic Compute Cloud (Amazon EC2) and Amazon Simple Storage Service (Amazon S3).
Other universities awarded allocations under this program include the University of Oxford (UK) and University of California, Berkeley. This grant allocates an EMR instance to Squire and a group of researchers who collaborate on a project investigating how free and open source software is made.
Squire will use Amazon’s EMR to store and analyze all the emails from the public developer mailing lists for a group of open source projects. Storing and analyzing emails (and reproducing published analyses using emails, especially lexical analysis of email content) has previously been extremely difficult using off-the-shelf relational database management systems due to size and processing constraints. Using EMR, Squire will collect, curate, and store the Apache email corpus as historical snapshots, and will assist other researchers who wish to analyze this data or replicate the studies of others.
Undergraduate students in Squire’s Data Mining and Analysis class (ISC 320) will assist in this research by generating reports and writing programs to process data stored in the EMR system.