Measuring Coordination Costs of Managing External Dependencies in Open Source Software Ecosystems
Extracted a large dataset of issues from several GitHub projects and built statistical models (zero inflated negative binomial regression and a linear mixed-effects model) to quantify coordination costs and evaluate the factors associated with it. (paper under review)
A Machine Learning Approach to Automatically Label Issues on Github
The goal of this project is to predict labels assigned to issues on Github. Both text and social features were used in a Stacked classifier in Weka (also modified its source code). [Report]
Data Mining for Social Good - A footprint of your online social activism.
Many of us have participated for social causes online often by sharing Facebook posts / Tweets. However, there is no easy way for someone to show the causes she has contributed for in the past. This project aims to build a web application where users can maintain a profile of their online social activism (often called slactivism). Theories from social psychology have shown that people perform desirable actions if they know those actions have had a positive impact in the past. Therefore, an important outcome of our application would be to increase user's motivation to participate in online social activism. The application was developed using the Meteor framework and written entirely in Javascript and HTML / CSS. This was a team project and my role was the conceptualization of the idea, mining and displaying relevant Tweets automatically, and authentication using an OAuth based Facebook login.[Source] [Demo]
Networks of Collaborations Among Government Organizations on Github
Recently, Github has emerged as a useful platform to allow people to collaborate on government projects both in building software and creating policies. Since Github also provides a rich set of social media features, these interactions create a complex network that provides insightful information on the way users collaborate. Our primary hypothesis, informed from theory is to evaluate if members within an organization have more ties or not. We then build on this finding to better understand various aspects of collaboration such as reciprocity and homophily. [Report]
An Agent Based Model of Edit Wars in Wikipedia
Edit wars are conflicts among editors of Wikipedia, when editors repeatedly overwrite each other’s content. The goal of this project is to create an agent-based model of edit wars in order to study the influence of various factors involved in consensus formation. We model the behavior of agents using theories of group stability and reinforcement learning.[Paper]
Effects of concurrent modifications on quality of articles in Wikipedia
The proximity of edits in time and space(e.g. same paragraph, line, etc.) can be measured to understand the level of concurrent modifications in a Wikipedia article. The effects of such concurrency measures on a set of randomly chosen 'good' and 'featured' class articles were studied. [Report]
Click here for a list of my older projects (prior to 2013).