The development Open Source Software fundamentally depends on the participation and commitment of volunteer developers to progress on a particular task. Several works have presented strategies to increase the on-boarding and engagement of new contributors, but little is known on how these diverse groups of developers self-organise to work together. This is the focus of our recent paper Online division of labour: emergent structures in Open Source Software (open access!). A short summary follows.
To understand how open source contributors end up self-organizing themselves, one must consider that, on one hand, platforms like GitHub provide a virtually unlimited development framework: any number of actors can potentially join to contribute in a decentralised, distributed, remote, and asynchronous manner. On the other, however, it seems reasonable that some sort of hierarchy and division of labour must be in place to meet human biological and cognitive limits, and also to achieve some level of efficiency. These latter features (hierarchy and division of labour) should translate into detectable structural arrangements when projects are represented as developer-file bipartite networks. Considering these two elements composing the OSS system allows retaining valuable information (as opposed to collapsing it on a unipartite network) and, above all, recognising both classes as co-evolutionary units that place mutual constraints on each other.
Thus, in this paper we analyse a set of popular open-source projects from GitHub, placing the accent on three key properties: nestedness, modularity and in-block nestedness. The first one, nestedness, is a suitable measure to quantify and visualise how the mentioned low truck factor, and the existence of core/drive-by developers, translates into a project’s network structure. As for modularity, it provides a natural way to check whether OSS projects split in identifiable compartments, suggesting specialisation, and whether such compartments are subject to size limitations, along the mentioned bio-cognitive limits. Finally, since modularity and nestedness are, to some extent, incompatible in the same network, in-block nestedness (or the lack of it) can help to determine how projects solve the tension between the emergence of nested (hierarchy, asymmetry) and modular (specialisation, division of labour, bounds to social connections) patterns.
These analyses show that indeed projects evolve into internally organised blocks. Furthermore, the distribution of sizes of such blocks is bounded, connecting our results to the celebrated Dunbar number both in off- and on-line environments.
At the mesoscale, we observe that projects tend to form blocks, a fact that can be related to the need of contributors to distribute coding efforts, allowing a project to develop steadily and in a balanced way. Those blocks or subgroups have a relatively stable size no matter how large a project is.No matter the number of contributors in an open source project, contributors end up working together in blocks of a relatively stable size, regardless the size of the project Click To Tweet
Previous research reported that OSS projects are largely heterogeneous, in the sense that developers self-organise into hierarchical structures. This conclusion is reinforced here, as we find evidence of nested arrangements in OSS bipartite networks. Thus, the presence of workload compartmentalization is compatible with the emergence of hierarchies, with generalists and specialists throughout a project. Paradoxically, a more evolved and structured architecture does not imply better overall performance here: the nested arrangement inside blocks can hamper a project’s progress, since the occasional and least committed contributors (those acting upon a small part of the code) tend to edit precisely the most generalist files, neglecting the least developed ones –a fact that has been observed from very different methodologies.Why attracting more contributors to your project is not enough to ensure its #sustainability : occasional contributors tend to focus on those files that everybody else is already working on. #oss #sustainoss Click To Tweet
By being more aware of the internal self-organisation of their projects, owners and administrators may design strategies to optimise the collaborative efforts of the limited number (and availability) of project contributors. For instance, they can place efforts to drive the actual project’s block decomposition towards a pre-defined software architectural pattern; or ensure that, despite the nested organisation within blocks, all files in a block receive some minimal attention. More research on the derivation of effective project management leadership strategies from the current division of labour in a project is clearly needed and impactful.To optimize the positive impact of new contributors, make sure you redirect their efforts to the projects parts where they are needed the most. Do NOT let them choose #oss #sustainoss Click To Tweet