Online toxicity is ubiquitous across the internet and its negative impact on the people and that online communities that it effects has been well documented. However, toxicity manifests differently on various platforms and toxicity in open source communities, while frequently discussed, is not well understood.
We take a first stride at understanding the characteristics of open source toxicity to better inform future work on designing effective intervention and detection methods. To this end, we curate a sample of 100 toxic GitHub issue discussions combining multiple search and sampling strategies. We then qualitatively analyze the sample to gain an understanding of the characteristics of open-source toxicity.
We find that the pervasive forms of toxicity in open source differ from those observed on other platforms like Reddit or Wikipedia. In our sample, some of the most prevalent forms of toxicity are entitled, demanding, and arrogant comments from project users as well as insults arising from technical disagreements. In addition, not all toxicity was written by people external to the projects; project members were also common authors of toxicity.
We also discuss the implications of our findings. Among others we hope that our findings will be useful for future detection work.
Paper / Slides / Presentation Video / Infographic
Courtney Miller, Sophie Cohen, Daniel Klug, Bogdan Vasilescu, and Christian Kästner. "“Did You Miss My Comment or What?” Understanding Toxicity in Open Source Discussions." Proceedings of the 44th International Conference on Software Engineering. 2022. DOI: https://doi.org/10.1145/3510003.3510111