The Qualitas Corpus is a curated collection of software systems intended to be used for empirical studies of code artifacts. The primary goal is to provide a resource that supports reproducible analysis studies of software. The project started in the 2000s when Java was the dominant language with over 25% of ratings according to the Tiobe index. In this talk we reflect on the history of the project and present some ongoing work focusing on counting objects, measuring recall of static analysis, dependency version analysis, feature adoption, and corpus visualisation.

Craig Anslow is a Lecturer (Assistant Professor) in Software Engineering within the School of Engineering and Computer Science at Victoria University of Wellington, New Zealand. Craig also teaches a course on the MSc in Software Engineering Programme within the Department of Computer Science at the University of Oxford, United Kingdom.

