Qualitas Corpus

The Qualitas Corpus is a curated collection of software systems intended to be used for empirical studies of code artefacts. The primary goal is to provide a resource that supports reproducible studies of software. The current release of the Corpus contains open-source Java software systems, often multiple versions.

The current release is version 20120401. It has 111 systems, 14 systems with 10 or more versions, and 661 versions total. There are two main distributions: the "r" (recent) release, containing the most recent versions we have of every system (111 systems) and the "e" (evolution) release, containing all versions of the 14 systems with 10 or more versions, a total of 486 versions. There are other distributions available.

In publications that use the corpus, please cite the APSEC paper and always identify the release used.

News

1 April 2012
A new distribution (20120401) has been released. See the history of the corpus for what has changed.
3 December 2010
A paper describing the design and development of the corpus was presented at APSEC2010. See "Citing the corpus" for details.

Index

Overview Catalogue summary
Acquiring the corpus Installing the corpus
Distribution structure Structure of the content
Defining systems Metadata about the contents
Criteria for inclusion Development status and plans
History of the corpus Conventions used
Citing the corpus Publications based on the corpus
Software FAQ
Glossary

Management

The Qualitas Corpus is currently being maintained by: See also the history of the Corpus.

Contact Us

See Ewan Tempero's Home Page.
[Qualitas Corpus Home]

Updated: 24-Jan-2013, Managed by Ewan Tempero