Skip to content

quality-attributes/datasets

Repository files navigation

Databases

Official data sources for the Quality Attributes project to train, test and validate if Non-Functional Requirements related to Quality Attributes can be found on GitHub Issues reports.

Training Set

On December 14th, 2019 the site http://ctp.di.fct.unl.pt/RE2017/pages/submission/data_papers/ was visited to get the PROMISE dataset, included as part of a data challenge.

Sayyad Shirabad, J. and Menzies, T.J. (2005) The PROMISE Repository of Software Engineering Databases. School of Information Technology and Engineering, University of Ottawa, Canada. Available: http://promise.site.uottawa.ca/SERepository

The non-functional requirements' labels in this dataset (involving 15 different projects) are distributed as follows:

Class Quantity Percentage
Funcional (F) 255 40.80%
Availability (A) 21 3.36%
Fault Tolerance (FT) 10 1.60%
Legal (L) 13 2.08%
Look & Feel (LF) 38 6.08%
Maintainability (MN) 17 2.72%
Operational (O) 62 9.92%
Performance (PE) 54 8.64%
Portability (PO) 1 0.16%
Scalability (SC) 21 3.36%
Security (SE) 66 10.56%
Usability (US) 67 10.72%
Total 625 100%

For the purposes of this study, only a subset of this dataset was considered, as part of the quality attributes categories and due to imbalanced classes:

Class Quantity Percentage
Availability (A) 21 8.20%
Fault Tolerance (FT) 10 3.91%
Maintainability (MN) 17 6.64%
Performance (PE) 54 21.09%
Scalability (SC) 21 8.21%
Security (SE) 66 25.78%
Usability (US) 67 26.17%
Total 256 100%

Test Set

Based upon the book:

Miller, Roxanne E., 2009, The Quest for Software Requirements, MavenMark Books, Milwaukee, WI

40 different non-functional requirements associated to quality attributes where collected. From the following categories (matching the ones included in the training).

  • Access Security
  • Availability
  • Usability
  • Maintainability
  • Scalability

Validation Set

According to the State of the Octoverse in 2019, the most contributed open source project at GitHub were as follows:

Place Repository Contributors
01 microsoft/vscode 19.1k
02 MicrosoftDocs/azure-docs 14k
03 flutter/flutter 13k
04 firstcontributions/first-contributions 11.6k
05 tensorflow/tensorflow 9.9k
06 facebook/react-native 9.1k
07 kubernetes/kubernetes 6.9k
08 DefinitelyTyped/DefinitelyTyped 6.9k
09 ansible/ansible 6.8k
10 home-assistant/home-assistant 6.3k

The repositories selected describe different software systems, excluding documentations and projects with the same scope (i.e. flutter and react-native. Data collected using quality-attributes/issue-collector for the following repositories:

  1. microsoft/vscode
  2. flutter/flutter
  3. tensorflow/tensorflow
  4. kubernetes/kubernetes
  5. ansible/ansible

Note: Only the latest 100 issues (as of 02/20/2020) for each repository were collected, due to GitHub's API v4 limitations

About

Official data sources for the Quality Attributes project

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •