You are required to read and agree to the below before accessing a full-text version of an article in the IDE article repository.

The full-text document you are about to access is subject to national and international copyright laws. In most cases (but not necessarily all) the consequence is that personal use is allowed given that the copyright owner is duly acknowledged and respected. All other use (typically) require an explicit permission (often in writing) by the copyright owner.

For the reports in this repository we specifically note that

  • the use of articles under IEEE copyright is governed by the IEEE copyright policy (available at http://www.ieee.org/web/publications/rights/copyrightpolicy.html)
  • the use of articles under ACM copyright is governed by the ACM copyright policy (available at http://www.acm.org/pubs/copyright_policy/)
  • technical reports and other articles issued by M‰lardalen University is free for personal use. For other use, the explicit consent of the authors is required
  • in other cases, please contact the copyright owner for detailed information

By accepting I agree to acknowledge and respect the rights of the copyright owner of the document I am about to access.

If you are in doubt, feel free to contact webmaster@ide.mdh.se

Requirement or Not, That is the Question: A Case from the Railway Industry

Fulltext:


Publication Type:

Conference/Workshop Paper

Venue:

Requirements Engineering: Foundation for Software Quality

Publisher:

Springer Nature Switzerland

DOI:

10.1007/978-3-031-29786-1_8


Abstract

[Context and Motivation] Requirements in tender documents are often mixed with other supporting information. Identifying requirements in large tender documents could aid the bidding process and help estimate the risk associated with the project. [Question/problem] Manual identification of requirements in large documents is a resource-intensive activity that is prone to human error and limits scalability. This study compares various state-of-the-art approaches for requirements identification in an industrial context. For generalizability, we also present an evaluation on a real-world public dataset. [Principal ideas/results] We formulate the requirement identification problem as a binary text classification problem. Various state-of-the-art classifiers based on traditional machine learning, deep learning, and few-shot learning are evaluated for requirements identification based on accuracy, precision, recall, and F1 score. Results from the evaluation show that the transformer-based BERT classifier performs the best, with an average F1 score of 0.82 and 0.87 on industrial and public datasets, respectively. Our results also confirm that few-shot classifiers can achieve comparable results with an average F1 score of 0.76 on significantly lower samples, i.e., only 20% of the data. [Contribution] There is little empirical evidence on the use of large language models and few-shots classifiers for requirements identification. This paper fills this gap by presenting an industrial empirical evaluation of the state-of-the-art approaches for requirements identification in large tender documents. We also provide a running tool and a replication package for further experimentation to support future research in this area.

Bibtex

@inproceedings{Bashir6625,
author = {Sarmad Bashir and Muhammad Abbas and Mehrdad Saadatmand and Eduard Paul Enoiu and Markus Bohlin and Pernilla Lindberg},
title = {Requirement or Not, That is the Question: A Case from the Railway Industry},
isbn = {978-3-031-29786-1},
editor = {Ferrari, Alessio and Penzenstadler, Birgit},
pages = {105--121},
month = {April},
year = {2023},
booktitle = {Requirements Engineering: Foundation for Software Quality},
publisher = {Springer Nature Switzerland},
url = {http://www.es.mdu.se/publications/6625-}
}