Materials for Multi-label Text Classification for Public Procurement in Spanish


Materials used for a SEPLN paper submission (to appear). The paper proposes a Common Procurement Vocabulary codes (CPVs) classifier that uses as an input the textual description of the contracting process, and assigns CPVs from the 45 top-level CPV categories. The classifier works for Spanish, and improves the state of the art by 10% F1-score.

To access the contents of this Research object as a JSON-LD RO-Crate, type the following curl command:

  curl -sH "accept:application/ld+json" -L


  • CPV Classifier training and test datasets

    • Description: Datasets used to test different approaches to CPV classification.
    • License: '' and 'info:eu-repo/semantics/openAccess'


The pointers for the main software used can be found below:

Our main code repository can be seen below:

In this work we compare different approaches to Common Procurement Vocabulary (CPV) codes classification, using data extracted from the Spanish Treasury.


About the authors

María Navas-Loro

María Navas-Loro

Postdoctoral researcher

Universidad Politécnica de Madrid Facultad de Informática

Daniel Garijo

Daniel Garijo

Distinguished Researcher

Universidad Politécnica de Madrid
Oscar Corcho

Oscar Corcho

Full professor

Universidad Politécnica de Madrid