Description
In this study, various transformer-based classifiers were trained on a new dataset of sustainability reports to automatically identify contributions to the SDGs. An innovative approach was taken by using the SDG icons that companies themselves included in their reports as labels. The study examined the extent to which the generalizability of the models can be improved through a greater variety of reports.
Results
The Longformer, as the best-performing model, achieved an F0.5 score of 0.65 when assigning the 17 possible SDGs. In addition, precision could be improved through threshold optimization and targeted regularization. Notably, even without manual ground-truth annotations, reliable identification of SDG contributions from a corporate perspective was possible. The use of company-assigned SDG labels proved to be a practical starting point for detecting sustainability-related content.
Background
The publication outlines the experiences from the project “ateSDG”.
Read the paper