
5.2 Implications for Stakeholders
The upside of our results is undeniable for stake-
holders. With a machine learning algorithm that can
provide an enterprise risk score on ESG factors ef-
ficiently, investors can now conduct a holistic risk
analysis of companies that do not already have ESG
risk scores/analysis. For a vast majority of compa-
nies, especially those with a market cap under $500
million, an ESG risk score is not provided, mean-
ing that our system could provide an analysis within
seconds. This form of analysis is especially useful
for private equity firms who tend to acquire compa-
nies that are not publicly traded, much less assigned
an ESG risk score. Considering that ESG initiatives,
which comprise ESG risk scores, correlate with mar-
ket risk and returns as described by Zhang, being able
to efficiently analyze ESG risk will help investors dif-
ferentiate sound companies from unsound companies.
5.3 Data Limitations
Our research, focusing exclusively on S&P 500 com-
panies, as opposed to Teoh who focused on major
technological stocks and the NASDAQ index, and
Zhang who focused exclusively on Chinese compa-
nies, presented both strengths and constraints. While
this focus allowed us to work with a consistent and
relatively homogenous dataset, it also limited the gen-
eralizability of our findings. Expanding our data be-
yond the S&P 500 could have potentially introduced
a more diverse range of ESG practices and report-
ing standards, reflecting a broader spectrum of cor-
porate behaviors and policies. This expansion could
have provided a richer and more nuanced understand-
ing of ESG risk assessments, enabling our models to
capture a wider array of ESG factors and their im-
pact. Additionally, including smaller or international
firms, which often have different ESG reporting stan-
dards and challenges, might have revealed additional
insights into the variability and complexity of ESG
practices globally. Next, our specific data source,
company reports, may have led to poor, monotonous
data as companies tend to provide standard responses
for certain issues, making it difficult for our mod-
els to differentiate companies experiencing high risk
from those experiencing low risk. Finally, ChatGPT’s
response algorithm tends to follow a specific format
that may have introduced unintended patterns within
our dataset that the models tried to recognize. This
may be another reason why our linear and logistic re-
gression models may have performed better than our
BERT or LSTM models, as linear regression is more
adept at capturing these consistent, systematic pat-
terns in data, while more complex models like BERT
might overfit to the nuances in language, missing out
on these broader, more uniform trends.
5.4 Enhancing the Process
To enhance the effectiveness of our process, several
strategies could be considered. Firstly, our strat-
egy could have focused solely on either annual re-
ports or even sustainability reports as opposed to data
from a diverse range of publicly released company
reports to enhance and streamline the data retrieval
process. Secondly, exploring alternative data sources,
like news articles, social media, or consumer reviews,
could provide additional context and depth to the ESG
assessments. Furthermore, continuously updating the
dataset with the latest reports and data would ensure
that the models stay relevant and accurate over time.
Another aspect to consider is improving data prepro-
cessing techniques, such as more advanced natural
language processing methods, to better capture the
nuances and subtleties in the textual data. Lastly,
as independent researchers, we faced financial con-
straints that inhibited our data retrieval process and
our machine-learning capabilities. Specifically, up-
grading the LLM model we used requires more finan-
cial flexibility. Our process could incorporate inter-
disciplinary approaches, such as integrating insights
from behavioral economics to understand the impact
of corporate governance on ESG performance. This
draws inspiration from D’Amato et al.’s exploration
of balance sheet items and their correlation with ESG
scores, suggesting a nuanced approach to feature se-
lection in our model. Further, Krappel et al.’s work
on the temporal dynamics of company fundamentals
in reflecting ESG ratings underlines the importance of
including longitudinal data analysis in our methodol-
ogy. This could ensure our model adapts to tempo-
ral changes in ESG criteria, much like the dynamic
models suggested by T.-T. et al. and Chowdhury
et al., who assessed year-on-year changes in ESG
risk scores and their correlation with financial perfor-
mance using various machine learning models.
5.5 Subjectivity in ESG Risk Scores
A limitation in our study, and indeed in the field of
ESG risk assessment in general, is the intrinsic sub-
jectivity of ESG risk scores. ESG scoring is an ex-
tensive process, often involving qualitative judgments
and varying interpretations of what constitutes good
environmental, social, and governance practices. This
subjectivity can lead to inconsistencies and variability
in ESG risk scores, even among similar companies. It
FEMIB 2024 - 6th International Conference on Finance, Economics, Management and IT Business
76