
that users have different perceptions if this mitigating
action works as expected or the dynamic system ap-
plied by this specific platform service, as we found in
(Jorge Mejia, 2020) and (Pandey and Caliskan, 2021).
Figure 8: Ridesharing discrimination detection discussion.
Another interesting point, which at the same time
corroborates the analysis with the studies found in the
literature, was in relation to the provision of passen-
ger information, such as origin and destination ad-
dresses, to the driver only after accepting the ride as
(Miroslav Tushev and Mahmoud, 2021), (Yanbo Ge,
2018), (Brown, 2019) and (Abramova, 2020). How-
ever, this was a point where we found divergent opin-
ions from the user’s platform. Passenger information
is made available after acceptance of the ride as a way
to mitigate discrimination, however, discrimination is
possible to identify that cancellation still occurs after
this information is made available, and when not, pas-
sengers report that the service provided is impacted,
causing embarrassment, discomfort, and insecurity to
the passenger who is disembarked outside the location
requested in the application.
Also, two indicators suggest a more in-depth anal-
ysis, as it was not possible to identify whether there
was direct or statistical discrimination by class or eth-
nicity. The largest of them, with 20.8 percent of com-
plaints, were related to charging, where they were as-
sociated with passenger complaints regarding drivers
who canceled the ride or did not want to use the dis-
count selected by the passenger when requesting the
ride. Another index that it was not possible to de-
termine direct or statistical discrimination by class or
ethnicity for the 1.3 percent of complaints categorized
as Red Line, that is, where the destination address is
located in communities or their surroundings. This
indicator may be more associated with public safety
issues but also it can hide discrimination behavior.
Furthermore, we were able to observe that the NB
model was the best compared to the SVM, until we
adjusted the class weights to solve the unbalanced
class problem. Additionally, we were unable to use
SVM to identify outliers in our data or obtain better
results due to the size of the datasets with both mod-
els.
6 CONCLUSIONS
In this study, it was possible to analyze that the main
problem of this research is a topic of great relevance
to society and there are opportunities to address it in
the information system in order to promote mecha-
nisms that reduce discrimination of any type, be it
racial, gender, sexual orientation, religious or politi-
cal association, of way to eradicate this behavior that
is harmful to society. Our study, combined with an ex-
ploratory analysis of the state of the art in literature,
proposed to answer the following questions:
• RQ1. Is there evidence of digital discrimination in
the ridesharing application used in Rio de Janeiro
city? Based on our analysis, it was possible to
conclude that yes, there is evidence of digital dis-
crimination in the ridesharing services of the city.
• RQ2. Is it possible to identify the factors that lead
to discrimination? Yes, it was possible to iden-
tify that there are factors associated with preju-
dice in particular towards women, with the com-
ments, it was possible to identify that the majority
of drivers are men, we found only 4 comments
with reference to a driver woman, representing
0.63 percent, and 50 percent with positive com-
ments.
• RQ3. What are the key concepts regarding Digital
Discrimination detection in a ridesharing service?
These concepts were identified in our analysis of
the domain, where we proposed an ontology about
it.
• RQ4. Could Machine Learning techniques accu-
rately identify discrimination and its main vari-
ables that can be used in actions to mitigate this
behavior? Yes, it is possible to use ML models to
accurately identify discrimination in ride-sharing
services. We were able to observe that, due to the
size of our data set, with just a small adjustment to
reduce the number of categories used for classifi-
cation, we already improved the results presented
by both models. If it is possible to increase the
size of the data set we can expect these results to
improve even further, in addition, if we get a data
set large enough to apply an unsupervised learn-
ing model, it will be possible to compare the re-
sults between the supervised and unsupervised, in
addition to analyzing the identified patterns and
Digital Discrimination Detection in Ridesharing Services in Rio de Janeiro City
1211