
Table 3: Total Variation Distance for different optimizers and loss functions of φ.
Optimizer Loss Learning Rate α Avg. Total Variation Distance
ADAM W 0.0001 0.37111863 ± 0.03492265
ADAM logW 0.0001 0.36836797 ± 0.043502506
ADAM KLDiv 0.01 0.2161428 ± 7.386059e-06
SGD W 0.0001 0.39597395 ± 0.049845863
SGD logW 0.0001 0.37211606 ± 0.04976438
SGD KLDiv 0.001 0.212498 ±0.0048501617
ter accuracy than the other proposals (Xu et al., 2017;
Chavira and Darwiche, 2008). We also illustrated the
use of SOFs for finite variable first-order logic formu-
las showing promising results. Finally, we conducted
KD experiments for two scenarios: when the set of
satisfying assignments (SOFs) was used as a regu-
larizer and when it served as the main loss function.
These experiments are promising, demonstrating simi-
lar accuracy compared to their teacher models while
requiring only half the number of parameters.
ACKNOWLEDGEMENTS
Vaishak Belle was supported by a Royal Society Re-
search Fellowship. Miguel Angel Mendez Lucero was
supported by CONACYT Mexico.
REFERENCES
Amari, S. and Nagaoka, H. (2000). Methods of Information
Geometry. Translations of mathematical monographs.
American Mathematical Society.
Amari, S.-i. (2005). Information geometry and its applica-
tions. Journal of Mathematical Psychology, 49:101–
102.
Amodei, D., Olah, C., Steinhardt, J., Christiano, P., Schul-
man, J., and Man
´
e, D. (2016). Concrete problems in ai
safety. arXiv preprint arXiv:1606.06565.
Barrett, C., Sebastiani, R., Seshia, S. A., Tinelli, C., Biere,
A., Heule, M., van Maaren, H., and Walsh, T. (2009).
Handbook of satisfiability. Satisfiability modulo theo-
ries, 185:825–885.
Bell, J. and Slomson, A. (2006). Models and Ultraproducts:
An Introduction. Dover Books on Mathematics Series.
Dover Publications.
Belle, V. (2020). Symbolic logic meets machine learning: A
brief survey in infinite domains. In Davis, J. and Tabia,
K., editors, Scalable Uncertainty Management - 14th
International Conference, SUM 2020, Bozen-Bolzano,
Italy, September 23-25, 2020, Proceedings, volume
12322 of Lecture Notes in Computer Science, pages
3–16. Springer.
Belle, V., Passerini, A., and Van den Broeck, G. (2015).
Probabilistic inference in hybrid domains by weighted
model integration. In Proceedings of 24th International
Joint Conference on Artificial Intelligence (IJCAI), vol-
ume 2015, pages 2770–2776.
Belousov, B. (2017). Geodesic distance between probability
distributions is not the kl divergence. 11 July, 2017.
Chavira, M. and Darwiche, A. (2008). On probabilistic
inference by weighted model counting. Artificial Intel-
ligence, 172(6-7):772–799.
Cover, T. M. and Thomas, J. A. (2006). Elements of Informa-
tion Theory 2nd Edition (Wiley Series in Telecommuni-
cations and Signal Processing). Wiley-Interscience.
Csisz
´
ar, I. and Shields, P. (2004). Information theory and
statistics: A tutorial. Foundations and Trends® in
Communications and Information Theory, 1(4):417–
528.
Darwiche, A. and Marquis, P. (2002). A knowledge compi-
lation map. Journal of Artificial Intelligence Research,
17:229–264.
Enderton, H. (2001). A Mathematical Introduction to Logic.
Elsevier Science.
Feldstein, J., Jur
ˇ
cius, M., and Tsamoura, E. (2023). Paral-
lel neurosymbolic integration with concordia. arXiv
preprint arXiv:2306.00480.
Fischer, M., Balunovic, M., Drachsler-Cohen, D., Gehr, T.,
Zhang, C., and Vechev, M. T. (2019). Dl2: Training and
querying neural networks with logic. In International
Conference on Machine Learning.
Gallot, S., Hulin, D., and Lafontaine, J. (2004). Riemannian
Geometry. Universitext. Springer Berlin Heidelberg.
Giunchiglia, E., Stoian, M. C., and Lukasiewicz, T. (2022).
Deep learning with logical constraints. In Raedt, L. D.,
editor, Proceedings of the Thirty-First International
Joint Conference on Artificial Intelligence, IJCAI 2022,
Vienna, Austria, 23-29 July 2022, pages 5478–5485.
ijcai.org.
Goodfellow, I., Bengio, Y., and Courville, A. (2016). Deep
Learning. MIT Press. http://www.deeplearningbook.
org.
Gou, J., Yu, B., Maybank, S. J., and Tao, D. (2020). Knowl-
edge distillation: A survey. CoRR, abs/2006.05525.
Gunning, D. (2017). Explainable artificial intelligence (xai).
Defense Advanced Research Projects Agency (DARPA),
nd Web, 2.
Hoernle, N., Karampatsis, R., Belle, V., and Gal, K. (2022).
Multiplexnet: Towards fully satisfied logical con-
straints in neural networks. In Thirty-Sixth AAAI Con-
ference on Artificial Intelligence, AAAI 2022, Thirty-
Fourth Conference on Innovative Applications of Ar-
tificial Intelligence, IAAI 2022, The Twelveth Sympo-
sium on Educational Advances in Artificial Intelligence,
ICAART 2025 - 17th International Conference on Agents and Artificial Intelligence
916