
volumes of data. Communications of the ACM,
39(11):27–34.
Felsenstein, J. (1985). Confidence limits on phyloge-
nies: An approach using the bootstrap. Evolution,
39(4):783–791.
Field, A. (2024). Discovering statistics using IBM SPSS
statistics. Sage publications limited.
Fisher, R. A. (1935). The Design of Experiments. Oliver
and Boyd, Edinburgh.
Garg, A. and Rajendran, R. (2024). The impact of struc-
tured prompt-driven generative ai on learning data
analysis in engineering students. In CSEDU (2), pages
270–277.
Girden, E. R. (1992). ANOVA: Repeated Measures, vol-
ume 84 of Quantitative Applications in the Social Sci-
ences. SAGE Publications, Newbury Park, CA.
Huelsenbeck, J. P. and Ronquist, F. (2001). Mrbayes:
Bayesian inference of phylogenetic trees. Bioinfor-
matics, 17(8):754–755.
Jain, A. K. (2010). Data clustering: 50 years beyond k-
means. Pattern recognition letters, 31(8):651–666.
Jumper, J., Evans, R., Pritzel, A., Green, T., Figurnov,
M., Ronneberger, O., Tunyasuvunakool, K., Bates,
R.,
ˇ
Z
´
ıdek, A., Potapenko, A., Bridgland, A., Meyer,
C., Kohl, S. A., Ballard, A. J., Cowie, A., Romera-
Paredes, B., Nikolov, S., Jain, R., Adler, J., Back, T.,
Petersen, S., Reiman, D., Clancy, E., Zielinski, M.,
Steinegger, M., Pacholska, M., Li, T. H., Degrave, R.
J. L., Bickerton, C. M., Meyer, W. J., Velankar, A. A.,
and Hassabis, D. (2021). Highly accurate protein
structure prediction with alphafold. Nature, 596:583–
589.
Koren, S., Walenz, B. P., Berlin, K., Miller, J. R., and
Phillippy, N. H. (2017). Canu: scalable and accurate
long-read assembly via adaptive k-mer weighting and
repeat separation. Genome Research, 27(5):722–736.
Li, C. and Wong, W. H. (2008). Model-based analysis of
oligonucleotide arrays: Expression index computation
and outlier detection. Proceedings of the National
Academy of Sciences, 98(1):31–36.
Liu, F. T., Ting, K. M., and Zhou, Z.-H. (2008). Isolation
forest. In Proceedings of the 2008 Eighth IEEE Inter-
national Conference on Data Mining, pages 413–422.
IEEE.
Macosko, E. Z., Basu, A., Satija, R., Nemesh, J., Shekhar,
K., Goldman, M., Tirosh, I., Bialas, A. R., Kamitaki,
N., Martersteck, E. M., Trombetta, J. J., Weitz, D. A.,
and Regev, A. (2015). Highly parallel genome-wide
expression profiling of individual cells using nanoliter
droplets. Cell, 161(5):1202–1214.
Mann, H. B. and Whitney, D. R. (1947). On a test of
whether one of two random variables is stochastically
larger than the other. The Annals of Mathematical
Statistics, 18(1):50–60.
Martisius, N. L., McPherron, S. P., Schulz-Kornas, E.,
Soressi, M., and Steele, T. E. (2020). A method for the
taphonomic assessment of bone tools using 3d surface
texture analysis of bone microtopography. Archaeo-
logical and Anthropological Sciences, 12:1–16.
McInnes, L., Healy, J., Saul, N., and Großberger, L. (2018).
Umap: Uniform manifold approximation and projec-
tion. Journal of Open Source Software, 3(29):861.
Pearson, K. (1901). On lines and planes of closest fit to
systems of points in space. The London, Edinburgh,
and Dublin Philosophical Magazine and Journal of
Science, 2(11):559–572.
Pevzner, P. A., Tang, H., and Waterman, M. S. (2001).
An eulerian path approach to dna fragment assem-
bly. Proceedings of the National Academy of Sciences,
98(17):9748–9753.
Quaranta, L., Azevedo, K., Calefato, F., and Kalinowski,
M. (2024). A multivocal literature review on the ben-
efits and limitations of industry-leading automl tools.
Information and Software Technology, page 107608.
Rousseeuw, P. J. and Driessen, K. V. (1999). A fast algo-
rithm for the minimum covariance determinant esti-
mator. Technometrics, 41(3):212–223.
Saitou, N. and Nei, M. (1987). The neighbor-joining
method: A new method for reconstructing phylo-
genetic trees. Molecular Biology and Evolution,
4(4):406–425.
Schubert, E. and Zimek, A. (2019). Elki: A large open-
source library for data analysis-elki release 0.7. 5” hei-
delberg”. arXiv preprint arXiv:1902.03616.
Schwede, T., Kopp, J., Guex, N., and Peitsch, M. C.
(2003). Swiss-model: An automated protein
homology-modeling server. Nucleic Acids Research,
31(13):3381–3385.
Shapiro, S. S. and Wilk, M. B. (1965). An analysis
of variance test for normality (complete samples).
Biometrika, 52(3-4):591–611.
Sim, K., Gopalkrishnan, V., Zimek, A., and Cong, G.
(2013). A survey on enhanced subspace clustering.
Data mining and knowledge discovery, 26:332–397.
Smith, T. F. and Waterman, M. S. (1981). Identification of
common molecular subsequences. Journal of Molec-
ular Biology, 147(1):195–197.
van der Maaten, L. and Hinton, G. (2008). Visualizing data
using t-sne. Journal of Machine Learning Research,
9:2579–2605.
Wilcoxon, F. (1945). Individual comparisons by ranking
methods. Biometrics Bulletin, 1(6):80–83.
Winkler, D. E., Kubo, T., Kubo, M. O., Kaiser, T. M.,
and T
¨
utken, T. (2022). First application of dental
microwear texture analysis to infer theropod feeding
ecology. Palaeontology, 65(6):e12632.
Xanthopoulos, I., Tsamardinos, I., Christophides, V., Si-
mon, E., and Salinger, A. (2020). Putting the human
back in the automl loop. In EDBT/ICDT Workshops.
Position Paper: Computer Supported Education vs. Education Supported Computing - On the Problem of Informed Decision Making of
Appropriate Data Analytics Method
445