covery of a class of compact extremely star-forming
galaxies. MNRAS, 399(3):1191–1205.
Cover, T. M. and Thomas, J. A. (2012). Elements of infor-
mation theory. John Wiley & Sons.
Dempster, A. P., Laird, N. M., and Rubin, D. B. (1977).
Maximum likelihood from incomplete data via the em
algorithm. Journal of the Royal Statistical Society:
Series B (Methodological), 39(1):1–22.
Eisenstein, D. J., Weinberg, D. H., Agol, E., Aihara, H.,
Allende Prieto, C., Anderson, S. F., Arns, J. A.,
Aubourg,
´
E., Bailey, S., Balbinot, E., and et al. (2011).
SDSS-III: Massive Spectroscopic Surveys of the Dis-
tant Universe, the Milky Way, and Extra-Solar Plane-
tary Systems. AJ, 142:72.
Goto, T. (2007). A catalogue of local E+A (post-starburst)
galaxies selected from the Sloan Digital Sky Survey
Data Release 5. MNRAS, 381:187–193.
Graur, O. and Maoz, D. (2013). Discovery of 90 Type Ia
supernovae among 700 000 Sloan spectra: the Type Ia
supernova rate versus galaxy mass and star formation
rate at redshift 0.1. MNRAS, 430(3):1746–1763.
Hall, P. B., Brandt, W. N., Petitjean, P., P
ˆ
aris, I., Filiz Ak,
N., Shen, Y., Gibson, R. R., Aubourg,
´
E., Anderson,
S. F., Schneider, D. P., Bizyaev, D., Brinkmann, J.,
Malanushenko, E., Malanushenko, V., Myers, A. D.,
Oravetz, D. J., Ross, N. P., Shelden, A., Simmons,
A. E., Streblyanska, A., Weaver, B. A., and York,
D. G. (2013). Broad absorption line quasars with
redshifted troughs: high-velocity infall or rotationally
dominated outflows? MNRAS, 434:222–256.
Kohonen, T. (1982). Self-organized formation of topolog-
ically correct feature maps. Biological Cybernetics,
43(1):59–69.
Levi, M., Bebek, C., Beers, T., Blum, R., Cahn, R., Eisen-
stein, D., Flaugher, B., Honscheid, K., Kron, R., La-
hav, O., McDonald, P., Roe, N., Schlegel, D., and rep-
resenting the DESI collaboration (2013). The DESI
Experiment, a whitepaper for Snowmass 2013. ArXiv
e-prints.
Lintott, C. J., Schawinski, K., Slosar, A., Land, K., Bam-
ford, S., Thomas, D., Raddick, M. J., Nichol, R. C.,
Szalay, A., Andreescu, D., Murray, P., and Vanden-
berg, J. (2008). Galaxy Zoo: morphologies derived
from visual inspection of galaxies from the Sloan Dig-
ital Sky Survey. MNRAS, 389:1179–1189.
Liu, F. T., Ting, K. M., and Zhou, Z.-H. (2008). Isolation
forest. In Proceedings of the 2008 Eighth IEEE In-
ternational Conference on Data Mining, ICDM ’08,
pages 413–422, Washington, DC, USA. IEEE Com-
puter Society.
McInnes, L., Healy, J., Saul, N., and Grossberger, L. (2018).
Umap: Uniform manifold approximation and projec-
tion. The Journal of Open Source Software, 3(29):861.
Meusinger, H., Schalldach, P., Scholz, R.-D., in der Au, A.,
Newholm, M., de Hoon, A., and Kaminsky, B. (2012).
Unusual quasars from the Sloan Digital Sky Survey
selected by means of Kohonen self-organising maps.
A&A, 541:A77.
Nun, I., Pichara, K., Protopapas, P., and Kim, D.-W. (2014).
Supervised detection of anomalous light curves in
massive astronomical catalogs. The Astrophysical
Journal, 793(1):23.
Nun, I., Protopapas, P., Sim, B., and Chen, W. (2016). En-
semble learning method for outlier detection and its
application to astronomical light curves. The Astro-
nomical Journal, 152(3):71.
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V.,
Thirion, B., Grisel, O., Blondel, M., Prettenhofer,
P., Weiss, R., Dubourg, V., Vanderplas, J., Passos,
A., Cournapeau, D., Brucher, M., Perrot, M., and
Duchesnay, E. (2011). Scikit-learn: Machine learning
in Python. Journal of Machine Learning Research,
12:2825–2830.
Perronnin, F. and Dance, C. (2007). Fisher kernels on visual
vocabularies for image categorization. In 2007 IEEE
conference on computer vision and pattern recogni-
tion, pages 1–8. IEEE.
Perronnin, F., S
´
anchez, J., and Mensink, T. (2010). Im-
proving the fisher kernel for large-scale image classi-
fication. In European conference on computer vision,
pages 143–156. Springer.
Pimentel, M. A., Clifton, D. A., Clifton, L., and Tarassenko,
L. (2014). A review of novelty detection. Signal Pro-
cessing, 99(Supplement C):215 – 249.
Protopapas, P., Giammarco, J. M., Faccioli, L., Struble,
M. F., Dave, R., and Alcock, C. (2006). Finding
outlier light curves in catalogues of periodic variable
stars. MNRAS, 369:677–696.
Reis, I., Poznanski, D., Baron, D., Zasowski, G., and Sha-
haf, S. (2018). Detecting outliers and learning com-
plex structures with large spectroscopic surveys - a
case study with apogee stars. Monthly Notices of the
Royal Astronomical Society, page sty348.
Richards, J. W., Starr, D. L., Miller, A. A., Bloom, J. S.,
Butler, N. R., Brink, H., and Crellin-Quick, A. (2012).
Construction of a Calibrated Probabilistic Classifica-
tion Catalog: Application to 50k Variable Sources in
the All-Sky Automated Survey. ApJS, 203:32.
Shi, T. and Horvath, S. (2006). Unsupervised learning with
random forest predictors. Journal of Computational
and Graphical Statistics, 15(1):118–138.
Vedaldi, A. and Fulkerson, B. (2008). VLFeat: An open and
portable library of computer vision algorithms. http:
//www.vlfeat.org/.
APPENDIX
DSS Galaxies Anomalies
This section contains examples of the anomalies de-
tected in the SDSS galaxy dataset by the Isolation
Forest, Unsupervised Random Forest, and PCA re-
construction. These three methods were able to de-
tect diverse types of true anomalies, similarly to our
Fisher Vector based method. Examples of anoma-
lies detected by isolation forest are shown in Table 3,
Detect the Unexpected: Novelty Detection in Large Astrophysical Surveys using Fisher Vectors
133