effective than recognising structures from within the
spectrogram of a complex audio mixture.
Further work will be conducted with the aim of al-
lowing the separation of notes in octave relation, and
improving the separation of low-pitched notes. Dif-
ferent approaches to automate the clustering of note
events into sources will also be explored, as a way
to deliver a fully-automated source separation system
that could be compared with other unsupervised algo-
rithms based on machine learning.
ACKNOWLEDGEMENTS
The authors would like to thank the University of
Costa Rica and the Costa Rican Ministry of Science,
Technology and Telecommunications for their sup-
port in founding this research.
REFERENCES
Bryan, N. J. and Mysore, G. J. (2013). Interactive refine-
ment of supervised and semi-supervised sound source
separation estimates. In Proceedings of the 38th IEEE
International Conference on Acoustics, Speech and
Signal Processing, pages 883–887.
Cano, E., Fitzgerald, D., and Brandenburg, K. (2016). Eval-
uation of quality of sound source separation algo-
rithms: human perception vs quantitative metrics. In
Proceedings of the 24th IEEE European Signal Pro-
cessing Conference, number 1, pages 1758–1762.
Chandna, P., Miron, M., Janer, J., and G
´
omez, E. (2017).
Monoaural audio source separation using deep convo-
lutional neural networks. In Proceedings of the 13th
International Conference on Latent Variable Analysis
and Signal Separation, pages 258–266.
Duan, Z., Pardo, B., and Zhang, C. (2010). Multiple fun-
damental frequency estimation by modeling spectral
peaks and non-peak regions. IEEE Transactions on
Audio, Speech and Language Processing, 18(8):2121–
2133.
Every, M. R. and Szymanski, J. E. (2006). Separation of
synchronous pitched notes by spectral filtering of har-
monics. IEEE Transactions on Audio, Speech and
Language Processing, 14(5):1845–1856.
F
´
evotte, C., Gribonval, R., and Vincent, E. (2005). BSS
EVAL toolbox user guide. Technical Report 1706,
Institut de Recherche en Informatique et Syst
`
emes
Al
´
eatoires.
Grais, E. M., Roma, G., Simpson, A., and Plumbley, M. D.
(2017). Two-stage single-channel audio source sepa-
ration using deep neural networks. IEEE/ACM Trans-
actions on Audio, Speech and Language Processing,
25(9):1469–1479.
Jang, G. J., Lee, T. W., and Oh, Y. H. (2003). Single-channel
signal separation using time-domain basis functions.
IEEE Signal Processing Letters, 10(6):168–171.
Li, Y., Woodruff, J., and Wang, D. (2009). Monaural mu-
sical sound separation based on pitch and common
amplitude modulation. IEEE Transactions on Audio,
Speech and Language Processing, 17(7):1361–1371.
Parsons, T. W. (1976). Separation of speech from in-
terfering speech by means of harmonic selection.
The Journal of the Acoustical Society of America,
60(1976):911.
Ponce de Le
´
on V
´
azquez, J. and Beltr
´
an Bl
´
azquez, J. R.
(2012). Blind separation of overlapping partials in
harmonic musical notes using amplitude and phase re-
construction. EURASIP Journal on Advances in Sig-
nal Processing, (223):1–16.
Rafii, Z., Liutkus, A., Stoter, F. R., Mimilakis, S. I., Fitzger-
ald, D., and Pardo, B. (2018). An overview of lead
and accompaniment separation in music. IEEE/ACM
Transactions on Audio, Speech and Language Pro-
cessing, 26(8):1307–1335.
Taghia, J. and Doostari, M. A. (2009). Subband-based
single-channel source separation of instantaneous au-
dio mixtures. World Applied Sciences Journal,
6(6):784–792.
Vincent, E., Gribonval, R., and F
´
evotte, C. (2006). Perfor-
mance measurement in blind audio source separation.
IEEE Transactions on Audio, Speech and Language
Processing, 14(4):1462–1469.
Vincent, E., Gribonval, R., and Plumbley, M. D. (2007). Or-
acle estimators for the benchmarking of source separa-
tion algorithms. Signal Processing, 87(8):1933–1950.
Vincent, E. and Plumbley, M. D. (2007). BSS ORACLE
toolbox version 2.1 user guide. Technical report.
Zivanovic, M. (2015). Harmonic bandwidth companding
for separation of overlapping harmonics in pitched
signals. IEEE/ACM Transactions on Audio, Speech
and Language Processing, 23(5):898–908.
Semi-supervised Audio Source Separation based on the Iterative Estimation and Extraction of Note Events
279