Publications
Selected key publications from our lab can be found below. For more publications, see the personal pages of our members.
2024
- Y. Li, R. Yuan, G. Zhang, Y. Ma, X. Chen, H. Yin, C. Xiao, C. Lin, A. Ragni, E. Benetos, N. Gyenge, R. Dannenberg, R. Liu, W. Chen, G. Xia, Y. Shi, W. Huang, Z. Wang, Y. Guo, J. Fu, “MERT: acoustic music understanding model with large-scale self-supervised training“, in 12th International Conference on Learning Representations (ICLR), 2024.
- D. Edwards, S. Dixon, E. Benetos, A. Maezawa, Y. Kusaka, “A data-driven analysis of robust automatic piano transcription”, IEEE Signal Processing Letters, vol. 31, pp. 681-685, 2024.
postprint - S. Singh, C. J. Steinmetz, E. Benetos, H. Phan, and D. Stowell, “ATGNN: audio tagging graph neural network”, IEEE Signal Processing Letters, vol. 31, pp. 825-829, 2024.
postprint - Y. Li, W. Cao, W. Xie, J. Li, and E. Benetos, “Few-shot class-incremental audio classification using dynamically expanded classifier with self-attention modified prototypes”, IEEE Transactions on Multimedia, vol. 26, pp. 1346-1360, 2024.
2023
- L. Wang, M. Clayton, and A. Rossberg, “Drone audition for bioacoustic monitoring,” Methods in Ecology and Evolution , vol. 14, no. 12, pp. 3068-3082, Dec. 2023.
- D. Edwards, S. Dixon, and E. Benetos, “PiJAMA: Piano Jazz with Automatic MIDI Annotations”, Transactions of the International Society for Music Information Retrieval, vol. 6, no. 1, pp. 89-102, 2023.
- R. Yuan, Y. Ma, Y. Li, G. Zhang, X. Chen, H. Yin, L. Zhuo, Y. Liu, J. Huang, Z. Tian, B. Deng, N. Wang, C. Lin, E. Benetos, A. Ragni, N. Gyenge, R. Dannenberg, W. Chen, G. Xia, W. Xue, S. Liu, S. Wang, R. Liu, Y. Guo, J. Fu, “MARBLE: Music Audio Representation Benchmark for Universal Evaluation“, 37th Conf. Neural Information Processing Systems (NeurIPS), Dec. 2023.
- S. Sarkar, L. Thorpe, E. Benetos, M. Sandler, “Leveraging synthetic data for improving chamber ensemble separation“, IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), Oct. 2023. Best student paper award
- A. Alex, L. Wang, P. Gastaldo, and A. Cavallaro, “Data augmentation for speech separation,” Speech Communication, vol. 152, no. 102948, pp. 1-16, July 2023
- M. Clayton, L. Wang, A. McPherson, and A. Cavallaro, “An embedded multichannel sound acquisition system for drone audition,” IEEE Sensors Journal, vol. 23, no. 12, pp. 13377-13386, Jun. 2023
- D. Mukhutdinov, A. Alex, A. Cavallaro and L. Wang, “Deep learning models for single-channel speech enhancement on drones,” IEEE Access, vol. 11, pp. 22993-23007, Mar. 2023.
2022
- H. Phan, A. Mertins, M. Baumert. Pediatric Automatic Sleep Staging: A comparative study of state-of-the-art deep learning methods. IEEE Transactions on Biomedical Engineering, vol. 69, no. 12, pp. 3612-3622, 2022
- O. Y. Chén, H. Phan, H. Cao, T. Qian, M. De Vos. Probing Potential Priming: Defining, Quantifying, and Testing the Causal Priming Effect Using the Potential Outcomes Framework. Frontiers in Psychology, vol. 13, Article ID 724498, 2022
- H. Phan, K. B. Mikkelsen, O. Y. Chén, P. Koch, A. Mertins, M. De Vos. SleepTransformer: Automatic Sleep Staging with Interpretability and Uncertainty Quantification. IEEE Transactions on Biomedical Engineering, vol. 69, no.8, pp. 2456-2467, 2022
- H. Phan, O. Y. Chén, Minh C. Tran, P. Koch, A. Mertins, and M. De Vos. XSleepNet: Multi-View Sequential Model for Automatic Sleep Staging. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), vol. 44, no. 9, pp. 5903-5915, 2022
- E. R. M. Heremans, H. Phan, A. H. Ansari, P. Borzée, B. Buyse, D. Testelmans, M. De Vos. Feature matching as improved transfer learning technique for wearable EEG. Biomedical Signal Processing and Control, vol. 78, article ID 1040092022, 2022
- E. R. M. Heremans, H. Phan, P. Borzée, B. Buyse, D. Testelmans, M. De Vos. From unsupervised to semi-supervised adversarial domain adaptation in EEG-based sleep staging. Journal of Neural Engineering, vol. 19, article ID 036044, 2022
- K. B. Mikkelsen, H. Phan, M. L. Rank, M. C. Hemmsen, M. de Vos, P. Kidmose. Sleep monitoring using ear-centered setups: Investigating the influence from electrode configurations. IEEE Transactions on Biomedical Engineering (TBME), vol. 69, no. 5, pp. 1564-1572, 2022
- P. Autthasan, R. Chaisaen, T. Sudhawiyangkul, P. Rangpong, S. Kiatthaveephong, N. Dilokthanakul, G. Bhakdisongkhram, H. Phan, C. Guan, and T. Wilaiprasitporn. MIN2Net: End-to-End Multi-Task Learning for Subject-Independent Motor Imagery EEG Classification. IEEE Transactions on Biomedical Engineering, vol. 69, no. 6, pp. 2105-2118, 2022
- A. Ragano, E. Benetos, and A. Hines, “Automatic quality assessment of digitized and restored sound archives”, Journal of the Audio Engineering Society, vol. 70, no. 4, pp. 252-270, April 2022.
- C. Wang, E. Benetos, V. Lostanlen, and E. Chew, “Adaptive scattering transforms for playing technique recognition”, IEEE/ACM Transactions on Audio, Speech and Language Processing, vol. 30, pp. 1407-1421, March 2022. postprint
- E. Benetos, A. Ragano, D. Sgroi, and A. Tuckwell, “Measuring National Mood with Music: Using Machine Learning to Construct a Measure of National Valence from Audio Data”, Behavior Research Methods, Feb. 2022.
- A. Terenzi, N. Ortolani, I. Nolasco, E. Benetos, and S. Cecchi. Comparison of Feature Extraction Methods for Sound-based Classification of Honey Bee Activity. IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 30, pp. 112-122, 2022. postprint
2021
- A. Ragano, E. Benetos, and A. Hines. More for less: Non-intrusive speech quality assessment with limited annotations.In: 2021 13th International Conference on Quality of Multimedia Experience (QoMEX), pp. 103-108, 2021.
- Y. Zhao, C. Wang, G. Fazekas, E. Benetos, and M. Sandler. Violinist identification based on vibrato features.In: 2021 29th European Signal Processing Conference (EUSIPCO), pp. 381-385, 2021.
- K. O’Hanlon, E. Benetos, and S. Dixon. Detecting Cover Songs with Pitch Class Key-Invariant Networks. In: Proc 2021 IEEE 31st Intl Workshop on Machine Learning for Signal Processing (MLSP), pp. 1-6, 2021.
- L. Liu, V. Morfi, and E. Benetos. Joint Multi-Pitch Detection and Score Transcription for Polyphonic Piano Music.In: Proc 2021 IEEE Intl Conf on Acoustics, Speech and Signal Processing (ICASSP), pp. 281-285, 2021.
- S. Singh, H. L. Bear, and E. Benetos. Prototypical Networks for Domain Adaptation in Acoustic Scene Classification.In: Proc 2021 IEEE Intl Conf on Acoustics, Speech and Signal Processing (ICASSP), pp. 346-350, 2021.
- H. Phan, H. L. Nguyen, O. Y Chén, P. Koch, N. Q. K. Duong, I. McLoughlin, A. Mertins. Self-Attention Generative Adversarial Network for Speech Enhancement.In: Proc 2021 IEEE Intl Conf on Acoustics, Speech and Signal Processing (ICASSP), pp. 7103-7107, 2021.
- T. N. T. Nguyen, N. K. Nguyen, H. Phan, L. Pham, K. Ooi, D. L. Jones, W.-S. Gan. A General Network Architecture for Sound Event Localization and Detection Using Transfer Learning and Recurrent Neural Network.In: Proc 2021 IEEE Intl Conf on Acoustics, Speech and Signal Processing (ICASSP), pp. 935-939, 2021.
- H. Phan, H. L. Nguyen, O. Y. Chén, L. Pham, P. Koch, I. McLoughlin, A. Mertins. Multi-view Audio and Music Classification.In: Proc 2021 IEEE Intl Conf on Acoustics, Speech and Signal Processing (ICASSP), pp. 611-615, 2021.
- L. Pham, H. Phan, A. Schindler, R. King, A. Mertins, and I. McLoughlin. Inception-Based Network and Multi-Spectrogram Ensemble Applied To Predict Respiratory Anomalies and Lung Diseases.In: Proc 43rd Annual Intl Conf of the IEEE Engineering in Medicine & Biology Society (EMBC), pp. 253-256, 2021.
- C. Lordelo, E. Benetos, S. Dixon, S. Ahlbäck, and P. Ohlsson. Adversarial Unsupervised Domain Adaptation for Harmonic-Percussive Source Separation. IEEE Signal Processing Letters (SPL), vol. 28, pp. 81-85, 2021. postprint
- H. Phan, O. Y. Chén, Minh C. Tran, P. Koch, A. Mertins, and M. De Vos. XSleepNet: Multi-View Sequential Model for Automatic Sleep Staging. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021. (in press)
- L. Pham, H. Phan, R. Palaniappan, A. Mertins, I. McLoughlin. CNN-MoE based framework for classification of respiratory anomalies and lung disease detection. IEEE Journal of Biomedical and Health Informatics (JBHI), 2021. (in press)
- L. Pham, H. Phan, T. Nguyen, R. Palaniappan, A. Mertins, I. McLoughlin. Robust Acoustic Scene Classification using a Multi-Spectrogram Encoder-Decoder Framework. Digital Signal Processing, vol. 110, Article ID 102943, 2021.
2020
- A. Ycart, L. Liu, E. Benetos, and M. T. Pearce. Investigating the Perceptual Validity of Evaluation Metrics for Automatic Piano Music Transcription. Transactions of the International Society for Music Information Retrieval, vol. 3, no. 1, pp. 68-81, June 2020.
- A. Ycart and E. Benetos. Learning and evaluation methodologies for polyphonic music sequence prediction with LSTMs. IEEE/ACM Transactions on Audio, Speech and Language Processing, vol. 28, pp. 1328-1341, Apr. 2020. postprint
- H. Phan, I. V. McLoughlin, L. Pham, O. Y. Chén, P. Koch, M. De Vos, and A. Mertins. Improving GANs for Speech Enhancement. IEEE Signal Processing Letters, vol. 27, pp. 1700-1704, 2020.
- I. McLoughlin, Z. Xie, Y. Song, H. Phan, and P. Ramaswamy. Time-frequency feature fusion for noise-robust audio event classification. Circuits, Systems, and Signal Processing (CSSP), vol. 39, no. 3, pp. 1672–1687, 2020.
- O. Y. Chén, F. Lipsmeier, H. Phan, J. Prince, K. I. Taylor, C. Gossens, M. Lindemann, and M. De Vos. Building a Machine-learning Framework to Remotely Assess Parkinson’s Disease Using Smartphones. IEEE Transactions on Biomedical Engineering (TBME), vol. 67, no. 12, pp. 3491-3500, 2020.
- M. A. Martínez Ramírez, E. Benetos, and J. D. Reiss, “Deep learning for black-box modeling of audio effects”, Applied Sciences, vol. 10, no. 2, Jan. 2020.
2019
- E. Benetos, S. Dixon, Z. Duan, and S. Ewert. Automatic Music Transcription: An Overview. IEEE Signal Processing Magazine, vol. 36, no. 1, pp. 20-30, Jan. 2019. postprint
- H. Phan, F. Andreotti, N. Cooray, O. Y. Chén, and M. De Vos. SeqSleepNet: End-to-End Hierarchical Recurrent Neural Network for Sequence-to-Sequence Automatic Sleep Staging. IEEE Transactions on Neural Systems and Rehabilitation Engineering (TNSRE), vol. 27, no. 3, pp. 400-410, 2019
- H. Phan, F. Andreotti, N. Cooray, O. Y. Chén, and M. De Vos. Joint Classification and Prediction CNN Framework for Automatic Sleep Stage Classification. IEEE Transactions on Biomedical Engineering (TBME), vol. 66, no. 5, pp. 1285-1296, 2019.
- Q. Zhou, Z. Feng, and E. Benetos, “Adaptive noise reduction for sound event detection using subband-weighted NMF”, Sensors, vol. 19, no. 14, July 2019.
- E. Covas and E. Benetos, “Optimal Neural Network Feature Selection for Spatial-Temporal Forecasting”, Chaos, vol. 29, no. 6, June 2019. postprint
2018
- E. Benetos, D. Stowell, and M. D. Plumbley. Approaches to complex sound scene analysis. In Computational Analysis of Sound Scenes and Events, T. Virtanen, M. D. Plumbley, and D. P. W. Ellis (eds.), Springer, 2018.
- A. Mesaros, T. Heittola, E. Benetos, P. Foster, M. Lagrange, T. Virtanen, and M. D. Plumbley, “Detection and Classification of Acoustic Scenes and Events: Outcome of the DCASE 2016 Challenge”, IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 26, no. 2, pp. 379-393, Feb. 2018.
- M. Panteli, E. Benetos, and S. Dixon, “A review of manual and computational approaches for the study of world music corpora”, Journal of New Music Research, vol. 47, no. 2, pp. 176-189, March 2018.
- H. Ali and S. N. Tran, E. Benetos and A. S. d’Avila Garcez, “Speaker recognition with hybrid features from a deep belief network”, Neural Computing and Applications, vol. 29, no. 6, pp. 13-19, March 2018.
- J. J. Valero-Mas, E. Benetos, and J. M. Iñesta, “A supervised classification approach for note tracking in polyphonic piano transcription”, Journal of New Music Research, vol. 47, no. 3, pp. 249-263, June 2018.
2017
- M. Panteli, E. Benetos, and S. Dixon, “A computational study on outliers in world music”, PLoS ONE, vol. 12, no. 12, article no. e0189399, Dec. 2017.
- D. Stowell. Computational Bioacoustic Scene Analysis. In Computational Analysis of Sound Scenes and Events, T. Virtanen, M. D. Plumbley, and D. P. W. Ellis (eds.), Springer, Oct. 2017.
- H. Pamula et al, Adaptation of deep learning methods to nocturnal bird audio monitoring, in LXIV Open Seminar on Acoustics (OSA) 2017, Piekary Śląskie, Poland. 2017.
- D. Stowell, E. Benetos, and L. F. Gill. On-bird Sound Recordings: Automatic Acoustic Recognition of Activities and Contexts. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 25(6):1193-1206, 2017. Postprint
- E. Benetos, G. Lafay, M. Lagrange and M. D. Plumbley. Polyphonic Sound Event Tracking using Linear Dynamical Systems. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 25(6):1266-1277, 2017. Postprint
- I. McLoughlin, H. Zhang, Z. Xie, Y. Song, W. Xiao, and H. Phan. Continuous Robust Sound Event Classification Using Time-Frequency Features and Deep Learning. PLoS ONE, vol. 12, no. 9, Article ID e0182309, 2017.
- H. Phan, L. Hertel, M. Maass, P. Koch, R. Mazur, and A. Mertins. Improved Audio Scene Classification based on Label-Tree Embeddings and Convolutional Neural Networks. IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP), vol. 25, no. 6, pp. 1278-1290, 2017.
- S. Abdallah, E. Benetos, N. Gold, S. Hargreaves, T. Weyde, and D. Wolff. The Digital Music Lab: A Big Data Infrastructure for Digital Musicology. ACM Journal on Computing and Cultural Heritage, vol. 10, no. 1, pp. 2:1-2:21, April 2017. Postprint
- A. McLeod, R. Schramm, M. Steedman, and E. Benetos, “Automatic Transcription of Polyphonic Vocal Music”, Applied Sciences, vol. 7, no. 12, article no. 1285, Dec. 2017.
2016
- D. Stowell, L. F. Gill, and D. Clayton. Detailed temporal structure of communication networks in groups of songbirds. Journal of the Royal Society Interface, 13(119), 2016.
- S. Sigtia, E. Benetos, S. Dixon. An End-to-End Neural Network for Polyphonic Piano Music Transcription. IEEE/ACM Transactions on Audio, Speech, and Language Processing 24(5): 927-939, 2016. DOI: 10.1109/TASLP.2016.2533858
- H. Phan, L. Hertel, M. Maass, R. Mazur, and A. Mertins. Learning Representations for Nonspeech Audio Events through Their Similarities to Speech Patterns. IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP), vol. 24, no. 4, pp. 807-822, 2016.
- G. Lafay, M. Lagrange, M. Rossignol, E. Benetos, and A. Roebel. A morphological model for simulating acoustic scenes and its application to sound event detection. IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 24, no. 10, pp. 1854-1864, Oct. 2016. Postprint
2015
- D. Stowell, D. Giannoulis, E. Benetos, M. Lagrange and M. D. Plumbley, Detection and Classification of Audio Scenes and Events. IEEE Transactions on Multimedia, vol. 17, no. 10, pp. 1733-1746, 2015.
- C. Kereliuk, B.L. Sturm, J. Larsen, Deep Learning and Music Adversaries. IEEE Transactions on Multimedia, vol. 17, no. 11, pp. 2059-2071, 2015.
- S. Ewert, M.D. Plumbley, M. Sandler, A dynamic programming variant of non-negative matrix deconvolution for the transcription of struck string instruments. In: Proc Int Conf Acoustics, Speech and Signal Processing (ICASSP), 2015.
- E. Benetos, A. Holzapfel. Automatic transcription of Turkish microtonal music. Journal of the Acoustical Society of America, vol. 138, no. 4, pp. 2118-2130, 2015. Postprint
- H. Phan, Marco Maaß, R. Mazur, and A. Mertins. Random Regression Forests for Acoustic Event Detection and Classification. IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP), 23(1), pp. 20-31, 2015