Paper
20 October 2022 C-DPCL: an adaptive conformer-based source separation method on deep clustering
Shunxian Gu
Author Affiliations +
Proceedings Volume 12451, 5th International Conference on Computer Information Science and Application Technology (CISAT 2022); 1245155 (2022) https://doi.org/10.1117/12.2656634
Event: 5th International Conference on Computer Information Science and Application Technology (CISAT 2022), 2022, Chongqing, China
Abstract
The cocktail-party problem is to separate each source from an overlapped speech. Transformer-based source separation models have achieved impressive performance in single-channel scenarios. This paper proposes using Conformer models on deep clustering approaches in clean and noisy environments. First, we replace the Bi-LSTM-based embedding network in the original deep clustering work with a Conformer structure. Second, canopy K-means is applied as the postprocessing module to adapt our model to cases with an unknown speakers’ number. In addition to tuning the model structure, we also introduce a denoising front-end to enable our model to handle noisy mixture signals. Experimental results on WSJ0-2mix show that our model possesses state-of-the-art performance among previous works in clean source separation. In contrast, on handling noisy source signals, experiments on LibriMix demonstrate the effectiveness of our ensemble model, with an average 6% increase in SDRi, compared with previous SOTA models.
© (2022) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Shunxian Gu "C-DPCL: an adaptive conformer-based source separation method on deep clustering", Proc. SPIE 12451, 5th International Conference on Computer Information Science and Application Technology (CISAT 2022), 1245155 (20 October 2022); https://doi.org/10.1117/12.2656634
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Data modeling

Neural networks

Binary data

Interference (communication)

Computer programming

Denoising

Time-frequency analysis

RELATED CONTENT


Back to Top