Acoustic signal enhancement using autoregressive PixelCNN architecture

Pixel CNN for sound denoising

Authors

  • Shibani Kar Sambalpur University Institute of Information Technology

DOI:

https://doi.org/10.62110/sciencein.jist.2024.v12.770

Keywords:

Pixel CNN, deep generative model, auto regression, non-stationary noises, speech de-noising

Abstract

Acoustic Signals such as speech and sound are easily degraded by interferences present in our surroundings.The present work explores the usage of the  Pixel CNN architecture for the removal of non-stationary noises from the speech signal. The presence of noise in speech signals affects the performances of applications that use speech signal as a medium for communication such as automatic speech recognition systems, hearing aid, mobile phones. Pixel CNN is a deep generative network architecture implemented as an autoregressive model. The dataset “NOIZEUS” is used for noise mixed speech samples and clean speech samples. The architecture learns the feature from the input speech using the spectrogram representation of speech signal. To prove the efficiency of the method, the performance of Pixel CNN architecture is compared with a number of baseline methods to prove its efficiency. The parameters  used for comparison are “PESQ” and “STOI”.

URN:NBN:sciencein.jist.2024.v12.770

Downloads

Download data is not yet available.

Author Biography

  • Shibani Kar, Sambalpur University Institute of Information Technology

    Department of Electronics and Communication Engineering

Downloads

Published

2023-12-11

Issue

Section

Engineering

URN

How to Cite

Acoustic signal enhancement using autoregressive PixelCNN architecture. (2023). Journal of Integrated Science and Technology, 12(3), 770. https://doi.org/10.62110/sciencein.jist.2024.v12.770

Similar Articles

1-10 of 114

You may also start an advanced similarity search for this article.