Speech signal enhancement by information combining

Heese, Florian; Vary, Peter; Martin, Rainer
doi:1437-6768
% IMPORTANT: The following is UTF-8 encoded.  This means that in the presence
% of non-ASCII characters, it will not work with BibTeX 0.99 or older.
% Instead, you should use an up-to-date BibTeX implementation like “bibtex8” or
% “biber”.

@PHDTHESIS{Heese:674894,
      author       = {Heese, Florian},
      othercontributors = {Vary, Peter and Martin, Rainer},
      title        = {{S}peech signal enhancement by information combining; 1.
                      {A}uflage},
      volume       = {44},
      school       = {RWTH Aachen University},
      type         = {Dissertation},
      address      = {Aachen},
      publisher    = {Wissenschaftsverlag Mainz},
      reportid     = {RWTH-2016-09678},
      isbn         = {978-3-95886-125-1},
      series       = {Aachener Beiträge zu digitalen Nachrichtensystemen},
      pages        = {1 Online-Ressource (x, 194 Seiten) : Illustrationen,
                      Diagramme},
      year         = {2016},
      note         = {Auch veröffentlicht auf dem Publikationsserver der RWTH
                      Aachen University; Dissertation, RWTH Aachen University,
                      2016},
      abstract     = {Mobile phones as well as tablets are omnipresent and belong
                      to everyday life. Today audiovisual communication takes
                      place at different locations and in a large variety of
                      acoustic environments. In consequence, the intelligibility
                      as well as the quality of speech may significantly be
                      degraded by ambient background noise. In order to improve
                      speech intelligibility and to ensure a convenient
                      communication with high audio quality, speech enhancement
                      techniques are required. In this thesis all critical
                      components contributing to the enhancement of the up-link
                      signal are addressed: • signal capturing at the acoustic
                      front-end with a new near field beam former, • new
                      codebook based speech and noise estimation procedure
                      generating and exploiting reliability information, and •
                      actual noise reduction exploiting spectral dependencies of
                      human speech. For the acoustic front-end of the digital
                      processing chain a novel concept for the filter optimization
                      of a near field beamformer is introduced. The optimization
                      scheme allows to closely approximate a predefined reception
                      characteristic which can be freely chosen according to the
                      application. The output of the beamformer provides a
                      pre-enhanced signal with improved SNR for subsequent
                      single-microphone based speech enhancement.
                      Single-microphone noise reduction usually relies on
                      statistical properties of speech and noise. In general, the
                      noise is assumed to be stationary or only slightly
                      time-varying, which is in practice often not fulfilled. Due
                      to imprecise noise estimation, single-microphone systems are
                      prone to unpleasant artifacts that are called musical tones.
                      In this context different Information Combining methods,
                      merging various estimates, are presented which address
                      specifically the problem of non-stationary noise signals,
                      leading to a significant improved estimation accuracy. On
                      the one hand, the proposed Information Combining is used
                      with respect to spectral dependencies of human speech. On
                      the other hand, it merges the best of several speech and
                      noise estimates depending on their reliability. The
                      necessary estimates are provided by a new statistical noise
                      estimator as well as a codebook driven speech and noise
                      estimation algorithm. The achieved estimation quality opens
                      up the possibility to close the gap between the conflicting
                      goals of high noise attenuation, low speech distortion, and
                      the prevention of undesired musical tone artifacts. Finally,
                      the practical aspects of the proposed enhancement systems
                      are considered and discussed with two implemented real-time
                      demonstrators.},
      cin          = {613310},
      ddc          = {621.3},
      cid          = {$I:(DE-82)613310_20140620$},
      typ          = {PUB:(DE-HGF)11 / PUB:(DE-HGF)3},
      urn          = {urn:nbn:de:hbz:82-rwth-2016-096782},
      url          = {https://publications.rwth-aachen.de/record/674894},
}
h1

h2

h3

h4

h5

h6

RWTH

Kontakt

RWTH Publications

Allgemeines