Conference item icon

Conference item

The conversation: deep audio-visual speech enhancement

Abstract:

Our goal is to isolate individual speakers from multi-talker simultaneous speech in videos. Existing works in this area have focussed on trying to separate utterances from known speakers in controlled environments. In this paper, we propose a deep audio-visual speech enhancement network that is able to separate a speaker's voice given lip regions in the corresponding video, by predicting both the magnitude and the phase of the target signal. The method is applicable to speakers unheard and un...

Expand abstract
Publication status:
Published
Peer review status:
Reviewed (other)

Actions


Access Document


Files:
Publisher copy:
10.21437/Interspeech.2018-1400

Authors


More by this author
Institution:
University of Oxford
Division:
Mathematical Physical and Life Sciences
Department:
Engineering Science
Role:
Author
More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Engineering Science
Oxford college:
Brasenose College
Role:
Author
Publisher:
International Speech Communication Association Publisher's website
Host title:
Interspeech Proceedings
Journal:
Interspeech Journal website
Volume:
2018
Pages:
3244-3248
Publication date:
2018-09-02
Acceptance date:
2018-06-03
Event location:
Hyderabad, India
DOI:
ISSN:
1990-9772
Pubs id:
pubs:859243
UUID:
uuid:d04cc64a-7ae9-4a24-9804-f5b78235d543
Local pid:
pubs:859243
Source identifiers:
859243
Deposit date:
2018-06-25

Terms of use


Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP