Conference item
The conversation: deep audio-visual speech enhancement
- Abstract:
-
Our goal is to isolate individual speakers from multi-talker simultaneous speech in videos. Existing works in this area have focussed on trying to separate utterances from known speakers in controlled environments. In this paper, we propose a deep audio-visual speech enhancement network that is able to separate a speaker's voice given lip regions in the corresponding video, by predicting both the magnitude and the phase of the target signal. The method is applicable to speakers unheard and un...
Expand abstract
- Publication status:
- Published
- Peer review status:
- Reviewed (other)
Actions
Access Document
- Files:
-
-
(Version of record, pdf, 434.9KB)
-
- Publisher copy:
- 10.21437/Interspeech.2018-1400
Authors
Funding
Bibliographic Details
- Publisher:
- International Speech Communication Association Publisher's website
- Host title:
- Interspeech Proceedings
- Journal:
- Interspeech Journal website
- Volume:
- 2018
- Pages:
- 3244-3248
- Publication date:
- 2018-09-02
- Acceptance date:
- 2018-06-03
- Event location:
- Hyderabad, India
- DOI:
- ISSN:
-
1990-9772
Item Description
- Pubs id:
-
pubs:859243
- UUID:
-
uuid:d04cc64a-7ae9-4a24-9804-f5b78235d543
- Local pid:
- pubs:859243
- Source identifiers:
-
859243
- Deposit date:
- 2018-06-25
Terms of use
- Copyright holder:
- International Speech Communication Association
- Copyright date:
- 2018
- Notes:
- This is a conference paper presented at Interspeech, 02-06 September 2018, Hyderabad, India. The final version is available online from ISCA at: https://doi.org/10.21437/Interspeech.2018-1400
Metrics
If you are the owner of this record, you can report an update to it here: Report update to this record