Conference item
Slow-fast auditory streams for audio recognition
- Abstract:
- We propose a two-stream convolutional network for audio recognition, that operates on time-frequency spectrogram inputs. Following similar success in visual recognition, we learn Slow-Fast auditory streams with separable convolutions and multi-level lateral connections. The Slow pathway has high channel capacity while the Fast pathway operates at a fine-grained temporal resolution. We showcase the importance of our two-stream proposal on two diverse datasets: VGG-Sound and EPIC-KITCHENS-100, and achieve state- of-the-art results on both.
- Publication status:
- Published
- Peer review status:
- Peer reviewed
Actions
Access Document
- Files:
-
-
(Accepted manuscript, 2.1MB)
-
- Publisher copy:
- 10.1109/ICASSP39728.2021.9413376
Authors
Funding
Bibliographic Details
- Publisher:
- IEEE Publisher's website
- Host title:
- ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- Pages:
- 855-859
- Publication date:
- 2021-05-13
- Acceptance date:
- 2021-01-29
- Event title:
- 2021 IEEE International Conference on Acoustics, Speech and Signal Processing
- Event location:
- Toronto, Ontario, Canada
- Event website:
- https://www.2021.ieeeicassp.org/
- Event start date:
- 2021-06-06
- Event end date:
- 2021-06-11
- DOI:
- EISSN:
-
2379-190X
- ISSN:
-
1520-6149
- EISBN:
- 978-1-7281-7605-5
- ISBN:
- 978-1-7281-7606-2
Item Description
- Language:
- English
- Keywords:
- Pubs id:
-
1212419
- Local pid:
- pubs:1212419
- Deposit date:
- 2022-01-19
Terms of use
- Copyright holder:
- IEEE
- Copyright date:
- 2021
- Rights statement:
- © 2021 IEEE
- Notes:
- This is the accepted manuscript version of the paper. The final version is available online from IEEE at https://doi.org/10.1109/ICASSP39728.2021.9413376
Metrics
If you are the owner of this record, you can report an update to it here: Report update to this record