Learning deep models from synthetic data for extracting dolphin whistle contours
Abstract
We present a learning-based method for extracting whistles of toothed whales (Odontoceti) in hydrophone recordings. Our method represents audio signals as time-frequency spectrograms and decomposes each spectrogram into a set of time-frequency patches. A deep neural network learns archetypical patterns (e.g., crossings, frequency modulated sweeps) from the spectrogram patches and predicts time-frequency peaks that are associated with whistles. We also developed a comprehensive method to synthesize training samples from background environments and train the network with minimal human annotation effort. We applied the proposed learn-from-synthesis method to a subset of the public Detection, Classification, Localization, and Density Estimation (DCLDE) 2011 workshop data to extract whistle confidence maps, which we then processed with an existing contour extractor to produce whistle annotations. The F1-score of our best synthesis method was 0.158 greater than our baseline whistle extraction algorithm (~25% improvement) when applied to common dolphin (Delphinus spp.) and bottlenose dolphin (Tursiops truncatus) whistles.
Citation
Li , P , Liu , X , Palmer , K , Fleishman , E , Gillespie , D M , Nosal , E-M , Shiu , Y , Klinck , H , Cholewiak , D , Helble , T & Roch , M 2020 , Learning deep models from synthetic data for extracting dolphin whistle contours . in 2020 International Joint Conference on Neural Networks, IJCNN 2020 - Proceedings . , 9206992 , Proceedings of the International Joint Conference on Neural Networks , IEEE Computer Society , IEEE World Congress on Computational Intelligence (IEEE WCCI) - 2020 International Joint Conference on Neural Networks (IJCNN 2020) , Glasgow , United Kingdom , 19/07/20 . https://doi.org/10.1109/IJCNN48605.2020.9206992 conference
Publication
2020 International Joint Conference on Neural Networks, IJCNN 2020 - Proceedings
Type
Conference item
Rights
Copyright © 2020 IEEE. This work has been made available online in accordance with publisher policies or with permission. Permission for further reuse of this content should be sought from the publisher or the rights holder. This is the author created accepted manuscript following peer review and may differ slightly from the final published version. The final published version of this work is available at https://ieeexplore.ieee.org
Collections
Items in the St Andrews Research Repository are protected by copyright, with all rights reserved, unless otherwise indicated.