Show simple item record

Files in this item

Thumbnail

Item metadata

dc.contributor.authorWang, Yuheng
dc.contributor.authorYe, Juan
dc.contributor.authorBorchers, David L.
dc.date.accessioned2022-05-09T12:30:12Z
dc.date.available2022-05-09T12:30:12Z
dc.date.issued2022-07-01
dc.identifier279183034
dc.identifierabdef773-387e-45c5-943e-0a4646dd841a
dc.identifier000791029000001
dc.identifier85129426944
dc.identifier.citationWang , Y , Ye , J & Borchers , D L 2022 , ' Automated call detection for acoustic surveys with structured calls of varying length ' , Methods in Ecology and Evolution , vol. 13 , no. 7 , pp. 1552-1567 . https://doi.org/10.1111/2041-210X.13873en
dc.identifier.issn2041-210X
dc.identifier.otherRIS: urn:35AB0257E2D3DD8B32F2A1C087DC7460
dc.identifier.otherORCID: /0000-0002-2838-6836/work/112711243
dc.identifier.otherORCID: /0000-0002-3944-0754/work/112711658
dc.identifier.urihttps://hdl.handle.net/10023/25320
dc.descriptionFunding: Y.W. is partly funded by the China Scholarship Council (CSC) for Ph.D. study at the University of St Andrews, UK.en
dc.description.abstract1. When recorders are used to survey acoustically conspicuous species, identification calls of the target species in recordings is essential for estimating density and abundance. We investigate how well deep neural networks identify vocalisations consisting of phrases of varying lengths, each containing a variable number of syllables. We use recordings of Hainan gibbon (Nomascus hainanus) vocalisations to develop and test the methods. 2. We propose two methods for exploiting the two-level structure of such data. The first combines convolutional neural network (CNN) models with a hidden Markov model (HMM) and the second uses a convolutional recurrent neural network (CRNN). Both models learn acoustic features of syllables via a CNN and temporal correlations of syllables into phrases either via an HMM or recurrent network. We compare their performance to commonly used CNNs LeNet and VGGNet, and support vector machine (SVM). We also propose a dynamic programming method to evaluate how well phrases are predicted. This is useful for evaluating performance when vocalisations are labelled by phrases, not syllables. 3. Our methods perform substantially better than the commonly used methods when applied to the gibbon acoustic recordings. The CRNN has an F-score of 90% on phrase prediction, which is 18% higher than the best of the SVM or LeNet and VGGNet methods. HMM post-processing raised the F-score of these last three methods to as much as 87%. The number of phrases is overestimated by CNNs and SVM, leading to error rates between 49% and 54%. With HMM, these error rates can be reduced to 0.4% at the lowest. Similarly, the error rate of CRNN's prediction is no more than 0.5%. 4. CRNNs are better at identifying phrases of varying lengths composed of a varying number of syllables than simpler CNN or SVM models. We find a CRNN model to be best at this task, with a CNN combined with an HMM performing almost as well. We recommend that these kinds of models are used for species whose vocalisations are structured into phrases of varying lengths.
dc.format.extent16
dc.format.extent6141261
dc.language.isoeng
dc.relation.ispartofMethods in Ecology and Evolutionen
dc.subjectAcoustic surveyen
dc.subjectAutomated call detectionen
dc.subjectConvolutional recurrent neural networken
dc.subjectGibbon callsen
dc.subjectHidden Markov modelen
dc.subjectMachine learningen
dc.subjectQA Mathematicsen
dc.subjectQH301 Biologyen
dc.subjectDASen
dc.subjectMCCen
dc.subject.lccQAen
dc.subject.lccQH301en
dc.titleAutomated call detection for acoustic surveys with structured calls of varying lengthen
dc.typeJournal articleen
dc.contributor.institutionUniversity of St Andrews. Statisticsen
dc.contributor.institutionUniversity of St Andrews. School of Computer Scienceen
dc.contributor.institutionUniversity of St Andrews. School of Mathematics and Statisticsen
dc.contributor.institutionUniversity of St Andrews. Scottish Oceans Instituteen
dc.contributor.institutionUniversity of St Andrews. Centre for Research into Ecological & Environmental Modellingen
dc.contributor.institutionUniversity of St Andrews. Marine Alliance for Science & Technology Scotlanden
dc.identifier.doihttps://doi.org/10.1111/2041-210X.13873
dc.description.statusPeer revieweden


This item appears in the following Collection(s)

Show simple item record