Statistics Theses

Statistics Theses https://hdl.handle.net/10023/101 Tue, 23 Apr 2024 15:59:36 GMT 2024-04-23T15:59:36Z Expanding the use of spatial models in statistical ecology https://hdl.handle.net/10023/29306 This thesis is focused on expanding the use of spatial modelling approaches for applications in ecology. Spatial ecology is about understanding the processes that give rise to spatial patterns in ecological data. In addition to developing a purely scientific understanding, insights into these processes are essential for the effective monitoring and conservation management of ecological systems. However, for many ecological problems, the detectability of animals is imperfect, requiring the use of complex observation models that can account for this. In this thesis we focus on two such models: distance sampling and spatial capture-recapture (SCR). For both these models we incorporate spatially structured random effects to provide a non-parametric method for describing spatial variation in species’ abundance, and to address the problem of spatial auto-correlation. These complex models require the use of computationally efficient random effect structures and inference methods. In particular, we use a sparse stochastic partial differential equation (SPDE) approach as well as low rank penalised smoothing splines. We also draw links between these two approaches in order to illuminate the technically challenging results underpinning the SPDE approach. For inference in distance sampling models, we use a novel approach to achieve a one-stage model fit based on iterated model fitting using approximate Bayesian methods. For inference in SCR models, we use Laplace approximate maximum likelihood methods. We present models that have the necessary complexity to jointly model complex ecological and observation processes, as well as providing efficient methods to fit the models in practice. We conclude by discussing related avenues for future research that are motivated by applied problems in the field of spatial ecology. Tue, 14 Jun 2022 00:00:00 GMT https://hdl.handle.net/10023/29306 2022-06-14T00:00:00Z Seaton, Andrew Ernest This thesis is focused on expanding the use of spatial modelling approaches for applications in ecology. Spatial ecology is about understanding the processes that give rise to spatial patterns in ecological data. In addition to developing a purely scientific understanding, insights into these processes are essential for the effective monitoring and conservation management of ecological systems. However, for many ecological problems, the detectability of animals is imperfect, requiring the use of complex observation models that can account for this. In this thesis we focus on two such models: distance sampling and spatial capture-recapture (SCR). For both these models we incorporate spatially structured random effects to provide a non-parametric method for describing spatial variation in species’ abundance, and to address the problem of spatial auto-correlation. These complex models require the use of computationally efficient random effect structures and inference methods. In particular, we use a sparse stochastic partial differential equation (SPDE) approach as well as low rank penalised smoothing splines. We also draw links between these two approaches in order to illuminate the technically challenging results underpinning the SPDE approach. For inference in distance sampling models, we use a novel approach to achieve a one-stage model fit based on iterated model fitting using approximate Bayesian methods. For inference in SCR models, we use Laplace approximate maximum likelihood methods. We present models that have the necessary complexity to jointly model complex ecological and observation processes, as well as providing efficient methods to fit the models in practice. We conclude by discussing related avenues for future research that are motivated by applied problems in the field of spatial ecology. Counting processes for spatial capture-recapture https://hdl.handle.net/10023/28389 Estimates of animal density, the number of individuals per unit area, are critically important for understanding ecological processes affecting wildlife management. Increasingly, modern technology like camera traps and acoustic recording units are being used to monitor wildlife populations. These are relatively inexpensive and can reliably record animal detections over a long period of time. When animals have a chance to be recorded on multiple detectors, spatial capture-recapture (SCR) can be used to estimate animal density. However, these models generally require that animal identity is known which is often not the case. For camera trap studies, the animals may not have unique pelage to distinguish them reliably, or image quality may be poor. In an acoustic recording, there may not be any individually identifying features in the individual’s call to distinguish it from others of the same species. The motivating problem of this thesis is how to deal with uncertain animal identity in SCR. We review current methods in SCR for both known and unknown identity, ways to make Bayesian inference for SCR models using Nimble in R, and how SCR can be written as a Dirichlet process mixture model when identities are unknown. This leads us to reformulate the conventional SCR model as a marked Poisson process, such that the counting process for detections through time no longer depends on identity, but the observed mark distributions do. When identity is latent, the observed marks are distributed over a mixture of N latent animal characteristics (e.g. activity centre, sex), where N is the number of animals at risk of detection. This becomes a generalization of the unmarked SCR model of Chandler and Royle (2013) and allows us to easily add additionally observed covariates to help estimate animal identity. We show through simulation how well the method works and apply it to a camera trap survey of fisher (Pekania pennanti) and an acoustic survey of the Cape Peninsula moss frog (Arthroleptella lightfooti), each with different types of information used to inform identities. A fundamental assumption of SCR is that detections of an individual occurs independently. This implies that a detection at one detector has no impact on the probability that an animal is seen at any other detectors immediately after. Assumptions of independence allows us to model SCR detections as a spatiotemporal point process, often a Poisson process. As a result, the time of detection becomes uninformative about animal identity. We offer a solution to this by thinking about SCR in terms of a new detection function that depends on a realistic animal movement model. To do this, we relax the independence assumption by building a spatiotemporal dependent point process with a new detection function that depends on where and when the animal was last observed. As a result, we can explicitly model the existing correlation in the detections as well as provide a new method for spatiotemporal clustering of latent identity SCR problems. Tue, 28 Nov 2023 00:00:00 GMT https://hdl.handle.net/10023/28389 2023-11-28T00:00:00Z van Dam-Bates, Paul Estimates of animal density, the number of individuals per unit area, are critically important for understanding ecological processes affecting wildlife management. Increasingly, modern technology like camera traps and acoustic recording units are being used to monitor wildlife populations. These are relatively inexpensive and can reliably record animal detections over a long period of time. When animals have a chance to be recorded on multiple detectors, spatial capture-recapture (SCR) can be used to estimate animal density. However, these models generally require that animal identity is known which is often not the case. For camera trap studies, the animals may not have unique pelage to distinguish them reliably, or image quality may be poor. In an acoustic recording, there may not be any individually identifying features in the individual’s call to distinguish it from others of the same species. The motivating problem of this thesis is how to deal with uncertain animal identity in SCR. We review current methods in SCR for both known and unknown identity, ways to make Bayesian inference for SCR models using Nimble in R, and how SCR can be written as a Dirichlet process mixture model when identities are unknown. This leads us to reformulate the conventional SCR model as a marked Poisson process, such that the counting process for detections through time no longer depends on identity, but the observed mark distributions do. When identity is latent, the observed marks are distributed over a mixture of N latent animal characteristics (e.g. activity centre, sex), where N is the number of animals at risk of detection. This becomes a generalization of the unmarked SCR model of Chandler and Royle (2013) and allows us to easily add additionally observed covariates to help estimate animal identity. We show through simulation how well the method works and apply it to a camera trap survey of fisher (Pekania pennanti) and an acoustic survey of the Cape Peninsula moss frog (Arthroleptella lightfooti), each with different types of information used to inform identities. A fundamental assumption of SCR is that detections of an individual occurs independently. This implies that a detection at one detector has no impact on the probability that an animal is seen at any other detectors immediately after. Assumptions of independence allows us to model SCR detections as a spatiotemporal point process, often a Poisson process. As a result, the time of detection becomes uninformative about animal identity. We offer a solution to this by thinking about SCR in terms of a new detection function that depends on a realistic animal movement model. To do this, we relax the independence assumption by building a spatiotemporal dependent point process with a new detection function that depends on where and when the animal was last observed. As a result, we can explicitly model the existing correlation in the detections as well as provide a new method for spatiotemporal clustering of latent identity SCR problems. Combining spatially adaptive statistical modelling methods and computer vision approaches for the automatic detection of animals from high resolution images https://hdl.handle.net/10023/27871 This study aimed to automate detecting animals in aerial images and improve detection by combining computer vision techniques with statistical modelling of the surveyed area. Knowing the number of animals in an area is important for wildlife management and existing methods require trained observers in larger planes or photography from smaller aircraft requiring manual counting. Automating detection would allow large areas to be surveyed more frequently with lower human input. This thesis shows that animals can be automatically detected from aerial images using the YOLO object detection network. Studies ruled out classical computer vision techniques due to excess false positives. The YOLO method detects 61% of the animals compared to 79% detection by humans, however it also detects 11.6 False Positives Per Image (FPPI). Modelling the distribution of multiple species required a multinomial model. CReSS based GAMs were extended to the multinomial case and simulation studies were carried out to compare CReSS to other multinomial approaches, showing CReSS was preferred for: high noise, low sample sizes or animal densities close to exclusion zones. Confidence intervals from the statistical model were concatenated with the YOLO model. This reduced the FPPI from 11.6 to 6.6, showing that combining prior knowledge from a statistical model improves performance of animal detection. Manual checking time per image was reduced by 97%, from 5 minutes to 11 seconds. Using the automated detections to guide manual checks spotted additional animals increasing the recall to 0.81, greater than the recall estimated for human performance of 0.79. The methods described have reduced the estimated manual checking time for the 40,000 photographs covering the 7,500km2 survey area in Namibia from 9 months to 3 weeks, meaning this method could be used frequently to give timely and reliable results. Tue, 29 Nov 2022 00:00:00 GMT https://hdl.handle.net/10023/27871 2022-11-29T00:00:00Z Fell, Christina Mary This study aimed to automate detecting animals in aerial images and improve detection by combining computer vision techniques with statistical modelling of the surveyed area. Knowing the number of animals in an area is important for wildlife management and existing methods require trained observers in larger planes or photography from smaller aircraft requiring manual counting. Automating detection would allow large areas to be surveyed more frequently with lower human input. This thesis shows that animals can be automatically detected from aerial images using the YOLO object detection network. Studies ruled out classical computer vision techniques due to excess false positives. The YOLO method detects 61% of the animals compared to 79% detection by humans, however it also detects 11.6 False Positives Per Image (FPPI). Modelling the distribution of multiple species required a multinomial model. CReSS based GAMs were extended to the multinomial case and simulation studies were carried out to compare CReSS to other multinomial approaches, showing CReSS was preferred for: high noise, low sample sizes or animal densities close to exclusion zones. Confidence intervals from the statistical model were concatenated with the YOLO model. This reduced the FPPI from 11.6 to 6.6, showing that combining prior knowledge from a statistical model improves performance of animal detection. Manual checking time per image was reduced by 97%, from 5 minutes to 11 seconds. Using the automated detections to guide manual checks spotted additional animals increasing the recall to 0.81, greater than the recall estimated for human performance of 0.79. The methods described have reduced the estimated manual checking time for the 40,000 photographs covering the 7,500km2 survey area in Namibia from 9 months to 3 weeks, meaning this method could be used frequently to give timely and reliable results. Title redacted https://hdl.handle.net/10023/27008 Abstract redacted Tue, 28 Nov 2023 00:00:00 GMT https://hdl.handle.net/10023/27008 2023-11-28T00:00:00Z Franchini, Filippo Luciano Charly Abstract redacted Advances on priors for the Dirichlet process mixture model with Gaussian kernels https://hdl.handle.net/10023/26299 Abstract redacted Tue, 29 Nov 2022 00:00:00 GMT https://hdl.handle.net/10023/26299 2022-11-29T00:00:00Z Jing, Wei Abstract redacted Improved methods for estimating spatial and temporal trends from point transect survey data https://hdl.handle.net/10023/23506 This thesis is about methods for improving estimates of abundance and trends from distance sampling surveys. My particular focus is on point transect surveys of endemic Hawaiian songbirds. When critical assumptions are met, design-based distance sampling provides unbiased abundance estimates; however, for rare endangered Hawaiian forest birds, the estimates can have high variance, hindering their use in assessing conservation efforts. One approach to improve precision is to use spatial models instead of design-based methods. I fitted density surface models (DSMs), accounting for spatial and temporal correlation, using a two-stage approach that separated modelling of detection probability from modelling spatio-temporal patterns in density using generalized additive models (GAMs). Precision was improved and maps depicted spatio-temporal patterns in densities. I compared the model that I fitted for a single year to two alternative approaches: spatial point-process model based on a log-Gaussian Cox process with a Matérn covariance (LGCP) and a soap-film smoother. The GAM-based DSMs and LGCP approaches produced better precision than the design-based method but varied in how they captured pattern in the data. I also implemented a GAM that used a smoother which took into account the study area boundary (a soap-film smoother) and found this produced better extrapolations into parts of the study area not surveyed. Including biological realism is another approach to improve modelling of population change over time is to link design-based abundance estimates to an underlying population dynamics model, using a state-space modelling framework. This constrains population changes to be biologically realistic, as I demonstrate with a set of models that make different assumptions about the demographic parameters driving population changes. Overall, I demonstrate that spatial, spatio-temporal and population dynamics modelling procedures reduced the variance in density estimates in single- and multi-year abundance data compared to design-based methods, thus better informing management and conservation decisions. Tue, 29 Jun 2021 00:00:00 GMT https://hdl.handle.net/10023/23506 2021-06-29T00:00:00Z Camp, Richard Joseph This thesis is about methods for improving estimates of abundance and trends from distance sampling surveys. My particular focus is on point transect surveys of endemic Hawaiian songbirds. When critical assumptions are met, design-based distance sampling provides unbiased abundance estimates; however, for rare endangered Hawaiian forest birds, the estimates can have high variance, hindering their use in assessing conservation efforts. One approach to improve precision is to use spatial models instead of design-based methods. I fitted density surface models (DSMs), accounting for spatial and temporal correlation, using a two-stage approach that separated modelling of detection probability from modelling spatio-temporal patterns in density using generalized additive models (GAMs). Precision was improved and maps depicted spatio-temporal patterns in densities. I compared the model that I fitted for a single year to two alternative approaches: spatial point-process model based on a log-Gaussian Cox process with a Matérn covariance (LGCP) and a soap-film smoother. The GAM-based DSMs and LGCP approaches produced better precision than the design-based method but varied in how they captured pattern in the data. I also implemented a GAM that used a smoother which took into account the study area boundary (a soap-film smoother) and found this produced better extrapolations into parts of the study area not surveyed. Including biological realism is another approach to improve modelling of population change over time is to link design-based abundance estimates to an underlying population dynamics model, using a state-space modelling framework. This constrains population changes to be biologically realistic, as I demonstrate with a set of models that make different assumptions about the demographic parameters driving population changes. Overall, I demonstrate that spatial, spatio-temporal and population dynamics modelling procedures reduced the variance in density estimates in single- and multi-year abundance data compared to design-based methods, thus better informing management and conservation decisions. Movement ecology and conservation : the case of African vultures https://hdl.handle.net/10023/20210 The movements of critically endangered vultures, equipped with satellite-based tracking devices in Namibia, were inspected using Generalized Additive Models. Models incorporated spatially adaptive (1D and 2D) smooths via the Spatially Adaptive Local Smoothing Algorithm (SALSA) and Complex REgion Spatial Smoother (CReSS) method. The correlated nature of geo-location data was address via robust standard errors. The results of this thorough and integrative study of movement ecology have an unprecedented level of detail, far exceeding what is available in the literature. Namely, vultures were seen throughout Namibia and its five neighbouring countries with three individuals visiting locations farther than 1,000 km from where they were initially seen. Large variability was found both within and between birds. Differences were perceived in four daily movement properties, even though temporal differences were only captured for daily distance travelled (monthly) and daily maximum displacement (seasonally). There was noticeable variation in the size of the areas each bird used from month to month, often showing very little spatial overlap. Home ranges varied greatly; one bird expanded its monthly home range as much as nineteen times its smaller size. Contrastingly, core areas remained sometimes constant. Home ranges were three to five times larger than the respective core areas, clearly indicating a non-uniform use of the environment. The extensive study area (2.3 million sq.km) was characterised using habitat features, climate conditions and indices of human presence. Vegetation index, minimum distance to river and minimum distance to road were consistently important in explaining the probability of bird presence. Nonetheless, each vulture used its environment in its own way. These novel findings support trans-frontier conservation measures, represent crucial support to revise the geographic extent of existing conservation actions and constitute the basis to predict the risk of exposure of vultures to lethal threats or to assess changes under Climate Change scenarios. Tue, 28 Jul 2020 00:00:00 GMT https://hdl.handle.net/10023/20210 2020-07-28T00:00:00Z Estevinho Santos Faustino, Cláudia The movements of critically endangered vultures, equipped with satellite-based tracking devices in Namibia, were inspected using Generalized Additive Models. Models incorporated spatially adaptive (1D and 2D) smooths via the Spatially Adaptive Local Smoothing Algorithm (SALSA) and Complex REgion Spatial Smoother (CReSS) method. The correlated nature of geo-location data was address via robust standard errors. The results of this thorough and integrative study of movement ecology have an unprecedented level of detail, far exceeding what is available in the literature. Namely, vultures were seen throughout Namibia and its five neighbouring countries with three individuals visiting locations farther than 1,000 km from where they were initially seen. Large variability was found both within and between birds. Differences were perceived in four daily movement properties, even though temporal differences were only captured for daily distance travelled (monthly) and daily maximum displacement (seasonally). There was noticeable variation in the size of the areas each bird used from month to month, often showing very little spatial overlap. Home ranges varied greatly; one bird expanded its monthly home range as much as nineteen times its smaller size. Contrastingly, core areas remained sometimes constant. Home ranges were three to five times larger than the respective core areas, clearly indicating a non-uniform use of the environment. The extensive study area (2.3 million sq.km) was characterised using habitat features, climate conditions and indices of human presence. Vegetation index, minimum distance to river and minimum distance to road were consistently important in explaining the probability of bird presence. Nonetheless, each vulture used its environment in its own way. These novel findings support trans-frontier conservation measures, represent crucial support to revise the geographic extent of existing conservation actions and constitute the basis to predict the risk of exposure of vultures to lethal threats or to assess changes under Climate Change scenarios. Estimating abundance of African great apes https://hdl.handle.net/10023/18859 All species and subspecies of African great apes are listed by the International Union for the Conservation of Nature as endangered or critically endangered, and populations continue to decline. As human populations and industry expand into great ape habitat, efficient, reliable estimators of great ape abundance are needed to inform conservation status and land-use planning, to assess adverse and beneficial effects of human activities, and to help funding agencies and donors make informed and efficient contributions. Fortunately, technological advances have improved our ability to sample great apes remotely, and new statistical methods for estimating abundance are constantly in development. Following a brief general introduction, this thesis reviews established and emerging approaches to estimating great ape abundance, then describes new methods for estimating animal density from photographic data by distance sampling with camera traps, and for selecting among models of the distance sampling detection function when distance data are overdispersed. Subsequent chapters quantify the effect of violating the assumption of demographic closure when estimating abundance using spatially explicit capture–recapture models for closed populations, and describe the design and implementation of a camera trapping survey of chimpanzees at the landscape scale in Kibale National Park, Uganda. The new methods developed have generated considerable interest, and allow abundances of multiple species, including great apes, to be estimated from data collected during a single photographic survey. Spatially explicit capture–recapture analyses of photographic data from small study areas yielded accurate and precise estimates of chimpanzee abundance, and this combination of methods could be used to enumerate great apes over large areas and in dense forests more reliably and efficiently than previously possible. Tue, 03 Dec 2019 00:00:00 GMT https://hdl.handle.net/10023/18859 2019-12-03T00:00:00Z Howe, Eric J. All species and subspecies of African great apes are listed by the International Union for the Conservation of Nature as endangered or critically endangered, and populations continue to decline. As human populations and industry expand into great ape habitat, efficient, reliable estimators of great ape abundance are needed to inform conservation status and land-use planning, to assess adverse and beneficial effects of human activities, and to help funding agencies and donors make informed and efficient contributions. Fortunately, technological advances have improved our ability to sample great apes remotely, and new statistical methods for estimating abundance are constantly in development. Following a brief general introduction, this thesis reviews established and emerging approaches to estimating great ape abundance, then describes new methods for estimating animal density from photographic data by distance sampling with camera traps, and for selecting among models of the distance sampling detection function when distance data are overdispersed. Subsequent chapters quantify the effect of violating the assumption of demographic closure when estimating abundance using spatially explicit capture–recapture models for closed populations, and describe the design and implementation of a camera trapping survey of chimpanzees at the landscape scale in Kibale National Park, Uganda. The new methods developed have generated considerable interest, and allow abundances of multiple species, including great apes, to be estimated from data collected during a single photographic survey. Spatially explicit capture–recapture analyses of photographic data from small study areas yielded accurate and precise estimates of chimpanzee abundance, and this combination of methods could be used to enumerate great apes over large areas and in dense forests more reliably and efficiently than previously possible. Methods in spatially explicit capture-recapture https://hdl.handle.net/10023/18233 Capture-recapture (CR) methods are a ubiquitous means of estimating animal abundance from wildlife surveys. They rely on the detection and subsequent redetection of individuals over a number of sampling occasions. It is usually necessary for individuals to be recognised upon redetection. Spatially explicit capture-recapture (SECR) methods generalise those of CR by accounting for the locations at which each detection occurs. This allows spatial heterogeneity in detection probabilities to be accounted for: individuals with home-range centres near the detector array are more likely to be detected. They also permit estimation of animal density in addition to abundance. One particular advantage of SECR methods is that they can be used when individuals are detected via the cues they produce---examples include birdsongs detected by microphones and whale surfacings detected by human observers. In such situations each cue may be detected by multiple detectors at different fixed locations. Redetections are then spatial (rather than temporal) in nature, and density can be estimated from a single survey occasion. Existing methods, however, cannot generally be appropriately applied to the resulting cue-detection data without making assumptions that rarely hold. Additionally, they usually estimate cue density rather than animal density, which does not usually have the same biological importance. This thesis extends SECR methodology primarily for the appropriate estimation of animal density from cue-based SECR surveys. These extensions include (i) incorporation of auxiliary survey data into SECR estimators, (ii) appropriate point and variance estimators of animal density for a range of scenarios, and (iii) methods to account for both heterogeneity in detectability and cues that are directional in nature. Moreover, a general class of methods is presented for the estimation of demographic parameters from wildlife surveys on which individuals cannot be recognised. These can variously be applied to CR and---potentially---SECR. Fri, 24 Jun 2016 00:00:00 GMT https://hdl.handle.net/10023/18233 2016-06-24T00:00:00Z Stevenson, Ben C. Capture-recapture (CR) methods are a ubiquitous means of estimating animal abundance from wildlife surveys. They rely on the detection and subsequent redetection of individuals over a number of sampling occasions. It is usually necessary for individuals to be recognised upon redetection. Spatially explicit capture-recapture (SECR) methods generalise those of CR by accounting for the locations at which each detection occurs. This allows spatial heterogeneity in detection probabilities to be accounted for: individuals with home-range centres near the detector array are more likely to be detected. They also permit estimation of animal density in addition to abundance. One particular advantage of SECR methods is that they can be used when individuals are detected via the cues they produce---examples include birdsongs detected by microphones and whale surfacings detected by human observers. In such situations each cue may be detected by multiple detectors at different fixed locations. Redetections are then spatial (rather than temporal) in nature, and density can be estimated from a single survey occasion. Existing methods, however, cannot generally be appropriately applied to the resulting cue-detection data without making assumptions that rarely hold. Additionally, they usually estimate cue density rather than animal density, which does not usually have the same biological importance. This thesis extends SECR methodology primarily for the appropriate estimation of animal density from cue-based SECR surveys. These extensions include (i) incorporation of auxiliary survey data into SECR estimators, (ii) appropriate point and variance estimators of animal density for a range of scenarios, and (iii) methods to account for both heterogeneity in detectability and cues that are directional in nature. Moreover, a general class of methods is presented for the estimation of demographic parameters from wildlife surveys on which individuals cannot be recognised. These can variously be applied to CR and---potentially---SECR. The statistical development of integrated multi-state stopover models https://hdl.handle.net/10023/18206 This thesis focusses on the analysis of ecological capture-recapture data and the estimation of population parameters of interest. Many of the common models applied to such data, for example the Cormack-Jolly-Seber model, condition on the first capture of an individual or on the number of individuals encountered. A consequence of this conditioning is that it is not possible to estimate the total abundance directly. Stopover models remove the conditioning on first capture and instead explicitly model the arrival of individuals into the population. This permits the estimation of abundance through the likelihood along with other parameters such as capture and retention probabilities. We develop an integrated stopover model capable of analysing multiple years of data within a single likelihood and allowing parameters to be shared across years. We consider special cases of this model, writing the likelihood using sufficient statistics as well as utilising the hidden Markov model framework to allow for efficient evaluation of the likelihood. We further extend this model to an integrated multistate-stopover model which incorporates any available discrete state information. The new stopover models are applied to real ecological data sets. A cohort-dependent single-year stopover model is applied to data on grey seals, Halichoerus grypus, where the cohorts are determined by birth year. The integrated stopover model and integrated multi-state stopover model are used to analyse a data set on great crested newts, Triturus cristatus. A subset of this data is used to explore closed population models that permit capture probabilities to depend on discrete state information. The final section of this thesis considers a capture-recapture-recovery data set relating to Soay sheep, a breed of domestic sheep Ovis aries. These data contain individual time-varying continuous covariates and raise the issue of dealing with missing data. Fri, 24 Jun 2016 00:00:00 GMT https://hdl.handle.net/10023/18206 2016-06-24T00:00:00Z Worthington, Hannah This thesis focusses on the analysis of ecological capture-recapture data and the estimation of population parameters of interest. Many of the common models applied to such data, for example the Cormack-Jolly-Seber model, condition on the first capture of an individual or on the number of individuals encountered. A consequence of this conditioning is that it is not possible to estimate the total abundance directly. Stopover models remove the conditioning on first capture and instead explicitly model the arrival of individuals into the population. This permits the estimation of abundance through the likelihood along with other parameters such as capture and retention probabilities. We develop an integrated stopover model capable of analysing multiple years of data within a single likelihood and allowing parameters to be shared across years. We consider special cases of this model, writing the likelihood using sufficient statistics as well as utilising the hidden Markov model framework to allow for efficient evaluation of the likelihood. We further extend this model to an integrated multistate-stopover model which incorporates any available discrete state information. The new stopover models are applied to real ecological data sets. A cohort-dependent single-year stopover model is applied to data on grey seals, Halichoerus grypus, where the cohorts are determined by birth year. The integrated stopover model and integrated multi-state stopover model are used to analyse a data set on great crested newts, Triturus cristatus. A subset of this data is used to explore closed population models that permit capture probabilities to depend on discrete state information. The final section of this thesis considers a capture-recapture-recovery data set relating to Soay sheep, a breed of domestic sheep Ovis aries. These data contain individual time-varying continuous covariates and raise the issue of dealing with missing data. Randomness as a computational strategy : on matrix and tensor decompositions https://hdl.handle.net/10023/16693 Matrix and tensor decompositions are fundamental tools for finding structure and data processing. In particular, the efficient computation of low-rank matrix approximations is an ubiquitous problem in the area of machine learning and elsewhere. However, massive data arrays pose a computational challenge for these techniques, placing significant constraints on both memory and processing power. Recently, the fascinating and powerful concept of randomness has been introduced as a strategy to ease the computational load of deterministic matrix and data algorithms. The basic idea of these algorithms is to employ a degree of randomness as part of the logic in order to derive from a high-dimensional input matrix a smaller matrix, which captures the essential information of the original data matrix. Subsequently, the smaller matrix is then used to efficiently compute a near-optimal low-rank approximation. Randomized algorithms have been shown to be robust, highly reliable, and computationally efficient, yet simple to implement. In particular, the development of the randomized singular value decomposition can be seen as a milestone in the era of ‘big data’. Building up on the great success of this probabilistic strategy to compute low-rank matrix decompositions, this thesis introduces a set of new randomized algorithms. Specifically, we present a randomized algorithm to compute the dynamic mode decomposition, which is a modern dimension reduction technique designed to extract dynamic information from dynamical systems. Then, we advocate the randomized dynamic mode decomposition for background modeling of surveillance video feeds. Further, we show that randomized algorithms are embarrassingly parallel by design and that graphics processing units (GPUs) can be utilized to substantially accelerate the computations. Finally, the concept of randomized algorithms is generalized for tensors in order to compute the canonical CANDECOMP/PARAFAC (CP) decomposition. Mon, 20 Nov 2017 00:00:00 GMT https://hdl.handle.net/10023/16693 2017-11-20T00:00:00Z Erichson, N. Benjamin Matrix and tensor decompositions are fundamental tools for finding structure and data processing. In particular, the efficient computation of low-rank matrix approximations is an ubiquitous problem in the area of machine learning and elsewhere. However, massive data arrays pose a computational challenge for these techniques, placing significant constraints on both memory and processing power. Recently, the fascinating and powerful concept of randomness has been introduced as a strategy to ease the computational load of deterministic matrix and data algorithms. The basic idea of these algorithms is to employ a degree of randomness as part of the logic in order to derive from a high-dimensional input matrix a smaller matrix, which captures the essential information of the original data matrix. Subsequently, the smaller matrix is then used to efficiently compute a near-optimal low-rank approximation. Randomized algorithms have been shown to be robust, highly reliable, and computationally efficient, yet simple to implement. In particular, the development of the randomized singular value decomposition can be seen as a milestone in the era of ‘big data’. Building up on the great success of this probabilistic strategy to compute low-rank matrix decompositions, this thesis introduces a set of new randomized algorithms. Specifically, we present a randomized algorithm to compute the dynamic mode decomposition, which is a modern dimension reduction technique designed to extract dynamic information from dynamical systems. Then, we advocate the randomized dynamic mode decomposition for background modeling of surveillance video feeds. Further, we show that randomized algorithms are embarrassingly parallel by design and that graphics processing units (GPUs) can be utilized to substantially accelerate the computations. Finally, the concept of randomized algorithms is generalized for tensors in order to compute the canonical CANDECOMP/PARAFAC (CP) decomposition. Incorporating animal movement with distance sampling and spatial capture-recapture https://hdl.handle.net/10023/16467 Distance sampling and spatial capture-recapture are statistical methods to estimate the number of animals in a wild population based on encounters between these animals and scientific detectors. Both methods estimate the probability an animal is detected during a survey, but do not explicitly model animal movement. The primary challenge is that animal movement in these surveys is unobserved; one must average over all possible paths each animal could have travelled during the survey. In this thesis, a general statistical model, with distance sampling and spatial capture-recapture as special cases, is presented that explicitly incorporates animal movement. An efficient algorithm to integrate over all possible movement paths, based on quadrature and hidden Markov modelling, is given to overcome the computational obstacles. For distance sampling, simulation studies and case studies show that incorporating animal movement can reduce the bias in estimated abundance found in conventional models and expand application of distance sampling to surveys that violate the assumption of no animal movement. For spatial capture-recapture, continuous-time encounter records are used to make detailed inference on where animals spend their time during the survey. In surveys conducted in discrete occasions, maximum likelihood models that allow for mobile activity centres are presented to account for transience, dispersal, and heterogeneous space use. These methods provide an alternative when animal movement causes bias in standard methods and the opportunity to gain richer inference on how animals move, where they spend their time, and how they interact. Thu, 06 Dec 2018 00:00:00 GMT https://hdl.handle.net/10023/16467 2018-12-06T00:00:00Z Glennie, Richard Distance sampling and spatial capture-recapture are statistical methods to estimate the number of animals in a wild population based on encounters between these animals and scientific detectors. Both methods estimate the probability an animal is detected during a survey, but do not explicitly model animal movement. The primary challenge is that animal movement in these surveys is unobserved; one must average over all possible paths each animal could have travelled during the survey. In this thesis, a general statistical model, with distance sampling and spatial capture-recapture as special cases, is presented that explicitly incorporates animal movement. An efficient algorithm to integrate over all possible movement paths, based on quadrature and hidden Markov modelling, is given to overcome the computational obstacles. For distance sampling, simulation studies and case studies show that incorporating animal movement can reduce the bias in estimated abundance found in conventional models and expand application of distance sampling to surveys that violate the assumption of no animal movement. For spatial capture-recapture, continuous-time encounter records are used to make detailed inference on where animals spend their time during the survey. In surveys conducted in discrete occasions, maximum likelihood models that allow for mobile activity centres are presented to account for transience, dispersal, and heterogeneous space use. These methods provide an alternative when animal movement causes bias in standard methods and the opportunity to gain richer inference on how animals move, where they spend their time, and how they interact. Gaussian Markov random fields and structural additive regression : applications in freshwater fisheries management https://hdl.handle.net/10023/15909 In this thesis structural additive regression (STAR) models are constructed for two applications in freshwater fisheries management 1) large scale modelling of fish abundance using electrofishing removal data and 2) assessing the effect on stream temperature of tree felling. Both approaches take advantage of the central role Gaussian Markov random fields (GMRFs) play in the construction of structured additive regression components. The R package mgcv can fit, in principle, any STAR model. In practice, however, several extensions are required to allow a non-specialised user to access this functionality, and a large part of this thesis is the developement of software to allow a general user ready access to a wide range of GMRF models within the familiar mgcv framework. All models are fitted making use of this extension where possible (and practical). The thesis is divided into three main chapters. Chapter 2 serves to provide background and insight into penalised regression and STAR models and the role that GMRFs play in smoothing. Also presented are the extensions required to fit GMRF models in mgcv. Chapter 3 presents a two stage model for fish density using electrofishing removal data. The first stage of this model estimates fish capture probability and is not a STAR model, but can utilise aspects of GMRFs through low rank approximations; software to make this available is developed and presented. The second stage is a Poisson STAR model and can therefore be fitted in the extended mgcv framework. Finally, Chapter 4 presents a model for the impact of a clear felling event on stream temperature. This model utilises cyclic smoothers applied to the functional principal components of daily temperature curves. This allows for a detailed assessment of the effects of felling on stream temperature that is not possible when modelling daily summaries alone. Fri, 23 Jun 2017 00:00:00 GMT https://hdl.handle.net/10023/15909 2017-06-23T00:00:00Z Millar, Colin Pearson In this thesis structural additive regression (STAR) models are constructed for two applications in freshwater fisheries management 1) large scale modelling of fish abundance using electrofishing removal data and 2) assessing the effect on stream temperature of tree felling. Both approaches take advantage of the central role Gaussian Markov random fields (GMRFs) play in the construction of structured additive regression components. The R package mgcv can fit, in principle, any STAR model. In practice, however, several extensions are required to allow a non-specialised user to access this functionality, and a large part of this thesis is the developement of software to allow a general user ready access to a wide range of GMRF models within the familiar mgcv framework. All models are fitted making use of this extension where possible (and practical). The thesis is divided into three main chapters. Chapter 2 serves to provide background and insight into penalised regression and STAR models and the role that GMRFs play in smoothing. Also presented are the extensions required to fit GMRF models in mgcv. Chapter 3 presents a two stage model for fish density using electrofishing removal data. The first stage of this model estimates fish capture probability and is not a STAR model, but can utilise aspects of GMRFs through low rank approximations; software to make this available is developed and presented. The second stage is a Poisson STAR model and can therefore be fitted in the extended mgcv framework. Finally, Chapter 4 presents a model for the impact of a clear felling event on stream temperature. This model utilises cyclic smoothers applied to the functional principal components of daily temperature curves. This allows for a detailed assessment of the effects of felling on stream temperature that is not possible when modelling daily summaries alone. Incorporating animal movement into circular plot and point transect surveys of wildlife abundance https://hdl.handle.net/10023/15612 Estimating wildlife abundance is fundamental for its effective management and conservation. A range of methods exist: total counts, plot sampling, distance sampling and capture-recapture based approaches. Methods have assumptions and their failure can lead to substantial bias. Current research in the field is focused not on establishing new methods but in extending existing methods to deal with their assumptions' violation. This thesis focus on incorporating animal movement into circular plot sampling (CPS) and point transect sampling (PTS), where a key assumption is that animals do not move while within detection range, i.e., the survey is a snapshot in time. While targeting this goal, we found some unexpected bias in PTS when animals were still and model selection was used to choose among different candidate models for the detection function (the model describing how detectability changes with observer-animal distance). Using a simulation study, we found that, although PTS estimators are asymptotically unbiased, for the recommended sample sizes the bias depended on the form of the true detection function. We then extended the simulation study to include animal movement, and found this led to further bias in CPS and PTS. We present novel methods that incorporate animal movement with constant speed into estimates of abundance. First, in CPS, we present an analytic expression to correct for the bias given linear movement. When movement is de ned by a diffusion process, a simulation based approach, modelling the probability of animal presence in the circular plot, results in less than 3% bias in the abundance estimates. For PTS we introduce an estimator composed of two linked submodels: the movement (animals moving linearly) and the detection model. The performance of the proposed method is assessed via simulation. Despite being biased, the new estimator yields improved results compared to ignoring animal movement using conventional PTS. Mon, 01 Jan 2018 00:00:00 GMT https://hdl.handle.net/10023/15612 2018-01-01T00:00:00Z Prieto González, Rocío Estimating wildlife abundance is fundamental for its effective management and conservation. A range of methods exist: total counts, plot sampling, distance sampling and capture-recapture based approaches. Methods have assumptions and their failure can lead to substantial bias. Current research in the field is focused not on establishing new methods but in extending existing methods to deal with their assumptions' violation. This thesis focus on incorporating animal movement into circular plot sampling (CPS) and point transect sampling (PTS), where a key assumption is that animals do not move while within detection range, i.e., the survey is a snapshot in time. While targeting this goal, we found some unexpected bias in PTS when animals were still and model selection was used to choose among different candidate models for the detection function (the model describing how detectability changes with observer-animal distance). Using a simulation study, we found that, although PTS estimators are asymptotically unbiased, for the recommended sample sizes the bias depended on the form of the true detection function. We then extended the simulation study to include animal movement, and found this led to further bias in CPS and PTS. We present novel methods that incorporate animal movement with constant speed into estimates of abundance. First, in CPS, we present an analytic expression to correct for the bias given linear movement. When movement is de ned by a diffusion process, a simulation based approach, modelling the probability of animal presence in the circular plot, results in less than 3% bias in the abundance estimates. For PTS we introduce an estimator composed of two linked submodels: the movement (animals moving linearly) and the detection model. The performance of the proposed method is assessed via simulation. Despite being biased, the new estimator yields improved results compared to ignoring animal movement using conventional PTS. A continuous-time formulation for spatial capture-recapture models https://hdl.handle.net/10023/15596 Spatial capture-recapture (SCR) models are relatively new but have become the standard approach used to estimate animal density from capture-recapture data. It has in the past been impractical to obtain sufficient data for analysis on species that are very difficult to capture such as elusive carnivores that occur at low density and range very widely. Advances in technology have led to alternative ways to virtually “capture" individuals without having to physically hold them. Some examples of these new non-invasive sampling methods include scat or hair collection for genetic analysis, acoustic detection and camera trapping. In traditional capture-recapture (CR) and SCR studies populations are sampled at discrete points in time leading to clear and well defined occasions whereas the new detector types mentioned above sample populations continuously in time. Researchers with data collected continuously currently need to define an appropriate occasion and aggregate their data accordingly thereby imposing an artificial construct on their data for analytical convenience. This research develops a continuous-time (CT) framework for SCR models by treating detections as a temporal non homogeneous Poisson process (NHPP) and replacing the usual SCR detection function with a continuous detection hazard function. The general CT likelihood is first developed for data from passive (also called “proximity") detectors like camera traps that do not physically hold individuals. The likelihood is then modified to produce a likelihood for single-catch traps (traps that are taken out of action by capturing an animal) that has proven difficult to develop with a discrete-occasion approach. The lack of a suitable single-catch trap likelihood has led to researchers using a discrete-time (DT) multi-catch trap estimator to analyse single-catch trap data. Previous work has found the DT multi-catch estimator to be robust despite the fact that it is known to be based on the wrong model for single-catch traps (it assumes that the traps continue operating after catching an individual). Simulation studies in this work confirm that the multi-catch estimator is robust for estimating density when density is constant or does not vary much in space. However, there are scenarios with non-constant density surfaces when the multi-catch estimator is not able to correctly identify regions of high density. Furthermore, the multi-catch estimator is known to be negatively biased for the intercept parameter of SCR detection functions and there may be interest in the detection function in its own right. On the other hand the CT single-catch estimator is unbiased or nearly so for all parameters of interest including those in the detection function and those in the model for density. When one assumes that the detection hazard is constant through time there is no impact of ignoring capture times and using only the detection frequencies. This is of course a special case and in reality detection hazards will tend to vary in time. However when one assumes that the effects of time and distance in the time-varying hazard are independent, then similarly there is no information in the capture times about density and detection function parameters. The work here uses a detection hazard that assumes independence between time and distance. Different forms for the detection hazard are explored with the most flexible choice being that of a cyclic regression spline. Extensive simulation studies suggest as expected that a DT proximity estimator is unbiased for the estimation of density even when the detection hazard varies though time. However there are indirect benefits of incorporating capture times because doing so will lead to a better fitting detection component of the model, and this can prevent unexplained variation being erroneously attributed to the wrong covariate. The analysis of two real datasets supports this assertion because the models with the best fitting detection hazard have different effects to the other models. In addition, modelling the detection process in continuous-time leads to a more parsimonious approach compared to using DT models when the detection hazard varies in time. The underlying process is occurring in continuous-time and so using CT models allows inferences to be drawn about the underlying process, for example the timevarying detection hazard can be viewed as a proxy for animal activity. The CT formulation is able to model the underlying detection hazard accurately and provides a formal modelling framework to explore different hypotheses about activity patterns. There is scope to integrate the CT models developed here with models for space usage and landscape connectivity to explore these processes on a finer temporal scale. SCR models are experiencing a rapid growth in both application and method development. The data generating process occurs in CT and hence a CT modelling approach is a natural fit and opens up several opportunities that are not possible with a DT formulation. The work here makes a contribution by developing and exploring the utility of such a CT SCR formulation. Sun, 01 Jan 2017 00:00:00 GMT https://hdl.handle.net/10023/15596 2017-01-01T00:00:00Z Distiller, Greg Spatial capture-recapture (SCR) models are relatively new but have become the standard approach used to estimate animal density from capture-recapture data. It has in the past been impractical to obtain sufficient data for analysis on species that are very difficult to capture such as elusive carnivores that occur at low density and range very widely. Advances in technology have led to alternative ways to virtually “capture" individuals without having to physically hold them. Some examples of these new non-invasive sampling methods include scat or hair collection for genetic analysis, acoustic detection and camera trapping. In traditional capture-recapture (CR) and SCR studies populations are sampled at discrete points in time leading to clear and well defined occasions whereas the new detector types mentioned above sample populations continuously in time. Researchers with data collected continuously currently need to define an appropriate occasion and aggregate their data accordingly thereby imposing an artificial construct on their data for analytical convenience. This research develops a continuous-time (CT) framework for SCR models by treating detections as a temporal non homogeneous Poisson process (NHPP) and replacing the usual SCR detection function with a continuous detection hazard function. The general CT likelihood is first developed for data from passive (also called “proximity") detectors like camera traps that do not physically hold individuals. The likelihood is then modified to produce a likelihood for single-catch traps (traps that are taken out of action by capturing an animal) that has proven difficult to develop with a discrete-occasion approach. The lack of a suitable single-catch trap likelihood has led to researchers using a discrete-time (DT) multi-catch trap estimator to analyse single-catch trap data. Previous work has found the DT multi-catch estimator to be robust despite the fact that it is known to be based on the wrong model for single-catch traps (it assumes that the traps continue operating after catching an individual). Simulation studies in this work confirm that the multi-catch estimator is robust for estimating density when density is constant or does not vary much in space. However, there are scenarios with non-constant density surfaces when the multi-catch estimator is not able to correctly identify regions of high density. Furthermore, the multi-catch estimator is known to be negatively biased for the intercept parameter of SCR detection functions and there may be interest in the detection function in its own right. On the other hand the CT single-catch estimator is unbiased or nearly so for all parameters of interest including those in the detection function and those in the model for density. When one assumes that the detection hazard is constant through time there is no impact of ignoring capture times and using only the detection frequencies. This is of course a special case and in reality detection hazards will tend to vary in time. However when one assumes that the effects of time and distance in the time-varying hazard are independent, then similarly there is no information in the capture times about density and detection function parameters. The work here uses a detection hazard that assumes independence between time and distance. Different forms for the detection hazard are explored with the most flexible choice being that of a cyclic regression spline. Extensive simulation studies suggest as expected that a DT proximity estimator is unbiased for the estimation of density even when the detection hazard varies though time. However there are indirect benefits of incorporating capture times because doing so will lead to a better fitting detection component of the model, and this can prevent unexplained variation being erroneously attributed to the wrong covariate. The analysis of two real datasets supports this assertion because the models with the best fitting detection hazard have different effects to the other models. In addition, modelling the detection process in continuous-time leads to a more parsimonious approach compared to using DT models when the detection hazard varies in time. The underlying process is occurring in continuous-time and so using CT models allows inferences to be drawn about the underlying process, for example the timevarying detection hazard can be viewed as a proxy for animal activity. The CT formulation is able to model the underlying detection hazard accurately and provides a formal modelling framework to explore different hypotheses about activity patterns. There is scope to integrate the CT models developed here with models for space usage and landscape connectivity to explore these processes on a finer temporal scale. SCR models are experiencing a rapid growth in both application and method development. The data generating process occurs in CT and hence a CT modelling approach is a natural fit and opens up several opportunities that are not possible with a DT formulation. The work here makes a contribution by developing and exploring the utility of such a CT SCR formulation. Modelling the spatial dynamics of non-state terrorism : world study, 2002-2013 https://hdl.handle.net/10023/12067 To this day, terrorism perpetrated by non-state actors persists as a worldwide threat, as exemplified by the recent lethal attacks in Paris, London, Brussels, and the ongoing massacres perpetrated by the Islamic State in Iraq, Syria and neighbouring countries. In response, states deploy various counterterrorism policies, the costs of which could be reduced through more efficient preventive measures. The literature has not applied statistical models able to account for complex spatio-temporal dependencies, despite their potential for explaining and preventing non-state terrorism at the sub-national level. In an effort to address this shortcoming, this thesis employs Bayesian hierarchical models, where the spatial random field is represented by a stochastic partial differential equation. The results show that lethal terrorist attacks perpetrated by non-state actors tend to be concentrated in areas located within failed states from which they may diffuse locally, towards neighbouring areas. At the sub-national level, the propensity of attacks to be lethal and the frequency of lethal attacks appear to be driven by antagonistic mechanisms. Attacks are more likely to be lethal far away from large cities, at higher altitudes, in less economically developed areas, and in locations with higher ethnic diversity. In contrast, the frequency of lethal attacks tends to be higher in more economically developed areas, close to large cities, and within democratic countries. Thu, 07 Dec 2017 00:00:00 GMT https://hdl.handle.net/10023/12067 2017-12-07T00:00:00Z Python, André To this day, terrorism perpetrated by non-state actors persists as a worldwide threat, as exemplified by the recent lethal attacks in Paris, London, Brussels, and the ongoing massacres perpetrated by the Islamic State in Iraq, Syria and neighbouring countries. In response, states deploy various counterterrorism policies, the costs of which could be reduced through more efficient preventive measures. The literature has not applied statistical models able to account for complex spatio-temporal dependencies, despite their potential for explaining and preventing non-state terrorism at the sub-national level. In an effort to address this shortcoming, this thesis employs Bayesian hierarchical models, where the spatial random field is represented by a stochastic partial differential equation. The results show that lethal terrorist attacks perpetrated by non-state actors tend to be concentrated in areas located within failed states from which they may diffuse locally, towards neighbouring areas. At the sub-national level, the propensity of attacks to be lethal and the frequency of lethal attacks appear to be driven by antagonistic mechanisms. Attacks are more likely to be lethal far away from large cities, at higher altitudes, in less economically developed areas, and in locations with higher ethnic diversity. In contrast, the frequency of lethal attacks tends to be higher in more economically developed areas, close to large cities, and within democratic countries. Modelling complex dependencies inherent in spatial and spatio-temporal point pattern data https://hdl.handle.net/10023/12009 Point processes are mechanisms that beget point patterns. Realisations of point processes are observed in many contexts, for example, locations of stars in the sky, or locations of trees in a forest. Inferring the mechanisms that drive point processes relies on the development of models that appropriately account for the dependencies inherent in the data. Fitting models that adequately capture the complex dependency structures in either space, time, or both is often problematic. This is commonly due to—but not restricted to—the intractability of the likelihood function, or computational burden of the required numerical operations. This thesis primarily focuses on developing point process models with some hierarchical structure, and specifically where this is a latent structure that may be considered as one of the following: (i) some unobserved construct assumed to be generating the observed structure, or (ii) some stochastic process describing the structure of the point pattern. Model fitting procedures utilised in this thesis include either (i) approximate-likelihood techniques to circumvent intractable likelihoods, (ii) stochastic partial differential equations to model continuous spatial latent structures, or (iii) improving computational speed in numerical approximations by exploiting automatic differentiation. Moreover, this thesis extends classic point process models by considering multivariate dependencies. This is achieved through considering a general class of joint point process model, which utilise shared stochastic structures. These structures account for the dependencies inherent in multivariate point process data. These models are applied to data originating from various scientific fields; in particular, applications are considered in ecology, medicine, and geology. In addition, point process models that account for the second order behaviour of these assumed stochastic structures are also considered. Fri, 23 Jun 2017 00:00:00 GMT https://hdl.handle.net/10023/12009 2017-06-23T00:00:00Z Jones-Todd, Charlotte M. Point processes are mechanisms that beget point patterns. Realisations of point processes are observed in many contexts, for example, locations of stars in the sky, or locations of trees in a forest. Inferring the mechanisms that drive point processes relies on the development of models that appropriately account for the dependencies inherent in the data. Fitting models that adequately capture the complex dependency structures in either space, time, or both is often problematic. This is commonly due to—but not restricted to—the intractability of the likelihood function, or computational burden of the required numerical operations. This thesis primarily focuses on developing point process models with some hierarchical structure, and specifically where this is a latent structure that may be considered as one of the following: (i) some unobserved construct assumed to be generating the observed structure, or (ii) some stochastic process describing the structure of the point pattern. Model fitting procedures utilised in this thesis include either (i) approximate-likelihood techniques to circumvent intractable likelihoods, (ii) stochastic partial differential equations to model continuous spatial latent structures, or (iii) improving computational speed in numerical approximations by exploiting automatic differentiation. Moreover, this thesis extends classic point process models by considering multivariate dependencies. This is achieved through considering a general class of joint point process model, which utilise shared stochastic structures. These structures account for the dependencies inherent in multivariate point process data. These models are applied to data originating from various scientific fields; in particular, applications are considered in ecology, medicine, and geology. In addition, point process models that account for the second order behaviour of these assumed stochastic structures are also considered. Parameter redundancy in log-linear models https://hdl.handle.net/10023/11739 Log-linear models are widely used to analyse categorical variables arranged in a contingency table. Sampling zero entries in the table can cause the problem of large standard errors for some model parameter estimates. This thesis focuses on the reason of this problem and suggests a solution by utilising the parameter redundancy approach. This approach detects whether a model is non-identifiable and parameter redundant, and specifies a smaller set of parameters or combinations of them that all are estimable. The parameter redundancy method is adapted here for Poisson log-linear models which are parameter redundant because of the number and pattern of the zero observations in the contingency table. Furthermore, it is shown that in some parameter redundant log-linear models, the presence of constraints referred to as esoteric constraints can make more parameters estimable. It is proven in a theorem that for a saturated Poisson log-linear model fitted to an lm table with one zero cell count, which model parameters are not estimable. Three examples of real data in sparse contingency tables are presented to demonstrate the process of identifying the estimable parameters and reducing the model. An alternative approach is the existence of the MLE method that checks for the existence of the maximum likelihood estimates of the cell means in the log-linear model after observing the zero entries. The method considers the log-linear model as a polyhedral cone and provides conditions to detect the estimability of the cell means. This method is compared here with the parameter redundancy approach and their similarities and differences are explained and illustrated using examples. In parameter redundant models with existent MLE, it is observed that the presence of the esoteric constraints makes all the parameters estimable. Thu, 07 Dec 2017 00:00:00 GMT https://hdl.handle.net/10023/11739 2017-12-07T00:00:00Z Sharifi Far, Serveh Log-linear models are widely used to analyse categorical variables arranged in a contingency table. Sampling zero entries in the table can cause the problem of large standard errors for some model parameter estimates. This thesis focuses on the reason of this problem and suggests a solution by utilising the parameter redundancy approach. This approach detects whether a model is non-identifiable and parameter redundant, and specifies a smaller set of parameters or combinations of them that all are estimable. The parameter redundancy method is adapted here for Poisson log-linear models which are parameter redundant because of the number and pattern of the zero observations in the contingency table. Furthermore, it is shown that in some parameter redundant log-linear models, the presence of constraints referred to as esoteric constraints can make more parameters estimable. It is proven in a theorem that for a saturated Poisson log-linear model fitted to an lm table with one zero cell count, which model parameters are not estimable. Three examples of real data in sparse contingency tables are presented to demonstrate the process of identifying the estimable parameters and reducing the model. An alternative approach is the existence of the MLE method that checks for the existence of the maximum likelihood estimates of the cell means in the log-linear model after observing the zero entries. The method considers the log-linear model as a polyhedral cone and provides conditions to detect the estimability of the cell means. This method is compared here with the parameter redundancy approach and their similarities and differences are explained and illustrated using examples. In parameter redundant models with existent MLE, it is observed that the presence of the esoteric constraints makes all the parameters estimable. Bayesian multi-species modelling of non-negative continuous ecological data with a discrete mass at zero https://hdl.handle.net/10023/9626 Severe declines in the number of some songbirds over the last 40 years have caused heated debate amongst interested parties. Many factors have been suggested as possible causes for these declines, including an increase in the abundance and distribution of an avian predator, the Eurasian sparrowhawk Accipiter nisus. To test for evidence for a predator effect on the abundance of its prey, we analyse data on 10 species visiting garden bird feeding stations monitored by the British Trust for Ornithology in relation to the abundance of sparrowhawks. We apply Bayesian hierarchical models to data relating to averaged maximum weekly counts from a garden bird monitoring survey. These data are essentially continuous, bounded below by zero, but for many species show a marked spike at zero that many standard distributions would not be able to account for. We use the Tweedie distributions, which for certain areas of parameter space relate to continuous nonnegative distributions with a discrete probability mass at zero, and are hence able to deal with the shape of the empirical distributions of the data. The methods developed in this thesis begin by modelling single prey species independently with an avian predator as a covariate, using MCMC methods to explore parameter and model spaces. This model is then extended to a multiple-prey species model, testing for interactions between species as well as synchrony in their response to environmental factors and unobserved variation. Finally we use a relatively new methodological framework, namely the SPDE approach in the INLA framework, to fit a multi-species spatio-temporal model to the ecological data. The results from the analyses are consistent with the hypothesis that sparrowhawks are suppressing the numbers of some species of birds visiting garden feeding stations. Only the species most susceptible to sparrowhawk predation seem to be affected. Thu, 01 Jan 2015 00:00:00 GMT https://hdl.handle.net/10023/9626 2015-01-01T00:00:00Z Swallow, Ben Severe declines in the number of some songbirds over the last 40 years have caused heated debate amongst interested parties. Many factors have been suggested as possible causes for these declines, including an increase in the abundance and distribution of an avian predator, the Eurasian sparrowhawk Accipiter nisus. To test for evidence for a predator effect on the abundance of its prey, we analyse data on 10 species visiting garden bird feeding stations monitored by the British Trust for Ornithology in relation to the abundance of sparrowhawks. We apply Bayesian hierarchical models to data relating to averaged maximum weekly counts from a garden bird monitoring survey. These data are essentially continuous, bounded below by zero, but for many species show a marked spike at zero that many standard distributions would not be able to account for. We use the Tweedie distributions, which for certain areas of parameter space relate to continuous nonnegative distributions with a discrete probability mass at zero, and are hence able to deal with the shape of the empirical distributions of the data. The methods developed in this thesis begin by modelling single prey species independently with an avian predator as a covariate, using MCMC methods to explore parameter and model spaces. This model is then extended to a multiple-prey species model, testing for interactions between species as well as synchrony in their response to environmental factors and unobserved variation. Finally we use a relatively new methodological framework, namely the SPDE approach in the INLA framework, to fit a multi-species spatio-temporal model to the ecological data. The results from the analyses are consistent with the hypothesis that sparrowhawks are suppressing the numbers of some species of birds visiting garden feeding stations. Only the species most susceptible to sparrowhawk predation seem to be affected. Random coeffcient models for complex longitudinal data https://hdl.handle.net/10023/6386 Longitudinal data are common in biological research. However, real data sets vary considerably in terms of their structure and complexity and present many challenges for statistical modelling. This thesis proposes a series of methods using random coefficients for modelling two broad types of longitudinal response: normally distributed measurements and binary recapture data. Biased inference can occur in linear mixed-effects modelling if subjects are drawn from a number of unknown sub-populations, or if the residual covariance is poorly specified. To address some of the shortcomings of previous approaches in terms of model selection and flexibility, this thesis presents methods for: (i) determining the presence of latent grouping structures using a two-step approach, involving regression splines for modelling functional random effects and mixture modelling of the fitted random effects; and (ii) flexible of modelling of the residual covariance matrix using regression splines to specify smooth and potentially non-monotonic variance and correlation functions. Spatially explicit capture-recapture methods for estimating the density of animal populations have shown a rapid increase in popularity over recent years. However, further refinements to existing theory and fitting software are required to apply these methods in many situations. This thesis presents: (i) an analysis of recapture data from an acoustic survey of gibbons using supplementary data in the form of estimated angles to detections, (ii) the development of a multi-occasion likelihood including a model for stochastic availability using a partially observed random effect (interpreted in terms of calling behaviour in the case of gibbons), and (iii) an analysis of recapture data from a population of radio-tagged skates using a conditional likelihood that allows the density of animal activity centres to be modelled as functions of time, space and animal-level covariates. Fri, 27 Jun 2014 00:00:00 GMT https://hdl.handle.net/10023/6386 2014-06-27T00:00:00Z Kidney, Darren Longitudinal data are common in biological research. However, real data sets vary considerably in terms of their structure and complexity and present many challenges for statistical modelling. This thesis proposes a series of methods using random coefficients for modelling two broad types of longitudinal response: normally distributed measurements and binary recapture data. Biased inference can occur in linear mixed-effects modelling if subjects are drawn from a number of unknown sub-populations, or if the residual covariance is poorly specified. To address some of the shortcomings of previous approaches in terms of model selection and flexibility, this thesis presents methods for: (i) determining the presence of latent grouping structures using a two-step approach, involving regression splines for modelling functional random effects and mixture modelling of the fitted random effects; and (ii) flexible of modelling of the residual covariance matrix using regression splines to specify smooth and potentially non-monotonic variance and correlation functions. Spatially explicit capture-recapture methods for estimating the density of animal populations have shown a rapid increase in popularity over recent years. However, further refinements to existing theory and fitting software are required to apply these methods in many situations. This thesis presents: (i) an analysis of recapture data from an acoustic survey of gibbons using supplementary data in the form of estimated angles to detections, (ii) the development of a multi-occasion likelihood including a model for stochastic availability using a partially observed random effect (interpreted in terms of calling behaviour in the case of gibbons), and (iii) an analysis of recapture data from a population of radio-tagged skates using a conditional likelihood that allows the density of animal activity centres to be modelled as functions of time, space and animal-level covariates. Novel methods for species distribution mapping including spatial models in complex regions https://hdl.handle.net/10023/4514 Species Distribution Modelling (SDM) plays a key role in a number of biological applications: assessment of temporal trends in distribution, environmental impact assessment and spatial conservation planning. From a statistical perspective, this thesis develops two methods for increasing the accuracy and reliability of maps of density surfaces and provides a solution to the problem of how to collate multiple density maps of the same region, obtained from differing sources. From a biological perspective, these statistical methods are used to analyse two marine mammal datasets to produce accurate maps for use in spatial conservation planning and temporal trend assessment. The first new method, Complex Region Spatial Smoother [CReSS; Scott-Hayward et al., 2013], improves smoothing in areas where the real distance an animal must travel (`as the animal swims') between two points may be greater than the straight line distance between them, a problem that occurs in complex domains with coastline or islands. CReSS uses estimates of the geodesic distance between points, model averaging and local radial smoothing. Simulation is used to compare its performance with other traditional and recently-developed smoothing techniques: Thin Plate Splines (TPS, Harder and Desmarais [1972]), Geodesic Low rank TPS (GLTPS; Wang and Ranalli [2007]) and the Soap lm smoother (SOAP; Wood et al. [2008]). GLTPS cannot be used in areas with islands and SOAP can be very hard to parametrise. CReSS outperforms all of the other methods on a range of simulations, based on their fit to the underlying function as measured by mean squared error, particularly for sparse data sets. Smoothing functions need to be flexible when they are used to model density surfaces that are highly heterogeneous, in order to avoid biases due to under- or over-fitting. This issue was addressed using an adaptation of a Spatially Adaptive Local Smoothing Algorithm (SALSA, Walker et al. [2010]) in combination with the CReSS method (CReSS-SALSA2D). Unlike traditional methods, such as Generalised Additive Modelling, the adaptive knot selection approach used in SALSA2D naturally accommodates local changes in the smoothness of the density surface that is being modelled. At the time of writing, there are no other methods available to deal with this issue in topographically complex regions. Simulation results show that CReSS-SALSA2D performs better than CReSS (based on MSE scores), except at very high noise levels where there is an issue with over-fitting. There is an increasing need for a facility to combine multiple density surface maps of individual species in order to make best use of meta-databases, to maintain existing maps, and to extend their geographical coverage. This thesis develops a framework and methods for combining species distribution maps as new information becomes available. The methods use Bayes Theorem to combine density surfaces, taking account of the levels of precision associated with the different sets of estimates, and kernel smoothing to alleviate artefacts that may be created where pairs of surfaces join. The methods were used as part of an algorithm (the Dynamic Cetacean Abundance Predictor) designed for BAE Systems to aid in risk mitigation for naval exercises. Two case studies show the capabilities of CReSS and CReSS-SALSA2D when applied to real ecological data. In the first case study, CReSS was used in a Generalised Estimating Equation framework to identify a candidate Marine Protected Area for the Southern Resident Killer Whale population to the south of San Juan Island, off the Pacific coast of the United States. In the second case study, changes in the spatial and temporal distribution of harbour porpoise and minke whale in north-western European waters over a period of 17 years (1994-2010) were modelled. CReSS and CReSS-SALSA2D performed well in a large, topographically complex study area. Based on simulation results, maps produced using these methods are more accurate than if a traditional GAM-based method is used. The resulting maps identified particularly high densities of both harbour porpoise and minke whale in an area off the west coast of Scotland in 2010, that might be a candidate for inclusion into the Scottish network of Nature Conservation Marine Protected Areas. Tue, 05 Nov 2013 00:00:00 GMT https://hdl.handle.net/10023/4514 2013-11-05T00:00:00Z Scott-Hayward, Lindesay Alexandra Sarah Species Distribution Modelling (SDM) plays a key role in a number of biological applications: assessment of temporal trends in distribution, environmental impact assessment and spatial conservation planning. From a statistical perspective, this thesis develops two methods for increasing the accuracy and reliability of maps of density surfaces and provides a solution to the problem of how to collate multiple density maps of the same region, obtained from differing sources. From a biological perspective, these statistical methods are used to analyse two marine mammal datasets to produce accurate maps for use in spatial conservation planning and temporal trend assessment. The first new method, Complex Region Spatial Smoother [CReSS; Scott-Hayward et al., 2013], improves smoothing in areas where the real distance an animal must travel (`as the animal swims') between two points may be greater than the straight line distance between them, a problem that occurs in complex domains with coastline or islands. CReSS uses estimates of the geodesic distance between points, model averaging and local radial smoothing. Simulation is used to compare its performance with other traditional and recently-developed smoothing techniques: Thin Plate Splines (TPS, Harder and Desmarais [1972]), Geodesic Low rank TPS (GLTPS; Wang and Ranalli [2007]) and the Soap lm smoother (SOAP; Wood et al. [2008]). GLTPS cannot be used in areas with islands and SOAP can be very hard to parametrise. CReSS outperforms all of the other methods on a range of simulations, based on their fit to the underlying function as measured by mean squared error, particularly for sparse data sets. Smoothing functions need to be flexible when they are used to model density surfaces that are highly heterogeneous, in order to avoid biases due to under- or over-fitting. This issue was addressed using an adaptation of a Spatially Adaptive Local Smoothing Algorithm (SALSA, Walker et al. [2010]) in combination with the CReSS method (CReSS-SALSA2D). Unlike traditional methods, such as Generalised Additive Modelling, the adaptive knot selection approach used in SALSA2D naturally accommodates local changes in the smoothness of the density surface that is being modelled. At the time of writing, there are no other methods available to deal with this issue in topographically complex regions. Simulation results show that CReSS-SALSA2D performs better than CReSS (based on MSE scores), except at very high noise levels where there is an issue with over-fitting. There is an increasing need for a facility to combine multiple density surface maps of individual species in order to make best use of meta-databases, to maintain existing maps, and to extend their geographical coverage. This thesis develops a framework and methods for combining species distribution maps as new information becomes available. The methods use Bayes Theorem to combine density surfaces, taking account of the levels of precision associated with the different sets of estimates, and kernel smoothing to alleviate artefacts that may be created where pairs of surfaces join. The methods were used as part of an algorithm (the Dynamic Cetacean Abundance Predictor) designed for BAE Systems to aid in risk mitigation for naval exercises. Two case studies show the capabilities of CReSS and CReSS-SALSA2D when applied to real ecological data. In the first case study, CReSS was used in a Generalised Estimating Equation framework to identify a candidate Marine Protected Area for the Southern Resident Killer Whale population to the south of San Juan Island, off the Pacific coast of the United States. In the second case study, changes in the spatial and temporal distribution of harbour porpoise and minke whale in north-western European waters over a period of 17 years (1994-2010) were modelled. CReSS and CReSS-SALSA2D performed well in a large, topographically complex study area. Based on simulation results, maps produced using these methods are more accurate than if a traditional GAM-based method is used. The resulting maps identified particularly high densities of both harbour porpoise and minke whale in an area off the west coast of Scotland in 2010, that might be a candidate for inclusion into the Scottish network of Nature Conservation Marine Protected Areas. Modelling catch sampling uncertainty in fisheries stock assessment : the Atlantic-Iberian sardine case https://hdl.handle.net/10023/4474 The statistical assessment of harvested fish populations, such as the Atlantic-Iberian sardine (AIS) stock, needs to deal with uncertainties inherent in fisheries systems. Uncertainties arising from sampling errors and stochasticity in stock dynamics must be incorporated in stock assessment models so that management decisions are based on realistic evaluation of the uncertainty about the status of the stock. The main goal of this study is to develop a stock assessment framework that accounts for some of the uncertainties associated with the AIS stock that are currently not integrated into stock assessment models. In particular, it focuses on accounting for the uncertainty arising from the catch data sampling process. The central innovation the thesis is the development of a Bayesian integrated stock assessment (ISA) model, in which an observation model explicitly links stock dynamics parameters with statistical models for the various types of data observed from catches of the AIS stock. This allows for systematic and statistically consistent propagation of the uncertainty inherent in the catch sampling process across the whole stock assessment model, through to estimates of biomass and stock parameters. The method is tested by simulations and found to provide reliable and accurate estimates of stock parameters and associated uncertainty, while also outperforming existing designed-based and model-based estimation approaches. The method is computationally very demanding and this is an obstacle to its adoption by fisheries bodies. Once this obstacle is overcame, the ISA modelling framework developed and presented in this thesis could provide an important contribution to the improvement in the evaluation of uncertainty in fisheries stock assessments, not only of the AIS stock, but of any other fish stock with similar data and dynamics structure. Furthermore, the models developed in this study establish a solid conceptual platform to allow future development of more complex models of fish population dynamics. Tue, 01 Jan 2013 00:00:00 GMT https://hdl.handle.net/10023/4474 2013-01-01T00:00:00Z Caneco, Bruno The statistical assessment of harvested fish populations, such as the Atlantic-Iberian sardine (AIS) stock, needs to deal with uncertainties inherent in fisheries systems. Uncertainties arising from sampling errors and stochasticity in stock dynamics must be incorporated in stock assessment models so that management decisions are based on realistic evaluation of the uncertainty about the status of the stock. The main goal of this study is to develop a stock assessment framework that accounts for some of the uncertainties associated with the AIS stock that are currently not integrated into stock assessment models. In particular, it focuses on accounting for the uncertainty arising from the catch data sampling process. The central innovation the thesis is the development of a Bayesian integrated stock assessment (ISA) model, in which an observation model explicitly links stock dynamics parameters with statistical models for the various types of data observed from catches of the AIS stock. This allows for systematic and statistically consistent propagation of the uncertainty inherent in the catch sampling process across the whole stock assessment model, through to estimates of biomass and stock parameters. The method is tested by simulations and found to provide reliable and accurate estimates of stock parameters and associated uncertainty, while also outperforming existing designed-based and model-based estimation approaches. The method is computationally very demanding and this is an obstacle to its adoption by fisheries bodies. Once this obstacle is overcame, the ISA modelling framework developed and presented in this thesis could provide an important contribution to the improvement in the evaluation of uncertainty in fisheries stock assessments, not only of the AIS stock, but of any other fish stock with similar data and dynamics structure. Furthermore, the models developed in this study establish a solid conceptual platform to allow future development of more complex models of fish population dynamics. Estimating wildlife distribution and abundance from line transect surveys conducted from platforms of opportunity https://hdl.handle.net/10023/3727 Line transect data obtained from 'platforms of opportunity' are useful for the monitoring of long term trends in dolphin populations which occur over vast areas, yet analyses of such data axe problematic due to violation of fundamental assumptions of line transect methodology. In this thesis we develop methods which allow estimates of dolphin relative abundance to be obtained when certain assumptions of line transect sampling are violated. Generalised additive models are used to model encounter rate and mean school size as a function of spatially and temporally referenced covariates. The estimated relationship between the response and the environmental and locational covariates is then used to obtain a predicted surface for the response over the entire survey region. Given those predicted surfaces, a density surface can then be obtained and an estimate of abundance computed by numerically integrating over the entire survey region. This approach is particularly useful when search effort is not random, in which case standard line transect methods would yield biased estimates. Estimates of f (0) (the inverse of the effective strip (half-)width), an essential component of the line transect estimator, may also be biased due to heterogeneity in detection probabilities. We developed a conditional likelihood approach in which covariate effects are directly incorporated into the estimation procedure. Simulation results indicated that the method performs well in the presence of size-bias. When multiple covariates are used, it is important that covariate selection be carried out. As an example we applied the methods described above to eastern tropical Pacific dolphin stocks. However, uncertainty in stock identification has never been directly incorporated into methods used to obtain estimates of relative or absolute abundance. Therefore we illustrate an approach in which trends in dolphin relative abundance axe monitored by small areas, rather than stocks. Mon, 01 Jan 2001 00:00:00 GMT https://hdl.handle.net/10023/3727 2001-01-01T00:00:00Z Marques, Fernanda F. C. Line transect data obtained from 'platforms of opportunity' are useful for the monitoring of long term trends in dolphin populations which occur over vast areas, yet analyses of such data axe problematic due to violation of fundamental assumptions of line transect methodology. In this thesis we develop methods which allow estimates of dolphin relative abundance to be obtained when certain assumptions of line transect sampling are violated. Generalised additive models are used to model encounter rate and mean school size as a function of spatially and temporally referenced covariates. The estimated relationship between the response and the environmental and locational covariates is then used to obtain a predicted surface for the response over the entire survey region. Given those predicted surfaces, a density surface can then be obtained and an estimate of abundance computed by numerically integrating over the entire survey region. This approach is particularly useful when search effort is not random, in which case standard line transect methods would yield biased estimates. Estimates of f (0) (the inverse of the effective strip (half-)width), an essential component of the line transect estimator, may also be biased due to heterogeneity in detection probabilities. We developed a conditional likelihood approach in which covariate effects are directly incorporated into the estimation procedure. Simulation results indicated that the method performs well in the presence of size-bias. When multiple covariates are used, it is important that covariate selection be carried out. As an example we applied the methods described above to eastern tropical Pacific dolphin stocks. However, uncertainty in stock identification has never been directly incorporated into methods used to obtain estimates of relative or absolute abundance. Therefore we illustrate an approach in which trends in dolphin relative abundance axe monitored by small areas, rather than stocks. Bayesian point process modelling of ecological communities https://hdl.handle.net/10023/3710 The modelling of biological communities is important to further the understanding of species coexistence and the mechanisms involved in maintaining biodiversity. This involves considering not only interactions between individual biological organisms, but also the incorporation of covariate information, if available, in the modelling process. This thesis explores the use of point processes to model interactions in bivariate point patterns within a Bayesian framework, and, where applicable, in conjunction with covariate data. Specifically, we distinguish between symmetric and asymmetric species interactions and model these using appropriate point processes. In this thesis we consider both pairwise and area interaction point processes to allow for inhibitory interactions and both inhibitory and attractive interactions. It is envisaged that the analyses and innovations presented in this thesis will contribute to the parsimonious modelling of biological communities. Fri, 28 Jun 2013 00:00:00 GMT https://hdl.handle.net/10023/3710 2013-06-28T00:00:00Z Nightingale, Glenna Faith The modelling of biological communities is important to further the understanding of species coexistence and the mechanisms involved in maintaining biodiversity. This involves considering not only interactions between individual biological organisms, but also the incorporation of covariate information, if available, in the modelling process. This thesis explores the use of point processes to model interactions in bivariate point patterns within a Bayesian framework, and, where applicable, in conjunction with covariate data. Specifically, we distinguish between symmetric and asymmetric species interactions and model these using appropriate point processes. In this thesis we consider both pairwise and area interaction point processes to allow for inhibitory interactions and both inhibitory and attractive interactions. It is envisaged that the analyses and innovations presented in this thesis will contribute to the parsimonious modelling of biological communities. Animal population estimation using mark-recapture and plant-capture https://hdl.handle.net/10023/3655 Mark-recapture is a method of population estimation that involves capturing a number of animals from a population of unknown size on several occasions, and marking those animals that are caught each time. By observing the number of marked animals that are subsequently seen, estimates of the total population size can be made. There are various subclasses of the mark-recapture method called the Otis-class of models (Otis, Burnham, White & Anderson 1978). These relate to the assumed behaviour of the individuals in the target population. More recent work has generalised the theory of mark-recapture to the so-called plant-capture, where a known number of animals are pre-inserted into the target population. Sampling is then carried out as normal, but with additional information coming from knowledge of the number of planted individuals. The theory underpinning plant-capture is less well-developed than mark-recapture, with the difference on population estimation of the former over the latter not often tested. This thesis shows that, under fixed and random sample-size models, the inclusion of plants can improve the mean point population estimation of various estimators. The estimator of Pathak (1964) is generalised to allow for the inclusion of plants into the target population. The results show that mean estimates from most estimators, under most models, can be improved with the inclusion of plants, and the sample standard deviations of the simulations can be reduced. This improvement in mean point population estimation is particularly pronounced when the number of animals captured is low. Sample coverage, which is the proportion of distinct animals caught during sampling, is also often sought by practitioners. Given here is a generalisation of the inverse population estimator of Pathak (1964) to plant-capture and a proposed new inverse population estimator, which can be used as estimates of the coverage of a sample. Sun, 01 Jan 2012 00:00:00 GMT https://hdl.handle.net/10023/3655 2012-01-01T00:00:00Z Gormley, Richard Mark-recapture is a method of population estimation that involves capturing a number of animals from a population of unknown size on several occasions, and marking those animals that are caught each time. By observing the number of marked animals that are subsequently seen, estimates of the total population size can be made. There are various subclasses of the mark-recapture method called the Otis-class of models (Otis, Burnham, White & Anderson 1978). These relate to the assumed behaviour of the individuals in the target population. More recent work has generalised the theory of mark-recapture to the so-called plant-capture, where a known number of animals are pre-inserted into the target population. Sampling is then carried out as normal, but with additional information coming from knowledge of the number of planted individuals. The theory underpinning plant-capture is less well-developed than mark-recapture, with the difference on population estimation of the former over the latter not often tested. This thesis shows that, under fixed and random sample-size models, the inclusion of plants can improve the mean point population estimation of various estimators. The estimator of Pathak (1964) is generalised to allow for the inclusion of plants into the target population. The results show that mean estimates from most estimators, under most models, can be improved with the inclusion of plants, and the sample standard deviations of the simulations can be reduced. This improvement in mean point population estimation is particularly pronounced when the number of animals captured is low. Sample coverage, which is the proportion of distinct animals caught during sampling, is also often sought by practitioners. Given here is a generalisation of the inverse population estimator of Pathak (1964) to plant-capture and a proposed new inverse population estimator, which can be used as estimates of the coverage of a sample. Estimating anglerfish abundance from trawl surveys, and related problems https://hdl.handle.net/10023/3652 The content of this thesis was motivated by the need to estimate anglerfish abundance from stratified random trawl surveys of the anglerfish stock which occupies the northern European shelf (Fernandes et al., 2007). The survey was conducted annually from 2005 to 2010 in order to obtain age-structured estimates of absolute abundance for this stock. An estimation method is considered to incorporate statistical models for herding, length-based net retention probability and missing age data and uncertainty from all of these sources in variance estimation. A key component of abundance estimation is the estimation of capture probability. Capture probability is estimated from the experimental survey data using various logistic regression models with haul as a random effect. Conditional on the estimated capture probability, a number of abundance estimators are developed and applied to the anglerfish data. The abundance estimators differ in the way that the haul effect is incorporated. The performance of these estimators is investigated by simulation. An estimator with form similar to that conventionally used to estimate abundance from distance sampling surveys is found to perform best. The estimators developed for the anglerfish survey data which incorporate random effects in capture probability have wider application than trawl surveys. We examine the analytic properties of these estimators when the capture/detection probability is known. We apply these estimators to three different types of survey data in addition to the anglerfish data, with different forms of random effects and investigate their performance by simulation. We find that a generalization of the form of estimator typically used on line transect surveys performs best overall. It has low bias, and also the lowest bias and mean squared error among all the estimators we considered. Sun, 01 Jan 2012 00:00:00 GMT https://hdl.handle.net/10023/3652 2012-01-01T00:00:00Z Yuan, Yuan The content of this thesis was motivated by the need to estimate anglerfish abundance from stratified random trawl surveys of the anglerfish stock which occupies the northern European shelf (Fernandes et al., 2007). The survey was conducted annually from 2005 to 2010 in order to obtain age-structured estimates of absolute abundance for this stock. An estimation method is considered to incorporate statistical models for herding, length-based net retention probability and missing age data and uncertainty from all of these sources in variance estimation. A key component of abundance estimation is the estimation of capture probability. Capture probability is estimated from the experimental survey data using various logistic regression models with haul as a random effect. Conditional on the estimated capture probability, a number of abundance estimators are developed and applied to the anglerfish data. The abundance estimators differ in the way that the haul effect is incorporated. The performance of these estimators is investigated by simulation. An estimator with form similar to that conventionally used to estimate abundance from distance sampling surveys is found to perform best. The estimators developed for the anglerfish survey data which incorporate random effects in capture probability have wider application than trawl surveys. We examine the analytic properties of these estimators when the capture/detection probability is known. We apply these estimators to three different types of survey data in addition to the anglerfish data, with different forms of random effects and investigate their performance by simulation. We find that a generalization of the form of estimator typically used on line transect surveys performs best overall. It has low bias, and also the lowest bias and mean squared error among all the estimators we considered. Mixed effect models in distance sampling https://hdl.handle.net/10023/3618 Recently, much effort has been expended for improving conventional distance sampling methods, e.g. by replacing the design-based approach with a model-based approach where observed counts are related to environmental covariates (Hedley and Buckland, 2004) or by incorporating covariates in the detection function model (Marques and Buckland, 2003). While these models have generally been limited to include fixed effects, we propose four different methods for analysing distance sampling data using mixed effects models. These include an extension of the two-stage approach (Buckland et al., 2009), where we include site random effects in the second-stage count model to account for correlated counts at the same sites. We also present two integrated approaches which include site random effects in the count model. These approaches combine the analysis stages for the detection and count models and allow simultaneous estimation of all parameters. Furthermore, we develop a detection function model that incorporates random effects. We also propose a novel Bayesian approach to analysing distance sampling data which uses a Metropolis-Hastings algorithm for updating model parameters and a reversible jump Markov chain Monte Carlo (RJMCMC) algorithm for assessing model uncertainty. Lastly, we propose using hierarchical centering as a novel technique for improving model mixing and hence facilitating an RJMCMC algorithm for mixed models. We analyse two case studies, both large-scale point transect surveys, where the interest lies in establishing the effects of conservation buffers on agricultural fields. For each case study, we compare the results from one integrated approach to those from the extended two-stage approach. We find that these may differ in parameter estimates for covariates that were both in the detection and the count model and in model probabilities when model uncertainty was included in inference. The performance of the random effects based detection function is assessed via simulation and when heterogeneity in the data is present, one of the new estimators yields improved results compared to conventional distance sampling estimators. Tue, 01 Jan 2013 00:00:00 GMT https://hdl.handle.net/10023/3618 2013-01-01T00:00:00Z Oedekoven, Cornelia Sabrina Recently, much effort has been expended for improving conventional distance sampling methods, e.g. by replacing the design-based approach with a model-based approach where observed counts are related to environmental covariates (Hedley and Buckland, 2004) or by incorporating covariates in the detection function model (Marques and Buckland, 2003). While these models have generally been limited to include fixed effects, we propose four different methods for analysing distance sampling data using mixed effects models. These include an extension of the two-stage approach (Buckland et al., 2009), where we include site random effects in the second-stage count model to account for correlated counts at the same sites. We also present two integrated approaches which include site random effects in the count model. These approaches combine the analysis stages for the detection and count models and allow simultaneous estimation of all parameters. Furthermore, we develop a detection function model that incorporates random effects. We also propose a novel Bayesian approach to analysing distance sampling data which uses a Metropolis-Hastings algorithm for updating model parameters and a reversible jump Markov chain Monte Carlo (RJMCMC) algorithm for assessing model uncertainty. Lastly, we propose using hierarchical centering as a novel technique for improving model mixing and hence facilitating an RJMCMC algorithm for mixed models. We analyse two case studies, both large-scale point transect surveys, where the interest lies in establishing the effects of conservation buffers on agricultural fields. For each case study, we compare the results from one integrated approach to those from the extended two-stage approach. We find that these may differ in parameter estimates for covariates that were both in the detection and the count model and in model probabilities when model uncertainty was included in inference. The performance of the random effects based detection function is assessed via simulation and when heterogeneity in the data is present, one of the new estimators yields improved results compared to conventional distance sampling estimators. Quantifying biodiversity trends in time and space https://hdl.handle.net/10023/3414 The global loss of biodiversity calls for robust large-scale diversity assessment. Biological diversity is a multi-faceted concept; defined as the “variety of life”, answering questions such as “How much is there?” or more precisely “Have we succeeded in reducing the rate of its decline?” is not straightforward. While various aspects of biodiversity give rise to numerous ways of quantification, we focus on temporal (and spatial) trends and their changes in species diversity. Traditional diversity indices summarise information contained in the species abundance distribution, i.e. each species' proportional contribution to total abundance. Estimated from data, these indices can be biased if variation in detection probability is ignored. We discuss differences between diversity indices and demonstrate possible adjustments for detectability. Additionally, most indices focus on the most abundant species in ecological communities. We introduce a new set of diversity measures, based on a family of goodness-of-fit statistics. A function of a free parameter, this family allows us to vary the sensitivity of these measures to dominance and rarity of species. Their performance is studied by assessing temporal trends in diversity for five communities of British breeding birds based on 14 years of survey data, where they are applied alongside the current headline index, a geometric mean of relative abundances. Revealing the contributions of both rare and common species to biodiversity trends, these "goodness-of-fit" measures provide novel insights into how ecological communities change over time. Biodiversity is not only subject to temporal changes, but it also varies across space. We take first steps towards estimating spatial diversity trends. Finally, processes maintaining biodiversity act locally, at specific spatial scales. Contrary to abundance-based summary statistics, spatial characteristics of ecological communities may distinguish these processes. We suggest a generalisation to a spatial summary, the cross-pair overlap distribution, to render it more flexible to spatial scale. Fri, 30 Nov 2012 00:00:00 GMT https://hdl.handle.net/10023/3414 2012-11-30T00:00:00Z Studeny, Angelika C. The global loss of biodiversity calls for robust large-scale diversity assessment. Biological diversity is a multi-faceted concept; defined as the “variety of life”, answering questions such as “How much is there?” or more precisely “Have we succeeded in reducing the rate of its decline?” is not straightforward. While various aspects of biodiversity give rise to numerous ways of quantification, we focus on temporal (and spatial) trends and their changes in species diversity. Traditional diversity indices summarise information contained in the species abundance distribution, i.e. each species' proportional contribution to total abundance. Estimated from data, these indices can be biased if variation in detection probability is ignored. We discuss differences between diversity indices and demonstrate possible adjustments for detectability. Additionally, most indices focus on the most abundant species in ecological communities. We introduce a new set of diversity measures, based on a family of goodness-of-fit statistics. A function of a free parameter, this family allows us to vary the sensitivity of these measures to dominance and rarity of species. Their performance is studied by assessing temporal trends in diversity for five communities of British breeding birds based on 14 years of survey data, where they are applied alongside the current headline index, a geometric mean of relative abundances. Revealing the contributions of both rare and common species to biodiversity trends, these "goodness-of-fit" measures provide novel insights into how ecological communities change over time. Biodiversity is not only subject to temporal changes, but it also varies across space. We take first steps towards estimating spatial diversity trends. Finally, processes maintaining biodiversity act locally, at specific spatial scales. Contrary to abundance-based summary statistics, spatial characteristics of ecological communities may distinguish these processes. We suggest a generalisation to a spatial summary, the cross-pair overlap distribution, to render it more flexible to spatial scale. Finite and infinite ergodic theory for linear and conformal dynamical systems https://hdl.handle.net/10023/3220 The first main topic of this thesis is the thorough analysis of two families of piecewise linear maps on the unit interval, the α-Lüroth and α-Farey maps. Here, α denotes a countably infinite partition of the unit interval whose atoms only accumulate at the origin. The basic properties of these maps will be developed, including that each α-Lüroth map (denoted Lα) gives rise to a series expansion of real numbers in [0,1], a certain type of Generalised Lüroth Series. The first example of such an expansion was given by Lüroth. The map Lα is the jump transformation of the corresponding α-Farey map Fα. The maps Lα and Fα share the same relationship as the classical Farey and Gauss maps which give rise to the continued fraction expansion of a real number. We also consider the topological properties of Fα and some Diophantine-type sets of numbers expressed in terms of the α-Lüroth expansion. Next we investigate certain ergodic-theoretic properties of the maps Lα and Fα. It will turn out that the Lebesgue measure λ is invariant for every map Lα and that there exists a unique Lebesgue-absolutely continuous invariant measure for Fα. We will give a precise expression for the density of this measure. Our main result is that both Lα and Fα are exact, and thus ergodic. The interest in the invariant measure for Fα lies in the fact that under a particular condition on the underlying partition α, the invariant measure associated to the map Fα is infinite. Then we proceed to introduce and examine the sequence of α-sum-level sets arising from the α-Lüroth map, for an arbitrary given partition α. These sets can be written dynamically in terms of Fα. The main result concerning the α-sum-level sets is to establish weak and strong renewal laws. Note that for the Farey map and the Gauss map, the analogue of this result has been obtained by Kesseböhmer and Stratmann. There the results were derived by using advanced infinite ergodic theory, rather than the strong renewal theorems employed here. This underlines the fact that one of the main ingredients of infinite ergodic theory is provided by some delicate estimates in renewal theory. Our final main result concerning the α-Lüroth and α-Farey systems is to provide a fractal-geometric description of the Lyapunov spectra associated with each of the maps Lα and Fα. The Lyapunov spectra for the Farey map and the Gauss map have been investigated in detail by Kesseböhmer and Stratmann. The Farey map and the Gauss map are non-linear, whereas the systems we consider are always piecewise linear. However, since our analysis is based on a large family of different partitions of U , the class of maps which we consider in this paper allows us to detect a variety of interesting new phenomena, including that of phase transitions. Finally, we come to the conformal systems of the title. These are the limit sets of discrete subgroups of the group of isometries of the hyperbolic plane. For these so-called Fuchsian groups, our first main result is to establish the Hausdorff dimension of some Diophantine-type sets contained in the limit set that are similar to those considered for the maps Lα. These sets are then used in our second main result to analyse the more geometrically defined strict-Jarník limit set of a Fuchsian group. Finally, we obtain a “weak multifractal spectrum” for the Patterson measure associated to the Fuchsian group. Wed, 30 Nov 2011 00:00:00 GMT https://hdl.handle.net/10023/3220 2011-11-30T00:00:00Z Munday, Sara The first main topic of this thesis is the thorough analysis of two families of piecewise linear maps on the unit interval, the α-Lüroth and α-Farey maps. Here, α denotes a countably infinite partition of the unit interval whose atoms only accumulate at the origin. The basic properties of these maps will be developed, including that each α-Lüroth map (denoted Lα) gives rise to a series expansion of real numbers in [0,1], a certain type of Generalised Lüroth Series. The first example of such an expansion was given by Lüroth. The map Lα is the jump transformation of the corresponding α-Farey map Fα. The maps Lα and Fα share the same relationship as the classical Farey and Gauss maps which give rise to the continued fraction expansion of a real number. We also consider the topological properties of Fα and some Diophantine-type sets of numbers expressed in terms of the α-Lüroth expansion. Next we investigate certain ergodic-theoretic properties of the maps Lα and Fα. It will turn out that the Lebesgue measure λ is invariant for every map Lα and that there exists a unique Lebesgue-absolutely continuous invariant measure for Fα. We will give a precise expression for the density of this measure. Our main result is that both Lα and Fα are exact, and thus ergodic. The interest in the invariant measure for Fα lies in the fact that under a particular condition on the underlying partition α, the invariant measure associated to the map Fα is infinite. Then we proceed to introduce and examine the sequence of α-sum-level sets arising from the α-Lüroth map, for an arbitrary given partition α. These sets can be written dynamically in terms of Fα. The main result concerning the α-sum-level sets is to establish weak and strong renewal laws. Note that for the Farey map and the Gauss map, the analogue of this result has been obtained by Kesseböhmer and Stratmann. There the results were derived by using advanced infinite ergodic theory, rather than the strong renewal theorems employed here. This underlines the fact that one of the main ingredients of infinite ergodic theory is provided by some delicate estimates in renewal theory. Our final main result concerning the α-Lüroth and α-Farey systems is to provide a fractal-geometric description of the Lyapunov spectra associated with each of the maps Lα and Fα. The Lyapunov spectra for the Farey map and the Gauss map have been investigated in detail by Kesseböhmer and Stratmann. The Farey map and the Gauss map are non-linear, whereas the systems we consider are always piecewise linear. However, since our analysis is based on a large family of different partitions of U , the class of maps which we consider in this paper allows us to detect a variety of interesting new phenomena, including that of phase transitions. Finally, we come to the conformal systems of the title. These are the limit sets of discrete subgroups of the group of isometries of the hyperbolic plane. For these so-called Fuchsian groups, our first main result is to establish the Hausdorff dimension of some Diophantine-type sets contained in the limit set that are similar to those considered for the maps Lα. These sets are then used in our second main result to analyse the more geometrically defined strict-Jarník limit set of a Fuchsian group. Finally, we obtain a “weak multifractal spectrum” for the Patterson measure associated to the Fuchsian group. Spatial patterns and species coexistence : using spatial statistics to identify underlying ecological processes in plant communities https://hdl.handle.net/10023/3084 The use of spatial statistics to investigate ecological processes in plant communities is becoming increasingly widespread. In diverse communities such as tropical rainforests, analysis of spatial structure may help to unravel the various processes that act and interact to maintain high levels of diversity. In particular, a number of contrasting mechanisms have been suggested to explain species coexistence, and these differ greatly in their practical implications for the ecology and conservation of tropical forests. Traditional first-order measures of community structure have proved unable to distinguish these mechanisms in practice, but statistics that describe spatial structure may be able to do so. This is of great interest and relevance as spatially explicit data become available for a range of ecological communities and analysis methods for these data become more accessible. This thesis investigates the potential for inference about underlying ecological processes in plant communities using spatial statistics. Current methodologies for spatial analysis are reviewed and extended, and are used to characterise the spatial signals of the principal theorised mechanisms of coexistence. The sensitivity of a range of spatial statistics to these signals is assessed, and the strength of such signals in natural communities is investigated. The spatial signals of the processes considered here are found to be strong and robust to modelled stochastic variation. Several new and existing spatial statistics are found to be sensitive to these signals, and offer great promise for inference about underlying processes from empirical data. The relative strengths of particular processes are found to vary between natural communities, with any one theory being insufficient to explain observed patterns. This thesis extends both understanding of species coexistence in diverse plant communities and the methodology for assessing underlying process in particular cases. It demonstrates that the potential of spatial statistics in ecology is great and largely unexplored. Thu, 01 Nov 2012 00:00:00 GMT https://hdl.handle.net/10023/3084 2012-11-01T00:00:00Z Brown, Calum The use of spatial statistics to investigate ecological processes in plant communities is becoming increasingly widespread. In diverse communities such as tropical rainforests, analysis of spatial structure may help to unravel the various processes that act and interact to maintain high levels of diversity. In particular, a number of contrasting mechanisms have been suggested to explain species coexistence, and these differ greatly in their practical implications for the ecology and conservation of tropical forests. Traditional first-order measures of community structure have proved unable to distinguish these mechanisms in practice, but statistics that describe spatial structure may be able to do so. This is of great interest and relevance as spatially explicit data become available for a range of ecological communities and analysis methods for these data become more accessible. This thesis investigates the potential for inference about underlying ecological processes in plant communities using spatial statistics. Current methodologies for spatial analysis are reviewed and extended, and are used to characterise the spatial signals of the principal theorised mechanisms of coexistence. The sensitivity of a range of spatial statistics to these signals is assessed, and the strength of such signals in natural communities is investigated. The spatial signals of the processes considered here are found to be strong and robust to modelled stochastic variation. Several new and existing spatial statistics are found to be sensitive to these signals, and offer great promise for inference about underlying processes from empirical data. The relative strengths of particular processes are found to vary between natural communities, with any one theory being insufficient to explain observed patterns. This thesis extends both understanding of species coexistence in diverse plant communities and the methodology for assessing underlying process in particular cases. It demonstrates that the potential of spatial statistics in ecology is great and largely unexplored. Estimating abundance of rare, small mammals: A case study of the Key Largo woodrat (Neotoma floridana smalli) https://hdl.handle.net/10023/2068 Estimates of animal abundance or density are fundamental quantities in ecology and conservation, but for many species such as rare, small mammals, obtaining robust estimates is problematic. In this thesis, I combine elements of two standard abundance estimation methods, capture-recapture and distance sampling, to develop a method called trapping point transects (TPT). In TPT, a "detection function", g(r) (i.e. the probability of capturing an animal, given it is r m from a trap when the trap is set) is estimated using a subset of animals whose locations are known prior to traps being set. Generalised linear models are used to estimate the detection function, and the model can be extended to include random effects to allow for heterogeneity in capture probabilities. Standard point transect methods are modified to estimate abundance. Two abundance estimators are available. The first estimator is based on the reciprocal of the expected probability of detecting an animal, ^P, where the expectation is over r; whereas the second estimator is the expectation of the reciprocal of ^P. Performance of the TPT method under various sampling efforts and underlying true detection probabilities of individuals in the population was investigated in a simulation study. When underlying probability of detection was high (g(0) = 0:88) and between-individual variation was small, survey effort could be surprisingly low (c. 510 trap nights) to yield low bias (c. 4%) in the two estimators; but under certain situations, the second estimator can be extremely biased. Uncertainty and relative bias in population estimates increased with decreasing detectability and increasing between-individual variation. Abundance of the Key Largo woodrat (Neotoma floridana smalli), an endangered rodent with a restricted geographic range, was estimated using TPT. The TPT method compared well to other viable methods (capture-recapture and spatially-explicit capture-recapture), in terms of both field practicality and cost. The TPT method may generally be useful in estimating animal abundance in trapping studies and variants of the TPT method are presented. Sat, 01 Jan 2011 00:00:00 GMT https://hdl.handle.net/10023/2068 2011-01-01T00:00:00Z Potts, Joanne M. Estimates of animal abundance or density are fundamental quantities in ecology and conservation, but for many species such as rare, small mammals, obtaining robust estimates is problematic. In this thesis, I combine elements of two standard abundance estimation methods, capture-recapture and distance sampling, to develop a method called trapping point transects (TPT). In TPT, a "detection function", g(r) (i.e. the probability of capturing an animal, given it is r m from a trap when the trap is set) is estimated using a subset of animals whose locations are known prior to traps being set. Generalised linear models are used to estimate the detection function, and the model can be extended to include random effects to allow for heterogeneity in capture probabilities. Standard point transect methods are modified to estimate abundance. Two abundance estimators are available. The first estimator is based on the reciprocal of the expected probability of detecting an animal, ^P, where the expectation is over r; whereas the second estimator is the expectation of the reciprocal of ^P. Performance of the TPT method under various sampling efforts and underlying true detection probabilities of individuals in the population was investigated in a simulation study. When underlying probability of detection was high (g(0) = 0:88) and between-individual variation was small, survey effort could be surprisingly low (c. 510 trap nights) to yield low bias (c. 4%) in the two estimators; but under certain situations, the second estimator can be extremely biased. Uncertainty and relative bias in population estimates increased with decreasing detectability and increasing between-individual variation. Abundance of the Key Largo woodrat (Neotoma floridana smalli), an endangered rodent with a restricted geographic range, was estimated using TPT. The TPT method compared well to other viable methods (capture-recapture and spatially-explicit capture-recapture), in terms of both field practicality and cost. The TPT method may generally be useful in estimating animal abundance in trapping studies and variants of the TPT method are presented. Bayesian modelling of integrated data and its application to seabird populations https://hdl.handle.net/10023/1635 Integrated data analyses are becoming increasingly popular in studies of wild animal populations where two or more separate sources of data contain information about common parameters. Here we develop an integrated population model using abundance and demographic data from a study of common guillemots (Uria aalge) on the Isle of May, southeast Scotland. A state-space model for the count data is supplemented by three demographic time series (productivity and two mark-recapture-recovery (MRR)), enabling the estimation of prebreeder emigration rate - a parameter for which there is no direct observational data, and which is unidentifiable in the separate analysis of MRR data. A Bayesian approach using MCMC provides a flexible and powerful analysis framework. This model is extended to provide predictions of future population trajectories. Adopting random effects models for the survival and productivity parameters, we implement the MCMC algorithm to obtain a posterior sample of the underlying process means and variances (and population sizes) within the study period. Given this sample, we predict future demographic parameters, which in turn allows us to predict future population sizes and obtain the corresponding posterior distribution. Under the assumption that recent, unfavourable conditions persist in the future, we obtain a posterior probability of 70% that there is a population decline of >25% over a 10-year period. Lastly, using MRR data we test for spatial, temporal and age-related correlations in guillemot survival among three widely separated Scottish colonies that have varying overlap in nonbreeding distribution. We show that survival is highly correlated over time for colonies/age classes sharing wintering areas, and essentially uncorrelated for those with separate wintering areas. These results strongly suggest that one or more aspects of winter environment are responsible for spatiotemporal variation in survival of British guillemots, and provide insight into the factors driving multi-population dynamics of the species. Tue, 30 Nov 2010 00:00:00 GMT https://hdl.handle.net/10023/1635 2010-11-30T00:00:00Z Reynolds, Toby J. Integrated data analyses are becoming increasingly popular in studies of wild animal populations where two or more separate sources of data contain information about common parameters. Here we develop an integrated population model using abundance and demographic data from a study of common guillemots (Uria aalge) on the Isle of May, southeast Scotland. A state-space model for the count data is supplemented by three demographic time series (productivity and two mark-recapture-recovery (MRR)), enabling the estimation of prebreeder emigration rate - a parameter for which there is no direct observational data, and which is unidentifiable in the separate analysis of MRR data. A Bayesian approach using MCMC provides a flexible and powerful analysis framework. This model is extended to provide predictions of future population trajectories. Adopting random effects models for the survival and productivity parameters, we implement the MCMC algorithm to obtain a posterior sample of the underlying process means and variances (and population sizes) within the study period. Given this sample, we predict future demographic parameters, which in turn allows us to predict future population sizes and obtain the corresponding posterior distribution. Under the assumption that recent, unfavourable conditions persist in the future, we obtain a posterior probability of 70% that there is a population decline of >25% over a 10-year period. Lastly, using MRR data we test for spatial, temporal and age-related correlations in guillemot survival among three widely separated Scottish colonies that have varying overlap in nonbreeding distribution. We show that survival is highly correlated over time for colonies/age classes sharing wintering areas, and essentially uncorrelated for those with separate wintering areas. These results strongly suggest that one or more aspects of winter environment are responsible for spatiotemporal variation in survival of British guillemots, and provide insight into the factors driving multi-population dynamics of the species. Statistical models for the long-term monitoring of songbird populations: a Bayesian analysis of constant effort sites and ring-recovery data https://hdl.handle.net/10023/885 To underpin and improve advice given to government and other interested parties on the state of Britain’s common songbird populations, new models for analysing ecological data are developed in this thesis. These models use data from the British Trust for Ornithology’s Constant Effort Sites (CES) scheme, an annual bird-ringing programme in which catch effort is standardised. Data from the CES scheme are routinely used to index abundance and productivity, and to a lesser extent estimate adult survival rates. However, two features of the CES data that complicate analysis were previously inadequately addressed, namely the presence in the catch of “transient” birds not associated with the local population, and the sporadic failure in the constancy of effort assumption arising from the absence of within-year catch data. The current methodology is extended, with efficient Bayesian models developed for each of these demographic parameters that account for both of these data nuances, and from which reliable and usefully precise estimates are obtained. Of increasing interest is the relationship between abundance and the underlying vital rates, an understanding of which facilitates effective conservation. CES data are particularly amenable to an integrated approach to population modelling, providing a combination of demographic information from a single source. Such an integrated approach is developed here, employing Bayesian methodology and a simple population model to unite abundance, productivity and survival within a consistent framework. Independent data from ring-recoveries provide additional information on adult and juvenile survival rates. Specific advantages of this new integrated approach are identified, among which is the ability to determine juvenile survival accurately, disentangle the probabilities of survival and permanent emigration, and to obtain estimates of total seasonal productivity. The methodologies developed in this thesis are applied to CES data from Sedge Warbler, Acrocephalus schoenobaenus, and Reed Warbler, A. scirpaceus. Fri, 25 Jun 2010 00:00:00 GMT https://hdl.handle.net/10023/885 2010-06-25T00:00:00Z Cave, Vanessa M. To underpin and improve advice given to government and other interested parties on the state of Britain’s common songbird populations, new models for analysing ecological data are developed in this thesis. These models use data from the British Trust for Ornithology’s Constant Effort Sites (CES) scheme, an annual bird-ringing programme in which catch effort is standardised. Data from the CES scheme are routinely used to index abundance and productivity, and to a lesser extent estimate adult survival rates. However, two features of the CES data that complicate analysis were previously inadequately addressed, namely the presence in the catch of “transient” birds not associated with the local population, and the sporadic failure in the constancy of effort assumption arising from the absence of within-year catch data. The current methodology is extended, with efficient Bayesian models developed for each of these demographic parameters that account for both of these data nuances, and from which reliable and usefully precise estimates are obtained. Of increasing interest is the relationship between abundance and the underlying vital rates, an understanding of which facilitates effective conservation. CES data are particularly amenable to an integrated approach to population modelling, providing a combination of demographic information from a single source. Such an integrated approach is developed here, employing Bayesian methodology and a simple population model to unite abundance, productivity and survival within a consistent framework. Independent data from ring-recoveries provide additional information on adult and juvenile survival rates. Specific advantages of this new integrated approach are identified, among which is the ability to determine juvenile survival accurately, disentangle the probabilities of survival and permanent emigration, and to obtain estimates of total seasonal productivity. The methodologies developed in this thesis are applied to CES data from Sedge Warbler, Acrocephalus schoenobaenus, and Reed Warbler, A. scirpaceus. Topics in estimation of quantum channels https://hdl.handle.net/10023/869 A quantum channel is a mapping which sends density matrices to density matrices. The estimation of quantum channels is of great importance to the field of quantum information. In this thesis two topics related to estimation of quantum channels are investigated. The first of these is the upper bound of Sarovar and Milburn (2006) on the Fisher information obtainable by measuring the output of a channel. Two questions raised by Sarovar and Milburn about their bound are answered. A Riemannian metric on the space of quantum states is introduced, related to the construction of the Sarovar and Milburn bound. Its properties are characterized. The second topic investigated is the estimation of unitary channels. The situation is considered in which an experimenter has several non-identical unitary channels that have the same parameter. It is shown that it is possible to improve estimation using the channels together, analogous to the case of identical unitary channels. Also, a new method of phase estimation is given based on a method sketched by Kitaev (1996). Unlike other phase estimation procedures which perform similarly, this procedure requires only very basic experimental resources. Wed, 23 Jun 2010 00:00:00 GMT https://hdl.handle.net/10023/869 2010-06-23T00:00:00Z O'Loan, Caleb J. A quantum channel is a mapping which sends density matrices to density matrices. The estimation of quantum channels is of great importance to the field of quantum information. In this thesis two topics related to estimation of quantum channels are investigated. The first of these is the upper bound of Sarovar and Milburn (2006) on the Fisher information obtainable by measuring the output of a channel. Two questions raised by Sarovar and Milburn about their bound are answered. A Riemannian metric on the space of quantum states is introduced, related to the construction of the Sarovar and Milburn bound. Its properties are characterized. The second topic investigated is the estimation of unitary channels. The situation is considered in which an experimenter has several non-identical unitary channels that have the same parameter. It is shown that it is possible to improve estimation using the channels together, analogous to the case of identical unitary channels. Also, a new method of phase estimation is given based on a method sketched by Kitaev (1996). Unlike other phase estimation procedures which perform similarly, this procedure requires only very basic experimental resources. Multi-species state-space modelling of the hen harrier (Circus cyaneus) and red grouse (Lagopus lagopus scoticus) in Scotland https://hdl.handle.net/10023/864 State-space modelling is a powerful tool to study ecological systems. The direct inclusion of uncertainty, unification of models and data, and ability to model unobserved, hidden states increases our knowledge about the environment and provides new ecological insights. I extend the state-space framework to create multi-species models, showing that the ability to model ecosystem interactions is limited only by data availability. State-space models are fit using both Bayesian and Frequentist methods, making them independent of a statistical school of thought. Bayesian approaches can have the advantage in their ability to account for missing data and fit hierarchical structures and models with many parameters to limited data; often the case in ecological studies. I have taken a Bayesian model fitting approach in this thesis. The predator-prey interactions between the hen harrier (Circus cyaneus) and red grouse (Lagopus lagopus scoticus) are used to demonstrate state-space modelling’s capabilities. The harrier data are believed to be known without error, while missing data make the cyclic dynamics of the grouse harder to model. The grouse-harrier interactions are modelled in a multi-species state-space model, rather than including one species as a covariate in the other’s model. Finally, models are included for the harriers’ alternate prey. The single- and multi-species state-space models for the predator-prey interactions provide insight into the species’ management. The models investigate aspects of the species’ behaviour, from the mechanisms behind grouse cycles to what motivates harrier immigration. The inferences drawn from these models are applicable to management, suggesting actions to halt grouse cycles or mitigate the grouse-harrier conflict. Overall, the multi-species models suggest that two popular ideas for grouse-harrier management, diversionary feeding and habitat manipulation to reduce alternate prey densities, will not have the desired effect, and in the case of reducing prey densities, may even increase the harriers’ impact on grouse chicks. Wed, 23 Jun 2010 00:00:00 GMT https://hdl.handle.net/10023/864 2010-06-23T00:00:00Z New, Leslie Frances State-space modelling is a powerful tool to study ecological systems. The direct inclusion of uncertainty, unification of models and data, and ability to model unobserved, hidden states increases our knowledge about the environment and provides new ecological insights. I extend the state-space framework to create multi-species models, showing that the ability to model ecosystem interactions is limited only by data availability. State-space models are fit using both Bayesian and Frequentist methods, making them independent of a statistical school of thought. Bayesian approaches can have the advantage in their ability to account for missing data and fit hierarchical structures and models with many parameters to limited data; often the case in ecological studies. I have taken a Bayesian model fitting approach in this thesis. The predator-prey interactions between the hen harrier (Circus cyaneus) and red grouse (Lagopus lagopus scoticus) are used to demonstrate state-space modelling’s capabilities. The harrier data are believed to be known without error, while missing data make the cyclic dynamics of the grouse harder to model. The grouse-harrier interactions are modelled in a multi-species state-space model, rather than including one species as a covariate in the other’s model. Finally, models are included for the harriers’ alternate prey. The single- and multi-species state-space models for the predator-prey interactions provide insight into the species’ management. The models investigate aspects of the species’ behaviour, from the mechanisms behind grouse cycles to what motivates harrier immigration. The inferences drawn from these models are applicable to management, suggesting actions to halt grouse cycles or mitigate the grouse-harrier conflict. Overall, the multi-species models suggest that two popular ideas for grouse-harrier management, diversionary feeding and habitat manipulation to reduce alternate prey densities, will not have the desired effect, and in the case of reducing prey densities, may even increase the harriers’ impact on grouse chicks. Embedding population dynamics in mark-recapture models https://hdl.handle.net/10023/718 Mark-recapture methods use repeated captures of individually identifiable animals to provide estimates of properties of populations. Different models allow estimates to be obtained for population size and rates of processes governing population dynamics. State-space models consist of two linked processes evolving simultaneously over time. The state process models the evolution of the true, but unknown, states of the population. The observation process relates observations on the population to these true states. Mark-recapture models specified within a state-space framework allow population dynamics models to be embedded in inference ensuring that estimated changes in the population are consistent with assumptions regarding the biology of the modelled population. This overcomes a limitation of current mark-recapture methods. Two alternative approaches are considered. The "conditional" approach conditions on known numbers of animals possessing capture history patterns including capture in the current time period. An animal's capture history determines its state; consequently, capture parameters appear in the state process rather than the observation process. There is no observation error in the model. Uncertainty occurs only through the numbers of animals not captured in the current time period. An "unconditional" approach is considered in which the capture histories are regarded as observations. Consequently, capture histories do not influence an animal's state and capture probability parameters appear in the observation process. Capture histories are considered a random realization of the stochastic observation process. This is more consistent with traditional mark-recapture methods. Development and implementation of particle filtering techniques for fitting these models under each approach are discussed. Simulation studies show reasonable performance for the unconditional approach and highlight problems with the conditional approach. Strengths and limitations of each approach are outlined, with reference to Soay sheep data analysis, and suggestions are presented for future analyses. Wed, 24 Jun 2009 00:00:00 GMT https://hdl.handle.net/10023/718 2009-06-24T00:00:00Z Bishop, Jonathan R. B. Mark-recapture methods use repeated captures of individually identifiable animals to provide estimates of properties of populations. Different models allow estimates to be obtained for population size and rates of processes governing population dynamics. State-space models consist of two linked processes evolving simultaneously over time. The state process models the evolution of the true, but unknown, states of the population. The observation process relates observations on the population to these true states. Mark-recapture models specified within a state-space framework allow population dynamics models to be embedded in inference ensuring that estimated changes in the population are consistent with assumptions regarding the biology of the modelled population. This overcomes a limitation of current mark-recapture methods. Two alternative approaches are considered. The "conditional" approach conditions on known numbers of animals possessing capture history patterns including capture in the current time period. An animal's capture history determines its state; consequently, capture parameters appear in the state process rather than the observation process. There is no observation error in the model. Uncertainty occurs only through the numbers of animals not captured in the current time period. An "unconditional" approach is considered in which the capture histories are regarded as observations. Consequently, capture histories do not influence an animal's state and capture probability parameters appear in the observation process. Capture histories are considered a random realization of the stochastic observation process. This is more consistent with traditional mark-recapture methods. Development and implementation of particle filtering techniques for fitting these models under each approach are discussed. Simulation studies show reasonable performance for the unconditional approach and highlight problems with the conditional approach. Strengths and limitations of each approach are outlined, with reference to Soay sheep data analysis, and suggestions are presented for future analyses. Using generalized estimating equations with regression splines to improve analysis of butterfly transect data https://hdl.handle.net/10023/488 Surveying animal populations is an important aspect of wildlife management. Distinguishing trend from random fluctuations and quantifying trend are key goals in any analysis. The aim of this thesis is to review analyses of Butterfly Monitoring Survey (BMS) data and to develop new methods which address some flaws in previous studies. The BMS was established in 1976 at Monks Wood, Cambridgeshire and sites were added over time throughout Britain in order to monitor butterfly population trends. Weekly counts are made over the monitoring season and the main aims are to produce annual indices and compare these indices over time for any particular species. Originally, weekly counts were summed to produce relative indices and missing counts were estimated using linear interpolation. This thesis discusses the weaknesses of this basic method and suggests possible improvements. In recent years, with advancements in statistical methods and increased computer power, new methods can be applied to accommodate the longitudinal and flexible nature of ecological data. Mixed Models, Generalized Estimating Equations and Generalized Additive Models are used and the relative merits of each modelling approach discussed. These methods allow for correlation and non-linearity in data. Model selection is an important consideration when modelling and different tests are introduced and compared. Once a model is selected, site-level indices are estimated, which can be collated to produce regional and national indices. Different methods of estimating precision around indices are also contrasted. Bootstrapping is found to be a convenient and dependable approach. Abundance is difficult to disentangle from detectability when only counts of species are carried out. Methods for dealing with this problem are suggested. Once reliable annual abundance estimates are found, they can be compared over time using a variety of statistical techniques. The chain-ratio method is applied to a subset of real data. Sun, 01 Jun 2008 00:00:00 GMT https://hdl.handle.net/10023/488 2008-06-01T00:00:00Z Brewer, Ciara Surveying animal populations is an important aspect of wildlife management. Distinguishing trend from random fluctuations and quantifying trend are key goals in any analysis. The aim of this thesis is to review analyses of Butterfly Monitoring Survey (BMS) data and to develop new methods which address some flaws in previous studies. The BMS was established in 1976 at Monks Wood, Cambridgeshire and sites were added over time throughout Britain in order to monitor butterfly population trends. Weekly counts are made over the monitoring season and the main aims are to produce annual indices and compare these indices over time for any particular species. Originally, weekly counts were summed to produce relative indices and missing counts were estimated using linear interpolation. This thesis discusses the weaknesses of this basic method and suggests possible improvements. In recent years, with advancements in statistical methods and increased computer power, new methods can be applied to accommodate the longitudinal and flexible nature of ecological data. Mixed Models, Generalized Estimating Equations and Generalized Additive Models are used and the relative merits of each modelling approach discussed. These methods allow for correlation and non-linearity in data. Model selection is an important consideration when modelling and different tests are introduced and compared. Once a model is selected, site-level indices are estimated, which can be collated to produce regional and national indices. Different methods of estimating precision around indices are also contrasted. Bootstrapping is found to be a convenient and dependable approach. Abundance is difficult to disentangle from detectability when only counts of species are carried out. Methods for dealing with this problem are suggested. Once reliable annual abundance estimates are found, they can be compared over time using a variety of statistical techniques. The chain-ratio method is applied to a subset of real data. Incorporating measurement error and density gradients in distance sampling surveys https://hdl.handle.net/10023/391 Distance sampling is one of the most commonly used methods for estimating density and abundance. Conventional methods are based on the distances of detected animals from the center of point transects or the center line of line transects. These distances are used to model a detection function: the probability of detecting an animal, given its distance from the line or point. The probability of detecting an animal in the covered area is given by the mean value of the detection function with respect to the available distances to be detected. Given this probability, a Horvitz-Thompson- like estimator of abundance for the covered area follows, hence using a model-based framework. Inferences for the wider survey region are justified using the survey design. Conventional distance sampling methods are based on a set of assumptions. In this thesis I present results that extend distance sampling on two fronts. Firstly, estimators are derived for situations in which there is measurement error in the distances. These estimators use information about the measurement error in two ways: (1) a biased estimator based on the contaminated distances is multiplied by an appropriate correction factor, which is a function of the errors (PDF approach), and (2) cast into a likelihood framework that allows parameter estimation in the presence of measurement error (likelihood approach). Secondly, methods are developed that relax the conventional assumption that the distribution of animals is independent of distance from the lines or points (usually guaranteed by appropriate survey design). In particular, the new methods deal with the case where animal density gradients are caused by the use of non-random sampler allocation, for example transects placed along linear features such as roads or streams. This is dealt with separately for line and point transects, and at a later stage an approach for combining the two is presented. A considerable number of simulations and example analysis illustrate the performance of the proposed methods. Thu, 01 Nov 2007 00:00:00 GMT https://hdl.handle.net/10023/391 2007-11-01T00:00:00Z Marques, Tiago Andre Lamas Oliveira Distance sampling is one of the most commonly used methods for estimating density and abundance. Conventional methods are based on the distances of detected animals from the center of point transects or the center line of line transects. These distances are used to model a detection function: the probability of detecting an animal, given its distance from the line or point. The probability of detecting an animal in the covered area is given by the mean value of the detection function with respect to the available distances to be detected. Given this probability, a Horvitz-Thompson- like estimator of abundance for the covered area follows, hence using a model-based framework. Inferences for the wider survey region are justified using the survey design. Conventional distance sampling methods are based on a set of assumptions. In this thesis I present results that extend distance sampling on two fronts. Firstly, estimators are derived for situations in which there is measurement error in the distances. These estimators use information about the measurement error in two ways: (1) a biased estimator based on the contaminated distances is multiplied by an appropriate correction factor, which is a function of the errors (PDF approach), and (2) cast into a likelihood framework that allows parameter estimation in the presence of measurement error (likelihood approach). Secondly, methods are developed that relax the conventional assumption that the distribution of animals is independent of distance from the lines or points (usually guaranteed by appropriate survey design). In particular, the new methods deal with the case where animal density gradients are caused by the use of non-random sampler allocation, for example transects placed along linear features such as roads or streams. This is dealt with separately for line and point transects, and at a later stage an approach for combining the two is presented. A considerable number of simulations and example analysis illustrate the performance of the proposed methods. A Bayesian approach to modelling field data on multi-species predator prey-interactions https://hdl.handle.net/10023/174 Multi-species functional response models are required to model the predation of generalist preda- tors, which consume more than one prey species. In chapter 2, a new model for the multi-species functional response is presented. This model can describe generalist predators that exhibit func- tional responses of Holling type II to some of their prey and of type III to other prey. In chapter 3, I review some of the theoretical distinctions between Bayesian and frequentist statistics and show how Bayesian statistics are particularly well-suited for the fitting of functional response models because uncertainty can be represented comprehensively. In chapters 4 and 5, the multi- species functional response model is fitted to field data on two generalist predators: the hen harrier Circus cyaneus and the harp seal Phoca groenlandica. I am not aware of any previous Bayesian model of the multi-species functional response that has been fitted to field data. The hen harrier's functional response fitted in chapter 4 is strongly sigmoidal to the densities of red grouse Lagopus lagopus scoticus, but no type III shape was detected in the response to the two main prey species, field vole Microtus agrestis and meadow pipit Anthus pratensis. The impact of using Bayesian or frequentist models on the resulting functional response is discussed. In chapter 5, no functional response could be fitted to the data on harp seal predation. Possible reasons are discussed, including poor data quality or a lack of relevance of the available data for informing a behavioural functional response model. I conclude with a comparison of the role that functional responses play in behavioural, population and community ecology and emphasise the need for further research into unifying these different approaches to understanding predation with particular reference to predator movement. In an appendix, I evaluate the possibility of using a functional response for inferring the abun- dances of prey species from performance indicators of generalist predators feeding on these prey. I argue that this approach may be futile in general, because a generalist predator's energy intake does not depend on the density of any single of its prey, so that the possibly unknown densities of all prey need to be taken into account. Sun, 01 Jan 2006 00:00:00 GMT https://hdl.handle.net/10023/174 2006-01-01T00:00:00Z Asseburg, Christian Multi-species functional response models are required to model the predation of generalist preda- tors, which consume more than one prey species. In chapter 2, a new model for the multi-species functional response is presented. This model can describe generalist predators that exhibit func- tional responses of Holling type II to some of their prey and of type III to other prey. In chapter 3, I review some of the theoretical distinctions between Bayesian and frequentist statistics and show how Bayesian statistics are particularly well-suited for the fitting of functional response models because uncertainty can be represented comprehensively. In chapters 4 and 5, the multi- species functional response model is fitted to field data on two generalist predators: the hen harrier Circus cyaneus and the harp seal Phoca groenlandica. I am not aware of any previous Bayesian model of the multi-species functional response that has been fitted to field data. The hen harrier's functional response fitted in chapter 4 is strongly sigmoidal to the densities of red grouse Lagopus lagopus scoticus, but no type III shape was detected in the response to the two main prey species, field vole Microtus agrestis and meadow pipit Anthus pratensis. The impact of using Bayesian or frequentist models on the resulting functional response is discussed. In chapter 5, no functional response could be fitted to the data on harp seal predation. Possible reasons are discussed, including poor data quality or a lack of relevance of the available data for informing a behavioural functional response model. I conclude with a comparison of the role that functional responses play in behavioural, population and community ecology and emphasise the need for further research into unifying these different approaches to understanding predation with particular reference to predator movement. In an appendix, I evaluate the possibility of using a functional response for inferring the abun- dances of prey species from performance indicators of generalist predators feeding on these prey. I argue that this approach may be futile in general, because a generalist predator's energy intake does not depend on the density of any single of its prey, so that the possibly unknown densities of all prey need to be taken into account. Reconstruction of foliations from directional information https://hdl.handle.net/10023/158 In many areas of science, especially geophysics, geography and meteorology, the data are often directions or axes rather than scalars or unrestricted vectors. Directional statistics considers data which are mainly unit vectors lying in two- or three-dimensional space (R² or R³). One way in which directional data arise is as normals to foliations. A (codimension-1) foliation of {R}^{d} is a system of non-intersecting (d-1)-dimensional surfaces filling out the whole of {R}^{d}. At each point z of {R}^{d}, any given codimension-1 foliation determines a unit vector v normal to the surface through z. The problem considered here is that of reconstructing the foliation from observations ({z}{i}, {v}{i}), i=1,...,n. One way of doing this is rather similar to fitting smooth splines to data. That is, the reconstructed foliation has to be as close to the data as possible, while the foliation itself is not too rough. A tradeoff parameter is introduced to control the balance between smoothness and closeness. The approach used in this thesis is to take the surfaces to be surfaces of constant values of a suitable real-valued function h on {R}^{d}. The problem of reconstructing a foliation is translated into the language of Schwartz distributions and a deep result in the theory of distributions is used to give the appropriate general form of the fitted function h. The model parameters are estimated by a simplified Newton method. Under appropriate distributional assumptions on v{1},...,v{n}, confidence regions for the true normals are developed and estimates of concentration are given. Fri, 01 Jun 2007 00:00:00 GMT https://hdl.handle.net/10023/158 2007-06-01T00:00:00Z Yeh, Shu-Ying In many areas of science, especially geophysics, geography and meteorology, the data are often directions or axes rather than scalars or unrestricted vectors. Directional statistics considers data which are mainly unit vectors lying in two- or three-dimensional space (R² or R³). One way in which directional data arise is as normals to foliations. A (codimension-1) foliation of {R}^{d} is a system of non-intersecting (d-1)-dimensional surfaces filling out the whole of {R}^{d}. At each point z of {R}^{d}, any given codimension-1 foliation determines a unit vector v normal to the surface through z. The problem considered here is that of reconstructing the foliation from observations ({z}{i}, {v}{i}), i=1,...,n. One way of doing this is rather similar to fitting smooth splines to data. That is, the reconstructed foliation has to be as close to the data as possible, while the foliation itself is not too rough. A tradeoff parameter is introduced to control the balance between smoothness and closeness. The approach used in this thesis is to take the surfaces to be surfaces of constant values of a suitable real-valued function h on {R}^{d}. The problem of reconstructing a foliation is translated into the language of Schwartz distributions and a deep result in the theory of distributions is used to give the appropriate general form of the fitted function h. The model parameters are estimated by a simplified Newton method. Under appropriate distributional assumptions on v{1},...,v{n}, confidence regions for the true normals are developed and estimates of concentration are given.