Computer Science (School of)

Processing clinical guideline text for formal verification

Rahman, Fahrurrozi — 2022-06-15T00:00:00Z

Clinical guidelines are evidence-based recommendations developed to assist practitioners in their decisions on appropriate care for patients with specific clinical circumstances. They provide succinct instructions such as what drugs should be given or taken for a particular condition, how long such treatment should be given, what tests should be conducted, or other situational clinical circumstances for certain diseases. However, as they are described in natural language, they are prone to problems such as ambiguity and incompleteness. As the guidelines are publicly accessible, we expect them to be foolproof from inconsistencies and missing gaps. This thesis aims to answer a couple questions in regard to the correctness of clinical guidelines: (1) How can we get the main information in clinical guideline texts? (2) How can we check the guidelines in terms of correctness and consistencies? To answer these questions, first, we develop several methods to mark and capture the semantic information in the texts. We start by building a controlled natural language to reduce the complexity of the texts’ structure. We show that this approach is easy to set up but obviously unscalable. We then consider machine learning approaches and use semantic role labelling, named-entity recognition and relation classification techniques. To achieve this task, we create a clinical guideline corpus tagged with process labels. We show that even with a small corpus, the baseline performance is promising. We then investigate fine-tuning some state-of-the-art neural model architectures and get better performance. Finally, we create a framework to transform the clinical guidelines into formal statements and check their correctness against some properties using model checkers or constraint solvers. This thesis presents a study and analyses of entity labelling and relation classification in regard to clinical guidelines, as well as formally checking their correctness, providing insights and future research directions on the improvement of clinical guidelines.

Automatic evaluation of geopolitical risk

Burns, John Corcoran — 2024-12-03T00:00:00Z

This thesis aims to construct programs for automatically evaluating geopolitical risks. This project will use machine learning, specifically sentiment analysis, topic modeling and NER, to build new computer programs to evaluate and assess not just scheduled and predictable geopolitical events, but also unpredictable events. The gap in current literature this project intends to fill is the ability to respond to these risk events in real time. Thus, this project’s objective is to build programs that can digest the vast quantities of data generated by Twitter / X, the data source chosen for this project, focusing on keywords indicating a potential geopolitical event or crisis. With the use of Twitter / X, I was able to find that information appeared quicker through tweets than traditional news sources, thus I was able to identify emerging geopolitical topics, in some cases, hours or days before they became discussed in the mainstream media. I also achieved success with building a geopolitical risk index program with sentiment analysis to relate the index to the trends in the financial markets surrounding the start of the Ukraine War in 2022 at the daily level. Using Granger Causality, I found that the geopolitical risk index I created from the emotions gleaned from the sentiment analysis of the relevant tweets collected, contained predictive information of the movement of various financial assets over time. In addition, with the NER program, I was able to visualize the different geopolitical risks on a world map. While I managed to create a program that created a geopolitical risk index at the real time level, unfortunately, there was little relationship between the real time risk index and the change in financial markets. In combination, the output of these various programs allowed for the automatic evaluation of geopolitical risk.

Biologically-informed interpretable deep learning techniques for BMI prediction and gene interaction detection

Hequet, Chloe — 2024-06-12T00:00:00Z

The analysis of genetic point mutations at the population level can offer insights into the genetic basis of human traits, which in turn could potentially lead to new diagnostic and treatment options for heritable diseases. However, existing genetic data analysis methods tend to rely on simplifying assumptions that ignore nonlinear interactions between variants. The ability to model and describe nonlinear genetic interactions could lead to both improved trait prediction and enhanced understanding of the underlying biology. Deep Learning models offer the possibility of automatically learning complex nonlinear genetic architectures, but it is currently unclear how best to optimise them for genetic data. It is also essential that any models be able to “explain” what they have learned in order for them to be used for genetic discovery or clinical applications, which can be difficult due to the black-box nature of DL predictors. This thesis addresses a number of methodological gaps in applying explainable DL models end-to-end on variant-level genetic data. We propose novel methods for encoding genetic data for deep learning applications and show that feature encodings designed specifically for genetic variants offer the possibility of improved model efficiency and performance. We then benchmark a variety of models for the prediction of Body Mass Index using data from the UK Biobank, yielding insights into DL performance in this domain. We then propose a series of novel DL model interpretation methods with features optimised for biological insights. We first show how these can be used to validate that the network has automatically replicated existing knowledge, and then illustrate their ability to detect complex nonlinear genetic interactions that influence BMI in our cohort. Overall, we show that DL model training and interpretation procedures that have been optimised for genetic data can be used to yield new insights into disease aetiology.

Automating inventory composition management for bulk purchasing cloud brokerage strategy

Boonprasop, Chalee — 2024-06-12T00:00:00Z

Cloud providers offer end-users various pricing schemes to allow them to tailor VMs to their needs, e.g., a pay-as-you-go billing scheme, called on-demand, and a discounted contract scheme, called reserved instances. This work presents a cloud broker that offers users both the flexibility of on-demand instances and some discounts found in reserved instances. The broker employs a buy-low-and-sell-high strategy that places user requests into a resource pool of pre-purchased discounted cloud resources. A key challenge to buy-in-bulk-sell-individually cloud broker business models is to estimate user requests accurately and then optimise the stock level accordingly. Given the complexity and variety of the cloud computing market space, the number of the regression model and inherently optimisation search space can be intricate. In this thesis, we propose two solutions to the problem. The first solution is a risk-based decision model. The broker takes a risk-oriented approach to dynamically adjust the resource pool by analysing user request time series data. This approach does not require a training process which is useful at processing the large data stream. The broker is evaluated with high-frequency real cloud datasets from Alibaba. The results show that the overall profit of the broker is closely related to the optimal case. Additionally, the risk factors work as intended. The system hires more reserved instances when it can afford while leaning to the on-demand otherwise. We can also conclude that there is a correlation between the risk factors and the profit. On the other hand, the risk factor possesses some limitations, i.e. manual risk configuration, limited broker setting. Secondly, we propose a broker system that utilises the concept of causal discovery. From the risk-based solution, we can see that if there are parameters correlated with the profit, then by adjusting those parameters, we can manipulate the profit. We infer a function mapping from the extracted key entities of broker data to an objective of a broker, e.g. profit. The technique is similar to the additive noise model, causal discovery method. These functions are assumed to describe an actual underlying behaviour of the profit with respect to the parameters. Similar to the risk-based, we use the Alibaba trace data to simulate long term user requests. Our results show that the system can infer the underlying interaction model between variables unlock the profit model behaviour of the broker system.

Equivalence-preserving preprocessing of propositional logic formulae using existential graphs and implication hypergraphs

Francès de Mas, Jordina — 2024-06-12T00:00:00Z

Current approaches to propositional logic (PL) solving and simplification are search-based, require flattening transformations, backtracking, and use many non-equivalence-preserving techniques. This thesis presents an alternative approach, which is based on the study of Peirce's existential graphs and the analysis of implication graphs, capable of simplifying PL formulae in arbitrary form by applying novel deep inference rules that can detect redundancies across nesting levels. In particular, we first introduce a set of novel simplification techniques based on the exploration of binary implication graphs, which are guaranteed to be equivalence-preserving, monotonically decrease the size of the problem, and result in terminating and confluent procedures (up to variable renaming). We next introduce a novel PL formula representation able to capture all of its implication information in a tractable manner, which we call binary implication hypergraph. This novel PL formula encoding in the form of a directed hypergraph allows us to derive a suite of even more powerful simplification rules with similar guarantees. As a proof of concept, we provide an algorithm for a subset of our techniques and study its complexity, which is in line with the complexity limits found elsewhere in the literature. We then implement it and test it on SAT benchmarks with up to hundreds of thousands of variables, clauses and literals, which proves the practical feasibility of our framework, and compare our results to and in combination with the state-of-the-art, obtaining very promising results. Both existential graphs and our novel hypergraph formula representation offer a fresh view of preprocessing never explored before, which results in a systematic, explainable, technology-independent, equivalence-preserving method for simplifying PL formulae in arbitrary form, and constitutes a step forward on the quest for greater reasoning automation and shorter proofs, and opens the door to a whole new body of research.

Towards fully automated analysis of sputum smear microscopy images

Zachariou, Marios — 2024-12-03T00:00:00Z

Sputum smear microscopy is used for diagnosis and treatment monitoring of pulmonary tuberculosis (TB). Automation of image analysis can make this technique less laborious and more consistent. This research employs artificial intelligence to improve automation of Mycobacterium tuberculosis (Mtb) cell detection, bacterial load quantification, and phenotyping from fluorescence microscopy images. I first introduce a non-learning, computer vision (CV) approach for bacteria detection, employing ridge-based approach using the Hessian matrix to detect ridges of Mtb bacteria, complemented by geometric analysis. The effectiveness of this approach is assessed through a custom metric using the Hu moment vector. Results demonstrate lower performance relative to literature metrics, motivating the need for deep learning (DL) to capture bacterial morphology. Subsequently, I develop an automated pipeline for detection, classification, and counting of bacteria using DL techniques. Firstly, Cycle-GANs transfer labels from labelled to unlabeled fields of view (FOVs). Pre-trained DL models are used for subsequent classification and regression tasks. An ablation study confirms pipeline efficacy, with a count error within 5%. For downstream analysis, microscopy slides are divided into tiles, each of which is sequentially cropped and magnified. A subsequent filtering stage eliminates non-salient FOVs by applying pre-trained DL models along with a novel method that employs dual convolutional neural network (CNN)-based encoders for feature extraction: one encoder is dedicated to learning bacterial appearance, and the other focuses on bacterial shape, which both precede into a bottleneck of a smaller CNN classifier network. The proposed model outperforms others in accuracy, yields no false positives, and excels across decision thresholds. Mtb cell lipid content and length may be related to antibiotic tolerance, underscoring the need to locate bacteria within paired FOV images stained with distinct cell identification and lipid detection, and to measure bacterial dimensions. I employ a proposed UNet-like model for precise bacterial localization. By combining CNNs and feature descriptors, my method automates reporting of both lipid content and cell length. Application of the approaches described here may assist clinical TB care and therapeutics research.

Unsupervised domain adaptation in sensor-based human activity recognition

Rosales Sanabria, Andrea — 2022-06-15T00:00:00Z

Sensor-based human activity recognition (HAR) is to recognise human daily activities through a collection of ambient and wearable sensors. Sensor-based human activity recognition is having a significant impact in a wide range of applications in smart city, smart home, and personal healthcare. Such wide deployment of HAR systems often faces the annotation-scarcity challenge; that is, most of the HAR techniques, especially the deep learning techniques, require a large number of training data while annotating sensor data is very time- and effort-consuming. Unsupervised domain adaptation has been successfully applied to tackle this challenge, where the activity knowledge from a well-annotated domain can be transferred to a new, unlabelled domain. However, existing techniques do not perform well on highly heterogeneous domains. To address this problem, this thesis proposes unsupervised domain adaptation models for human activity recognition. The first model presented is a new knowledge- and data-driven technique to achieve coarse- and fine-grained feature alignment using variational autoencoders. This proposed approach demonstrates high recognition accuracy and robustness against sensor noise, compared to the state-of-the-art domain adaptation techniques. However, the limitations with this approach are that knowledge-driven annotation can be inaccurate and also the model incurs extra knowledge engineering effort to map the source and target domain. This limits the application of the model. To tackle the above limitation, we then present another two data-driven unsupervised domain adaptation techniques. The first method is based on bidirectional generative adversarial networks (Bi-GAN) to perform domain adaptation. In order to improve the matching between the source and target domain, we employ Kernel Mean Matching (KMM) to enable covariate shift correction between transformed source data and original target data so that they can be better aligned. This technique works well but it does not separate classes that have similar patterns. To tackle this problem, our second method includes contrastive learning during the adaptation process to minimise the intra-class discrepancy and maximise the inter-class margin. Both methods are validated with high accuracy results on various experiments using three HAR datasets and multiple transfer learning tasks in comparison with 12 state-of-the-art techniques.

The Virtual Time Travel Platform : engineering a generic framework for immersive cultural heritage scenes

McCaffery, John — 2015-06-24T00:00:00Z

This thesis presents the Virtual Time Travel Platform (VTTP), a flexible platform for creating, sharing and deploying interactive cultural heritage content across a diverse range of contexts. The interactive scenes created using the VTTP enable experiential learning on cultural heritage topics. The VTTP supports the creation of scenes using freely available tools. These scenes can then be deployed into museums, schools and across the Internet through a process of reconfiguration rather than redevelopment. The VTTP enables high tech, immersive exhibits to be produced with a budget and a flexibility that is suitable for co-creation with community museums as opposed to national institutions. To date the VTTP has been used to deploy three heterogenous museum exhibits across Scotland which have been visited by more than 10,000 visitors in the 18 months since the first one went live. This thesis presents an evaluation of the success of these exhibits at engaging the public with the topics they present. The VTTP was created by augmenting existing Open Virtual World (OVW) software (OpenSim) to support installation into museums. The component which adds this functionality is a bespoke application called Chimera. Chimera supports immersive displays, Natural User Input (NUI) control and the embedding of experiential exploration as part of a larger context suitable for museums. The extension of OVW technology to enable museum deployment is the major contribution of this work. To support the museum deployments a quantitative analysis of the VTTP Viewer component is presented. This evaluates the impact of Viewer quality of service onuser quality of experience and suggests heuristics for optimising VTTP deployments.

Proof-relevant resolution - the foundations of constructive proof automation

Farka, František — 2021-06-30T00:00:00Z

Dependent type theory is an expressive programming language. This language allows to write programs that carry proofs of their properties. This in turn gives high confidence in such programs, making the software trustworthy. Yet, the trustworthiness comes for a price: type inference involves an increasing number of proof obligations. Automation of this process becomes necessary for any system with dependent types that aims to be usable in practice. At the same time, implementation of automation in a verified manner is prohibitively complex. Sometimes, external solvers are used to aid the automation. These solvers may be based on classical logic and may not be themselves verified, thus compromising the guarantees provided by constructive nature of type theory. In this thesis, we explore the idea of proof relevant resolution that allows automation of type inference in type theory in a verifiable and constructive manner, hence to restore the confidence in programs and the trustworthiness of software. Technical content of this thesis is threefold. First, we propose a novel framework for proof-relevant resolution. We take two constructive logics, Horn-clause and hereditary Harrop formulae logics as a starting point. We formulate the standard big-step operational semantics of these logics. We expose their Curry-Howard nature by treating formulae of these logics as types and proofs as terms thus developing a theory of proof-relevant resolution. We develop small-step operational semantics of proof-relevant resolution and prove it sound with respect to the big-step operational semantics. Secondly, we demonstrate our approach on an example of type inference in Logical Framework (LF). We translate a type-inference problem in LF into resolution in proof-relevant Horn-clause logic. Such resolution provides, besides an answer substitution to logic variables, a proof term that captures the resolution tree. We interpret the proof term as a derivation of well-formedness judgement of the object in the original problem. This allows for a straightforward implementation of type checking of the resolved solution since type checking is reduced to verifying the derivation captured by the proof term. The theoretical development is substantiated by an implementation. Finally, we demonstrate that our approach allows to reason about semantic properties of code. Type class resolution has been well-known to be a proof-relevant fragment of Horn-clause logic, and recently its coinductive extensions were introduced. In this thesis, we show that all of these extensions amalgamate with the theoretical framework we introduce. Our novel result here is exposing that the coinductive extensions are actually based on hereditary Harrop logic, rather than Horn-clause logic. We establish a number of soundness and completeness results for them. We also discuss soundness of program transformation that are allowed by proof-relevant presentation of type class resolution.

Facilitating the analysis and management of data for cancer care

Silvina, Agastya — 2021-11-30T00:00:00Z

The Edinburgh Cancer Centre (ECC) is an institution containing the National Health Service (NHS) Lothian cancer patient data from multiple resources. These resources are scattered across different systems and platforms, making it difficult to use the information collected in a useful way. There is a lack of proxy between the different (sub)systems, and this thesis presents a series of applications/projects to promote data usage and interoperability. We develop both front-end and back-end applications to bring together several databases, such as ChemoCare, Trak, and Oncology database. We create the South East Scotland Oncology (SESO) Gateway to improve the quality and capability of reporting outcomes within South East Scotland Oncology databases in real-time using routinely captured and integrated electronic healthcare data. With SESO Gateway, we focus on cancer pathway data visualisation for both the personal timeline and the cohort summary for various treatments. We also carry out a database migration and evaluate several reporting services for the newly migrated database to accelerate data access. We then perform data analysis for the patient's treatment waiting time. By analysing the waiting time and comparing it to the intended pathway, we can simplify the auditing process of the first stage of patients' cancer care journey. Further, we use the patients' treatment data, recorded toxicity level, and various observations concerning breast cancer patients to create models to analyse the outcome of the treatments, mainly chemotherapy. We compare several different techniques applied to the same data set to predict the toxicity outcome of the treatment. Through analysis and evaluation of the performance of these techniques, we can determine which method is more suitable in different situations to assist the oncologists in real-time clinical practice. After training the models, we create a dashboard as a placeholder for the models. Lastly, we explore how to define rules for cancer data and use a constraint based approach to fabricate a large cancer dataset, which will allow us to explore more techniques and further improve our system capability in the future. With our proposed systems, healthcare professionals can directly access and analyse patient data to gain further insights regarding the treatment that is best suited for an individual patient.

Rethinking historical university records : provenance in visualization and digital humanities research

Vancisin, Tomas — 2024-06-12T00:00:00Z

The world’s oldest universities, including St Andrews (my case study), have started digitizing their historical student and staff records for their (in)valuable information about the ‘education-worthy’, and the institutions themselves. Current digitization comes in various forms, from scanning handwritten records and transcribing them, to applying handwritten text recognition (HTR). While text search interfaces facilitate quicker access to these collections – and protect fragile documents – they only provide a record-by-record view. By contrast, this thesis argues for representing historical university records through visualization which allows multi-perspective views on records and foregrounds their curation(s) over time by defining and showcasing the concept of Provenance-Driven Visualization (PDV). Provenance as a key parameter in the keeping of such collections has been overlooked by researchers in DH and VIS, despite emphasizing attribution as part of research ethics (trustworthiness, transparency, etc.). Even where provenance is disclosed, it is (a) partial, (b) presented through text at collection-level, or through homogenous diagrams (hiding more complex processes), and (c) typically separated from the visualization itself (in an ‘about’ page or as diagrams). By directly addressing provenance through PDV as central to the advancement of digital curation of historical university records, this thesis develops VIS and DH research by demonstrating how visualization is itself a means for knowledge discovery as well as knowledge recovery. Main chapters develop my theoretical, ethical, and applied approach to provenance visualization (PDV) using the Biographical Records of St Andrews University 1579-1897 as an indicative case to highlight (1) added transparency (to the accuracy, representation, and ‘facts’ of such collections), (2) greater inclusion and diversity of such research, when the curatorial processes and decisions behind them are visualized (to enlarge research ethics and fuel interdisciplinary research), and (3) added critical understanding of such historical collections. Conclusions present all three as key parameters for theoretical and applied VIS and DH research.

Effective player guidance in logic puzzles

Lynch, Alice May — 2024-06-12T00:00:00Z

Pen & paper puzzle games are an extremely popular pastime, often enjoyed by demographics normally not considered to be ‘gamers’. They are increasingly used as ‘serious games’ and there has been extensive research into computationally generating and efficiently solving them. However, there have been few academic studies that have focused on the players themselves. Presenting an appropriate level of challenge to a player is essential for both player enjoyment and engagement. Providing appropriate assistance is an essential mechanic for making a game accessible to a variety of players. In this thesis, we investigate how players solve Progressive Pen & Paper Puzzle Games (PPPPs) and how to provide meaningful assistance that allows players to recover from being stuck, while not reducing the challenge to trivial levels. This thesis begins with a qualitative in-person study of Sudoku solving. This study demonstrates that, in contrast to all existing assumptions used to model players, players were unsystematic, idiosyncratic and error-prone. We then designed an entirely new approach to providing assistance in PPPPs, which guides players towards easier deductions rather than, as current systems do, completing the next cell for them. We implemented a novel hint system using our design, with the assessment of the challenge being done using Minimal Unsatisfiable Sets (MUSs). We conducted four studies, using two different PPPPs, that evaluated the efficacy of the novel hint system compared to the current hint approach. The studies demonstrated that our novel hint system was as helpful as the existing system while also improving the player experience and feeling less like cheating. Players also chose to use our novel hint system significantly more often. We have provided a new approach to providing assistance to PPPP players and demonstrated that players prefer it over existing approaches.

A clinical decision support system for detecting and mitigating potentially inappropriate medications

Redeker, Guilherme Alfredo — 2024-06-12T00:00:00Z

Background: Medication errors are a leading cause of preventable harm to patients. In older adults, the impact of ageing on the therapeutic effectiveness and safety of drugs is a significant concern, especially for those over 65. Consequently, certain medications called Potentially Inappropriate Medications (PIMs) can be dangerous in the elderly and should be avoided. Tackling PIMs by health professionals and patients can be time-consuming and error-prone, as the criteria underlying the definition of PIMs are complex and subject to frequent updates. Moreover, the criteria are not available in a representation that health systems can interpret and reason with directly. Objectives: This thesis aims to demonstrate the feasibility of using an ontology/rule-based approach in a clinical knowledge base to identify potentially inappropriate medication(PIM). In addition, how constraint solvers can be used effectively to suggest alternative medications and administration schedules to solve or minimise PIM undesirable side effects. Methodology: To address these objectives, we propose a novel integrated approach using formal rules to represent the PIMs criteria and inference engines to perform the reasoning presented in the context of a Clinical Decision Support System (CDSS). The approach aims to detect, solve, or minimise undesirable side-effects of PIMs through an ontology (knowledge base) and inference engines incorporating multiple reasoning approaches. Contributions: The main contribution lies in the framework to formalise PIMs, including the steps required to define guideline requisites to create inference rules to detect and propose alternative drugs to inappropriate medications. No formalisation of the selected guideline (Beers Criteria) can be found in the literature, and hence, this thesis provides a novel ontology for it. Moreover, our process of minimising undesirable side effects offers a novel approach that enhances and optimises the drug rescheduling process, providing a more accurate way to minimise the effect of drug interactions in clinical practice.

Erasure in dependently typed programming

Tejiščák, Matúš — 2020-07-07T00:00:00Z

It is important to reduce the cost of correctness in programming. Dependent types and related techniques, such as type-driven programming, oﬀer ways to do so. Some parts of dependently typed programs constitute evidence of their typecorrectness and, once checked, are unnecessary for execution. These parts can easily become asymptotically larger than the remaining runtime-useful computation, which can cause linear-time algorithms run in exponential time, or worse. It would be unnacceptable, and contradict our goal of reducing the cost of correctness, to make programs run slower by only describing them more precisely. Current systems cannot erase such computation satisfactorily. By modelling erasure indirectly through type universes or irrelevance, they impose the limitations of these means to erasure. Some useless computation then cannot be erased and idiomatic programs remain asymptotically sub-optimal. This dissertation explains why we need erasure, that it is diﬀerent from other concepts like irrelevance, and proposes two ways of erasing non-computational data. One is an untyped ﬂow-based useless variable elimination, adapted for dependently typed languages, currently implemented in the Idris 1 compiler. The other is the main contribution of the dissertation: a dependently typed core calculus with erasure annotations, full dependent pattern matching, and an algorithm that infers erasure annotations from unannotated (or partially annotated) programs. I show that erasure in well-typed programs is sound in that it commutes with single-step reduction. Assuming the Church-Rosser property of reduction, I show that properties such as Subject Reduction hold, which extends the soundness result to multi-step reduction. I also show that the presented erasure inference is sound and complete with respect to the typing rules; that this approach can be extended with various forms of erasure polymorphism; that it works well with monadic I/O and foreign functions; and that it is eﬀective in that it not only removes the runtime overhead caused by dependent typing in the presented examples, but can also shorten compilation times.

Adaptive guidance in extended reality environments

Weerasinghe, Maheshya — 2023-11-28T00:00:00Z

Learning depends on the dynamics of one’s personal circumstances and immediate environment that provides hands- experience. As a result, educators are constantly striving to create personalised learning experiences for learners. The increasing use of technology in education has led to the development of various e-learning systems. However, these systems are limited by their inability to create immersive and interactive learning environments that cater to each learner’s individual needs and preferences. Extended Reality (XR) technologies such as Virtual Reality (VR), Augmented Reality (AR), and Mixed Reality (MR) offer a new way of delivering Experiential Learning (ExL) that can meet these challenges. However, existing XR-based learning systems lack the ability to adapt to learners’ individual needs and preferences, which may reduce their learning performance. Nevertheless, there is a lack of research and guidance on effectively incorporating XR technologies to design adaptive experiential learning systems. Thus, this thesis aims to contribute new knowledge on how XR technologies can be used to design and develop interactive, adaptive ExL systems that can be integrated into future learning environments. This is accomplished by (i) presenting a comprehensive design space grounded in XR technology and the theoretical underpinnings of learning and instructional guidance, and by (ii) conducting three different user studies, each focusing on an interactive experiential learning system developed based on a particular configuration of the presented design space. In the first study, the focus is placed on how different representation methods of the future building (paper, desktop and VR HMD) would affect the user experience, dimensions of user engagement, the understanding of the space with minimum guidance, and support users to project themselves into the future office space. The second study explores how different factors of instructional guidance – i.e., the amount of guidance (fixed vs. adaptive-amount) and the type of guidance (fixed vs. adaptive-associations) – would affect the user experience, engagement and the learning outcomes of a language learning scenario. The final study further looks into detail at how different interfaces (AR vs. non-AR) and types of guidance (keyword only vs. keyword + visualisation) would affect user experience, engagement and consequently the learning performances in vocabulary learning. The results of this research will provide insights into the design and development of interactive XR based experiential learning systems that can meet the diverse learning needs and preferences of individual learners, leading to improved learning outcomes. This work will be useful and of interest to researchers and practitioners who conduct research within the fields of Human-Computer Interaction (HCI), instructional design or education.

Automatic inference of latent emotion from spontaneous facial micro-expressions

Zhang, Liangfei — 2023-11-28T00:00:00Z

Emotional states exert a profound influence on individuals' overall well-being, impacting them both physically and psychologically. Accurate recognition and comprehension of human emotions represent a crucial area of scientific exploration. Facial expressions, vocal cues, body language, and physiological responses provide valuable insights into an individual's emotional state, with facial expressions being universally recognised as dependable indicators of emotions. This thesis centres around three vital research aspects concerning the automated inference of latent emotions from spontaneous facial micro-expressions, seeking to enhance and refine our understanding of this complex domain. Firstly, the research aims to detect and analyse activated Action Units (AUs) during the occurrence of micro-expressions. AUs correspond to facial muscle movements. Although previous studies have established links between AUs and conventional facial expressions, no such connections have been explored for micro-expressions. Therefore, this thesis develops computer vision techniques to automatically detect activated AUs in micro-expressions, bridging a gap in existing studies. Secondly, the study explores the evolution of micro-expression recognition techniques, ranging from early handcrafted feature-based approaches to modern deep-learning methods. These approaches have significantly contributed to the field of automatic emotion recognition. However, existing methods primarily focus on capturing local spatial relationships, neglecting global relationships between different facial regions. To address this limitation, a novel third-generation architecture is proposed. This architecture can concurrently capture both short and long-range spatiotemporal relationships in micro-expression data, aiming to enhance the accuracy of automatic emotion recognition and improve our understanding of micro-expressions. Lastly, the thesis investigates the integration of multimodal signals to enhance emotion recognition accuracy. Depth information complements conventional RGB data by providing enhanced spatial features for analysis, while the integration of physiological signals with facial micro-expressions improves emotion discrimination. By incorporating multimodal data, the objective is to enhance machines' understanding of latent emotions and improve latent emotion recognition accuracy in spontaneous micro-expression analysis.

Remote photoplethysmography (rPPG) to measure heart rate and blood oxygenation levels using colour, infrared and depth data from real home environments

Pirzada, Pireh — 2023-11-28T00:00:00Z

Heart Rate (HR) and Blood Oxygenation Level (SPO₂) are physiological signs that are critically important measurements in the assessment of emergent ill-health. These typically require physical contact and blood tests that are often prohibitive for people with certain incapacities, severe illnesses, or burns. Currently, there is no commercially available system for measuring HR and SPO₂ simultaneously remotely, such as through Remote Photoplethysmography (rPPG). Furthermore, there is a gap in the literature on rPPG research as it is unclear which preprocessing techniques and noise reduction algorithms work best in a realistic scenario encompassing diverse demographic characteristics. This thesis addresses these gaps by answering the question ‘How can rPPG be used for unobtrusively measuring vital signs for diverse participants in uncontrolled (home) environments with a low Root Mean Square Error (RMSE)?”. The Automated Remote Pulse Oximetry System incorporates Red, Green, Blue, Depth and Infrared (IR) data to measure HR and SPO₂ remotely from Regions of Interest (ROIs) from the face. Various preprocessing and noise reduction algorithms for measuring vital signs have been evaluated across different skin pigmentation types using multispectral imaging of participants’ faces over time. This novel approach uses the frequency content to obtain the HR and a depth-calibrated ratiometric measurement from Red and IR to measure SPO₂. Additionally, this research with 40 participants identifies and reports factors from real-life environments that impact the system’s error rate. Detrending, interpolating, hamming, and normalising the signal using a 15-second temporal window size with FastICA produced the lowest RMSE of 7.8 for HR with an r-correlation value of 0.85 and RMSE of 2.5 for SPO₂ across different skin pigmentation types which also has the lowest computation time of 1.75ms per measurement. This rPPG system has the potential for deployment in uncontrolled environments offering widespread benefits for those who require remote HR and SPO₂ measurement.

Ethnographically-informed distributed participatory design framework for sociotechnical change : co-designing a collaborative training tool to support real-time collaborative writing

Ardati, Abd Alsattar — 2023-11-28T00:00:00Z

Although Wikipedia’s immense success is partially due to its support of the asynchronous collaboration model, researchers argue that the bureaucratic rules and technical infrastructure enabling it feed into Wikipedia’s content bias. Attempts to introduce different collaboration models have so far failed, but the fact that they have occurred persistently over time suggests that at least part of the Wikipedia community favours incorporating features such as real-time collaborative editing. My research is founded on the argument that the advantageous aspects of the asynchronous model should be preserved, although the existing model needs to be complemented by real-time collaboration in settings such as Wikipedia training events. This thesis describes a Participatory Design process resulting in a prototype called WikiSync, a system that introduces real-time collaboration for the Wikipedia community using a responsible design approach that is respectful of Wikipedia’s rich social structure and history. Furthermore, my research has produced an adaptive methodology for co-designing sociotechnical solutions in a geographically distributed community. After an in-depth observation of online Wikipedia training and the existing community innovation processes, my participatory design sessions have helped create a mutual learning environment for co-designing WikiSync in tandem with the community, while addressing a wide range of their concerns about real-time collaboration. I also consulted the broader Wikipedia community using an online social ideation and voting tool to evaluate the desirability and applicability of the solution. Finally, the resulting ethnographically-informed distributed Participatory Design framework provides an innovation process for involving a diverse, widely distributed online community in co-designing sociotechnical solutions.

Representing constraint problems visually

Zhu, Xu — 2023-11-28T00:00:00Z

With the increasing complexity of the world and the explosion of both information and choices, people are faced with having to make more decisions and solve more problems. One typical type of problem that people are confronted with during their daily lives is constraint problems. Common examples of such problems include scheduling, trip planning, table planning, or resource allocation. Existing constraint solvers can solve these kinds of problems quickly. However, to use these solvers requires knowledge of constraint programming languages, which require significant time and effort to learn, and one cannot expect the general public to learn these solvers. To address this issue, I investigated how people visually represent constraint problems, gaining insight into common techniques that people use to represent this kind of problem. This initial study showed that people often used a variety of different modalities and approached the problem in a non-linear way. From this study, a set of guidelines were developed for the design of visual constraints modelling languages. Using these principles, I designed a visual constraints modelling language. From this language, I designed and implemented a visual constraints modelling interface and system (Solvi), which can model various constraint problems and solve the problems using existing constraint solvers. Both the language and interface were evaluated through a user study to understand how people visually modelled both predefined and their own constraint problems. This indicated that the developed visual language and interface can model common constraints problems. The positive feedback received confirms that the visual modelling language is useful for dealing with the problems encountered in daily life and the visual representation is easier for understanding the relationships between the elements. However, some issues were identified, and possible future improvements to the language and interface, and further research into problem modelling are also discussed.

Investigation of accurate, fast, robust, neural gesture recognition involving a sequential approach

Almuallem, Zahida — 2023-11-28T00:00:00Z

Abstract redacted

Modelling energy consumption in multi-core systems using meta-heuristics and statistical modelling

Alguwaifli, Yasir — 2023-06-14T00:00:00Z

Controlling energy consumption has always been a necessity in many computing contexts as the resources that provide said energy is limited, be it a battery supplying power to an SBC/SOC, an embedded system, a drone, a phone, or another low/limited energy device, or a large cluster of machines that process extensive computations requiring multiple resources, such as a NUMA system. The need to accurately predict the energy consumption of such devices is crucial in many fields. Furthermore, different types of languages, e.g. Haskell and C/C++, exhibit different behavioural properties, such as strict vs. lazy evaluation, garbage collection vs. manual memory management, and different parallel runtime behaviours. In addition most software developers do not write software with energy consumption as a goal, this is mostly due to the lack of generalised tooling to help them optimise and predict energy consumption of their software. Therefore, the need to predict energy consumption in a generalised way for different types of languages that do not rely on specific program properties is needed. We construct several statistical models based on parallel benchmarks using regression modelling such as Non-negative Least Squares (NNLS), Random Forests, and Lasso and Elastic-Net Regularized Generalized Linear Models (GLMNET) from two different programming paradigms, namely Haskell and C/C++. Furthermore, the assessment of the statistical models is made over a complete set of benchmarks that behave similarly in both Haskell and C/C++. In addition to assessing the statistical models, we develop meta-heuristic algorithms to predict the energy consumed in parallel benchmarks from Haskell's Nofib and C/C++'s PARSEC suites for a range of implementations in PThreads, OpenMP and Intel's Threading Building Blocks (TBB). The results show that benchmarks with high scalability and performance in parallel execution can have their energy consumption predicted and even optimised by selecting the best configuration for the desired results. We also observe that even in degraded performance benchmarks, high core count execution can still be predicted to the nearest configuration to produce the lowest energy sample. Additionally, the meta-heuristic technique can be employed using a language- and architecture-agnostic approach to energy consumption prediction rather than requiring hand-tuned models for specific architectures and/or benchmarks. Although meta-heuristic sampling provided acceptable levels of accuracy, the combination of the statistical model with the meta-heuristic algorithms proved to be challenging to optimise. Except for low to medium accuracy levels for the Genetic algorithm, combining meta-heuristics demonstrated limited to poor accuracy.

Title redacted

Jiang, Ai — 2023-06-14T00:00:00Z

Abstract redacted

Disruptive technologies for heritage preservation and promotion : strategies connecting heritage, community and museums through 3D digitisation

Cassidy, Catherine Anne — 2023-06-14T00:00:00Z

Heritage is the physical evidence of human existence and expression with resounding benefits to society and communities. However, an ever-changing world presents constant threats, risking heritage’s destruction or loss. 3D digitisation preserves physical heritage through its reconstitution to digital, augmenting the potential for dissemination within the digital domain. A disconnect between museums and emergent technologies presents a challenge for democratised 3D digitisation and management. Yet, museums and their communities are agents for transformation invested in heritage permanence, with digital literacies to suggest synergies with adapted 3D processes and engagement. An opportunity for intervention to support preservation of heritage through 3D digitisation while facilitating its promotion through engagement with emergent and immersive technologies is investigated in this thesis. Unexpectedly, COVID-19 caused major global disruption, threatening to destabilise connections to heritage and undermine functionalities of museums within their communities. Response to COVID-19 initiated innovation, leading to modified design for applicable stakeholder response. This dissertation investigates how application of accessible strategies empowers communities to actively engage with their heritage as 3D digital assets in a time of disruption. Practice-based research methodologies informed collaborative innovation with stakeholders and end users at various intersections of 3D preservation and promotion. Design of infrastructures and processes aid in engagement through laboratory research, deployed prototypes, and recognition of demands from COVID-19. Strategies for Preserving Heritage through Engagement Representation and Archiving (SPHERA) supports engagement with 3D assets through its Pillars, including creation of digital content, digital skills development and knowledge exchange, and infrastructures to facilitate digital heritage curation and engagement. Guiding Principles and best Practice workflows support implementation of SPHERA as part of original contributions for this dissertation. Leveraging emergent technologies for communities and museums to preserve and promote heritage improves engagement and accessibility with consequential social benefits and wellbeing, and is an area of potential for the future.

A user-centred approach to computer vision-assisted changeover operations

Rupp, Simon — 2023-06-14T00:00:00Z

This research project aimed to investigate the root causes of frequent mistakes in changeover operations and to design a technical solution to address these issues. The focus of the study was on designing an operator assistance system (OAS) to augment the abilities of factory workers. This project was done in collaboration with a manufacturing company, and a user-centred design approach was used to interact with company representatives and gather feedback on the OAS design. The research process included identifying concrete objectives for the OAS based on three identified problems and the design of a computer vision model and an OAS user interface. The computer vision model was trained using a smaller than usual, and therefore more realistic, amount of data collected by a worker in real factory conditions. We are the first to use a (tiny-) Yolo algorithm for the bolt elongation-based method of detecting loose screws. Using the Tiny Yolo v4 algorithm, which is suitable for local deployment on mobile devices, we reached a mean average precision (mAP) of 100% on our test dataset. The model is thus ready for initial factory deployment as it will not replace human judgement and any additional error detection is beneficial in industry. The OAS user interface was tested for usability through user studies with factory workers. The study's findings showed that workers found the navigation intuitive, thought the features were useful and valued the editability. They also provided recommendations for further improvement. Overall, this research contributes to both industry and research by addressing a pressing issue in manufacturing and supplying a proof-of-concept solution that could improve efficiency and reduce mistakes in changeover operations.

Mitigating phishing threats

Wang, Yunjia — 2023-06-14T00:00:00Z

Due to the rapid development of the Internet, modern daily behaviour has become more efficient and convenient. The Internet has become an indispensable element in our daily life, providing significant resources to people whether for play, work or education. In addition, with the increased universality of mobile devices, a magnitude of services is at our fingertips, the efficiency of our life or work has improved. However, the negative side of this is the increase in cybercrimes, with large losses for both individuals and enterprises. Phishing is currently defined as a criminal mechanism employing both social engineering and technical subterfuge to gather any useful information such as user personal data or financial account credentials. Phishing threats have been in existence for many years, since the establishment of the Internet, and they have continuously evolved and increased in application. So far, phishing attacks have accounted for a large proportion of all malicious attacks, and they are a globally growing threat with an increasing frequency of known attacks. Phishing attacks are a major current cyber threat as they are always cheap to produce and easy to deploy, in particular, due to the development of E-commerce, either to an individual user or organization. For the individual, sensitive credentials are always of interest to phishers due to the development of E-commerce. For an enterprise, a successful phishing attack, such as a subdomain takeover attack, may affect their organization’s reputation as well as cause financial loss. Currently, most security vendors have been using different approaches to prevent phishing attacks. However, these solutions cannot keep up with the constant updating of phishing websites. In this thesis, web phishing attack types are classified into three different categories, from the shallower to the deeper. They are General Phishing Attack, Advanced Phishing Attack and Subdomain Takeover Attacks. The purpose of this thesis is to present an effective mitigation to defend against these phishing threats. From the shallower approach to a deeper, more complex approach, according to our defined categories of phishing threats, the specific mitigations and contributions are presented.

Data science use cases in the manufacturing industry : from theory to practice

Arenas Contreras, Diego Alejandro — 2022-11-29T00:00:00Z

One of the main challenges organisations face today is supporting business decisions from the massive volumes of data they are continuously collecting. The problem for organisations is how to become a data-driven organisation using the data they collect to generate insights and repeatable solutions connecting information needs with usable data products. Our objectives during the doctorate were to research and implement high quality technological and methodological solutions following best practices from academia and industry and, at the same time, build internal capacity for the organisation from experience. We implemented a series of data-related projects. The projects can be classified into two types. There are foundational projects that build infrastructure and processes to analyse data and applied data projects. Our methods included practices from software engineering, data science, and data engineering. We designed and built data solutions based on the principles of scalability, automation, encapsulation} and abstraction. We followed the principles mentioned above from the design phases of the projects; this allowed us to achieve good integration with the current systems and infrastructure of the organisation. We operationalised the technologies we explored for each project using a use-case driven approach. Users and stakeholders were involved early on in the projects, and we maintained excellent and continuous communication with them. The foundational projects implemented data architectures rather than implementing a specific ad-hoc solution so that the projects adjusted well to changing requirements and were generalisable to be reused entirely or components of the solutions in future projects. We used the foundational projects in the applied data projects. We deployed an estimation model to quantify the number of technicians needed to support an on-site project. Using an API to query the model, we used a microservice architecture exposing the final model to be consumed. We designed and implemented the analysis of estimating the lifespan of batteries using survival analysis and spectral clustering techniques. We ranked specific machines from best to worst performance based on fuel consumption to optimise resources on project sites. We designed and implemented a Python custom package to facilitate the exploration of databases for data science and data engineering projects. We designed and implemented a microservices architecture to support data streaming analytics. We made recommendations on using a machine learning framework to track and monitor machine learning models, wrote guidelines for best practices, and delivered internal tutorials about the use and benefits of these kinds of solutions. We implemented a data-driven architecture to support the analysis of telemetry data from multiple data sources. We implemented an alarm system on top of the solution using the analytical database of the project. Finally, we designed and implemented a custom Python package to handle repeatable data engineering tasks for the data engineering team. Data science and data engineering are new and essential roles in companies that aim to become data-driven organisations. We believe that using software engineering and software development techniques contributes significantly to this organisational change and accelerates internal innovation using data. We promptly provided data and information to the stakeholders to support their information needs and decision-making processes.

Computational analysis of tissue images in cancer diagnosis and prognosis : machine learning-based methods for the next generation of computational pathology

Dimitriou, Neofytos — 2023-06-14T00:00:00Z

The focus of this work is to develop machine learning systems capable of tissue image analysis in the context of cancer diagnosis and prognosis. Such a system can not only identify new prognostic markers, but can also serve as a standalone clinical prediction rule, the premise being that its non-linear, multivariate nature may be capable of identifying and employing complex patterns that collectively provide accurate cancer diagnosis and prognosis, better than the clinical gold standard. The task, however, is very challenging because of the extremely high resolution of the images, highly heterogeneous microenvironment, multiple sources of noise and artifacts, and low-granularity of ground truth. A starting point of related work which tackles the same task is the extraction of handcrafted features. I investigate the application of machine learning for prognosis using handcrafted features, and develop prognostic machine learning models that demonstrate better performances than baselines based on clinically employed prognostic systems, in two separate cohorts of colorectal and muscle-invasive bladder cancer patients. Moreover, analysis of the proposed methods provides insight behind the prognostic nature of characteristics within the microenvironment, not yet included in the clinical systems. The emergence of deep learning has enabled analysis with images directly. Given the laborious, expensive, and human bias inducing nature of designing and building pipelines for handcrafted feature extraction, I investigate the application of deep learning on tissue images directly. In particular, I propose a framework that allows the training of models directly from exhaustively-tiled whole slide images with only patient-level ground truth, and demonstrate its effectiveness on colorectal cancer prognosis. In my final work, I introduce a new type of CNN-based method, called Magnifying Networks, for gigapixel image analysis that does not require whole slide images to be patch-based preprocessed. Instead, MagNets dynamically extract patches from the tissue image based on the best magnification level, field-of-view, and location according to an optimizing task, and not based on generic, predefined or static ways. My results on the publicly available Camelyon16 and Camelyon17 datasets demonstrate the effectiveness of MagNets, as well as the proposed optimization framework, on the task of whole slide image classification. MagNets process far fewer patches from each slide than any of the existing end-to-end approaches (10 to 300 times fewer).

NLP-supervised stroke detection in medical images

Schrempf, Patrick Maurice — 2023-06-14T00:00:00Z

In the UK, around 100,000 people have a stroke each year, equating to one stroke every five minutes. Prompt treatment is required to give a good patient outcome. To help decide what the treatment options are, most suspected stroke patients in the UK have a brain scan performed on admission to hospital. Each scan is examined by a radiologist, and a report describing the images is written. An automatic artificial intelligence (AI) solution for brain scan analysis could support the radiologist to report scans quickly and accurately, in turn helping with the speed and accuracy of treatment. However, training AI models for medical image analysis requires large amounts of expertly annotated data which are time-consuming and expensive to obtain. Large unlabelled datasets such as those for stroke patients are currently difficult to use directly for training. An emerging solution is to leverage radiology reports as a source of expert information in order to construct imaging training annotations. In this thesis, a system for extracting suitable labels from radiology reports of suspected stroke patients is proposed and implemented in collaboration with NHS Greater Glasgow and Clyde. A state-of-the-art deep learning natural language processing (NLP) model is first trained to extract the labels from radiology reports automatically by combining per-label attention with a novel data augmentation approach that uses templates and knowledge bases. This NLP model is used to automatically create training annotations from an uncurated dataset of over 27,000 reports, larger than is realistically feasible using only manual annotations. Finally, an image analysis model is trained to detect haemorrhagic regions in the brain. Using automatically extracted annotations improves performance over manual annotations by 20%. Being able to use these data without the need for time-consuming and expensive annotation efforts to be carried out by already stretched healthcare professionals provides a promising avenue for transformative medical image analysis research.

All weather scheduling : towards effective scheduling across the edge, fog and cloud

Ekwe-Ekwe, Nnamdi Nzegwu — 2023-06-14T00:00:00Z

The cloud plays a crucial part in the deployment of many applications today. However, alternative processing layers such as the edge and the fog are increasingly being used in order to meet applications' low-latency requirements, save bandwidth or reduce monetary cost. The rise in edge/fog use means that schedulers that can successfully schedule applications and their tasks across the edge/fog/cloud are a growing area of research interest. However, schedulers to date that have been proposed are limited in their functionality, not being adaptive (in terms of scheduling policies) to changing cluster state, consider applications only as a whole and not their individual task requirements, have been validated on a minimal number of applications, or require significant a-priori knowledge by the end-user of their applications' characteristics and its affinity to resources in their cluster. We present the Weather-Adaptive-Scheduler or WASCH, a novel, data-driven scheduler for the edge/fog/cloud that achieves minimal task makespan for an applications' tasks. WASCH first profiles applications that a user wants to schedule across a cluster of edge/fog/cloud resources, using a set of representative tasks for that application. This data (application/task/node performance metrics) is used by WASCH to build a regression model. This model is then used to predict which resource in the cluster will deliver the minimal makespan of new end-user tasks. WASCH in a new contribution to the state of the art, does all of this automatically without any need for end-user a-priori knowledge of the cluster, the characteristics of an application or affinity of the application to the various resources. Via empirical evaluation of WASCH using 14 diverse, real-life applications, we show that WASCH successfully achieves minimal task makespan for those 14 applications' tasks when compared to two state of the art schedulers only taking an average of 6ms per prediction per resource.

A conceptual framework for uncertainty in software systems and its application to software architectures

Lupafya, Chawanangwa — 2023-06-14T00:00:00Z

The development and operation of a software system involve many aspects including processes, artefacts, infrastructure and environments. Most of these aspects are vulnerable to uncertainty. Thus, the identification, representation and management of uncertainty in software systems is important and will be of interest to many stakeholders in software systems. The hypothesis of this work is that such consideration would benefit from an underlying conceptual framework that allows stakeholders to characterise, analyse and mitigate uncertainties. This PhD proposes a framework to provide a generic foundation for the systematic and explicit consideration of uncertainty in software systems by consolidating and extending existing approaches to dealing with uncertainty, which are typically tailored to specific domains or artefacts. The thesis applies the framework to software architectures, which are fundamental in determining the structure, behaviour and qualities of software systems and are thus suited to serve as an exemplar artefact. The framework is evaluated using the software architectures of case studies from 3 different domains. The contributions of the research to the study of uncertainty in software systems include a literature review of approaches to managing uncertainty in software architecture, a review of existing work on uncertainty frameworks related to software systems, a conceptual framework for uncertainty in software systems, a conceptualisation of the workbench infrastructure as a basis for building an uncertainty consideration workbench of tools for representing uncertainty as part of software architecture descriptions, and an evaluation of the uncertainty framework using three software architecture case studies.

High-level efficient constraint dominance programming for pattern mining problems

Koçak, Gökberk — 2023-06-14T00:00:00Z

Pattern mining is a sub-field of data mining that focuses on discovering patterns in data to extract knowledge. There are various techniques to identify different types of patterns in a dataset. Constraint-based mining is a well-known approach to this where additional constraints are introduced to retrieve only interesting patterns. However, in these systems, there are limitations on imposing complex constraints. Constraint programming is a declarative methodology where the problem is modelled using constraints. Generic solvers can operate on a model to find the solutions. Constraint programming has been shown to be a well-suited and generic framework for various pattern mining problems with a selection of constraints and their combinations. However, a system that handles arbitrary constraints in a generic way has been missing in this field. In this thesis, we propose a declarative framework where the pattern mining models can be represented in high-level constraint specifications with arbitrary additional constraints. These models can be efficiently solved using underlying optimisations. The first contribution of this thesis is to determine the key aspects of solving pattern mining problems by creating an ad-hoc solver system. We investigate this further and create Constraint Dominance Programming (CDP) to be able to capture certain behaviours of pattern mining problems in an abstract way. To that end, we integrate CDP into the high-level \essence pipeline. Early empirical evaluation presents that CDP is already competitive with current existing techniques. The second contribution of this thesis is to exploit an additional behaviour, the incomparability, in pattern mining problems. By including the incomparability condition to CDP, we create CDP+I, a more explicit and even more efficient framework to represent these problems. We also prototype an automated system to deduct the optimal incomparability information for a given modelled problem. The third contribution of this thesis is to focus on the underlying solving of CDP+I to bring further efficiency. By creating the Solver Interactive Interface (SII) on SAT and SMT back-ends, we highly optimise not only CDP+I but any iterative modelling and solving, such as optimisation problems. The final contribution of this thesis is to investigate creating an automated configuration selection system to determine the best performing solving methodologies of CDP+I and introduce a portfolio of configurations that can perform better than any single best solver. In summary, this thesis presents a highly efficient, high-level declarative framework to tackle pattern mining problems.

Evaluating data linkage algorithms with perfect synthetic ground truth

Dalton, Thomas Stanley — 2022-06-15T00:00:00Z

Data linkage algorithms join datasets by identifying commonalities between them. The ability to evaluate the efficacy of different algorithms is a challenging problem that is often overlooked. If incorrect links are made or links are missed by a linkage algorithm then conclusions based on its linkage may be unfounded. Evaluating linkage quality is particularly challenging in domains where datasets are large and the number of links is low. Example domains include historical population data, bibliographic data, and administrative data. In these domains the evaluation of linkage quality is not well understood. A common approach to evaluating linkage quality is the use of metrics, most commonly precision, recall, and F-measure. These metrics indicate how often links are missed or false links are made. To calculate a metric, datasets are used where the true links and non-links are known. The linkage algorithm attempts to link the datasets and the constructed set of links is compared with the set of true links. In these domains we can rarely have confidence that the evaluation datasets contain all the true links and that no false links have been included. If such errors exist in the evaluation datasets, the calculated metrics may not truly reflect the performance of the linkage algorithm. This presents issues when making comparisons between linkage algorithms. To rigorously evaluate the efficacy of linkage algorithms, it is necessary to objectively measure an algorithm’s linkage quality with a range of different configuration parameters and datasets. These many datasets must be of appropriate scale and have ground truth which denotes all true links and non-links. Evaluating algorithms using shared standardised datasets enables objective comparisons between linkage algorithms. To facilitate objective linkage evaluation, a set of standardised datasets need to be shared and widely adopted. This thesis establishes an approach for the construction of synthetic datasets that can be used to evaluate linkage algorithms. This thesis addresses the following research questions: • What are appropriate approaches to the evaluation of linkage algorithms? • Is it feasible to synthesise realistic evaluation data? • Is synthetic evaluation data with perfect ground truth useful for evaluation? • How should synthesised data be statistically validated for correctness? • How should sets of synthesised data be used to evaluate linkage? • How can the evaluation of linkage algorithms be effectively communicated? This thesis makes a number of contributions, most notably a framework for the comprehensive evaluation of data linkage algorithms, thus significantly improving the comparability of linkage algorithms, especially in domains lacking evaluation data. The thesis demonstrates these techniques within the population reconstruction domain. Integral to the evaluation framework, approaches to synthesis and statistical validation of evaluation datasets have been investigated, resulting in a simulation model able to create many, characteristically varied, large-scale datasets.

Sensorimotor interfaces : towards enactivity in HCI

Carson, Iain — 2022-06-15T00:00:00Z

This thesis explores the application of enactive techniques to human computer interaction, focusing on how devices following ‘sensorimotor’ principles can be blended with interface goals to lead to new perceptual experiences. Building sensorimotor interfaces is an exciting, emerging ﬁeld of research facing challenges surrounding application, design, training and uptake. To tackle these challenges, this thesis cuts a line of investigation from a review of enactivity in the related ﬁeld of sensory substitution and augmentation devices, to a schematic taxonomy, model and design guide of ‘the sensorimotor interface’; developed from a theoretically-grounded, enactive approach to cognition. Device, interaction and training guidelines are drawn from this model, formalising the application of the enactive approach to HCI. A readily-available consumer device is then characterised and calibrated in preparation for testing the model validity and associated insights. The process highlights the effects of accessible, easily-implemented calibrations, and the importance of mixed-method approaches in assessing sensorimotor interface potential. The calibrated device is utilised to conduct a detailed, methodological investigation into how concurrently available sensory information affects and contributes to uptake of novel sensorimotor skills. Robust statistical modelling concludes that sensory concurrency has a profound effect on the comprehension and integration of enactive haptic signals, and that efforts to carefully control the nature and degree of sensory concurrency improve user comprehension and enjoyability when engaging with novel sensorimotor tasks, while reducing confusion and stress. The work is concluded by speculation on how the presented derivations, methods and observations can be used to directly inﬂuence future sensorimotor interface design in HCI. This thesis therefore constitutes a primer to the principles and history of sensory substitution and augmentation, details the requirements and limitations of the enactive approach in academia and industry, and brings enactivity forward as an accessible, viable and exciting methodology in interaction design.

Automatically exploiting high-level problem structure in local-search

Attieh, Saad — 2021-11-30T00:00:00Z

Constraint Programming is the study of modelling and solving complex combinatorial problems. Systematic-search and local-search are both well-researched approaches to solving constraint problems. Systematic-search exhaustively explores the entire search space and can be used to guarantee optimality, prove infeasibility or enumerate all possible solutions. Conversely, local-search is a heuristic-based approach to solving constraint problems. Often used in industrial applications, local-search is used to discover high-quality solutions quickly, usually sacrificing the ability to cover the entire search space. For this reason, it is preferred in applications where the scale of the problems being solved are beyond what can be feasibly searched using systematic methods. This work investigates methods of using information derived from high-level specifications of problems to augment the performance and scalability of local-search systems. Typically, abstract high-level constraint specifications or models are refined into lowlevel representations suitable for input to a constraint solver, erasing any knowledge of the specifications' high-level structures. We propose that whilst these lower-level models are equivalent in their description of the problems being solved, the original high-level specification, if retained, can be used to augment both the performance and scalability of local-search systems. In doing this, two approaches have been implemented and benchmarked. In the first approach, Structured Neighbourhood Search (SNS), a systematic solver is adapted to support declarative large neighbourhood search, using the high-level types such as sets, sequences and partitions in the original problem specification to automatically construct higher-quality, structured neighbourhoods. Our experiments demonstrate the performance of SNS when applied to structured problems. In the second approach, a novel constraint-based local-search solver is designed to operate on the high-level structures without refining these structures into lower-level representations. The new solver Athanor can directly instantiate and operate on the types in the Essence abstract specification language, supporting arbitrarily nested types such as sets of partitions, multi-sets of sequences and so on. Athanor retains the performance of SNS but boasts a unique benefit; on some classes of problems, the high-level solver is shown to be able to efficiently operate on instances that are too large for low-level solvers to even begin search.

Streamlined constraint reasoning : an automated approach from high level constraint specifications

Spracklen, Patrick — 2022-11-29T00:00:00Z

Constraint Programming (CP) is a powerful technique for solving large-scale combinatorial (optimisation) problems. Solving a problem proceeds in two distinct phases: modelling and solving. Effective modelling has a huge impact on the performance of the solving process. Even with the advance of modern automated modelling tools, search spaces involved can be so vast that problems can still be difficult to solve. To further constrain the model a more aggressive step that can be taken is the addition of streamliner constraints, which are not guaranteed to be sound but are designed to focus effort on a highly restricted but promising portion of the search space. Previously, producing effective streamlined models was a manual, difficult and time-consuming task. This thesis presents a completely automated process to the generation, search and selection of streamliner portfolios to produce a substantial reduction in search effort across a diverse range of problems. First, we propose a method for the generation and evaluation of streamliner conjectures automatically from the type structure present in an Essence specification. Second, the possible streamliner combinations are structured into a lattice and a multi-objective search method for searching the lattice of combinations and building a portfolio of streamliner combinations is defined. Third, the problem of "Streamliner Selection" is introduced which deals with selecting from the portfolio an effective streamliner for an unseen instance. The work is evaluated by presenting two sets of experiments on a variety of problem classes. Lastly, we explore the effect of model selection in the context of streamlined specifications and discuss the process of streamlining for Constrained Optimization Problems.

Homeostatic action selection for simultaneous multi-tasking

Symons, David Andrew — 2020-07-29T00:00:00Z

Mobile robots are rapidly developing and gaining in competence, but the potential of available hardware still far outstrips our ability to harness. Domain-speciﬁc applications are most successful due to customised programming tailored to a narrow area of application. Resulting systems lack extensibility and autonomy, leading to increased cost of development. This thesis investigates the possibility of designing and implementing a general framework capable of simultaneously coordinating multiple tasks that can be added or removed in a plug and play manner. A homeostatic mechanism is proposed for resolving the contentions inevitably arising between tasks competing for the use of the same robot actuators. In order to evaluate the developed system, demonstrator tasks are constructed to reach a goal location, prevent collision, follow a contour around obstacles and balance a ball within a spherical bowl atop the robot. Experiments show preliminary success with the homeostatic coordination mechanism but a restriction to local search causes issues that preclude conclusive evaluation. Future work identiﬁes avenues for further research and suggests switching to a planner with the sufﬁcient foresight to continue evaluation.

Co-creating data protection solutions through a commons

Wong, Janis — 2022-11-29T00:00:00Z

In our data-driven society, personal data affecting individuals as data subjects is increasingly being collected and processed by sizeable, international companies. While data protection laws and privacy technologies attempt to limit the impact of data breaches and privacy scandals, they rely on individuals having a detailed understanding of the available recourse, resulting in the responsibilisation of data protection. Existing data stewardship frameworks incorporate data protection considerations and employ data-protection-by-design principles but may not include data subjects in the process itself, relying on supplementary legal doctrines to strengthen data protection enforcement. Current data protection solutions also lack support for protecting individual autonomy over personal data through co-creation and participation, particularly where there is socio-technical and communal value to collaborative data from which data subjects may not currently benefit. These challenges motivate the application of a theoretical and practical framework that can encourage co-creation of data protection solutions, increase awareness of different stakeholder interests, and rebalance power between data subjects and data controllers. In this thesis, we propose adapting the commons framework to create a data protection-focused data commons. We conduct interviews with commons experts to identify the institutional barriers to creating a commons and challenges of incorporating data protection principles into a commons. We propose requirements for establishing a data protection-focused data commons by applying our interview findings and data protection principles. We then deploy the data protection-focused data commons using an online learning use case. We conduct a study to explore the usefulness of the commons for supporting students' agency and co-creating data protection solutions in response to tutorial recordings, their consent preferences, and attitudes towards privacy and online learning. We find that a data protection-focused data commons as a socio-technical framework can support the collaboration and co-creation of data protection solutions for the benefit of data subjects.

Mobility multihoming duality for the Internet Protocol

Yanagida, Ryo — 2022-06-21T00:00:00Z

In the current Internet, mobile devices with multiple connectivity are becoming increasingly common; however, the Internet protocol itself has not evolved accordingly. Instead, add-on mechanisms have emerged, but they do not integrate well. Currently, the user suffers from disruption to communication on the end-host as the physical network connectivity changes. This is because the IP address changes when the point of attachment changes, breaking the transport layer end-to-end state. Furthermore, while a device can be connected to multiple networks simultaneously, the use of IP addresses prevents end-hosts from leveraging multiple network interfaces — a feature known as host multihoming, which can potentially improve the throughput or reliability. While solutions exist separately for mobility and multihoming, it is not possible to use them as a duality solution for the end-host. This work extended ILNPv6, an engineering solution of Identifier-Locator Network Protocol (ILNP) implemented as a superset of IPv6 on the Linux kernel. The existing implementation was extended to enable mobility and multihoming duality. First, the mobility implementation was enhanced to support continuous mobility; a comparative analysis against Mobile IPv6 (MIPv6) showed superior performance during a series of handoffs. Second, multihoming was implemented and integrated with mobility; the evaluation with a flexible multi-connectivity scenario with load-balancing showed negligible loss and consistent throughput. Finally, the impact of the combined mobility-multihoming mechanism was evaluated with a real-time video streaming application showing continuous uninterrupted real-time video playback up to 2160p (4K ultra high definition). Overall, this work has demonstrated that mobility-multihoming duality is possible for end-hosts over IPv6 for existing applications without changing the network infrastructure.

Visual analysis of arguments in video-based debates

Soares Mota Carneiro, Guilherme — 2022-06-15T00:00:00Z

The ability to understand, process and evaluate arguments made by others and ourselves is important in many personal and professional spheres, such as political debates. However, developing an understanding and communicating with others is often limited to passive viewing, textual discussion on social media and comments on a newspaper website. The analysis of arguments might help in developing a better understanding, but this typically appears in written form, such as debate article written by journalists or experts on a newspaper. A growing number of argumentation tools favour diagram-based graphical representations to traditional text documents for argument analysis because arguments have non-linear structures that are difficult to convey simply through text. Such tools have been developed for different purposes such as education and decision analysis and are often used by experts in the field of argumentation and debate analysis. Despite the widespread use and development of argumentation systems, there is still little understanding of how to design and implement argumentation systems for non-experts in argumentation. This thesis investigates how to design and implement Deb8, a tool that allows collaborative analysis of video-based debates. This thesis presents the results of three studies that uncover to what extent non-experts in argumentation understand and use Deb8, what argument concepts non-experts apply in their analyses and what role the graphical representation of arguments play in the analysis of video-based debates. The findings presented in this thesis can guide the design of better argumentation systems and shed light on the areas of debate analysis and argument visualisation with a better understanding of how to design systems to help the general public to argue better.

Single-handed interaction techniques for mobile and wearable computing

Yeo, Hui Shyong — 2021-06-30T00:00:00Z

The past decade has seen the proliferation of mobile and wearable computing devices into our everyday life. Such devices are now used throughout the day for both productivity and entertainment purposes. As a result, it is important that input techniques for these devices are efficient, effective and intuitive. Further, it is important that these techniques reflect the reality of common usage patterns. In particular, supporting single-handed usage is of paramount importance, given that in many scenarios only one hand is available. As the screen size of mobile devices are getting larger, single-handed usage becomes even more problematic. At the opposite end of the scale, using small wearable devices such as smartwatches or fitness trackers often requires two hands. This thesis is concerned with the exploration, design, and evaluation of input techniques that enable practical and effective single-handed interaction on mobile and wearable devices, which empower users to achieve more with their smart devices when only one hand is available. In particular, the thesis focuses on the practicability and actual implementation of such techniques, by using built-in or low-cost sensors that are readily available. The work first motivates the thesis topic that was encountered during the early phase of study. Then, the single-handed interaction problem is tackled with two types of device form factor, both mobile and wearable. This thesis studies the problem on three types of input modalities – mid-air gesture, hand posture, on-surface gesture and three types of interaction techniques – text input, gesture, pointing. This thesis provides several techniques, interaction methods and exemplars required to explore the single-handed interaction problem. The effectiveness and efficiency of the techniques are evaluated with rigorous studies.

Computing normalisers of highly intransitive groups

Chang, Mun See — 2021-06-30T00:00:00Z

We investigate the normaliser problem, that is, given 𝐺, 𝐻 ≤ 𝑆ₙ, compute 𝑁[sub]𝐺(𝐻). The fastest known theoretical algorithm for this problem is simply exponential, but more eﬃcient algorithms are known for some restriction of classes for 𝐺 and 𝐻. In this thesis, we will focus on highly intransitive groups, which are groups with many orbits. We give new algorithms to compute 𝑁[sub](𝑆ₙ)(𝐻) for highly intransitive groups 𝐻 ≤ 𝑆ₙ and for some subclasses that perform substantially faster than previous implementations in the computer algebra system GAP.

Hidden Markov models with variational inference in marketing science

Danielson, Matthew — 2021-06-30T00:00:00Z

Hidden Markov Models (HMMs) are a well known type of model for many varieties of sequen- tial data. There exist several algorithms for learning HMMs: a variant of an expectation- maximization (EM) algorithm known as the Baum Welch method, Markov Chain Monte Carlo (MCMC), and Variational Inference (VI). This third method is less frequently used, yet it has interesting properties with regard to convergence, sparsity, and interpretation that are worth further exploration. HMMs are used as explanatory models in the field of marketing science, where one of the goals is to interpret the model structure to understand customer behavior. This thesis will explore the use of HMMs trained with VI to build an interpretable classification model for customer churn on a dataset consisting of call data records from a mobile telecommunications company. In this thesis we first provide an introduction to VI for HMMs and then derive a mixture of HMMs (mHMMs) using VI. A mHMMs is then shown to be quite capable of performing unsupervised clustering of sequential data. Next, we present the design and interface of a new open source library for training HMMs and mHMMs with VI and EM. We show that this library achieves excellent performance while still providing an intuitive interface in the Python programming language. We then examine the performance of classifiers using HMMs trained with VI and EM on several classification datasets. The results from these experiments are then used to build and test several simple classification models to predict churn for the provided dataset. As these models are shown to have poor performance, we train a more traditional machine learning model based on gradient boosted trees and evaluate the interpretability, stability and performance of this model over a subsequent 18 months of data.

A grey-box approach to benchmarking and performance modelling of data-intensive applications

Ceesay, Sheriffo — 2021-06-30T00:00:00Z

The advent of big data about a decade ago, coupled with its processing and storage challenges gave rise to the development of a multitude of data-intensive frameworks. These distributed parallel processing frameworks can be used to process petabytes of data stored in a cluster of computing nodes. Companies and organisations can now process massive amounts of data to drive innovation and gain a competitive advantage. However, these new paradigms have resulted in several research challenges due to their inherent difference from the more mature traditional data processing and storage systems. Firstly, they are comparatively more modern, supporting the execution of a wide variety of new data-intensive workloads with varying performance requirements. Therefore, there is a clear need to study and standardise ways to benchmark and compare them to identify and improve performance bottlenecks. Secondly, they are highly configurable; enabling users the freedom to tune the execution environment based on the application's performance requirements. However, this freedom and the ubiquity of the configuration parameters present an additional challenge by shifting the tuning and optimisation responsibilities of these numerous configuration parameters to the users. To address the above broad challenges, in this research, we enabled a grey-box benchmarking and performance modelling framework focusing on two of the most common communication patterns for data-intensive applications. The use of communication patterns allowed us to classify and study varying but related data-intensive workloads using the same sets of requirements. Furthermore, we enabled a multi-objective performance prediction framework that can be used to answer various performance-related questions such as the time it takes to execute an application, the best configuration parameters to satisfy constraints such as deadline, and recommendation of optimal cloud instances to minimise monetary cost. To gauge the generality of this work, we have validated the results on two internal clusters, and the results are consistent across both setups. We have also provided a REST API and web implementation for validation. The primary take way result is that the research showcase a comprehensive approach that can be used to benchmarking and modelling the performance of data-intensive applications.

Computer assisted data acquisition in real time

Harland, David M. — 1978-01-01T00:00:00Z

A technique is outlined by which a normal time-shared operating system may be modified to permit concurrent processing of the time-synchronous data sampling operations in real time and the more typical time-insensitive functions of other user processes on that system. Guidance is given regarding the implementation of such a proposal and it is discussed in relation to the UNIX operating system, on which it has been demonstrated. It is pointed out that concurrent processing of a multi-user time-shared system and a single-user time-synchronous sampling routine offers a more efficient use of the computer processor, when real time sampling must be undertaken, while still offering a service to other users. Such a scheme may be an attractive alternative to implementing a full-scale Real Time Operating System for the exclusive use of a very limited sampling operation. Applications for such a technique in the field of observational astronomical research are briefly reviewed, and current ad hoc software criticised.

Implementation of a macro-processor for string handling

Stathopoulos, Constantin Nicholas — 1971-01-01T00:00:00Z

In this work TRAC is implemented in FOTRAN IV. This will enable various users to compile this FORTRAN version and use it in their own installations making only minor modifications to meet individual specifications. The way of the implementation allows adding either existing TRAC functions which are not included in this work or completely new, primitive functions needed for specific, well-defined purposes. TRAC is a very flexible interactive language with versatile capabilities at execution time. The presented processor is programmed in FORTRAN IV using IBM 360 44PS and RAX facilities. It is compiled and intended to be used as a software package providing TRAC language facilities for the 360 RAX REMOTE ENTRY COMPUTING SYSTEM. It can be used under 44PS for special purposes. TRAC is a member of the set 'STRING MANIPULATTON LANGUAGES'. A thorough examination of this set helps to understand the basics of operations and techniques for dealing with strings. Another member of the same set is described briefly: this is the 'SNOBOL LANGUAGE' 'String manipulation languages' is a subset of the set 'SYMBOL MANIPULATION LANGUAGES'. Some of the fundamental ideas and principles for symbol manipulation are included. The most-known languages, techniques and applications are mentioned, followed by references to allow further research and investigation. The above are introduced in the following order: 1. Symbol Manipulation Languages, 2. String Manipulation Languages. 3. TRAC Language.

A comparison of certain simulation systems

Stavrakis, Constantine — 1973-01-01T00:00:00Z

In recent years' the use of simulation in the world has expanded rapidly. The writing of a simulation program in one of the general purpose programming languages such as FORTRAN, ALGOL etc., was very difficult. This necessitated the development of a number of so-called "General Purpose Simulation Languages" that are aimed at simplifying the task of writing simulation programs for a variety of different types of models. Among these simulation languages that have been developed are the GFSS and SIMSCRIPT languages. At the present time there are many simulation languages available, but the most are special-purpose types which are not widely used. The major simulation languages in the world today include GPSS and SIMSCRIPT. Of these GPSS is the most widely used. This book provides briefly a description of GPSS, SIMSCRIPT and FORTRAN languages used in the computer simulation systems, as well as the requirements for programming simulation on a computer. Our. major purpose is to solve a number of such problems in GPSS and FORTRAN in order to facilitate a comparison between these two languages. In addition we shall display a very powerful package of FORTRAN subroutines in order to simplify the work of doing simulation in FORTRAN. This book is divided into five chapters. Chapter one provides a general introduction to the General Purpose Simulation System (GPSS). It is to be noted that GPSS exists in several versions. Here we shall describe the GPSS/3oO language as it is implemented on the IBM 360 computers. Chapter two describes four different simulation models in GPSS to demonstrate the use of this language. Chapter three contains a survey of the SIMSCRIPT simulation language and a comparison between GPSS and SIMSCRIPT. Chapter four describes in FORTRAN the same models. as in chapter two in order to illustrate the applicability of FORTRAN in simulation models. Finally, the last five chapter, compares the capabilities between GPSS and FORTRAN languages and discusses the output results of all the four models in both languages.

Applications of computer graphics to plant science

Norton, Colin R. — 1979-01-01T00:00:00Z

RAL : relational algebra language

Bocca Choy, Jorge — 1979-01-01T00:00:00Z

This thesis describes a Data Base Management System (DBMS) that has been designated and developed at the University of St. Andrews, using a PDP 11/40 computer and the UNIX (1) operating system. The system is a general purpose Data Base Management System supporting a relational view of data and has been developed for applications of small and medium complexity and size. At the level of the user interface, the system offers a sublanguage based on the relational algebra, and the chosen operators are similar to the ones suggested by Codd (2). This language decision is in opposition to a sublanguage based on a first predicate calculus which other relational DBMS of similar capabilities offer. The thesis discusses the concepts on which the system is based, the power of expression of the language offered to the user of it and implementation techniques and related problems. Furthermore, an evaluation of the implementation of RAL suggests ways of improving the performance of the system enhancing its power and flexibility.

Meaning and translation : theory and practice of machine translation as exemplified by applicative and cognitive grammars

Guerecheau, Chantal — 2003-01-01T00:00:00Z

This dissertation is articulated around machine translation theory and practice on the one hand, and translation theories and issues on the other hand. Machine translation, being based usually on a single natural language analysis, fails to encompass the complex operations performed on source and target texts belonging to source and target languages and cultures. This thesis compares machine and human translation of a large corpus of sentences (Le Petit Prince, by Antoine de Saint-Exupery), analyses how human beings translate, and how a translation programme processes the same text. The survey of the transfer processes - or shifts - displayed in several human translations (from French into English, German and Russian), as well as the analysis of the machine translation outputs (into English and German), show that machine translation should be reconsidered, not through source language analysis, but through the transfer operations performed on a source text to produce a target 'equivalent' text. Translation, seen as a cognitive operation, is here studied within the perspective of Applicative and Cognitive Grammars, formalisms rooted in Montague Grammar. More specifically, the Applicative and Cognitive Grammar, developed by Desclés, aims at determining a genotype language (a hypothetical universal semiotic system underlying all languages), and is developed around the theory of organised intermediate representation levels. Applicative and Cognitive Grammars can be seen as a step towards the setting-up of an interlingua architecture, which represents the next generation of machine translation systems. This research allows for an analysis of the deficiencies of a transfer translation system, as well as a better understanding of the processes of 'meaning transfer' in translation, seen as a semiotic operation.

Computer drawn perspective landscapes from contour data

Abramson, Cecile Sara — 1972-01-01T00:00:00Z

“Computer drawn perspective landscapes from contour data" describes a computer program which, makes a plotter drawing of a landscape using map data. The user must supply 1) a matrix of heights in a certain area, 2) an observation point and 3), a point to indicate the boundary of the view and the direction the observer is facing. The user may also supply information about bodies of water, cities or towns. The program stores the input and calculates the lines of the landscape and draws them on the plotter. It also supplies a frame for the drawing. The program calculates the landscape lines by forming a field of vision, the left radius being formed by the observation point and the view point (both supplied by the user). The arc of vision is divided into 240 radiating lines. The angle of elevation for 80 points along each radiating line is calculated and the points with the largest angles are connected to form the outlines which are drawn. The first chapter is a general survey of computer graphics. The rest of the thesis is concerned with the program itself, first there is a general description of the project and the problems involved in going about it, and then a detailed description of the Fortran* program. The last chapter describes further work that would be relevant to this project. Also included are illustrations and the Fortran program itself.

A more effective use of information in search for quantified Boolean formulae

Rowley, Andrew G. D. — 2006-01-01T00:00:00Z

The solving of Quantified Boolean Formulae (QBFs) has recently become of great interest. QBFs are an extension of the Satisfiability problem (SAT), which has been studied in depth. Many QBF techniques are built as extensions to SAT techniques. While this can be useful, it also means that QBF specific techniques have not received as much attention as they could. The contributions of this thesis are: Introduce new data structures for QBF which use the information available to the QBF solver more effectively. This reduces the amount of time taken to update the data structures. Description of a new method of using a SAT solver within a QBF solver. This does not ignore the results of the SAT solver as is done with previous techniques. The use of an incomplete SAT solver in QBF search is also discussed, which gives rise to the first incomplete QBF solver. A detailed analysis of solution-directed backjumping. This is shown to be less effective than was previously thought. New techniques are developed to build better solution sets that result in improved operation of solution-directed backjumping. A new technique for solution learning is developed. This increases the amount of information learned for each solution found without a large increase in the space required. An experimental analysis shows that this results in a reduced number of backtracks on many problems compared to other solution learning techniques. Overall, the better use of information is shown to lead to improvements in QBF solving techniques

An implementation of Reeve's syntax directed translator

Vashishta, Ambrish — 1978-01-01T00:00:00Z

This thesis describes an implementation of a syntax-directed translator originally due to C.M. Reeves. The translator program simulates a special-purpose stored-program digital computer. A program for this machine represents the syntax and semantics of some language the source language. The translator is program-driven, backtracking and performs the parsing of the source text in a top-down manner. It does not allow left-recursive grammars. The program is written in ALGOL W on the IBM 360/44 computer. The translator accepts language specifications (syntax and semantics) in extended BNF - a meta language described in the thesis - and eventually produces a recogniser for statements in the source language. The translator is composed of two machines - a parsing machine and an editor machine. The parsing machine has 17 machine instructions, some these instructions are of single address type while others have no operand. Details of these are described. Briefly they are used by the parser machine to make decisions about whether or not a source text is grammatical. The output produced by the parser machine is determined by the semantics embodied in the program. This output is then carried to the editor machine and is treated as its input. The input to the editor machine is an ordered sequence of symbols which contains some edit instructions. With the help of these edit codes, this input is interpreted on the editor machine. The editor machine has 6 machine instructions which normally operate on the top one or two elements of the editor stack. The output produced by the editor machine is then assembled. The program thus obtained can be run on the hypothetic, computer as a compiler. A summary of related work is given.

A syntax analysis and manipulation tool

Milne, Allan C. — 1986-01-01T00:00:00Z

A software system for analysing and manipulating an LL(1) grammar of a programming language is described. An extended form of BNF notation is presented for the specification of the syntax. The transformations which can be performed upon the defined grammar are described. The system is particularly designed for the processing of LL(1) grammars, although many of the analysis and manipulation functions will be applicable to a wide range of grammar types. The software system (called LL1) performs various transformations upon the input grammar; to translate it into standard BNF, to remove useless productions and direct left recursion, and to attempt to arrive at an LL(1) definition. LL1 also provides various analyses of the grammar and will output the final syntax in human/machine readable form. A facility is provided to generate a recursive descent parser, incorporating a simple lexical analyser, which will recognise valid sentences of the defined language. A justification for these transformations and analyses is given and this is put in the context of the overall language design process. A definition and description of LL(1) grammars and the recursive descent compiling technique is made together with a description of the structure and operation of LL1.

Non-analytic shifts in smooth goal-directed human behaviour

Wale, Adrian Peter — 2006-01-01T00:00:00Z

The thesis is aimed at finding the form of explanation and creating the associated computing methodology required to provide an effective computational explanation of smooth goal-directed behaviour. Smooth behaviour has typically been explained using analytic components. It is hypothesised that goal-directed smooth behaviour would benefit from a new hybrid form of explanation involving non-analytic as well as analytic aspects in order to account better for the type of plastic and persistent adaptation seen in natural agent behaviour. The thesis investigates strategies used by animate agents to control the shape of their motor actions in pursuing goals with a view to establishing their components. The hypothesis that there are non-analytic components in natural smooth goal-directed behaviour is empirically tested in the arena of human hand movement kinematics in a variety of experimental settings. The presence of these components in the behaviour is demonstrated in various ways involving the agent constantly redirecting itself so as to remain projecting non-analytically through the goal. The demonstrations begin with an investigation of a simplest base case of a behaviour that involves a smooth merge between two parallel linear movements. A further series of experiments generalizes the methodology to provide successful predictions for cases involving different ratios for the central movement, different directions at the ends of the movement, and with smooth external perturbations added to the movement. Computing and cognitive applications of the methodology are given. It is concluded that the new hybrid form of explanation and methodology is supported by the empirical evidence as being an appropriate one in many cases for providing an effective computational explanation of goal-directed smooth behaviour.

Multisite adaptive computation offloading for mobile cloud applications

Sulaiman, Dawand Jalil — 2020-12-01T00:00:00Z

The sheer amount of mobile devices and their fast adaptability have contributed to the proliferation of modern advanced mobile applications. These applications have characteristics such as latency-critical and demand high availability. Also, these kinds of applications often require intensive computation resources and excessive energy consumption for processing, a mobile device has limited computation and energy capacity because of the physical size constraints. The heterogeneous mobile cloud environment consists of different computing resources such as remote cloud servers in faraway data centres, cloudlets whose goal is to bring the cloud closer to the users, and nearby mobile devices that can be utilised to offload mobile tasks. Heterogeneity in mobile devices and the different sites include software, hardware, and technology variations. Resource-constrained mobile devices can leverage the shared resource environment to offload their intensive tasks to conserve battery life and improve the overall application performance. However, with such a loosely coupled and mobile device dominating network, new challenges and problems such as how to seamlessly leverage mobile devices with all the offloading sites, how to simplify deploying runtime environment for serving offloading requests from mobile devices, how to identify which parts of the mobile application to offload and how to decide whether to offload them and how to select the most optimal candidate offloading site among others. To overcome the aforementioned challenges, this research work contributes the design and implementation of MAMoC, a loosely coupled end-to-end mobile computation offloading framework. Mobile applications can be adapted to the client library of the framework while the server components are deployed to the offloading sites for serving offloading requests. The evaluation of the offloading decision engine demonstrates the viability of the proposed solution for managing seamless and transparent offloading in distributed and dynamic mobile cloud environments. All the implemented components of this work are publicly available at the following URL: https://github.com/mamoc-repos

Bio-inspired multisensory integration of social signals

Mansouri Benssassi, Esma — 2020-07-29T00:00:00Z

Emotions understanding represents a core aspect of human communication. Our social behaviours are closely linked to expressing our emotions and understanding others’ emotional and mental states through social signals. Emotions are expressed in a multisensory manner, where humans use social signals from different sensory modalities such as facial expression, vocal changes, or body language. The human brain integrates all relevant information to create a new multisensory percept and derives emotional meaning. There exists a great interest for emotions recognition in various fields such as HCI, gaming, marketing, and assistive technologies. This demand is driving an increase in research on multisensory emotion recognition. The majority of existing work proceeds by extracting meaningful features from each modality and applying fusion techniques either at a feature level or decision level. However, these techniques are ineffective in translating the constant talk and feedback between different modalities. Such constant talk is particularly crucial in continuous emotion recognition, where one modality can predict, enhance and complete the other. This thesis proposes novel architectures for multisensory emotions recognition inspired by multisensory integration in the brain. First, we explore the use of bio-inspired unsupervised learning for unisensory emotion recognition for audio and visual modalities. Then we propose three multisensory integration models, based on different pathways for multisensory integration in the brain; that is, integration by convergence, early cross-modal enhancement, and integration through neural synchrony. The proposed models are designed and implemented using third generation neural networks, Spiking Neural Networks (SNN) with unsupervised learning. The models are evaluated using widely adopted, third-party datasets and compared to state-of-the-art multimodal fusion techniques, such as early, late and deep learning fusion. Evaluation results show that the three proposed models achieve comparable results to state-of-the-art supervised learning techniques. More importantly, this thesis shows models that can translate a constant talk between modalities during the training phase. Each modality can predict, complement and enhance the other using constant feedback. The cross-talk between modalities adds an insight into emotions compared to traditional fusion techniques.

Heterogeneity-aware scheduling and data partitioning for system performance acceleration

Yu, Teng — 2020-06-01T00:00:00Z

Over the past decade, heterogeneous processors and accelerators have become increasingly prevalent in modern computing systems. Compared with previous homogeneous parallel machines, the hardware heterogeneity in modern systems provides new opportunities and challenges for performance acceleration. Classic operating systems optimisation problems such as task scheduling, and application-specific optimisation techniques such as the adaptive data partitioning of parallel algorithms, are both required to work together to address hardware heterogeneity. Significant effort has been invested in this problem, but either focuses on a specific type of heterogeneous systems or algorithm, or a high-level framework without insight into the difference in heterogeneity between different types of system. A general software framework is required, which can not only be adapted to multiple types of systems and workloads, but is also equipped with the techniques to address a variety of hardware heterogeneity. This thesis presents approaches to design general heterogeneity-aware software frameworks for system performance acceleration. It covers a wide variety of systems, including an OS scheduler targeting on-chip asymmetric multi-core processors (AMPs) on mobile devices, a hierarchical many-core supercomputer and multi-FPGA systems for high performance computing (HPC) centers. Considering heterogeneity from on-chip AMPs, such as thread criticality, core sensitivity, and relative fairness, it suggests a collaborative based approach to co-design the task selector and core allocator on OS scheduler. Considering the typical sources of heterogeneity in HPC systems, such as the memory hierarchy, bandwidth limitations and asymmetric physical connection, it proposes an application-specific automatic data partitioning method for a modern supercomputer, and a topological-ranking heuristic based schedule for a multi-FPGA based reconfigurable cluster. Experiments on both a full system simulator (GEM5) and real systems (Sunway Taihulight Supercomputer and Xilinx Multi-FPGA based clusters) demonstrate the significant advantages of the suggested approaches compared against the state-of-the-art on variety of workloads.

In silico modelling of in-host tuberculosis dynamics : towards building the virtual patient

Pitcher, Michael John — 2020-06-24T00:00:00Z

Tuberculosis (TB) accounts for over 1 million deaths each year, despite effective treatment regimens being available. Improving the treatment of TB will require new regimens, each of which will need to be put through expensive and lengthy clinical trials, with no guarantee of success. The ability to predict which of many novel regimens to progress through the clinical trial stages would be an important tool to TB research. as current models are constrained in their ability to reflect the whole spectrum of pathophysiology, particularly as there remains uncertainty around the events that occur. This thesis explores the use of computational techniques to model a pulmonary human TB infection. We introduce the first in silico model of TB occurring over the whole lung that incorporates both the environmental heterogeneity that is exhibited within different spatial regions of the organ, and the different bacterial dissemination routes, in order to understand how bacteria move during infection and why post-primary disease is typically localised towards the apices of the lung. Our results show that including environmental heterogeneity within the lung can have profound effects on the results of an infection, by creating a region towards the apex which is preferential for bacterial proliferation. We also present a further iteration of the model, whereby the environment is made more granular to better understand the regions which are afflicted during infection, and show how sensitivity analysis can determine the factors that contribute most to disease outcomes. We show that in order to simulate TB disease within a human lung with sufficient accuracy, better understanding of the dynamics is required. The model presented in this thesis is intended to provide insight into these complicated dynamics, and thus make progress towards an end goal of a virtual clinical trial, consisting of a heterogeneous population of synthetic virtual patients.

Project management in social data science : integrating lessons from research practice and software engineering

Lvov, Ilia — 2019-12-03T00:00:00Z

Online platforms, transaction processing systems, mobile sensors and other novel sources of data have shaped many areas of social research. The emerging discipline of social data science is subject to questions of epistemology, politics, ethics and responsibility, while the practice of doing social data science raises signiﬁcant project management issues that include logistics, team communication, software system integration and stakeholder engagement. Keeping track of such a multitude of individual concerns while maintaining an overview of a social data science project as a whole is not trivial. This calls for provision of appropriate guidance for holistic project management. The project management issues in social data science are strikingly similar to those arising in software engineering. In this thesis, I adapt a particular software engineering project management tool – the SEMAT Essence model (Jacobson et al., 2013) – to the needs of social data science. This model offers a holistic management approach by addressing key project aspects, including the often overlooked yet crucially important ones such as maintaining stakeholder engagement and establishing the ways of working. The SEMAT Essence is a progress tracking model and does not assume any speciﬁc work process, which is valuable given the great diversity of social data science projects. To achieve this goal, I study the practice of doing social data science through participant observation of social data science projects and by providing ethnographic accounts for those. Using the ethnographic ﬁndings and the basic content and structure of the SEMAT model, I develop the Social Science Scorecard Deck – an agile project management tool for social data science. To assess the Scorecard Deck, I use the tool in management of a social data science project and then subject the tool to external validation by interviewing experts in social data science.

Investigation of keyboard and speech based text entry on mobile devices

Reyal, Shyam Mehraaj — 2019-06-26T00:00:00Z

This work presents four in-depth empirical investigations on the performance and user experience of three popular mainstream mobile text entry methods: Touch Typing on a Software Keyboard (STK), the Gesture Typing on a Software Keyboard (SGK), and Speech Based Text Entry. The first and third studies are lab-based longitudinal text entry experiments. In the second and fourth studies, we use a new text entry evaluation methodology based on the experience sampling method (ESM). In the ESM based studies, participants installed an Android app on their own mobile phones that periodically sampled their text entry performance and user experience amid their everyday activities for four weeks. The studies show that text can be entered at an average speed of 24 to 41 WPM using software keyboards, and 49 to 59 WPM using speech, depending on the method and the user's experience, with 0.9% to 3.6% character error rates remaining for software keyboard and 3.0% to 5.8% for speech. Error rates of SGK and speech based input are a major challenge; and reducing out-of-vocabulary errors is particularly important. Both typing and speech have strengths, weaknesses, and different individual awareness and preferences. Two-thumb touch typing in a focused setting is particularly effective on STK, whereas one-handed SGK typing with the thumb is particularly effective in more mobile situations. Speech is more effective when convenience and constraints take priority, whereas typing is more preferable in public – due to social concerns, network latency issues and background noise. When exposed, users showed a trend to migrate from STK to SGK. We also conclude that studies in the lab and in the wild can both be informative to reveal different aspects of keyboard and speech based text entry, but used in conjunction is more reliable in comprehensively assessing input technologies of current and future generations.

Jeeves : a blocks-based approach to end-user development of experience sampling apps

Rough, Daniel John — 2018-12-06T00:00:00Z

Professional programmers are signiﬁcantly outnumbered by end-users of software, and cannot possibly predict the diverse and dynamic needs of user groups in advance. This thesis is concerned with the provision of an end-user development (EUD) approach, allowing end-users to independently create and modify their own software. EUD activities are particularly applicable to the work practices of psychology researchers and clinicians, who are increasingly dependent on software for assessment of participants and patients, but must also depend on developers to realise their requirements. This thesis targets these professionals, with an EUD solution to creating assessment software. The Experience Sampling Method (ESM) is one such means of assessment that takes place in participants’ everyday lives. Through regular completion of subjective self-reports, participants provide rich detail of their ongoing physical and emotional well-being. However, lack of engagement with such studies remains a prevalent issue. This thesis investigates features for maximising engagement with experience sampling smartphone apps. Such apps are becoming accepted as standard practice for remote assessment, but researchers are stiﬂed by the complexity and cost of implementation. Moreover, existing EUD tools are insufﬁcient for development of ESM apps that include engaging features. This thesis presents the development of Jeeves, an EUD tool with a blocks-based programming paradigm that empowers non-programmers to rapidly develop tailored, context-sensitive ESM apps. The adoption of Jeeves is contingent on a number of factors, including its ease-of-use, real-world utility, and organisational conditions. Failure to incorporate the necessary functionality pertaining to these factors into Jeeves will lead to abandonment. This thesis is concerned with establishing the usability, utility, and external factors necessary for adoption of Jeeves. Further, Jeeves is evaluated with respect to these factors through a series of rigorous studies from a range of application domains.

Adaptivity of 3D web content in web-based virtual museums : a quality of service and quality of experience perspective

Bakri, Hussein — 2019-12-03T00:00:00Z

The 3D Web emerged as an agglomeration of technologies that brought the third dimension to the World Wide Web. Its forms spanned from being systems with limited 3D capabilities to complete and complex Web-Based Virtual Worlds. The advent of the 3D Web provided great opportunities to museums by giving them an innovative medium to disseminate collections' information and associated interpretations in the form of digital artefacts, and virtual reconstructions thus leading to a new revolutionary way in cultural heritage curation, preservation and dissemination thereby reaching a wider audience. This audience consumes 3D Web material on a myriad of devices (mobile devices, tablets and personal computers) and network regimes (WiFi, 4G, 3G, etc.). Choreographing and presenting 3D Web components across all these heterogeneous platforms and network regimes present a significant challenge yet to overcome. The challenge is to achieve a good user Quality of Experience (QoE) across all these platforms. This means that different levels of fidelity of media may be appropriate. Therefore, servers hosting those media types need to adapt to the capabilities of a wide range of networks and devices. To achieve this, the research contributes the design and implementation of Hannibal, an adaptive QoS & QoE-aware engine that allows Web-Based Virtual Museums to deliver the best possible user experience across those platforms. In order to ensure effective adaptivity of 3D content, this research furthers the understanding of the 3D web in terms of Quality of Service (QoS) through empirical investigations studying how 3D Web components perform and what are their bottlenecks and in terms of QoE studying the subjective perception of fidelity of 3D Digital Heritage artefacts. Results of these experiments lead to the design and implementation of Hannibal.

Corpus linguistics for the exploration of legal precedent

Brown, Evan David — 2019-07-17T00:00:00Z

A deterioration of legal research skills has become a critical issue for lawyers. This thesis examines the causes of the problem under English law and specifically addresses how current technology for legal research contributes to or ameliorates the skills deficit. England has a "common law" system where the state of the law is determined by precedents laid down in previous cases. This linkage between cases means that lawyers have to assimilate a large amount of written language in order to understand the key elements of prior judgements. There are approximately two million criminal cases and another two million civil cases heard in England each year, from which the important decisions are reported. This thesis first analyses the way in which lawyers work in conducting legal research. The findings indicate that the move online has improved access to legal information but it has compromised the ability of practitioners to identify high-quality precedent and to discard information which is not relevant. A side-effect is the marginalisation of trained law librarians and their curation skills which contributes to the problem. Existing platforms prioritise comprehensive data coverage over delivering a curated research environment. A need to better train lawyers in the skills of critical thinking and linguistic analysis through computer-based tools is identified. This thesis proposes that linguistic analysis techniques from the domain of corpus linguistics can help. Single-context legal research tools which minimise the need to switch between different applications are also required. Effective working is currently compromised by having to navigate between many different tools. The development of effective collaboration skills is of particular importance. Lawyers must work well in teams in order to prepare cases effectively. The Legal Research and Collaboration platform is the prototype application which results from this research. It is a software system for legal research within teams of lawyers. Experiments establish how effectively LARC works for both practising lawyers and for law students. A foundation for future work is laid because the software is entirely open source and is based upon open access legal data. The results show that critical barriers which result in poor legal research skills can be ameliorated by well-designed computer-based tools.

Designing Spatially-Aware Indoor Visual Interfaces and Systems

Dostál, Jakub — 2016-06-22T00:00:00Z

The environments in which people interact with displays and other devices are changing. Interactions are not longer constrained by displays being tethered to a desk. As the variety and complexity of interactive environments increases, so does the importance of spatial aspects of interactions and the physical and visual constraints of people and other interactive entities. This thesis examines spatial relationships between entities and other characteristics of interactions through the lens of the Interaction Relationship Entity model, also introduced here. Moreover, the thesis demonstrates the viability of low-cost, high-availability hardware and software for exploration of novel interactive systems through a set of algorithms that can be used for spatial tracking. The presented work also includes three case studies, each of which explores different aspects of spatial interactions with displays. The first case study investigates the use of displays capable of simultaneously showing two different views from different angles for creating spatial interactions that do not require active tracking. The second case study explores dynamic manipulation of on-display content and prototyping spatial interactions with large displays. The third case study considers how visual changes on displays in a multi-display environment can be tracked during periods of inattention.

Applying named data networking in mobile ad hoc networks

Perez Aruni, Percy Dante — 2019-01-01T00:00:00Z

This thesis presents the Name-based Mobile Ad-hoc Network (nMANET) approach to content distribution that ensure and enables responsible research on applying named data networking protocol in mobile ad-hoc networks. The test framework of the nMANET approach allows reproducibility of experiments and validation of expected results based on analysis of experimental data. The area of application for nMANETs is the distribution of humanitarian information in emergency scenarios. Named-Data Networking (NDN) and ad-hoc mobile communication allow exchange of emergency information in situations where central services such as cellular towers and electric systems are disrupted. The implemented prototype enables researchers to reproduce experiments on content distribution that consider constraints on mobile resources, such as the remaining power of mobile devices and available network bandwidth. The nMANET framework validates a set of experiments by measuring network traffic and energy consumption from both real mobile devices and those in a simulated environment. Additionally, this thesis presents results from experiments in which the nMANET forwarding strategies and traditional wireless services, such as hotpost, are analysed and compared. This experimental data represents the evidence that supports and validates the methodology presented in this thesis. The design and implementation of an nMANET prototype, the Java NDN Forwarder Daemon (JNFD) is presented as a testing framework, which follows the principles of continuous integration, continuous testing and continuous deployment. This testing framework is used to validate JNFD and IP-based technologies, such as HTTP in a MANET using the OLSR routing protocol, as well as traditional wireless infrastructure mode wireless. The set of experiments executed, in a small network of Android smart-phones connected in ad-hoc mode and in a virtual ad-hoc network simulator show the advantages of reproducibility using nMANET features. JNFD is open source, all experiments are scripted, they are repeatable and scalable. Additionally, JNFD utilises real GPS traces to simulate mobility of nodes during experiments. This thesis provides experimental evidence to show that nMANET allows reproducibility and validation of a wide range of future experiments applying NDN on MANETs.

Verified programming with explicit coercions

Schwaab, Christopher — 2019-06-26T00:00:00Z

Type systems have proved to be a powerful means of specifying and proving important program invariants. In dependently typed programming languages types can depend on values and hence express arbitrarily complicated propositions and their machine checkable proofs. The type-based approach to program specification allows for the programmer to not only transcribe their intentions, but arranges for their direct involvement in the proving process, thus aiding the machine in its attempt to satisfy difficult obligations. In this thesis we develop a series of patterns for programming in a correct-by-construction style making use of constraints and coercions to prove properties within a dependently typed host. This allows for the development of a verified, kernel which can be built upon using the host system features. In particular this should allow for the development of “tactics” or semiautomated solvers invoked when coercing types all within a single language. The efficacy of this approach is given by the development of a system of expressions indexed by their, exposing a case analysis feature serving to generate value constraints. These constraints are directly reflected into the host allowing for their involvement in the type-checking process. A motivating use case of this design shows how a term’s semantic index information admits an exact, formalized cost analysis amenable to reasoning within the host. Finally we show how such a system is used to identify unreachable dead-code, trivially admitting the design and verification of an SSA style compiler with this optimization. We think such a design of explicitly proving the local correctness of type-transformations in the presence of accumulated constraints can form the basis of a flexible language in concert with a variety of trusted solver.

A seamless framework for formal reasoning on specifications : model derivation, verification and comparison

Mendoza Santana, Juan Jose — 2019-06-26T00:00:00Z

While formal methods have been demonstrated to be favourable to the construction of reliable systems, they also present us with several limitations. Most of the eﬀorts regarding formal reasoning are concerned with model correctness for critical systems, while other properties, including model validity, have seen little development, especially in the context of non-critical systems. We set to advance model validation by relating a software model with the corresponding requirements it is intended to capture. This requires us to express both requirements and models in a common formal language, which in turn will enable not only model validation, but also model generation and comparison. We present a novel framework (TOMM) that integrates the formalization of class diagrams and requirements, along with a set of formal theories to validate, infer, and compare class models. We introduce SpeCNL, a controlled domain independent subset of English sentences, and a document structure named ConSpec. The combination of both allows us to express and formalize functional requirements related to class models. Our formal framework is accompanied by a proof-of-concept tool that integrates language and image processing libraries, as well as formal methods, to aid the usage and evaluation of our theories. In addition, we provide an implementation that performs partial extraction of relevant information from the graphical representations of class diagrams. Though diﬀerent approaches to model validation exist, they assume the existence of formal speciﬁcations for the model to be checked. In contrast, our approach has been shown to deal with informal speciﬁcations and seamlessly validate, generate and compare class models.

Defence against Denial of Service (DoS) attacks using Identifier-Locator Network Protocol (ILNP) and Domain Name System (DNS)

Shehzad, Khawar — 2019-06-26T00:00:00Z

This research considered a novel approach to network security by combining a new networking architecture based on the Identifier-Locator Network Protocol (ILNP) and the existing Domain Name System (DNS). Specifically, the investigations considered the mitigation of network-level and transport-level based Denial of Service (DoS) attacks. The solutions presented for DoS are applicable to secure servers that are visible externally from an enterprise network. DoS was chosen as an area of concern because in recent years DoS has become the most common and hard to defend against attacks. The novelty of this approach was to consider the way the DNS and ILNP can work together, transparently to the application, within an enterprise scenario. This was achieved by the introduction of a new application-level access control function - the Capability Management System (CMS) - which applies configuration at the application level (DNS data) and network level (ILNP namespaces). CMS provides dynamic, ephemeral identity and location information to clients and servers, in order to effectively partition legitimate traffic from attack traffic. This was achieved without modifying existing network components such as switches and routers and making standard use of existing functions, such as access control lists, and DNS servers, all within a single trust domain that is under the control of the enterprise. The prime objectives of this research were: • to defend against DoS attacks with the use of naming and DNS within an enterprise scenario. • to increase the attacker’s effort in launching a successful DoS attack. • to reduce the visibility of vulnerabilities that can be discovered by an attacker by active probing approaches. • to practically demonstrate the effectiveness of ILNP and DNS working together to provide a solution for DoS mitigation. The solution methodology is based on the use of network and transport level capabilities, dynamic changes to DNS data, and a Moving Target Defence (MTD) paradigm. There are three solutions presented which use ILNP namespaces. These solutions are referred to as identifier-based, locator-based, and combined identifier-locator based solutions, respectively. ILNP-based node identity values were used to provide transport-level per-client server capabilities, thereby providing per-client isolation of traffic. ILNP locator values were used to allow a provision of network-level traffic separation for externally accessible enterprise services. Then, the identifier and locator solutions were combined, showing the possibility of protecting the services, with per-client traffic control and topological traffic path separation. All solutions were site-based solutions and did not require any modification in the core/external network, or the active cooperation of an ISP, therefore limiting the trust domain to the enterprise itself. Experiments were conducted to evaluate all the solutions on a test-bed consisting of off-the-shelf hardware, open-source software, an implementation of the CMS written in C, all running on Linux. The discussion includes considering the efficacy of the solutions, comparisons with existing methods, the performance of each solution, and critical analysis highlighting future improvements that could be made.

Transforming the museum-community nexus with technology : a virtual museum infrastructure for participatory engagement and management

Fabola, Adeola Ezekiel — 2018-12-06T00:00:00Z

Museums play an important role in society as the custodians of heritage, and advances in technology have brought about opportunities for curating, preserving and disseminating heritage through virtual museums. However, this is not matched by an understanding of how these technologies can support these functions, especially given the varying levels of resources that museums have at their disposal. To address this problem, a hybrid methodology which combines underpinning theory and practice has been adopted. Initial investigation of the problem takes place through a contextualisation of museology and heritage studies, followed by exploratory case studies that yield design objectives for a Virtual Museum Infrastructure (VMI). A design of the VMI is proposed based on these objectives, and the VMI is instantiated, deployed and evaluated in real-world scenarios using a combination of quantitative and qualitative techniques. The findings of this investigation demonstrate that the use of technology provides new opportunities for engagement with heritage, as experts and community members alike can create, curate and preserve content, which can then be disseminated in engaging ways using immersive, yet affordable technologies. This work therefore demonstrates how technology can be used to: (1) support museums in the creation, curation, preservation and dissemination of heritage, through a VMI that provides support for all the stages of the media life cycle, (2) facilitate active use, so that content that is created once can be reused on multiple platforms (for example on the web, on mobile apps and in on-site installations), and (3) encourage connectivity by linking up local museums using a location-aware interface and facilitates the consumption content using digital literacies available to the public. The aforementioned points, coupled with the system instantiations that demonstrate them, represent the contributions of this thesis.

Detecting cloud virtual network isolation security for data leakage

Al Nasseri, Haifa Mohamed — 2019-06-26T00:00:00Z

This thesis considers information leakage in cloud virtually isolated networks. Virtual Network (VN) Isolation is a core element of cloud security yet research literature shows that no experimental work, to date, has been conducted to test, discover and evaluate VN isolation data leakage. Consequently, this research focussed on that gap. Deep Dives of the cloud infrastructures were performed, followed by (Kali) penetration tests to detect any leakage. This data was compared to information gathered in the Deep Dive, to determine the level of cloud network infrastructure being exposed. As a major contribution to research, this is the first empirical work to use a Deep Dive approach and a penetration testing methodology applied to both CloudStack and OpenStack to demonstrate cloud network isolation vulnerabilities. The outcomes indicated that Cloud manufacturers need to test their isolation mechanisms more fully and enhance them with available solutions. However, this field needs more industrial data to confirm if the found issues are applicable to non-open source cloud technologies. If the problems revealed are widespread then this is a major issue for cloud security. Due to the time constraints, only two cloud testbeds were built and analysed, but many potential future works are listed for analysing more complicated VN, analysing leveraged VN plugins and testing if system complexity will cause more leakage or protect the VN. This research is one of the first empirical building blocks in the field and gives future researchers the basis for building their research on top of the presented methodology and results and for proposing more effective solutions.

Full coverage displays for non-immersive applications

Petford, Julian — 2019-06-26T00:00:00Z

Full Coverage Displays (FCDs), which cover the interior surface of a room with display pixels, can create novel user interfaces taking advantage of natural aspects of human perception and memory which we make use of in our everyday lives. However, past research has generally focused on FCDs for immersive experiences, the required hardware is generally prohibitively expensive for the average potential user, configuration is complicated for developers and end users, and building applications which conform to the room layout is often difficult. The goals of this thesis are: to create an affordable, easy to use (for developers and end users) FCD toolkit for non-immersive applications; to establish efficient pointing techniques in FCD environments; and to explore suitable ways to direct attention to out-of-view targets in FCDs. In this thesis I initially present and evaluate my own "ASPECTA Toolkit" which was designed to meet the above requirements. Users during the main evaluation were generally positive about their experiences, all completing the task in less than three hours. Further evaluation was carried out through interviews with researchers who used ASPECTA in their own work. These revealed similarly positive results, with feedback from users driving improvements to the toolkit. For my exploration into pointing techniques, Mouse and Ray-Cast approaches were chosen as most appropriate for FCDs. An evaluation showed that the Ray-Cast approach was fastest overall, while a mouse-based approach showed a small advantage in the front hemisphere of the room. For attention redirection I implemented and evaluated a set of four visual techniques. The results suggest that techniques which are static and lead all the way to the target may have an advantage and that the cognitive processing time of a technique is an important consideration.

Simplified cloud instance selection

Boonprasop, Chalee — 2018-04-23T00:00:00Z

Cloud computing delivers computational services to anyone over the internet. The cloud providers offer these services through a simplified billing model where customers can rent services based on the types of computing power they require. However, given the vast choice, it is complicated for a user to select the optimal instance types for a given workload or application. In this paper, we propose a user-friendly cloud instance recommendation system, which given a set of weighted coefficients representing the relevance of CPU, memory, storage and network along with a price, will recommend the best performing instances. The system only requires provider specified data about instance types and doesn’t require costly cloud benchmarking. We evaluate our approach on Microsoft Azure across a number of different common workload types.

Anomaly-based network intrusion detection enhancement by prediction threshold adaptation of binary classification models

Al Tobi, Amjad Mohamed — 2018-10-19T00:00:00Z

Network traffic exhibits a high level of variability over short periods of time. This variability impacts negatively on the performance (accuracy) of anomaly-based network Intrusion Detection Systems (IDS) that are built using predictive models in a batch-learning setup. This thesis investigates how adapting the discriminating threshold of model predictions, specifically to the evaluated traffic, improves the detection rates of these Intrusion Detection models. Specifically, this thesis studied the adaptability features of three well known Machine Learning algorithms: C5.0, Random Forest, and Support Vector Machine. The ability of these algorithms to adapt their prediction thresholds was assessed and analysed under different scenarios that simulated real world settings using the prospective sampling approach. A new dataset (STA2018) was generated for this thesis and used for the analysis. This thesis has demonstrated empirically the importance of threshold adaptation in improving the accuracy of detection models when training and evaluation (test) traffic have different statistical properties. Further investigation was undertaken to analyse the effects of feature selection and data balancing processes on a model’s accuracy when evaluation traffic with different significant features were used. The effects of threshold adaptation on reducing the accuracy degradation of these models was statistically analysed. The results showed that, of the three compared algorithms, Random Forest was the most adaptable and had the highest detection rates. This thesis then extended the analysis to apply threshold adaptation on sampled traffic subsets, by using different sample sizes, sampling strategies and label error rates. This investigation showed the robustness of the Random Forest algorithm in identifying the best threshold. The Random Forest algorithm only needed a sample that was 0.05% of the original evaluation traffic to identify a discriminating threshold with an overall accuracy rate of nearly 90% of the optimal threshold.

Supervisor recommendation tool for Computer Science projects

Zemaityte, Gintare — 2019-01-09T00:00:00Z

In most Computer Science programmes, students are required to undertake an individual project under the guidance of a supervisor during their studies. With increasing student numbers, matching students to suitable supervisors is becoming an increasing challenge. This paper presents a software tool which assists Computer Science students in identifying the most suitable supervisor for their final year project. It does this by matching a list of keywords or a project proposal provided by the students to a list of keywords which were automatically extracted from freely available data for each potential supervisor. The tool was evaluated using both manual and user testing, with generally positive results and user feedback. 83% of respondents agree that the current implementation of the tool is accurate, with 67% saying it would be a useful tool to have when looking for a supervisor. The tool is currently being adapted for wider use in the School.

Proof-carrying plans

Schwaab, Christopher Joseph — 2019-01-01T00:00:00Z

It is becoming increasingly important to verify safety and security of AI applications. While declarative languages (of the kind found in automated planners and model checkers) are traditionally used for verifying AI systems, a big challenge is to design methods that generate verified executable programs. A good example of such a “verification to implementation” cycle is given by automated planning languages like PDDL, where plans are found via a model search in a declarative language, but then interpreted or compiled into executable code in an imperative language. In this paper, we show that this method can itself be verified. We present a formal framework and a prototype Agda implementation that represent PDDL plans as executable functions that inhabit types that are given by formulae describing planning problems. By exploiting the well-known Curry-Howard correspondence, type-checking then automatically ensures that the generated program corresponds precisely to the specification of the planning problem.

The Sea of Stuff: a model to manage shared mutable data in a distributed environment

Conte, Simone Ivan — 2019-06-26T00:00:00Z

Managing data is one of the main challenges in distributed systems and computer science in general. Data is created, shared, and managed across heterogeneous distributed systems of users, services, applications, and devices without a clear and comprehensive data model. This technological fragmentation and lack of a common data model result in a poor understanding of what data is, how it evolves over time, how it should be managed in a distributed system, and how it should be protected and shared. From a user perspective, for example, backing up data over multiple devices is a hard and error-prone process, or synchronising data with a cloud storage service can result in conflicts and unpredictable behaviours. This thesis identifies three challenges in data management: (1) how to extend the current data abstractions so that content, for example, is accessible irrespective of its location, versionable, and easy to distribute; (2) how to enable transparent data storage relative to locations, users, applications, and services; and (3) how to allow data owners to protect data against malicious users and automatically control content over a distributed system. These challenges are studied in detail in relation to the current state of the art and addressed throughout the rest of the thesis. The artefact of this work is the Sea of Stuff (SOS), a generic data model of immutable self-describing location-independent entities that allow the construction of a distributed system where data is accessible and organised irrespective of its location, easy to protect, and can be automatically managed according to a set of user-defined rules. The evaluation of this thesis demonstrates the viability of the SOS model for managing data in a distributed system and using user-defined rules to automatically manage data across multiple nodes.

Employing domain specific discriminative information to address inherent limitations of the LBP descriptor in face recognition

Fan, Junjie — 2018-10-15T00:00:00Z

The local binary patern (LBP) descriptor and its derivatives have a demonstrated track record of good performance in face recognition. Nevertheless the original descriptor, the framework within which it is employed, and the aforementioned improvements of these in the existing literature, all suffer from a number of inherent limitations. In this work we highlight these and propose novel ways of addressing them in a principled fashion. Specifically, we introduce (i) gradient based weighting of local descriptor contributions to region based histograms as a means of avoiding data smoothing by non-discriminative image loci, and (ii) Gaussian fuzzy region membership as a means of achieving robustness to registration errors. Importantly, the nature of these contributions allows the proposed techniques to be combined with the existing extensions to the LBP descriptor thus making them universally recommendable. Effectiveness is demonstrated on the notoriously challenging Extended Yale B face corpus.

Enabling single-handed interaction in mobile and wearable computing

Yeo, Hui Shyong — 2018-10-11T00:00:00Z

Mobile and wearable computing are increasingly pervasive as people carry and use personal devices in everyday life. Screen sizes of such devices are becoming larger and smaller to accommodate both intimate and practical uses. Some mobile device screens are becoming larger to accommodate new experiences (e.g., phablet, tablet, eReader), whereas screen sizes on wearable devices are becoming smaller to allow them to fit into more places (e.g., smartwatch, wrist-band and eye-wear). However, these trends are making it difficult to use such devices with only one hand due to their placement, limited thumb reach and the fat-finger problem. This is especially true as there are many occasions when a user’s other hand is occupied (encumbered) or not available. This thesis work explores, creates and studies novel interaction techniques that enable effective single-hand usage on mobile and wearable devices, empowering users to achieve more with their smart devices when only one hand is available.

Diversity computing

Fletcher-Wastson, Sue — 2018-08-22T00:00:00Z

Teaching data ethics : "We're going to ethics the heck out of this"

Henderson, Tristan — 2019-01-09T00:00:00Z

This paper outlines a new Data Ethics & Privacy module that was introduced to computer science students in 2018. The module aims to raise student awareness of current debates in computer science such as bias in artificial intelligence, algorithmic accountability, filter bubbles and data protection, and practical mechanisms for addressing these issues. To do this, the module includes interdisciplinary content from ethics, law and computer science, and also adopts some teaching methods from the law. I describe the format of the module, challenges with module design and approval, some initial comments on the first year’s cohort, and plans for future improvements. I believe that the topic is currently important and this discussion might be of interest to other computer science departments considering the introduction of similar content.

Automatic generation of proof terms in dependently typed programming languages

Slama, Franck — 2018-09-17T00:00:00Z

Dependent type theories are a kind of mathematical foundations investigated both for the formalisation of mathematics and for reasoning about programs. They are implemented as the kernel of many proof assistants and programming languages with proofs (Coq, Agda, Idris, Dedukti, Matita, etc). Dependent types allow to encode elegantly and constructively the universal and existential quantifications of higher-order logics and are therefore adapted for writing logical propositions and proofs. However, their usage is not limited to the area of pure logic. Indeed, some recent work has shown that they can also be powerful for driving the construction of programs. Using more precise types not only helps to gain confidence about the program built, but it can also help its construction, giving rise to a new style of programming called Type-Driven Development. However, one difficulty with reasoning and programming with dependent types is that proof obligations arise naturally once programs become even moderately sized. For example, implementing an adder for binary numbers indexed over their natural number equivalents naturally leads to proof obligations for equalities of expressions over natural numbers. The need for these equality proofs comes, in intensional type theories (like CIC and ML) from the fact that in a non-empty context, the propositional equality allows us to prove as equal (with the induction principles) terms that are not judgementally equal, which implies that the typechecker can't always obtain equality proofs by reduction. As far as possible, we would like to solve such proof obligations automatically, and we absolutely need it if we want dependent types to be used more broadly, and perhaps one day to become the standard in functional programming. In this thesis, we show one way to automate these proofs by reflection in the dependently typed programming language Idris. However, the method that we follow is independent from the language being used, and this work could be reproduced in any dependently-typed language. We present an original type-safe reflection mechanism, where reflected terms are indexed by the original Idris expression that they represent, and show how it allows us to easily construct and manipulate proofs. We build a hierarchy of correct-by-construction tactics for proving equivalences in semi-groups, monoids, commutative monoids, groups, commutative groups, semi-rings and rings. We also show how each tactic reuses those from simpler structures, thus avoiding duplication of code and proofs. Finally, and as a conclusion, we discuss the trust we can have in such machine-checked proofs.

AIF-EL - an OWL2-EL-compliant AIF ontology

Cerutti, Federico — 2018-09-11T00:00:00Z

This paper briefly describes AIF-EL, an OWL2-EL compliant ontology for the Argument Interchange Format.

CISpaces.org : from fact extraction to report generation

Cerutti, Federico — 2018-09-11T00:00:00Z

We introduce CISpaces.org, a tool to support situational understanding in intelligence analysis that complements but not replaces human expertise. The system combines natural language processing, argumentation-based reasoning, and natural language generation to produce intelligence reports from social media data, and to record the process of forming hypotheses from relationships among information. In this paper, we show how CISpaces.org meets the desirable requirements elicited from senior professionals, and demonstrate its usage and capabilities to support analysts in delivering effective and tailored intelligence to decision makers.

Querying metric spaces with bit operations

Connor, Richard — 2018-01-01T00:00:00Z

Metric search techniques can be usefully characterised by the time at which distance calculations are performed during a query. Most exact search mechanisms use a “just-in-time” approach where distances are calculated as part of a navigational strategy. An alternative is to use a “one-time” approach, where distances to a fixed set of reference objects are calculated at the start of each query. These distances are typically used to re-cast data and queries into a different space where querying is more efficient, allowing an approximate solution to be obtained. In this paper we use a “one-time” approach for an exact search mechanism. A fixed set of reference objects is used to define a large set of regions within the original space, and each query is assessed with respect to the definition of these regions. Data is then accessed if, and only if, it is useful for the calculation of the query solution. As dimensionality increases, the number of defined regions must increase, but the memory required for the exclusion calculation does not. We show that the technique gives excellent performance over the SISAP benchmark data sets, and most interestingly we show how increases in dimensionality may be countered by relatively modest increases in the number of reference objects used. Funding: This work was supported by ESRC grant ES/L007487/1 “Administrative Data Research Centre—Scotland".

Structured arrows : a type-based framework for structured parallelism

Castro, David — 2018-06-27T00:00:00Z

This thesis deals with the important problem of parallelising sequential code. Despite the importance of parallelism in modern computing, writing parallel software still relies on many low-level and often error-prone approaches. These low-level approaches can lead to serious execution problems such as deadlocks and race conditions. Due to the non-deterministic behaviour of most parallel programs, testing parallel software can be both tedious and time-consuming. A way of providing guarantees of correctness for parallel programs would therefore provide significant benefit. Moreover, even if we ignore the problem of correctness, achieving good speedups is not straightforward, since this generally involves rewriting a program to consider a (possibly large) number of alternative parallelisations. This thesis argues that new languages and frameworks are needed. These language and frameworks must not only support high-level parallel programming constructs, but must also provide predictable cost models for these parallel constructs. Moreover, they need to be built around solid, well-understood theories that ensure that: (a) changes to the source code will not change the functional behaviour of a program, and (b) the speedup obtained by doing the necessary changes is predictable. Algorithmic skeletons are parametric implementations of common patterns of parallelism that provide good abstractions for creating new high-level languages, and also support frameworks for parallel computing that satisfy the correctness and predictability requirements that we require. This thesis presents a new type-based framework, based on the connection between structured parallelism and structured patterns of recursion, that provides parallel structures as type abstractions that can be used to statically parallelise a program. Specifically, this thesis exploits hylomorphisms as a single, unifying construct to represent the functional behaviour of parallel programs, and to perform correct code rewritings between alternative parallel implementations, represented as algorithmic skeletons. This thesis also defines a mechanism for deriving cost models for parallel constructs from a queue-based operational semantics. In this way, we can provide strong static guarantees about the correctness of a parallel program, while simultaneously achieving predictable speedups.

Biologically inspired vision for human-robot interaction

Saleiro, Mario — 2015-01-01T00:00:00Z

Human-robot interaction is an interdisciplinary research area that is becoming more and more relevant as robots start to enter our homes, workplaces, schools, etc. In order to navigate safely among us, robots must be able to understand human behavior, to communicate, and to interpret instructions from humans, either by recognizing their speech or by understanding their body movements and gestures. We present a biologically inspired vision system for human-robot interaction which integrates several components: visual saliency, stereo vision, face and hand detection and gesture recognition. Visual saliency is computed using color, motion and disparity. Both the stereo vision and gesture recognition components are based on keypoints coded by means of cortical V1 simple, complex and end-stopped cells. Hand and face detection is achieved by using a linear SVM classifier. The system was tested on a child-sized robot.

A parametric spectral model for texture-based salience

Terzić, Kasim — 2015-01-01T00:00:00Z

We present a novel saliency mechanism based on texture. Local texture at each pixel is characterised by the 2D spectrum obtained from oriented Gabor filters. We then apply a parametric model and describe the texture at each pixel by a combination of two 1D Gaussian approximations. This results in a simple model which consists of only four parameters. These four parameters are then used as feature channels and standard Difference-of-Gaussian blob detection is applied in order to detect salient areas in the image, similar to the Itti and Koch model. Finally, a diffusion process is used to sharpen the resulting regions. Evaluation on a large saliency dataset shows a significant improvement of our method over the baseline Itti and Koch model.

Chamber of Ideas 2.0 : a virtual collaborative system for organizational and group workflows of postgraduate students, academic staff, and support staff at the University of St Andrews

Schorr, Scott — 2017-01-01T00:00:00Z

The Chamber of Ideas is a virtual collaborative system designed to enhance the research experience for postgraduate students and academic staff at research universities, and to improve daily workflow efficiencies between researchers and support staff. It builds upon past literature and system development within the fields of e-Science and Computer- Supported Cooperative Work. Research is becoming increasingly interdisciplinary, multi-institutional, and digital, all trends which have contributed to increased levels of collaboration between researchers. This shift toward greater collaboration has been incentivized by host research institutions, public funding bodies, and private sponsors. It has been largely enabled by the presence and rapid growth of the World Wide Web. As a platform, the World Wide Web provides a communication infrastructure capable of linking all researchers from all disciplines from all research institutions across the globe. Yet, a widely-adopted, federated, and ubiquitous Web-based service does not presently exist to satisfy the evolving collaborative workflow needs of today’s researchers. This thesis focuses on the University of St Andrews as a local case-study to present a technical blueprint and project roadmap for the design and introduction of a new system that can fill this niche. Requirements were elicited from university stakeholders regarding organizational workflows for knowledge transfer, research funding, researcher communication with support units, and interdisciplinary research between schools. Primary institutional stakeholders include the Knowledge Transfer Centre, St Leonard's College, Postgraduate Society, and Vice-Principal for Enterprise & Engagement. A prototype was designed and engineered to support user research management, research group coordination, and team project management, incorporating unique sets of collaborative tools for user, group, and work object system perspectives. The thesis proposes a new theoretical framework for Large-Scale Complex Research Institutions inspired by LSCIT System and ULS System literature, and introduces concepts of institutional genealogy and social research data for system preservation and curation.

Ten simple rules for measuring the impact of workshops

Sufi, Shoaib — 2018-08-30T00:00:00Z

Workshops are used to explore a specific topic, transfer knowledge, solve identified problems or create something new. In funded research projects and other research endeavours, workshops are the mechanism to gather the wider project, community or interested people together around a particular topic. However, natural questions arise: how do we measure the impact of these workshops? Do we know whether they are meeting the goals and objectives we set for them? What indicators should we use? In response to these questions, this paper will outline rules that will improve the measurement of the impact of workshops. SS, AN, RS, IE, and OP acknowledge the support of EPSRC, BBSRC and ESRC Grant EP/N006410/1 for the UK Software Sustainability Institute, http://gow.epsrc.ac.uk/NGBOViewGrant.aspx?GrantRef=EP/N006410/1. IS, CS, and JdB acknowledges support from Cancer Research UK (grant C5529/A16895).

Automatic generation and selection of streamlined constraint models via Monte Carlo search on a model lattice

Spracklen, Patrick — 2018-01-01T00:00:00Z

Streamlined constraint reasoning is the addition of uninferred constraints to a constraint model to reduce the search space, while retaining at least one solution. Previously it has been established that it is possible to generate streamliners automatically from abstract constraint specifications in Essence and that effective combinations of streamliners can allow instances of much larger scale to be solved. A shortcoming of the previous approach was the crude exploration of the power set of all combinations using depth and breadth first search. We present a new approach based on Monte Carlo search over the lattice of streamlined models, which efficiently identifies effective streamliner combinations. Funding: EPSRC EP/P015638/1.

Solving the threat of LSB steganography within data loss prevention systems

Wang, Yunjia — 2017-01-01T00:00:00Z

With the recent spate of data loss breaches from industry and commerce, especially with the large number of Advanced Persistent Threats, companies are increasing their network boundary security. As network defences are enhanced through the use of Data Loss Prevention systems (DLP), attackers seek new ways of exploiting and extracting confidential data. This is often done by internal parties in large-scale organisations through the use of steganography. The successful utilisation of steganography makes the exportation of confidential data hard to detect, equipped with the ability of escaping even the most sophisticated DLP systems. This thesis provides two effective solutions to prevent data loss from effective LSB image steganographic behaviour, with the potential to be applied in industrial DLP systems.

Using machine learning to select and optimise multiple objectives in media compression

Murashko, Oleksandr — 2018-01-01T00:00:00Z

The growing complexity of emerging image and video compression standards means additional demands on computational time and energy resources in a variety of environments. Additionally, the steady increase in sensor resolution, display resolution, and the demand for increasingly high-quality media in consumer and professional applications also mean that there is an increasing quantity of media being compressed. This work focuses on a methodology for improving and understanding the quality of media compression algorithms using an empirical approach. Consequently, the outcomes of this research can be deployed on existing standard compression algorithms, but are also likely to be applicable to future standards without substantial redevelopment, increasing productivity and decreasing time-to-market. Using machine learning techniques, this thesis proposes a means of using past information about how images and videos are compressed in terms of content, and leveraging this information to guide and improve industry standard media compressors in order to achieve the desired outcome in a time and energy e cient way. The methodology is implemented and evaluated on JPEG, WebP and x265 codecs, allowing the system to automatically target multiple performance characteristics like le size, image quality, compression time and e ciency, based on user preferences. Compared to previous work, this system is able to achieve a prediction error three times smaller for quality and size for JPEG, and a speed up of compression of four times for WebP, targeting the same objectives. For x265 video compression, the system allows multiple objectives to be considered simultaneously, allowing speedier encoding for similar levels of quality.

Stepping into the clouds : enabling companies to adapt their capabilities to cloud computing to succeed under uncertain conditions

Werfs, Marc — 2016-06-22T00:00:00Z

Recent technologies have changed the way companies acquire and use computing resources. Companies have to adapt their capabilities, which combine business processes, skills, etc., to exploit the opportunities presented by these technologies whilst avoiding adverse effects. The latter part is, however, becoming increasingly difficult due to the uncertain long-term impact recent technologies have. This thesis argues that companies are required to adapt their capabilities in a way that increases the company’s resilience so that they are robust yet flexible enough to succeed under uncertain conditions. By focusing on cloud computing as one recent technology, this thesis first identifies the underlying processes of adapting capabilities to cloud computing by investigating how software vendors migrated their products into the cloud. The results allow the definition of viewpoints that influence the adaptation of capabilities to cloud computing. Furthermore, the Functional Resonance Analysis Method (FRAM) is applied to one software vendor after the migration of their product into the cloud. FRAM enables the analysis of ‘performance variabilities’ that need to be dampened to increase the resilience of systems. The results show that FRAM appropriately informs steps to increase and measure resilience when migrating products into the cloud. The final part develops cFRAM which extends FRAM through the viewpoints to enable the analysis of capabilities within FRAM. The goal of cFRAM is to enable companies to (1) identify existing capabilities, (2) investigate the impact of cloud computing on them, and (3) inform steps to adapt them to cloud computing whilst dampening performance variabilities. The results of the cFRAM evaluation study are unequivocal and show cFRAM is a novel method that achieves its goal of enabling companies to adapt their capabilities to cloud computing in a way that increases the company’s resilience. cFRAM can be easily adapted to other technologies like smartphones by changing the viewpoints.

A domain-driven method for creating self-adaptive application architecture

Huang, Jin — 2017-01-01T00:00:00Z

Following the increasing complexity of modern software systems, software engineers have introduced self-adaptation techniques from the field of control theory into software development. However, it is still difficult to construct self-adaptive software systems. By understanding the importance of software architecture, this dissertation concerns the issues of how to design a domain-specific self-adaptive software application architecture in a principled way. Specifically, there is still lacking of method for helping software engineers generate software architecture which is consistent with the domain knowledge. To achieve the research goal, this dissertation has: 1) investigated the existing definitions about software architecture; 2) proposed a framework of understanding self-adaptive software application architecture via appropriate architectural patterns; 3) proposed a novel high-order language, and the tools, to specify domain-specific uncertainty; 4) proposed an improved version of Grasp, and the tools, so that users can describe the dynamism of a self-adaptive application; 5) proposed a novel architectural pattern by selecting architectural patterns in a principled way; 6) evaluate this work by applying these methods to a business project.

Optimising the usage of cloud resources for execution bag-of-tasks applications

Thai, Long Thanh — 2017-01-01T00:00:00Z

Cloud computing has been widely adopted by many organisations, due to its flexibility in resource provisioning and on-demand pricing models. Entire clusters of machines can now be dynamically provisioned to meet the computational demands of users. By moving operations to the cloud, users hope to reduce the costs of building and maintaining a computational cluster without sacrificing the quality of service. However, cloud computing has presented challenges in scheduling and managing the usage of resources, which users of more traditional resource pooling models, such as grid and clusters, have never encountered before. Firstly, the costs associated with resource usage changes dynamically, and is based on the type and duration of resources used; this prevents users from greedily acquiring as many resources as possible due to the associated costs. Secondly, the cloud computing marketplace offers an assortment of on-demand resources with a wide range of performance capabilities. Given the variety of resources, this makes it difficult for users to construct a cluster which is suitable for their applications. As a result, it is challenging for users to ensure the desired quality of service while running applications on the cloud. The research in this thesis focuses on optimising the usage of cloud computing resources. We propose approaches for scheduling the execution of applications on to the cloud, such that the desired performance is met whilst the incurred monetary cost is minimised. Furthermore, this thesis presents a set of mechanisms which manages the execution at runtime, in order to detect and handle unexpected events with undesirable consequences, such as the violation of quality of service, or cost overheads. Using both simulated and real world experiments, we validate the feasibility of the proposed research by executing applications on the cloud with low costs without sacrificing performance. The key result is that it is possible to optimise the usage of cloud resources for user applications by using the research reported in this thesis.

Pattern discovery for parallelism in functional languages

Barwell, Adam David — 2018-01-01T00:00:00Z

No longer the preserve of specialist hardware, parallel devices are now ubiquitous. Pattern-based approaches to parallelism, such as algorithmic skeletons, simplify traditional low-level approaches by presenting composable high-level patterns of parallelism to the programmer. This allows optimal parallel configurations to be derived automatically, and facilitates the use of different parallel architectures. Moreover, parallel patterns can be swap-replaced for sequential recursion schemes, thus simplifying their introduction. Unfortunately, there is no guarantee that recursion schemes are present in all functional programs. Automatic pattern discovery techniques can be used to discover recursion schemes. Current approaches are limited by both the range of analysable functions, and by the range of discoverable patterns. In this thesis, we present an approach based on program slicing techniques that facilitates the analysis of a wider range of explicitly recursive functions. We then present an approach using anti-unification that expands the range of discoverable patterns. In particular, this approach is user-extensible; i.e. patterns developed by the programmer can be discovered without significant effort. We present prototype implementations of both approaches, and evaluate them on a range of examples, including five parallel benchmarks and functions from the Haskell Prelude. We achieve maximum speedups of 32.93x on our 28-core hyperthreaded experimental machine for our parallel benchmarks, demonstrating that our approaches can discover patterns that produce good parallel speedups. Together, the approaches presented in this thesis enable the discovery of more loci of potential parallelism in pure functional programs than currently possible. This leads to more possibilities for parallelism, and so more possibilities to take advantage of the potential performance gains that heterogeneous parallel systems present.

Fidelity perception of 3D models on the web

Bakri, Hussein — 2018-01-01T00:00:00Z

Cultural heritage artefacts act as a gateway helping people learn about their social traditions and history. However, preserving these artefacts faces many difficulties, including potential destruction or damage from global warming, wars and conflicts, and degradation from day-to-day use. In addition, artefacts can only be present in one place at a time, and many of them can not be exhibited due to the limited physical space of museums. The digital domain offers opportunities to capture and represent the form and texture of these artefacts and to overcome the previously mentioned constraints by allowing people to access and interact with them on multiple platforms (mobile devices, tablets and personal computers) and network regimes. Through two experiments we study the subjective perception of the fidelity of 3D models in web browsers in order to discover perceptible resolution thresholds. This helps us create models of reasonable graphical complexity that could be fetched on the biggest range of end devices. It also enables us to design systems which efficiently optimise the user experience by adapting their behaviour based upon user perception, model characteristics and digital infrastructure.

Using metric space indexing for complete and efficient record linkage

Akgün, Özgür — 2018-01-01T00:00:00Z

Record linkage is the process of identifying records that refer to the same real-world entities in situations where entity identifiers are unavailable. Records are linked on the basis of similarity between common attributes, with every pair being classified as a link or non-link depending on their similarity. Linkage is usually performed in a three-step process: first, groups of similar candidate records are identified using indexing, then pairs within the same group are compared in more detail, and finally classified. Even state-of-the-art indexing techniques, such as locality sensitive hashing, have potential drawbacks. They may fail to group together some true matching records with high similarity, or they may group records with low similarity, leading to high computational overhead. We propose using metric space indexing (MSI) to perform complete linkage, resulting in a parameter-free process combining indexing, comparison and classification into a single step delivering complete and efficient record linkage. An evaluation on real-world data from several domains shows that linkage using MSI can yield better quality than current indexing techniques, with similar execution cost, without the need for domain knowledge or trial and error to configure the process.

Deriving distributed garbage collectors from distributed termination algorithms

Norcross, Stuart John — 2004-01-01T00:00:00Z

This thesis concentrates on the derivation of a modularised version of the DMOS distributed garbage collection algorithm and the implementation of this algorithm in a distributed computational environment. DMOS appears to exhibit a unique combination of attractive characteristics for a distributed garbage collector but the original algorithm is known to contain a bug and, previous to this work, lacks a satisfactory, understandable implementation. The relationship between distributed termination detection algorithms and distributed garbage collectors is central to this thesis. A modularised DMOS algorithm is developed using a previously published distributed garbage collector derivation methodology that centres on mapping centralised collection schemes to distributed termination detection algorithms. In examining the utility and suitability of the derivation methodology, a family of six distributed collectors is developed and an extension to the methodology is presented. The research work described in this thesis incorporates the definition and implementation of a distributed computational environment based on the ProcessBase language and a generic definition of a previously unimplemented distributed termination detection algorithm called Task Balancing. The role of distributed termination detection in the DMOS collection mechanisms is defined through a process of step-wise refinement. The implementation of the collector is achieved in two stages; the first stage defines the implementation of two distributed termination mappings with the Task Balancing algorithm; the second stage defines the DMOS collection mechanisms.

Using subgoal chaining to address the local minimum problem

Lewis, Jonathan Peter — 2002-01-01T00:00:00Z

A common problem in the area of non-linear function optimisation is that of not being able to guarantee finding the global optimum of the function in a feasible time especially when local optima exist. This problem applies to various areas of heuristic search. One of these areas concerns standard training techniques for feedforward neural networks. The element of heuristic search consists of attempting to find a neural weight state corresponding to the lowest training error. This problem may be termed the local minimum problem. The local minimum problem is addressed for feedforward neural networks. This is done by first establishing the conditions under which local minimum interference for the training process is to be expected. A target based approach to subgoal chaining in supervised learning is then investigated. This is a method to improve travel for neural networks by directing it more precisely through local subgoals than may be achieved through a more distant goal. It is shown however that linear subgoal chains are not sufficient to overcome the local minimum problem. Two novel training techniques are presented which use non-linear subgoal chains and are examined for their capability to address the local minimum problem. It is found that attempting to target a neural network to do something it cannot may lead to suboptimal training. It is also found that targeting a network to do something it is capable of generally leads to successful training. A novel system is presented which is designed to create optimal realisable targets for unrealisable goals. This allows neural networks to subsequently achieve the optimal weight state through a sufficiently powerful training method such as subgoal chaining. The results are shown to be consistent with the theoretical expectations.

Genetic programming with context-sensitive grammars

Paterson, Norman R. — 2003-01-01T00:00:00Z

This thesis presents Genetic Algorithm for Deriving Software (Gads), a new technique for genetic programming. Gads combines a conventional genetic algorithm with a context-sensitive grammar. The key to Gads is the onto genic mapping, which converts a genome from an array of integers to a correctly typed program in the phenotype language defined by the grammar. A new type of grammar, the reflective attribute grammar (rag), is introduced. The rag is an extension of the conventional attribute grammar, which is designed to produce valid sentences, not to recognize or parse them. Together, Gads and rags provide a scalable solution for evolving type-correct software in independently-chosen context-sensitive languages. The statistics of performance comparison is investigated. A method for representing a set of genetic programming systems or problems on a cladogram is presented. A method for comparing genetic programming systems or problems on a single rational scale is proposed.

Symmetry in constraint programming

McDonald, Iain — 2004-01-01T00:00:00Z

Constraint programming is an invaluable tool for solving many of the complex NP-complete problems that we need solutions to. These problems can be easily described as Constraint Satisfaction Problems (CSPs) and then passed to constraint solvers: complex pieces of software written to solve general CSPs efficiently. Many of the problems we need solutions to are real world problems: planning (e.g. vehicle routing), scheduling (e.g. job shop schedules) and timetabling problems (e.g. staff rotas) to name but a few. In the real world, we place structure on objects to make them easier to deal with. This manifests itself as symmetry. The symmetry in these real world problems make them easier to deal with for humans. However, they lead to a great deal of redundancy when using computational methods of problem solving. Thus, this thesis examines some of the many aspects of utilising the symmetry of CSPs to reduce the amount of computation needed by constraint solvers. In this thesis we look at the ease of use of previous symmetry breaking methods. We introduce a new and novel method of describing the symmetries of CSPs. We look at previous methods of symmetry breaking and show how we can drastically reduce their computation while still breaking all symmetry. We give the first detailed investigation into the behaviour of breaking only subsets of all symmetry. We look at how this affects the performance of constraint solvers before discovering the properties of a good symmetry. We then present an original method for choosing the best symmetries to use. Finally, we look at areas of redundant computation in constraint solvers that no other research has examined. New ways of dealing with this redundancy are proposed with results of an example implementation which improves efficiency by several orders of magnitude.

Systems support for distributed learning environments

Allison, Colin — 2003-01-01T00:00:00Z

This thesis contends that the growing phenomena of multi-user networked "learning environments" should be treated as distributed interactive systems and that their developers should be aware of the systems and networks issues involved in their construction and maintenance. Such environments are henceforth referred to as distributed learning environments, or DLEs. Three major themes are identified as part of systems support: i) shared resource coherence in DLEs; ii) Quality of Service for the end- users of DLEs; and iii) the need for an integrating framework to develop, deploy and manage DLEs. The thesis reports on several distinct implementations and investigations that are each linked by one or more of those themes. Initially, responsiveness and coherence emerged as potentially conflicting requirements, and although a system was built that successfully resolved this conflict it proved difficult to move from the "clean room" conditions of a research project into a real world learning context. Accordingly, subsequent systems adopted a web-based approach to aid deployment in realistic settings. Indeed, production versions of these systems have been used extensively in credit-bearing modules in several Scottish Universities. Interactive responsiveness then emerged as a major Quality of Service issue in its own right, and motivated a series of investigations into the sources of delay, as experienced by end users of web-oriented distributed learning environments. Investigations into this issue provided insight into the nature of web-oriented interactive distributed learning and highlighted the need to be QoS-aware. As the volume and the range of usage of distributed learning applications increased the need for an integrating framework emerged. This required identifying and supporting a wide variety of educational resource types and also the key roles occupied by users of the system, such as tutors, students, supervisors, service providers, administrators, examiners. The thesis reports on the approaches taken and lessons learned from researching, designing and implementing systems which support distributed learning. As such, it constitutes a documented body of work that can inform the future design and deployment of distributed learning environments.

Guest editorial: High-level programming for heterogeneous parallel systems

Brown, Christopher Mark — 2018-05-18T00:00:00Z

Murray polygons as a tool in image processing

Pharasi, Bhuwan — 1990-07-01T00:00:00Z

This thesis reports on some applications of murray polygons, which are a generalization of space filling curves and of Peano polygons in particular, to process digital image data. Murray techniques have been used on 2-dimensional and 3-dimensional images, which are in cartesian/polar co-ordinates. Attempts have been made to resolve many associated aspects of image processing, such as connected components labelling, hidden surface removal, scaling, shading, set operations, smoothing, superimposition of images, and scan conversion. Initially different techniques which involve quadtree, octree, and linear run length encoding, for processing images are reviewed. Several image processing problems which are solved using different techniques are described in detail. The steps of the development from Peano polygons via multiple radix arithmetic to murray polygons is described. The outline of a software implementation of the basic and fast algorithms are given and some hints for a hardware implementation are described The application of murray polygons to scan arbitrary images is explained. The use of murray run length encodings to resolve some image processing problems is described. The problem of finding connected components, scaling an image, hidden surface removal, shading, set operations, superimposition of images, and scan conversion are discussed. Most of the operations described in this work are on murray run lengths. Some operations on the images themselves are explained. The results obtained by using murray scan techniques are compared with those obtained by using standard methods such as linear scans, quadtrees, and octrees. All the algorithms obtained using murray scan techniques are finally presented in a menu format work bench. Algorithms are coded in PS-algol and the C language.

The theory and implementation of a secure system

Robb, David S. S. — 1992-01-01T00:00:00Z

Computer viruses pose a very real threat to this technological age. As our dependence on computers increases so does the incidence of computer virus infection. Like their biological counterparts, complete eradication is virtually impossible. Thus all computer viruses which have been injected into the public domain still exist. This coupled with the fact that new viruses are being discovered every day is resulting in a massive escalation of computer virus incidence. Computer viruses covertly enter the system and systematically take control, corrupt and destroy. New viruses appear each day that circumvent current means of detection, entering the most secure of systems. Anti-Virus software writers find themselves fighting a battle they cannot win: for every hole that is plugged, another leak appears. Presented in this thesis is both method and apparatus for an Anti-Virus System which provides a solution to this serious problem. It prevents the corruption, or destruction of data, by a computer virus or other hostile program, within a computer system. The Anti-Virus System explained in this thesis will guarantee system integrity and virus containment for any given system. Unlike other anti-virus techniques, security can be guaranteed, as at no point can a virus circumvent, or corrupt the action of the Anti-Virus System presented. It requires no hardware modification of the computer or the hard disk, nor software modification of the computer's operating system. Whilst being largely transparent to the user, the System guarantees total protection against the spread of current and future viruses.

Extension polymorphism

Balasubramaniam, Dharini — 1998-01-01T00:00:00Z

Any system that models a real world application has to evolve to be consistent with its changing domain. Dealing with evolution in an effective manner is particularly important for those systems that may store large amounts of data such as databases and persistent languages. In persistent programming systems, one of the important issues in dealing with evolution is the specification of code that will continue to work in a type safe way despite changes to type definitions. Polymorphism is one mechanism which allows code to work over many types. Inclusion polymorphism is often said to be a model of type evolution. However, observing type changes in persistent systems has shown that types most commonly exhibit additive evolution. Even though inclusion captures this pattern in the case of record types, it does not always do so for other type constructors. The confusion of subtyping, inheritance and evolution often leads to unsound or at best, dynamically typed systems. Existing solutions to this problem do not completely address the requirements of type evolution in persistent systems. The aim of this thesis is to develop a form of polymorphism that is suitable for modelling additive evolution in persistent systems. The proposed strategy is to study patterns of evolution for the most generally used type constructors in persistent languages and to define a new relation, called extension, which models these patterns. This relation is defined independent of any existing relations used for dealing with evolution. A programming language mechanism is then devised to provide polymorphism over this relation. The polymorphism thus defined is called extension polymorphism. This thesis presents work involving the design and definition of extension polymorphism and an implementation of a type checker for this polymorphism. A proof of soundness for a type system which supports extension polymorphism is also presented.

On the integration of concurrency, distribution and persistence

Munro, D. S. — 1994-01-01T00:00:00Z

The principal tenet of the persistence model is that it abstracts over all the physical properties of data such as how long it is stored, where it is stored, how it is stored, what form it is kept in and who is using it. Experience with programming systems which support orthogonal persistence has shown that the simpler semantics and reduced complexity can often lead to a significant reduction in software production costs. Persistent systems are relatively new and it is not yet clear which of the many models of concurrency and distribution best suit the persistence paradigm. Previous work in this area has tended to build one chosen model into the system which may then only be applicable to a particular set of problems. This thesis challenges the orthodoxy by designing a persistent framework in which all models of concurrency and distribution can be integrated in an add-on fashion. The provision of such a framework is complicated by a tension between the conceptual ideas of persistence and the intrinsics of concurrency and distribution. The approach taken is to integrate the spectra of concurrency and distribution abstractions into the persistence model in a manner that does not prevent the user from being able to reason about program behaviour. As examples of the reference model a number of different styles of concurrency and distribution have been designed and incorporated into the persistent programming system Napier88. A detailed treatment of these models and their implementations is given.

On the construction of persistent programming environments

Dearle, Alan — 1988-01-01T00:00:00Z

This thesis presents research into the construction of persistent programming systems. Much of the thesis is concerned with the design and implementation of persistent programming languages, in particular PS-algol and Napier. Both languages support machine independent vector and raster graphics data types. Napier provides an environment mechanism that enables the incremental construction and binding of programs. Napier has a powerful type system featuring parametric polymorphism and abstract data types. The machine supporting Napier, the Persistent Abstract Machine, is investigated. The machine supports an efficient implementation of parametric polymorphism and abstract data types. The Persistent Abstract Machine has a layered architecture in which permits experimentation into language implementation and store design. The construction of compilers in a persistent environment is explored. A flexible compiler architecture is developed. With it, a family of compilers may be constructed at relatively little cost. One such compiler is the callable compiler; this is a first class data object in the persistent environment. The uses of such a compiler are explored, in particular in the construction of an object browser. The persistent object browser introduces a new software architecture that permits adaptive programs to be constructed incrementally. This is achieved by writing, compiling and linking new procedures into an executing program. The architecture has been successfully applied to the construction of adaptive databases and bootstrap compilers.

Delivering the benefits of persistence to system construction and execution

Cutts, Q. I. — 1993-01-01T00:00:00Z

In an orthogonally persistent programming system the longevity of data is independent of its other attributes. The advantages of persistence may be seen primarily in the areas of data modelling and protection resulting from simpler semantics and reduced complexity. These have been verified by the first implementations of persistent languages, typically consisting of a persistent store, a run-time system and a compiler that produces programs that may access and manipulate the persistent environment. This thesis demonstrates that persistence can deliver many further benefits to the programming process when applied to software construction and execution. To support the thesis, a persistent environment has been extended with all the components necessary to support program construction and execution entirely within the persistent environment. This is the first known example of a strongly-typed integrated persistent programming environment. The keystone of this work is the construction of a compiler that operates entirely within the persistent environment. During its construction, persistence has been exploited in the development of a new methodology for the construction of applications from components and in the optimisation of the widespread use of type information throughout the environment. Further enhancements to software construction and execution have been developed that can only be supported within an integrated persistent programming environment. It is shown how persistence forms the basis of a new methodology for dynamic optimisation of code and data. In addition, new interfaces to the compiler are described that offer increased functionality over traditional compilers. Extended by the ability to manipulate structured values within the persistent environment, the interfaces increase the simplicity, flexibility and efficiency of software construction and execution. Reflective and hyper-programming techniques are also supported. The methodologies and compilation facilities evolved together as the compiler was developed and so the first uses of both were applied to one another. It is these applications that have been described in this thesis as examples of its validity. However, the methodologies and the compilation facilities need not be inter-twined. The benefits derived from each of them are general and they may be used in many areas of the persistent environment.

Types and polymorphism in persistent programming systems

Connor, R. C. H. — 1991-01-01T00:00:00Z

Persistent object stores

Brown, A L — 1989-01-01T00:00:00Z

The design and development of a type secure persistent object store is presented as part of an architecture to support experiments in concurrency, transactions and distribution. The persistence abstraction hides the physical properties of data from the programs that manipulate it. Consequently, a persistent object store is required to be of unbounded size, infinitely fast and totally reliable. A range of architectural mechanisms that can be used to simulate these three features is presented. Based on a suitable selection of these mechanisms, two persistent object stores are presented. The first store is designed for use with the programming language PS-algol. Its design is evolved to yield a more flexible layered architecture. The layered architecture is designed to provide each distinct architectural mechanism as a separate architectural layer conforming to a specified interface. The motivation for this design is two-fold. Firstly, the particular choice of layers greatly simplifies the resulting implementation and secondly, the layered design can support experimental architecture implementations. Since each layer conforms to a specified interface, it is possible to experiment with the implementation of an individual layer without affecting the implementation of the remaining architectural layers. Thus, the layered architecture is a convenient vehicle for experimenting with the implementation of persistent object stores. An implementation of the layered architecture is presented together with an example of how it may be used to support a distributed system. Finally, the architecture's ability to support a variety of storage configurations is presented.

Modelling continuous sequential behaviour to enhance training and generalization in neural networks

Chen, Lihui — 1993-01-01T00:00:00Z

This thesis is a conceptual and empirical approach to embody modelling of continuous sequential behaviour in neural learning. The aim is to enhance the feasibility of training and capacity for generalisation. By examining the sequential aspects of the passing of time in a neural network, it is suggested that an alteration to the usual goal weight condition may be made to model these aspects. The notion of a goal weight path is introduced, with a path-based backpropagation (PBP) framework being proposed. Two models using PBP have been investigated in the thesis. One is called Feedforward Continuous BackPropagation (FCBP) which is a generalization of conventional BackPropagation; the other is called Recurrent Continuous BackPropagation (RCBP) which provides a neural dynamic system for I/O associations. Both models make use of the continuity underlying analogue-binary associations and analogue-analogue associations within a fixed neural network topology. A graphical simulator cbptool for Sun workstations has been designed and implemented for supporting the research. The capabilities of FCBP and RCBP have been explored through experiments. The results for FCBP and RCBP confirm the modelling theory. The fundamental alteration made on conventional backpropagation brings substantial improvement in training and generalization to enhance the power of backpropagation.

Bicontexts and structural induction

Livesey, Mike — 1987-01-01T00:00:00Z

This thesis introduces and explores the notion of bicontext, an order-enriched category equipped with a unary endofunctor of order two called reverse. The purpose is threefold. First, the important categories that arise in Scott-Strachey denotational semantics have this additional structure, where by the constructions used to solve "data-type equations" are both limits and colimits simultaneously. Second, it yields a pleasant "set-theoretic" treatment of algebraic data-types in terms of bicontexts of (1, 1) relations rather than pairs of continuous functions. The theory provides a general way of relating bicontexts which serves to connect these particular ones. Third, the least solutions of data-type equations often have an associated principle of structural induction. Properties in such solutions become arrows in the appropriate bicontext, making the defining functor directly applicable to them. In this way the structural induction can be derived systematically from the functor.

Modelling recovery in database systems

Scheuerl, S. — 1998-01-01T00:00:00Z

The execution of modern database applications requires the co-ordination of a number of components such as: the application itself, the DBMS, the operating system, the network and the platform. The interaction of these components makes understanding the overall behaviour of the application a complex task. As a result the effectiveness of optimisations are often difficult to predict. Three techniques commonly available to analyse system behaviour are empirical measurement, simulation-based analysis and analytical modelling. The ideal technique is one that provides accurate results at low cost. This thesis investigates the hypothesis that analytical modelling can be used to study the behaviour of DBMSs with sufficient accuracy. In particular the work focuses on a new model for costing recovery mechanisms called MaStA and determines if the model can be used effectively to guide the selection of mechanisms. To verify the effectiveness of the model a validation framework is developed. Database workloads are executed on the flexible Flask architecture on different platforms. Flask is designed to minimise the dependencies between DBMS components and is used in the framework to allow the same workloads to be executed on a various recovery mechanisms. Empirical analysis of executing the workloads is used to validate the assumptions about CPU, I/O and workload that underlie MaStA. Once validated, the utility of the model is illustrated by using it to select the mechanisms that provide optimum performance for given database applications. By showing that analytical modelling can be used in the selection of recovery mechanisms, the work presented makes a contribution towards a database architecture in which the implementation of all components may be selected to provide optimum performance.

The combinatorics of abstract container data types

Tulley, Dominic H. — 1997-01-01T00:00:00Z

The study of abstract machines such as Turing machines, push down automata and finite state machines has played an important role in the advancement of computer science. It has led to developments in the theory of general purpose computers, compilers and string manipulation as well as many other areas. The language associated with an abstract machine characterises an important aspect of the behaviour of that machine. It is therefore the principal object of interest when studying such a machine. In this thesis we consider abstract container data types to be abstract machines. We define the concept of a language associated with an abstract container data type and investigate this in the same spirit as for other abstract machines. We also consider a model which allows us to describe various abstract container data types. This model is studied in a similar manner. There is a rich selection of problems to investigate. For instance, the data items which the abstract container data types operate on can take many forms. The input stream could consist of distinct data items, say 1, 2,..., n, or it could be a word over the binary alphabet. Alternatively it could be a sequence formed from the data items in some arbitrary multiset. Another consideration is whether or not an abstract data type has a finite storage capacity. It is shown how to construct a regular grammar which generates (an encoded form of) the set of permutations which can be realised by moving tokens through a network. A one to one correspondence is given between ordered forests of bounded height and members of the language associated with a bounded capacity priority queue operating on binary data. A number of related results are also proved; in particular for networks operating on binary data, and priority queues of capacity 2.

The imperative implementation of algebraic data types

Thomas, Muffy — 1988-01-01T00:00:00Z

The synthesis of imperative programs for hierarchical, algebraically specified abstract data types is investigated. Two aspects of the synthesis are considered: the choice of data structures for efficient implementation, and the synthesis of linked implementations for the class of ADTs which insert and access data without explicit key. The methodology is based on an analysis of the algebraic semantics of the ADT. Operators are partitioned according to the behaviour of their corresponding operations in the initial algebra. A family of relations, the storage relations of an ADT, Is defined. They depend only on the operator partition and reflect an observational view of the ADT. The storage relations are extended to storage graphs: directed graphs with a subset of nodes designated for efficient access. The data structures in our imperative language are chosen according to properties of the storage relations and storage graphs. Linked implementations are synthesised in a stepwise manner by implementing the given ADT first by its storage graphs, and then by linked data structures in the imperative language. Some circumstances under which the resulting programs have constant time complexity are discussed.

Robustness and generalisation : tangent hyperplanes and classification trees

Fernandes, Antonio Ramires — 1997-01-01T00:00:00Z

The issue of robust training is tackled for fixed multilayer feedforward architectures. Several researchers have proved the theoretical capabilities of Multilayer Feedforward networks but in practice the robust convergence of standard methods like standard backpropagation, conjugate gradient descent and Quasi-Newton methods may be poor for various problems. It is suggested that the common assumptions about the overall surface shape break down when many individual component surfaces are combined and robustness suffers accordingly. A new method to train Multilayer Feedforward networks is presented in which no particular shape is assumed for the surface and where an attempt is made to optimally combine the individual components of a solution for the overall solution. The method is based on computing Tangent Hyperplanes to the non-linear solution manifolds. At the core of the method is a mechanism to minimise the sum of squared errors and as such its use is not limited to Neural Networks. The set of tests performed for Neural Networks show that the method is very robust regarding convergence of training and has a powerful ability to find good directions in weight space. Generalisation is also a very important issue in Neural Networks and elsewhere. Neural Networks are expected to provide sensible outputs for unseen inputs. A framework for hyperplane based classifiers is presented for improving average generalisation. The framework attempts to establish a trained boundary so that there is an optimal overall spacing from the boundary to training points closest to this boundary. The framework is shown to provide results consistent with the theoretical expectations.

Using constraints to improve generalisation and training of feedforward neural networks : constraint based decomposition and complex backpropagation

Draghici, Sorin — 1996-01-01T00:00:00Z

Neural networks can be analysed from two points of view: training and generalisation. The training is characterised by a trade-off between the 'goodness' of the training algorithm itself (speed, reliability, guaranteed convergence) and the 'goodness' of the architecture (the difficulty of the problems the network can potentially solve). Good training algorithms are available for simple architectures which cannot solve complicated problems. More complex architectures, which have been shown to be able to solve potentially any problem do not have in general simple and fast algorithms with guaranteed convergence and high reliability. A good training technique should be simple, fast and reliable, and yet also be applicable to produce a network able to solve complicated problems. The thesis presents Constraint Based Decomposition (CBD) as a technique which satisfies the above requirements well. CBD is shown to build a network able to solve complicated problems in a simple, fast and reliable manner. Furthermore, the user is given a better control over the generalisation properties of the trained network with respect to the control offered by other techniques. The generalisation issue is addressed, as well. An analysis of the meaning of the term "good generalisation" is presented and a framework for assessing generalisation is given: the generalisation can be assessed only with respect to a known or desired underlying function. The known properties of the underlying function can be embedded into the network thus ensuring a better generalisation for the given problem. This is the fundamental idea of the complex backpropagation network. This network can associate signals through associating some of their parameters using complex weights. It is shown that such a network can yield better generalisation results than a standard backpropagation network associating instantaneous values.

Abstract machine design for increasingly more powerful ALGOL-languages

Gunn, Hamish Iain Elston — 1985-01-01T00:00:00Z

This thesis presents the work and results of an investigation into language implementation. Some work on language design has also been undertaken. Three languages have been implemented which may be described as members of the Algol family with features and constructs typical of that family. These include block structure, nested routines, variables, and dynamic allocation of data structures such as vectors and user-defined structures. The underlying technique behind these Implementations has been that of abstract machine modelling. For each language an abstract intermediate code has been designed. Unlike other such codes we have raised the level of abstraction so that the code lies closer to the language than that of the real machine on which the language may be implemented. Each successive language is more powerful than the previous by the addition of constructs which were felt to be useful. These were routines as assignable values, dynamically initialised constant locations, types as assignable values and lists. The three languages were, Algol R a "typical" Algol based on Algol W h an Algol with routines as assignable values, enumerated types, restriction of pointers to sets of user-defined structures, and constant locations. nsl a polymorphic Algol with types as assignable values, routines as assignable values, lists, and type- and value-constant locations. The intermediate code for Algol R was based on an existing abstract machine. The code level was raised and designed so that it should be used as the input to a code generator. Such a code generator was written improving a technique called simulated evaluation. The language h was designed and a recursive descent compiler written for it which produced an intermediate code similar in level to the previous one. Again a simulated evaluation code generator was written, this time generating code for an interpreted abstract machine which implemented routines as assignable and storable values. Finally the language nsl was designed. The compiler for it produced code for an orthogonal, very high level tagged architecture abstract machine which was implemented by interpretation. This machine implemented polymorphism, assignable routine values and type- and value- constancy. Descriptions of the intermediate codes/abstract machines are given in appendices.

A testbed for embedded systems

Burgess, Peter — 1994-01-01T00:00:00Z

Testing and Debugging are often the most difficult phase of software development. This is especially true of embedded systems which are usually concurrent, have real-time performance and correctness constraints and which execute in the field in an environment which may not permit internal scrutiny of the software behaviour. Although good software engineering practices help, they will never eliminate the need for testing and debugging. This is because failings in the specification and design are often only discovered through testing and understanding these failings and how to correct them comes from debugging. These observations suggest that embedded software should be designed in a way which makes testing and debugging easier and that tools which support these activities are required. Due to the often hostile environment in which the finished embedded system will function, it is necessary to have a platform which allows the software to be developed and tested "in vitro". The Testbed system achieves these goals by providing dynamic modification and process migration facilities for use during development as well as powerful monitoring and background debugging support. These facilities are built on a basic run-time harness supporting an event-driven programming model with a global communication mechanism. This programming model is well suited to the reactive nature of embedded systems. The main research contributions of this work are in the areas of finding deadlock-free, path-optimal routings for networks and of dynamic modification with automated conversion of data which may include pointers.

Effective termination techniques

Cropper, Nick I. — 1997-01-01T00:00:00Z

An important property of term rewriting systems is termination: the guarantee that every rewrite sequence is finite. This thesis is concerned with orderings used for proving termination, in particular the Knuth-Bendix and polynomial orderings. First, two methods for generating termination orderings are enhanced. The Knuth-Bendix ordering algorithm incrementally generates numeric and symbolic constraints that are sufficient for the termination of the rewrite system being constructed. The KB ordering algorithm requires an efficient linear constraint solver that detects the nature of degeneracy in the solution space, and for this a revised method of complete description is presented that eliminates the space redundancy that crippled previous implementations. Polynomial orderings are more powerful than Knuth-Bendix orderings, but are usually much harder to generate. Rewrite systems consisting of only a handful of rules can overwhelm existing search techniques due to the combinatorial complexity. A genetic algorithm is applied with some success. Second, a subset of the family of polynomial orderings is analysed. The polynomial orderings on terms in two unary function symbols are fully resolved into simpler orderings. Thus it is shown that most of the complexity of polynomial orderings is redundant. The order type (logical invariant), either r or A (numeric invariant), and precedence is calculated for each polynomial ordering. The invariants correspond in a natural way to the parameters of the orderings, and so the tabulated results can be used to convert easily between polynomial orderings and more tangible orderings. The orderings of order type are two of the recursive path orderings. All of the other polynomial orderings are of order type w or w2 and each can be expressed as a lexicographic combination of r (weight), A (matrix), and lexicographic (dictionary) orderings. The thesis concludes by showing how the analysis extends to arbitrary monadic terms, and discussing possible developments for the future.

A parallel functional language compiler for message-passing multicomputers

Junaidu, Sahalu B. — 1998-01-01T00:00:00Z

The research presented in this thesis is about the design and implementation of Naira, a parallel, parallelising compiler for a rich, purely functional programming language. The source language of the compiler is a subset of Haskell 1.2. The front end of Naira is written entirely in the Haskell subset being compiled. Naira has been successfully parallelised and it is the largest successfully parallelised Haskell program having achieved good absolute speedups on a network of SUN workstations. Having the same basic structure as other production compilers of functional languages, Naira's parallelisation technology should carry forward to other functional language compilers. The back end of Naira is written in C and generates parallel code in the C language which is envisioned to be run on distributed-memory machines. The code generator is based on a novel compilation scheme specified using a restricted form of Milner's 7r-calculus which achieves asynchronous communication. We present the first working implementation of this scheme on distributed-memory message-passing multicomputers with split-phase transactions. Simulated assessment of the generated parallel code indicates good parallel behaviour. Parallelism is introduced using explicit, advisory user annotations in the source' program and there are two major aspects of the use of annotations in the compiler. First, the front end of the compiler is parallelised so as to improve its efficiency at compilation time when it is compiling input programs. Secondly, the input programs to the compiler can themselves contain annotations based on which the compiler generates the multi-threaded parallel code. These, therefore, make Naira, unusually and uniquely, both a parallel and a parallelising compiler. We adopt a medium-grained approach to granularity where function applications form the unit of parallelism and load distribution. We have experimented with two different task distribution strategies, deterministic and random, and have also experimented with thread-based and quantum- based scheduling policies. Our experiments show that there is little efficiency difference for regular programs but the quantum-based scheduler is the best in programs with irregular parallelism. The compiler has been successfully built, parallelised and assessed using both idealised and realistic measurement tools: we obtained significant compilation speed-ups on a variety of simulated parallel architectures. The simulated results are supported by the best results obtained on real hardware for such a large program: we measured an absolute speedup of 2.5 on a network of 5 SUN workstations. The compiler has also been shown to have good parallelising potential, based on popular test programs. Results of assessing Naira's generated unoptimised parallel code are comparable to those produced by other successful parallel implementation projects.

A parallel implementation of SASL

Corovessis, Jiannis — 1983-01-01T00:00:00Z

The applicative or functional language SASL is investigated from the point of view of an implementation. The aim is to determine and experiment with a run-time environment (SASL parallel machine) which incorporates parallelism so that constituent parts of a program (its sub-expressions) can be processed concurrently. The introduction of parallelism is characterised by two fundamental issues. The type of programs, referred to as parallel and the so called strategy of parallelism, employed by the parallel machine. The former concerns deriving a graph from the program text indicating the order in which things must be done and the notion of "worthwhile" parallelism. In order to obtain a parallel program the original (sequential) program is transformed and/or modified. Certain programs are found to be essentially sequential. Parallelism is expressed as call-by-parallel parameter passing mechanism and by a parallel conditional operator, suggesting speculative parallelism. The issue of the strategy of parallelism concerns the scheme under which a regime of SASL processors combine their effort in processing a parallel program. The objective being to shorten the length of computation carried out by the sequential machine on the initial program. The class of parallel programs seems to be non-trivial and it includes both non-numerical and numerical programs. The "speed-up" by appealing to parallelism for such programs is found to be substantial.

FORUM and its implementation

Urban, Christian — 1997-01-01T00:00:00Z

Miller presented Forum as a specification logic: Forum extends several existing logic programming languages, for example Prolog, LO and Lolli. The crucial change in Forum is the extension from single succedent sequents, as in intuitionistic logic, to multiple succedent sequents, as in classical logic, with a corresponding extension of the notion of uniform proof. Forum uses the connectives of linear logic. Languages based on linear logic offer extra expressivity (in comparison with traditional logic languages), but also present new implementation challenges. One such challenge is that of context management, because the multiplicative linear connectives 'R', ''S'' and '-o' require context splitting. Hodas and Miller presented a solution (the 10 model) to this in 1991 for the language Lolli based on minimal linear logic. This thesis presents a technique which is an adaptation of the aforementioned approach for the language Forum and following a suggestion of Miller that the '.' constant be treated as primitive in order to avoid looping problems arising from its use as a derived symbol. Cervesato, Hodas and Pfenning have presented a technique for managing the 'T' constant, dividing each input context into a "slack" part and a "strict" part; the main novel contribution of this thesis is to modify this technique, by dividing instead the output context. This leads to a proof system with fewer rules (and consequent ease of implementation) but enhanced performance, for which we present some experimental evidence.

Models for persistence in lazy functional programming systems

McNally, David J. — 1993-01-01T00:00:00Z

Research into providing support for long term data in lazy functional programming systems is presented in this thesis. The motivation for this work has been to reap the benefits of integrating lazy functional programming languages and persistence. The benefits are: the programmer need not write code to support long term data since this is provided as part of the programming system; persistent data can be used in a type safe way since the programming language type system applies to data with the whole range of persistence; the benefits of lazy evaluation are extended to the full lifetime of a data value. Whilst data is reachable, any evaluation performed on the data persists. A data value changes monotonically from an unevaluated state towards a completely evaluated state over time. Interactive data intensive applications such as functional databases can be developed. These benefits are realised by the development of models for persistence in lazy functional programming systems. Two models are proposed which make persistence available to the functional programmer. The first, persistent modules, allows values named in modules to be stored in persistent storage for later reuse. The second model, stream persistence allows dynamic, interactive access to persistent storage. These models are supported by a system architecture which incorporates a persistent abstract machine, PCASE, integrated with a persistent object store. The resulting persistent lazy functional programming system, Staple, is used in prototyping and functional database modelling experiments.

Computational techniques applied to group presentations

Rutherford, Kevin — 1989-01-01T00:00:00Z

Designs for a collection of re-usable software modules are developed. The modules are implemented in C and expressed in a tool-kit for the Unix operating system. Each tool is an expert in some aspect of the manipulation by computer of group presentations. The granularity of the tool-kit has been chosen so that common usages of the Todd-Coxeter and Reidemeister-Schreier methods can be expressed in various ways using any tool composition language (eg. shell scripts), and running as a collection of co-operating processes. Data file formats for the interchange of group-theoretic information between processes are described. The tools are tested on well-known examples, and are used to prove a long-standing conjecture. Use of the tools as the basis for a rule-based "expert system" is discussed.

Proof-theoretic investigations into integrated logical and functional programming

Pinto, Luis Filipe Ribeiro — 1997-01-01T00:00:00Z

This thesis is a proof-theoretic investigation of logic programming based on hereditary Harrop logic (as in lambdaProlog). After studying various proof systems for the first-order hereditary Harrop logic, we define the proof-theoretic semantics of a logic LFPL, intended as the basis of logic programming with functions, which extends higher-order hereditary Harrop logic by providing definition mechanisms for functions in such a way that the logical specification of the function rather than the function may be used in proof search. In Chap. 3, we define, for the first-order hereditary Harrop fragment of LJ, the class of uniform linear focused (ULF) proofs (suitable for goal-directed search with backchaining and unification) and show that the ULF-proofs are in 1-1 correspondence with the expanded normal deductions, in Prawitz's sense. We give a system of proof-term annotations for LJ-proofs (where proof-terms uniquely represent proofs). We define a rewriting system on proof-terms (where rules represent a subset of Kleene's permutations in LJ) and show that: its irreducible proof- terms are those representing ULF-proofs; it is weakly normalising. We also show that the composition of Prawitz's mappings between LJ and NJ, restricted to ULF-proofs, is the identity. We take the view of logic programming where: a program P is a set of formulae; a goal G is a formula; and the different means of achieving G w.r.t. P correspond to the expanded normal deductions of G from the assumptions in P (rather than the traditional view, whereby the different means of goal-achievement correspond to the different answer substitutions). LFPL is defined in Chap. 4, by means of a sequent calculus. As in LeFun, it extends logic programming with functions and provides mechanisms for defining names for functions, maintaining proof search as the computation mechanism (contrary to languages such as ALF, Babel, Curry and Escher, based on equational logic, where the computation mechanism is some form of rewriting). LFPL also allows definitions for declaring logical properties of functions, called definitions of dependent type. Such definitions are of the form: (f,x) =def(A, w) : EX:RF, where f is a name for A and x is a name for w, a proof-term witnessing that the formula [A/x]F holds (i.e. A meets the specification Ex:rF). When searching for proofs, it may suffice to use the formula [A/x]F rather than A itself. We present an interpretation of LFPL into NNlambdanorm, a natural deduction system for hereditary Harrop logic with lambda-terms. The means of goal-achievement in LFPL are interpreted in NNlambdanorm essentially by cut-elimination, followed by an interpretation of cut-free sequent calculus proofs as normal deductions. We show that the use of definitions of dependent type may speed up proof search because the equivalent proofs using no such definitions may be much longer and because normalisation may be done lazily, since not all parts of the proof need to be exhibited. We sketch two methods for implementing LFPL, based on goal-directed proof search, differing in the mechanism for selecting definitions of dependent type on which to backchain. We discuss techniques for handling the redundancy arising from the equivalence of each proof using such a definition to one using no such definitions.

Parallel functional programming for message-passing multiprocessors

Ostheimer, Gerald — 1993-01-01T00:00:00Z

We propose a framework for the evaluation of implicitly parallel functional programs on message passing multiprocessors with special emphasis on the issue of load bounding. The model is based on a new encoding of the lambda-calculus in Milner's pi-calculus and combines lazy evaluation and eager (parallel) evaluation in the same framework. The pi-calculus encoding serves as the specification of a more concrete compilation scheme mapping a simple functional language into a message passing, parallel program. We show how and under which conditions we can guarantee successful load bounding based on this compilation scheme. Finally we discuss the architectural requirements for a machine to support our model efficiently and we present a simple RISC-style processor architecture which meets those criteria.

The application of message passing to concurrent programming

Harland, David M. — 1981-01-01T00:00:00Z

The development of concurrency in computer systems will be critically reviewed and an alternative strategy proposed. This is a programming language designed along semantic principles, and it is based upon the treatment of concurrent processes as values within that language's universe of discourse. An asynchronous polymorphic message system is provided to enable co-existent processes to communicate freely. This is presented as a fundamental language construct, and it is completely general purpose, as all values, however complex, can be passed as messages. Various operations are also built into the language so as to permit processes to discover and examine one another. These permit the development of robust systems, where localised failures can be detected, and action can be taken to recover. The orthogonality of the design is discussed and its implementation in terms of an incremental compiler and abstract machine interpreter is outlined in some detail. This thesis hopes to demonstrate that message-oriented communication in a highly parallel system of processes is not only a natural form of expression, but is eminently practical, so long as the entities performing the communication are values in the language

An experiment in high-level microprogramming

Sommerville, John F. — 1977-01-01T00:00:00Z

This thesis describes an experiment in developing a true high-level microprogramming language for the Burroughs B1700 series of computers. Available languages for machine description both at a behavioural level and at a microprogramming level are compared and the conclusion drawn that none were suitable for our purpose and that it was necessary to develop a new language which we call SUILVEN. SUILVEN is a true high-level language with no machine-dependent features. It permits the exact specification of the size of abstract machine data areas (via the BITS declaration) and allows the user to associate structure with these data areas (via the TEMPLATE declaration), SUILVEN only permits the use of structured control statements (if-then-else, while-do etc.) - the go to statement is not a feature of the language. SUILVEN is compiled into microcode for the B1700 range of machines. The compiler is written in SNOBOL4 and uses a top-down recursive descent analysis technique, using abstract machines for PASCAL and the locally developed SASL, SUILVEN was compared with other high and low level languages. The conclusions drawn from this comparison were as follows: - (i) SUILVEN was perfectly adequate for describing simple S-machines (ii) SUILVEN lacked certain features for describing higher-level machines (iii) The needs of a machine description language and a microprogram implementation language are different and that it is unrealistic to attempt to combine these in a single language.

Translation of APL to other high-level languages

Jacobs, Margaret M. — 1975-01-01T00:00:00Z

The thesis describes a method of translating the computer language APL to other high-level languages. Particular reference is made to FORTRAN, a language widely available to computer users. Although gaining in popularity, APL is not at present so readily available, and the main aim of the translation process is to enable the more desirable features of APL to be at the disposal of a far greater number of users. The translation process should also speed up the running of routines, since compilation in general leads to greater efficiency than interpretive techniques. Some inefficiencies of the APL language have been removed by the translation process. The above reasons for translating APL to other high-level languages are discussed in the introduction to the thesis. A description of the method of translation forms the main part of the thesis. The APL input code is first lexically scanned, a process whereby the subsequent phases are greatly simplified. An intermediate code form is produced in which bracketing is used to group operators and operands together, and to assign priorities to operators such that sub-expressions will be handled in the correct order. By scanning the intermediate code form, information is stacked until required later. The information is used to make possible a process of macro expansion. Each of the above processes is discussed in the main text of the thesis. The format of all information which can or must be supplied at translation time is clearly outlined in the text.

An extensible system for the automatic transmission of a class of programming languages

Perwaiz, Najam — 1975-01-01T00:00:00Z

This thesis deals with the topic of programming linguistics. A survey of the current techniques in the fields of syntax analysis and semantic synthesis is given. An extensible automatic translator has been described which can be used for the automatic translation of a class of programming languages. The automatic translator consists of two major parts: the syntax analyser and the semantic synthesizer. The syntax analyser is a generalised version of LL(K) parsers, the theoretical study of which has already been published by Lewis and Stearns and also by Rosenkrantz and Stearns. It accepts grammar of a given language in a modified version of the Backus Normal Form (MBNF) and parses the source language statements in a top down, left to right process without ever backing up. The semantic synthesizer is a table driven system which is called by the parser and performs semantic synthesis as .the parsing proceeds. The semantics of a programming language is specified in the form of semantic productions. These are used by the translator to construct semantic tables. The system is implemented in SN0B0L4 (SPITBOL version 2.0) on an IBM 360/44 and its description is supported by various examples. The automatic translator is an extensible system and SN0B0L4, the implementation language appears as its subset. It can be used to introduce look ahead in the parser, so that backup can be avoided. It can also be used to introduce new facilities in the semantic synthesizer.

The effective application of syntactic macros to language extensibility

Campbell, William R. — 1978-01-01T00:00:00Z

Starting from B M Leavenworth's proposal for syntactic macros, we describe an extension language LE with which one may extend a base Language LB for defining a new programming language LP. The syntactic macro processor is designed to minimise the overheads required for implementing the extensions and for carrying the syntax and data type error diagnostics of LB through to the extended language LP. Wherever possible, programming errors are flagged where they are introduced in the source text, whether in a macro definition or in a macro call. LE provides a notation, similar to popular extended forms of BNF, for specifying alternative syntaxes for new linguistic forms in the macro template, a separate assertion clause for imposing context sensitive restrictions on macro calls which cannot be imposed by the template, and a non-procedural language which reflects the nested structure of the template for prescribing conditional text replacement in the macro body. A super user may use LE for introducing new linguistic forms to LB and redefining, replacing or deleting existing forms. The end user is given the syntactic macro in terms of an LP macro declaration with which he may define new forms which are local to the lexical environments in which they are declared in his LP program. Because the macro process is embedded in and directed by a deterministic top down parse, the user can be sure that his extensions are unambiguous. Examples of macro definitions are given using a base language LB which has been designed to be rich enough in syntax and data types for illustrating the problems encountered in extending high level languages. An implementation of a compiler/processor for LB and LE is also described. A survey of previous work in this area, summaries of LE and LB, and a description of the abstract target machine are contained in appendices.

On the development of Algol

Morrison, Ronald — 1979-01-01T00:00:00Z

The thesis outlines the major problems in the design of high level programming languages. The complexity of these languages has caused the user problems in intellectual manageability. Part of this complexity is caused by lack of generality which also causes loss of power. The maxim of power through simplicity, simplicity through generality is established. To achieve this simplicity a number of ground rules, the principle of abstraction, the principle of correspondence and the principle of data type completeness are discussed and used to form a methodology for programming language design. The methodology is then put into practice and the language S-algol is designed as the first member of a family of languages. The second part of the thesis describes the implementation of the S-algol language. In particular a simple and effective method of compiler construction based on the technique of recursive descent is developed. The method uses a hierarchy of abstractions which are implemented as layers to define the compiler. The simplicity and success of the technique depends on the structuring of the layers and the choice of abstractions. The compiler is itself written in S-algol. An abstract machine to support the S-algol language is then proposed and implemented. This machine, the S-code machine, has two stacks and a heap with a garbage collector and a unique method of procedure entry and exit. A detailed description of the S-code machine for the PDP11 computer is given in the Appendices. The thesis then describes the measurement tools used to aid the implementer and the user. The results of improvements in efficiency when these tools are used on the compiler itself are discussed. Finally, the research is evaluated and a discussion of how it may be extended is given.

Designing digital constructive visualization tools

Méndez, Gonzalo Gabriel — 2018-06-27T00:00:00Z

The emergence of tools that support fast and easy creation of visualizations has made the benefits of Information Visualization (InfoVis) more accessible. The predominant design for visualization authoring tools often includes features such as automated mappings and visualization templates, which make tools effective and easy-to-use. These features, however, still impose barriers to non-experts (i.e., people with no formal training on visualization concepts). The paradigm of Constructive Visualization (ConstructiveVis) has shown potential to overcome some of these barriers, but it has only been investigated through the use of physical tokens that people manipulate to create representations of data. This dissertation investigates how the principles of ConstructiveVis can be applied in the design and implementation of digital constructive visualization tools. This thesis presents the results of several observational studies that uncover how tools that promote a constructive approach to visualization compare to more conventional ones. It also sheds light on what kind of benefits and limitations digital ConstructiveVis brings into non-experts' visualization design process. The investigations here presented lay the foundations for the design of better visualization tools that not only allow people to create effective visualizations but also promote critical reflection on design principles.

Change blindness in proximity-aware mobile interfaces

Brock, Michael Oliver — 2018-04-21T00:00:00Z

Interface designs on both small and large displays can encourage people to alter their physical distance to the display. Mobile devices support this form of interaction naturally, as the user can move the device closer or further away as needed. The current generation of mobile devices can employ computer vision, depth sensing and other inference methods to determine the distance between the user and the display. Once known, a system can adapt the rendering of display content accordingly and enable proximity-aware mobile interfaces.The dominant method of exploiting proximity-aware interfaces is to remove or superimpose visual information. In this paper, we investigate change blindness in such interfaces. We present the results of two studies. In our first study we show that a proximity-aware mobile interface results in significantly more change blindness errors than a non-moving interface. The absolute difference in error rates was 13.7%. In our second study we show that within a proximity-aware mobile interface, gradual changes induce significantly more change blindness errors than instant changes confirming expected change blindness behavior. Based on our results we discuss the implications of either exploiting change blindness effects or mitigating them when designing mobile proximity-aware interfaces. Funding: Google Faculty award and EPSRC grants EP/N010558/1 and EP/N014278/1 (P.O.K.).

TAPping into mental models with blocks

Rough, D. — 2017-10-10T00:00:00Z

Trigger-Action Programming (TAP) has been shown to support end-users' rule-based mental models of context-aware applications. However, when desired behaviours increase in complexity, this can lead to ambiguity that confuses events, states, and how they can be combined in meaningful ways. Blocks programming could provide a solution, through constrained editing of visual triggers, conditions and actions. We observed slips and mistakes by users performing TAP with Jeeves, our domain-specific blocks environment, and propose solutions.

Guaranteeing generalisation in neural networks

Polhill, John Gareth — 1995-01-01T00:00:00Z

Neural networks need to be able to guarantee their intrinsic generalisation abilities if they are to be used reliably. Mitchell's concept and version spaces technique is able to guarantee generalisation in the symbolic concept-learning environment in which it is implemented. Generalisation, according to Mitchell, is guaranteed when there is no alternative concept that is consistent with all the examples presented so far, except the current concept, given the bias of the user. A form of bidirectional convergence is used by Mitchell to recognise when the no-alternative situation has been reached. Mitchell's technique has problems of search and storage feasibility in its symbolic environment. This thesis aims to show that by evolving the technique further in a neural environment, these problems can be overcome. Firstly, the biasing factors which affect the kind of concept that can be learned are explored in a neural network context. Secondly, approaches for abstracting the underlying features of the symbolic technique that enable recognition of the no-alternative situation are discussed. The discussion generates neural techniques for guaranteeing generalisation and culminates in a neural technique which is able to recognise when the best fit neural weight state has been found for a given set of data and topology.

Knowledge-based interoperability for mathematical software systems

Kohlhase, Michael — 2017-01-01T00:00:00Z

There is a large ecosystem of mathematical software systems. Individually, these are optimized for particular domains and functionalities, and together they cover many needs of practical and theoretical mathematics. However, each system specializes on one area, and it remains very difficult to solve problems that need to involve multiple systems. Some integrations exist, but the are ad-hoc and have scalability and maintainability issues. In particular, there is not yet an interoperability layer that combines the various systems into a virtual research environment (VRE) for mathematics. The OpenDreamKit project aims at building a toolkit for such VREs. It suggests using a central system-agnostic formalization of mathematics (Math-in-the-Middle, MitM) as the needed interoperability layer. In this paper, we conduct the first major case study that instantiates the MitM paradigm for a concrete domain as well as a concrete set of systems. Specifically, we integrate GAP, Sage, and Singular to perform computation in group and ring theory. Our work involves massive practical efforts, including a novel formalization of computational group theory, improvements to the involved software systems, and a novel mediating system that sits at the center of a star-shaped integration layout between mathematical software systems. Funding: OpenDreamKit Horizon 2020 European Research Infrastructures project (#676541) and DFG project RA-18723-1 OAF.

Radar sensing in human-computer interaction

Yeo, Hui-shyong — 2018-01-01T00:00:00Z

Plug and Play Bench : simplifying big data benchmarking using containers

Ceesay, Sheriffo — 2017-12-11T00:00:00Z

The recent boom of big data, coupled with the challenges of its processing and storage gave rise to the development of distributed data processing and storage paradigms like MapReduce, Spark, and NoSQL databases. With the advent of cloud computing, processing and storing such massive datasets on clusters of machines is now feasible with ease. However, there are limited tools and approaches, which users can rely on to gauge and comprehend the performance of their big data applications deployed locally on clusters, or in the cloud. Researchers have started exploring this area by providing benchmarking suites suitable for big data applications. However, many of these tools are fragmented, complex to deploy and manage, and do not provide transparency with respect to the monetary cost of benchmarking an application. In this paper, we present Plug And Play Bench (PAPB1): aninfrastructure aware abstraction built to integrate and simplifythe deployment of big data benchmarking tools on clusters of machines. PAPB automates the tedious process of installing, configuring and executing common big data benchmark work-loads by containerising the tools and settings based on the underlying cluster deployment framework. Our proof of concept implementation utilises HiBench as the benchmark suite, HDP as the cluster deployment framework and Azure as the cloud platform. The paper further illustrates the inclusion of cost metrics based on the underlying Microsoft Azure cloud platform. This research was supported by a Microsoft Azure Award.

Reprogramming embedded systems at run-time

Oliver, Richard — 2014-09-02T00:00:00Z

The dynamic re-programming of embedded systems is a long-standing problem in the field. With the advent of wireless sensor networks and the 'Internet of Things' it has now become necessary to be able to reprogram at run-time due to the difficulty of gaining access to such systems once deployed. The issues of power consumption, flexibility, and operating system protections are examined for a range of approaches, and a critical comparison is given. A combination of approaches is recommended for the implementation of real-world systems and areas where further work is required are highlighted.

The mat sat on the cat : investigating structure in the evaluation of order in machine translation

McCaffery, Martin — 2017-09-28T00:00:00Z

We present a multifaceted investigation into the relevance of word order in machine translation. We introduce two tools, DTED and DERP, each using dependency structure to detect differences between the structures of machine-produced translations and human-produced references. DTED applies the principle of Tree Edit Distance to calculate edit operations required to convert one structure into another. Four variants of DTED have been produced, differing in the importance they place on words which match between the two sentences. DERP represents a more detailed procedure, making use of the dependency relations between words when evaluating the disparities between paths connecting matching nodes. In order to empirically evaluate DTED and DERP, and as a standalone contribution, we have produced WOJ-DB, a database of human judgments. Containing scores relating to translation adequacy and more specifically to word order quality, this is intended to support investigations into a wide range of translation phenomena. We report an internal evaluation of the information in WOJ-DB, then use it to evaluate variants of DTED and DERP, both to determine their relative merit and their strength relative to third-party baselines. We present our conclusions about the importance of structure to the tools and their relevance to word order specifically, then propose further related avenues of research suggested or enabled by our work.

Automatic vertebrae localization from CT scans using volumetric descriptors

Karsten, Juan — 2017-09-14T00:00:00Z

The localization and identification of vertebrae in spinal CT images plays an important role in many clinical applications, such as spinal disease diagnosis, surgery planning, and post-surgery assessment. However, automatic vertebrae localization presents numerous challenges due to partial visibility, appearance similarity of different vertebrae, varying data quality, and the presence of pathologies. Most existing methods require prior information on which vertebrae are present in a scan, and perform poorly on pathological cases, making them of little practical value. In this paper we describe three novel types of local information descriptors which are used to build more complex contextual features, and train a random forest classifier. The three features are progressively more complex, systematically addressing a greater number of limitations of the current state of the art.

Two variants of the froidure-pin algorithm for finite semigroups

Jonusas, Julius — 2018-02-08T00:00:00Z

In this paper, we present two algorithms based on the Froidure-Pin Algorithm for computing the structure of a finite semigroup from a generating set. As was the case with the original algorithm of Froidure and Pin, the algorithms presented here produce the left and right Cayley graphs, a confluent terminating rewriting system, and a reduced word of the rewriting system for every element of the semigroup. If U is any semigroup, and A is a subset of U, then we denote by the least subsemigroup of U containing A. If B is any other subset of U, then, roughly speaking, the first algorithm we present describes how to use any information about , that has been found using the Froidure-Pin Algorithm, to compute the semigroup . More precisely, we describe the data structure for a finite semigroup S given by Froidure and Pin, and how to obtain such a data structure for from that for . The second algorithm is a lock-free concurrent version of the Froidure-Pin Algorithm.

Verification of a lazy cache coherence protocol against a weak memory model

Banks, Christopher — 2017-10-02T00:00:00Z

In this paper, we verify a modern lazy cache coherence protocol, TSO-CC, against the memory consistency model it was designed for, TSO. We achieve this by first showing a weak simulation relation between TSO-CC (with a fixed number of processors) and a novel finite-state operational model which exhibits the laziness of TSO-CC and satisfies TSO. We then extend this by an existing parameterisation technique, allowing verification for an unbounded number of processors. The approach is executed entirely within a model checker, no external tool is required and very little in-depth knowledge of formal verification methods is required of the verifier. Funding: EPSRC grant EP/M027317/1

Overcoming mental blocks : a blocks-based approach to Experience Sampling studies

Rough, Daniel John — 2017-12-01T00:00:00Z

Experience Sampling Method (ESM) studies repeatedly survey participants on their behaviours and experiences as they go about their everyday lives. Smartphones afford an ideal platform for ESM study applications as devices seldom leave their users, and can automatically sense surrounding context to augment subjective survey responses. ESM studies are employed in fields such as psychology and social science where researchers are not necessarily programmers and require tools for application creation. Previous tools using web forms, text files, or flowchart paradigms are either insufficient to model the potential complexity of study protocols, or fail to provide a low threshold to entry. We demonstrate that blocks programming simultaneously lowers the barriers to creating simple study protocols, while enabling the creation of increasingly sophisticated protocols. We discuss the design of Jeeves, our blocks-based environment for ESM studies, and explain advantages that blocks afford in ESM study design.

Intuitive and interpretable visual communication of a complex statistical model of disease progression and risk

Li, Jieyi — 2017-07-11T00:00:00Z

Computer science and machine learning in particular are increasingly lauded for their potential to aid medical practice. However, the highly technical nature of the state of the art techniques can be a major obstacle in their usability by health care professionals and thus, their adoption and actual practical benefit. In this paper we describe a software tool which focuses on the visualization of predictions made by a recently developed method which leverages data in the form of large scale electronic records for making diagnostic predictions. Guided by risk predictions, our tool allows the user to explore interactively different diagnostic trajectories,or display cumulative long term prognostics, in an intuitive and easily interpretable manner.

Recommending privacy preferences in location-sharing services

Zhao, Yuchen — 2017-06-21T00:00:00Z

Location-sharing services have become increasingly popular with the proliferation of smartphones and online social networks. People share their locations with each other to record their daily lives or satisfy their social needs. At the same time, inappropriate disclosure of location information poses threats to people's privacy. One of the reasons why people fail to protect their location privacy is the difficulty of using the current mechanisms to manually configure location-privacy settings. Since people's location-privacy preferences are context-aware, manual configuration is cumbersome. People's incapability and unwillingness to do so lead to unexpected location disclosures that violate their location privacy. In this thesis, we investigate the feasibility of using recommender systems to help people protect their location privacy. We examine the performance of location-privacy recommender systems and compare it with the state-of-the-art. We also conduct online user studies to understand people's acceptance of such recommender systems and their concerns. We revise our design of the systems according to the results of the user studies. We find that user-based collaborative filtering can accurately recommend location-privacy preferences and outperform the state-of-the-art when training data are insufficient. From users' perspective, their acceptance of location-privacy recommender systems is affected by the openness and the context of recommendations and their privacy concerns about the systems. It is feasible to use data obfuscation or decentralisation to alleviate people's concerns and meanwhile keep the systems robust against malicious data attacks.

Augmenting visual perception with gaze-contingent displays

Mauderer, Michael — 2017-06-21T00:00:00Z

Cheap and easy to use eye tracking can be used to turn a common display into a gaze-contingent display: a system that can react to the user’s gaze and adjust its content based on where an observer is looking. This can be used to enhance the rendering on screens based on perceptual insights and the knowledge about what is currently seen. This thesis investigates how GCDs can be used to support aspects of depth and colour perception. This thesis presents experiments that investigate the effects of simulated depth of field and chromatic aberration on depth perception. It also investigates how changing the colours surrounding the attended area can be used to influence the perceived colour and how this can be used to increase colour differentiation of colour and potentially increase the perceived gamut of the display. The presented investigations and empirical results lay the foundation for future investigations and development of gaze-contingent technologies, as well as for general applications of colour and depth perception. The results show that GCDs can be used to support the user in tasks that are related to visual perception. The presented techniques could be used to facilitate common tasks like distinguishing the depth of objects in virtual environments or discriminating similar colours in information visualisations.

Towards a holistic framework for software artefact consistency management

Pete, Ildiko — 2017-06-21T00:00:00Z

A software system is represented by different software artefacts ranging from requirements specifications to source code. As the system evolves, artefacts are often modified at different rates and times resulting in inconsistencies, which in turn can hinder effective communication between stakeholders, and the understanding and maintenance of systems. The problem of the differential evolution of heterogeneous software artefacts has not been sufficiently addressed to date as current solutions focus on specific sets of artefacts and aspects of consistency management and are not fully automated. This thesis presents the concept of holistic artefact consistency management and a proof-of-concept framework, ACM, which aim to support the consistent evolution of heterogeneous software artefacts while minimising the impact on user choices and practices and maximising automation. The ACM framework incorporates traceability, change impact analysis, change detection, consistency checking and change propagation mechanisms and is designed to be extensible. The thesis describes the design, implementation and evaluation of the framework, and an approach to automate trace link creation using machine learning techniques. The framework evaluation uses six open source systems and suggests that managing the consistency of heterogeneous artefacts may be feasible in practical scenarios.

Child-centred technologies as learning tools within the primary classroom : exploring the role of tablets and the potential of digital pens in schools

Mann, Anne-Marie — 2017-06-21T00:00:00Z

This thesis provides insights into how technology can be and is used as child-centric learning tools within primary school classrooms. The conducted studies look closely at how tablet technology is integrated into the modern classroom, and considers how existing digital writing technologies could support handwriting-based learning exercises in future. This is achieved by conducting three in-the-wild studies, using different approaches, with a total of seventy-four children in school classrooms. In the first study, focus is placed on how tablets integrate into and with existing classroom practices, documenting when and how children use tablets in class. Relevant and complementary to this, the use of traditional writing tools is questioned and two further studies explore the potential and suitability of digital pens to support children’s handwriting-based learning. One looks in detail at how children’s handwriting is effected by different existing digital pen technologies. The other study, conducted through a creative, participatory design session, asks children to provide their opinions regarding desirable features for digital writing technology. The findings from this research classify and exemplify the role of tablets in the classroom, and explore potential design directions of digital writing tools which could be used by children in the future. This work may be useful and of interest to others who conduct research with children within the fields of Human Computer Interaction, Child Computer Interaction or education.

Seastar: a comprehensive framework for telemetry data in HPC environments

Weidner, Ole — 2017-06-27T00:00:00Z

A large number of 2nd generation high-performance computing applications and services rely on adaptive and dynamic architectures and execution strategies to run efficiently,resiliently, and at scale on today’s HPC infrastructures. They require information about applications and their environment to steer and optimize execution. We define this information as telemetry data. Current HPC platforms do not provide the infrastructure,interfaces and conceptual models to collect, store, analyze,and access such data. Today, applications depend on application and platform specific techniques for collecting telemetry data; introducing significant development overheads that inhibit portability and mobility. The development and adoption of adaptive, context-aware strategies is thereby impaired. To facilitate 2nd generation applications,more efficient application development, and swift adoption of adaptive applications in production, a comprehensive framework for telemetry data management must be provided by future HPC systems and services. We introduce Seastar, a conceptual model and a software framework to collect, store, analyze, and exploit streams of telemetry data generated by HPC systems and their applications. We show how Seastar can be integrated with HPC platform architectures and how it enables common application execution strategies.

Visualization of patient specific disease risk prediction

Osuala, Richard — 2017-02-16T00:00:00Z

The increasing trend of systematic collection of medical data (diagnoses, hospital admission emergencies, blood test results, scans etc) by health care providers offers an unprecedented opportunity for the application of modern data mining, pattern recognition, and machine learning algorithms. The ultimate aim is invariably that of improving outcomes, be it directly or indirectly. Notwithstanding the successes of recent research efforts in this realm, a major obstacle of making the developed models usable by medical professionals (rather than computer scientists or statisticians) remains largely unaddressed. Yet, a mounting amount of evidence shows that the ability to understanding and easily use novel technologies is a major factor governing how widely adopted by the target users (doctors, nurses, and patients, amongst others) they are likely to be. In this work we address this technical gap. In particular, we describe a portable, web based interface that allows health care professionals to interact with recently developed machine learning and data driven prognostic algorithms. Our application interfaces a statistical disease progression model and displays its predictions in an intuitive and readily understandable manner. Different types of geometric primitives and their visual properties (such as size or colour), are used to represent abstract quantities such as probability density functions, the rate of change of relative probabilities, and a series of other relevant statistics which the heath care professional can use to explore patients' risk factors or provide personalized, evidence and data driven incentivization to the patient.

Light curve analysis from Kepler spacecraft collected data

Nigri, Eduardo — 2017-06-06T00:00:00Z

Although scarce, previous work on the application of machine learning and data mining techniques on large corpora of astronomical data has produced promising results. For example, on the task of detecting so-called Kepler objects of interest (KOIs), a range of different ‘off the shelf’ classifiers has demonstrated outstanding performance. These rather preliminary research efforts motivate further exploration of this data domain. In the present work we focus on the analysis of threshold crossing events (TCEs) extracted from photometric data acquired by the Kepler spacecraft. We show that the task of classifying TCEs as being erected by actual planetary transits as opposed to confounding astrophysical phenomena is significantly more challenging than that of KOI detection, with different classifiers exhibiting vastly different performances. Nevertheless,the best performing classifier type, the random forest, achieved excellent accuracy, correctly predicting in approximately 96% of the cases. Our results and analysis should illuminate further efforts into the development of more sophisticated, automatic techniques, and encourage additional work in the area. The authors would like to thank CNPq-Brazil and the University of St Andrews for their kind support.

TiTAN: exploring midair text entry using freehand input

Yeo, Hui Shyong — 2017-05-06T00:00:00Z

TiTAN is a spatial user interface that enables freehand,midair text entry with a distant display while only requiring a low-cost depth sensor. Our system aims to leverage one’s familiarity with the QWERTY layout. It allows users to input text, in midair, by mimicking the typing action they typically perform on a physical keyboard or touchscreen. Here, both hands and ten fingers are individually tracked, along with click action detection which enables a wide variety of interactions.We propose three midair text entry techniques and evaluate the TiTAN system with two different sensors.

Impact of cell load on 5GHz IEEE 802.11 WLAN

Abu-Tair, Mamoun — 2017-03-27T00:00:00Z

We have conducted an empirical study of the latest 5GHz IEEE 802.11 wireless LAN (WLAN) variants of 802.11n (5GHz) and 802.11ac (Wave 1), under different cell load conditions. We have considered typical configurations of both protocols on a Linux testbed. Under light load,there is no clear difference between 802.11n and 802.11ac in terms of performance and energy consumption. However, in some cases of high cell load, we have found that there may be a small advantage with 802.11ac. Overall, we conclude that there may be little benefit in upgrading from 802.11n (5GHz) to 802.11ac in its current offering, as the benefits may be too small.

Information and knowing when to forget it

Sharma, Rohit — 2017-05-14T00:00:00Z

In this paper we propose several novel approaches for incorporating forgetting mechanisms into sequential prediction based machine learning algorithms. The broad premise of our work, supported and motivated in part by recent findings stemming from neurology research on the development of human brains, is that knowledge acquisition and forgetting are complementary processes, and that learning can (perhaps unintuitively) benefit from the latter too. We demonstrate that if forgetting is implemented in a purposeful and date driven manner, there are a number of benefits which can be gained from discarding information. The framework we introduce is a general one and can be used with any baseline predictor of choice. Hence in this sense it is best described as a meta-algorithm. The method we described was developed through a series of steps which increase the adaptability of the model, while being data driven.We first discussed a weakly adaptive forgetting process which we termed passive forgetting. A fully adaptive framework, which we termed active forgetting was developed by enveloping a passive forgetting process with a monitoring, self-aware module which detects contextual changes and makes a statistically informed choice when the model parameters should be abruptly rather than gradually updated. The effectiveness of the proposed metaframework was demonstrated on a real world data set concerned with a challenge of major practical importance: that of predicting currency exchange rates. Our approach was shown to be highly effective, reducing prediction errors by nearly 40%.

Glycaemic index prediction : a pilot study of data linkage challenges and the application of machine learning

Li, Jingyuan — 2017-02-18T00:00:00Z

The glycaemic index (GI) is widely used to characterize the effect that a food has on blood glucose which is of major importance to diabetic individuals as well as the general population at large. At present, its applicability is severely limited by the labour involved in its measurement and the lack of understanding about how different foods interact to produce the GI of the meal comprising them. In this pilot study we examine if readily available biochemical properties of food scan be used to predict their GI, thus opening possibilities for practicable use of the GI in the management of blood glucose in everyday life. We also examine practical challenges in the cross-linking of food information sources collected by different organizations, and highlight the need for the development of a universal standard which would facilitate automatic and error free data integration.

Reading small scalar data fields: color scales vs. Detail on Demand vs. FatFonts

Manteau, Constant — 2017-05-16T00:00:00Z

We empirically investigate the advantages and disadvantages of color- and digit-based methods to represent small scalar fields. We compare two types of color scales (one brightness-based and one that varies in hue, saturation and brightness) with an interactive tooltip that shows the scalar value on demand, and with a symbolic glyph-based approach (FatFonts). Three experiments tested three tasks: reading values, comparing values, and finding extrema. The results provide the first empirical comparisons of color scales with symbol-based techniques. The interactive tooltip enabled higher accuracy and shorter times than the color scales for reading values but showed slow completion times and low accuracy for value comparison and extrema finding tasks. The FatFonts technique showed better speed and accuracy for reading and value comparison, and high accuracy for the extrema finding task at the cost of being the slowest for this task.

Model selection and testing for an automated constraint modelling toolchain

Hussain, Bilal Syed — 2017-06-21T00:00:00Z

Constraint Programming (CP) is a powerful technique for solving a variety of combinatorial problems. Automated modelling using a refinement based approach abstracts over modelling decisions in CP by allowing users to specify their problem in a high level specification language such as ESSENCE. This refinement process produces many models resulting from different choices that can be selected, each with their own strengths. A parameterised specification represents a problem class where the parameters of the class define the instance of the class we wish to solve. Since each model has different performance characteristics the model chosen is crucial to be able to solve the instance effectively. This thesis presents a method to generate instances automatically for the purpose of choosing a subset of the available models that have superior performance across the instance space. The second contribution of this thesis is a framework to automate the testing of a toolchain for automated modelling. This process includes a generator of test cases that covers all aspects of the ESSENCE specification language. This process utilises our first contribution namely instance generation to generate parameterised specifications. This framework can detect errors such as inconsistencies in the model produced during the refinement process. Once we have identified a specification that causes an error, this thesis presents our third contribution; a method for reducing the specification to a much simpler form, which still exhibits a similar error. Additionally this process can generate a set of complementary specifications including specifications that do not cause the error to help pinpoint the root cause.

Opportunistic visualization with iVoLVER

Méndez, Gonzalo Gabriel — 2016-11-08T00:00:00Z

Proposed as 'data analysis anywhere, anytime, from anything', Opportunistic Information Visualization (Opportu-Vis) [1] seeks to provide analytical support in scenarios where the data of interest is not explicitly available and has to be retrieved from digital artifacts that are not traditionally used as data sources. Examples include raster images, web pages, vector files, and photographs. This showpiece presents how iVoLVER, the Interactive Visual Language for Visualization Extraction and Reconstruction, provides support in such settings. We briefly describe the overall construction approach of the tool in scenarios where different digital artifacts are used to compose interactive visuals. All of this becomes possible by using the data extraction capabilities of iVoLVER together with the elements of its visual language.

Some problems in the theory and application of the methods of numerical taxonomy

Wishart, David — 1970-01-01T00:00:00Z

Several of the methods of numerical taxonomy are compared and shown to be variants of a tripartite grouping procedure associated with a generalised intercluster similarity function involving ten computational parameters. Clustering by the techniques of hierarchic fusion, monothetic division and iterative relocation is obtained using different arithmetic combinations of the function parameters to both compute similarities and effect changes in cluster membership. The combinatorial solution for Ward's method is found, and the centroid sorting combinatorial solution is extended for size difference, shape difference, dispersion and dot product coefficients. It is suggested that clusters are characterised more by the choice of similarity criterion than by the choice of method, and it is demonstrated that some common criteria such as distance and the error sum of squares are inclined to force spherical 'minimum-variance' classes. These are contrasted by 'natural' classes, which correspond to closed density surfaces defined for a multi-variate sample space by the underlying probability density function. A method for mode-seeking is developed from this probabilistic model through various theoretical and experimental phases, and it is shown to perform slightly better than iterative relocation with the minimum-variance criteria using several Gaussian test populations. A fast algorithm is proposed for the solution of the Jardine-Sibson method for generating overlapping classes, and it is observed that this technique finds natural classes and is closely related to the probabilistic model. Some aspects of computational procedures are discussed, and in particular, it is proposed that a generalised system involving a statistical language, conversational mode package and program suite could be developed from a basic subroutine system. Paging and simulation techniques for the organisation of direct-access data files are suggested, and a comprehensive package of computer programs for cluster analysis is described.

Survey on data fragmentation issues for users

Balekundargi, Pooja Basavaraj — 2016-01-01T00:00:00Z

Information is just data that is processed and given meaning. We live in a world where we use information to carry out the simplest tasks. We normally have different types of information, some very important and some less. Some of it is used for entertainment, some for the purpose of decision making, or just as a knowledge improvement tool. With new technologies being invented everyday, we tend to use different methods to store this information. We store information across various locations and across different services depending on personal preferences and needs. This causes information to get dispersed, most commonly known as data fragmentation. It is important to have efficient access to information at the time of need. However, although storing information in different places might aid in accessibility beyond geographical boundaries, it also hinders the process of finding and remembering the location of the right information. The following research study aims to gain insight into the methods used by individuals while storing information that is valuable to their daily activities. We also look at the mind sets of the users while they make decisions regarding storage methods. The empirical research carried out over the course of this dissertation provides insight on the causes and consequences of data fragmentation with regards to personal information. The findings have been analyzed and reported in readable format.

Algorithms for optimising heterogeneous Cloud virtual machine clusters

Thai, Long Thanh — 2016-12-12T00:00:00Z

It is challenging to execute an application in a heterogeneous cloud cluster, which consists of multiple types of virtual machines with different performance capabilities and prices. This paper aims to mitigate this challenge by proposing a scheduling mechanism to optimise the execution of Bag-of-Task jobs on a heterogeneous cloud cluster. The proposed scheduler considers two approaches to select suitable cloud resources for executing a user application while satisfying pre-defined Service Level Objectives (SLOs) both in terms of execution deadline and minimising monetary cost. Additionally, a mechanism for dynamic re-assignment of jobs during execution is presented to resolve potential violation of SLOs. Experimental studies are performed both in simulation and on a public cloud using real-world applications. The results highlight that our scheduling approaches result in cost saving of up to 31% in comparison to naive approaches that only employ a single type of virtual machine in a homogeneous cluster. Dynamic reassignment completely prevents deadline violation in the best-case and reduces deadline violations by 95% in the worst-case scenario. This research was supported by an Amazon Web Services Education Research grant.

Timing properties and correctness for structured parallel programs on x86-64 multicores

Hammond, Kevin — 2016-01-01T00:00:00Z

This paper determines correctness and timing properties for structured parallel programs on x86-64 multicores. Multicore architectures are increasingly common, but real architectures have unpredictable timing properties, and even correctness is not obvious above the relaxed-memory concurrency models that are enforced by commonly-used hardware. This paper takes a rigorous approach to correctness and timing properties, examining common locking protocols from first principles, and extending this through queues to structured parallel constructs. We prove functional correctness and derive simple timing models, and both extend for the first time from low-level primitives to high-level parallel patterns. Our derived high-level timing models for structured parallel programs allow us to accurately predict upper bounds on program execution times on x86-64 multicores.

Achieving stable subspace clustering by post-processing generic clustering results

Pham, Duc-Son — 2016-10-31T00:00:00Z

We propose an effective subspace selection scheme as a post-processing step to improve results obtained by sparse subspace clustering (SSC). Our method starts by the computation of stable subspaces using a novel random sampling scheme. Thus constructed preliminary subspaces are used to identify the initially incorrectly clustered data points and then to reassign them to more suitable clusters based on their goodness-of-fit to the preliminary model. To improve the robustness of the algorithm, we use a dominant nearest subspace classification scheme that controls the level of sensitivity against reassignment. We demonstrate that our algorithm is convergent and superior to the direct application of a generic alternative such as principal component analysis. On several popular datasets for motion segmentation and face clustering pervasively used in the sparse subspace clustering literature the proposed method is shown to reduce greatly the incidence of clustering errors while introducing negligible disturbance to the data points already correctly clustered.

Towards sophisticated learning from EHRs : increasing prediction specificity and accuracy using clinically meaningful risk criteria

Vasiljeva, Ieva — 2016-08-16T00:00:00Z

Computer based analysis of Electronic Health Records (EHRs) has the potential to provide major novel insights of benefit both to specific individuals in the context of personalized medicine, as well as on the level of population-wide health care and policy. The present paper introduces a novel algorithm that uses machine learning for the discovery of longitudinal patterns in the diagnoses of diseases. Two key technical novelties are introduced: one in the form of a novel learning paradigm which enables greater learning specificity, and another in the form of a risk driven identification of confounding diagnoses. We present a series of experiments which demonstrate the effectiveness of the proposed techniques, and which reveal novel insights regarding the most promising future research directions.

Identification of promising research directions using machine learning aided medical literature analysis

Andrei, Victor — 2016-08-16T00:00:00Z

The rapidly expanding corpus of medical research literature presents major challenges in the understanding of previous work, the extraction of maximum information from collected data, and the identification of promising research directions. We present a case for the use of advanced machine learning techniques as an aide in this task and introduce a novel methodology that is shown to be capable of extracting meaningful information from large longitudinal corpora, and of tracking complex temporal changes within it.

Enabling energy-awareness for internet video

Ejembi, Oche Omobamibo — 2016-03-30T00:00:00Z

Continuous improvements to the state of the art have made it easier to create, send and receive vast quantities of video over the Internet. Catalysed by these developments, video is now the largest, and fastest growing type of traffic on modern IP networks. In 2015, video was responsible for 70% of all traffic on the Internet, with an compound annual growth rate of 27%. On the other hand, concerns about the growing energy consumption of ICT in general, continue to rise. It is not surprising that there is a significant energy cost associated with these extensive video usage patterns. In this thesis, I examine the energy consumption of typical video configurations during decoding (playback) and encoding through empirical measurements on an experimental test-bed. I then make extrapolations to a global scale to show the opportunity for significant energy savings, achievable by simple modifications to these video configurations. Based on insights gained from these measurements, I propose a novel, energy-aware Quality of Experience (QoE) metric for digital video - the Energy - Video Quality Index (EnVI). Then, I present and evaluate vEQ-benchmark, a benchmarking and measurement tool for the purpose of generating EnVI scores. The tool enables fine-grained resource-usage analyses on video playback systems, and facilitates the creation of statistical models of power usage for these systems. I propose GreenDASH, an energy-aware extension of the existing Dynamic Adaptive Streaming over HTTP standard (DASH). GreenDASH incorporates relevant energy-usage and video quality information into the existing standard. It could enable dynamic, energy-aware adaptation for video in response to energy-usage and user ‘green’ preferences. I also evaluate the subjective perception of such energy-aware, adaptive video streaming by means of a user study featuring 36 participants. I examine how video may be adapted to save energy without a significant impact on the Quality of Experience of these users. In summary, this thesis highlights the significant opportunities for energy savings if Internet users gain an awareness about their energy usage, and presents a technical discussion how this can be achieved by straightforward extensions to the current state of the art.

Predicting and optimizing image compression

Murashko, Oleksandr — 2016-10-01T00:00:00Z

Image compression is a core task for mobile devices, social media and cloud storage backend services. Key evaluation criteria for compression are: the quality of the output, the compression ratio achieved and the computational time (and energy) expended. Predicting the effectiveness of standard compression implementations like libjpeg and WebP on a novel image is challenging, and often leads to non-optimal compression. This paper presents a machine learning-based technique to accurately model the outcome of image compression for arbitrary new images in terms of quality and compression ratio, without requiring significant additional computational time and energy. Using this model, we can actively adapt the aggressiveness of compression on a per image basis to accurately fit user requirements, leading to a more optimal compression.

WatchMI: applications of watch movement input on unmodified smartwatches

Yeo, Hui Shyong — 2016-09-06T00:00:00Z

In this demo, we show that it is possible to enhance touch interaction on unmodified smartwatch to support continuous pressure touch, twist and pan gestures, by only analyzing the real-time data of Inertial Measurement Unit (IMU). Our evaluation results show that the three proposed input interfaces are accurate, noise-resistant, easy to use and can be deployed to a variety of smartwatches. We then showcase the potential of this work with seven example applications. During the demo session, users can try the prototype.

Using quantified-self for future remote health monitoring

Khorakhun, Chonlatee — 2015-01-01T00:00:00Z

Remote monitoring is an essential part of future mHealth systems for the delivery of personal and pervasive healthcare, especially to allow the collection of personal bio-data outside clinical environments. mHealth involves the use of mobile technologies including sensors and smart phones with Internet connectivity to collect personal bio-data. Yet, by its very nature, it presents considerable challenges: (1) it will be a highly distributed task, (2) requiring collection of bio-data from a myriad of sources, (3) to be gathered at the clinical site, (4) and via secure communication channels. To address these challenges, we propose the use of an online social network (OSN) based on the quantified-self, i.e. the use of wearable sensors to monitor, collect and distribute personal bio-data, as a key component of a near-future remote health monitoring system. Additionally, the use of a social media context allows existing social interactions within the healthcare regime to be modeled within a carer network, working in harmony with, and providing support for, existing relationships and interactions between patients and healthcare professionals. We focus on the use of an online social media platform (OSMP) to enable two primitive functions of quantified-self which we consider essential for mHealth, and on which larger personal healthcare services could be built: remote health monitoring of personal bio-data, and an alert system for asynchronous notifications. We analyse the general requirements in a carer network for these two primitive functions, in terms of four different viewpoints within the carer network: the patient, the doctor in charge, a professional carer, and a family member (or friend) of the patient. We propose that a wellbeing remote monitoring scenario can act as a suitable proxy for mHealth monitoring by the use of an OSN. To allow rapid design, experimentation and evaluation of mHealth systems, we describe our experience of creating an mHealth system based on a wellbeing scenario, exploiting the quantified-self approach of measurement and monitoring. The use of wellbeing data in this manner is particularly valuable to researchers and systems developers, as key development work can be completed within a realistic scenario, but without risk to sensitive patient medical data. We discuss the suitability of using wellbeing monitoring as a proxy for mHealth monitoring with OSMPs in terms of functionality, performance and the key challenge in ensuring appropriate levels of security and privacy. We find that OSMPs based on quantified-self offer great potential for enabling personal and pervasive healthcare in an mHealth scenario.

Quantifying scribal behavior : a novel approach to digital paleography

Sampath, Vinodh Rajan — 2016-11-30T00:00:00Z

We propose a novel approach for analyzing scribal behavior quantitatively using information about the handwriting of characters. To implement this approach, we develop a computational framework that recovers this information and decomposes the characters into primitives (called strokes) to create a hierarchically structured representation. We then propose a number of intuitive metrics quantifying various facets of scribal behavior, which are derived from the recovered information and character structure. We further propose the use of techniques modeling the generation of handwriting to directly study the changes in writing behavior. We then present a case study in which we use our framework and metrics to analyze the development of four major Indic scripts. We show that our framework and metrics coupled with appropriate statistical methods can provide great insight into scribal behavior by discovering speciﬁc trends and phenomena with quantitative methods. We also illustrate the use of handwriting modeling techniques in this context to study the divergence of the Brahmi script into two daughter scripts. We conduct a user study with domain experts to evaluate our framework and salient results from the case study, and we elaborate on the results of this evaluation. Finally, we present our conclusions and discuss the limitations of our research along with future work that needs to be done.

Client-side energy costs of video streaming

Ejembi, Oche — 2016-02-04T00:00:00Z

Through measurements on our testbed, we show how users of Netflix could make energy savings of up to 34% by adjusting video quality settings. We estimate the impacts of these quality settings on the energy consumption of client systems and the network. If users exercise choice in their video streaming habits, over 100 GWh of energy a year could be saved on a global scale. We discuss how providing energy usage information to users of digital video could enable them to make choices of video settings to reduce energy usage, and we estimate savings on associated electricity costs and carbon emissions.

Adaptive pattern recognition in a real-world environment

Bairaktaris, Dmitrios — 1991-01-01T00:00:00Z

This thesis introduces and explores the notion of a real-world environment with respect to adaptive pattern recognition and neural network systems. It then examines the individual properties of a real-world environment and proposes Continuous Adaptation, Persistence of information and Context-sensitive recognition to be the major design criteria a neural network system in a real-world environment should satisfy. Based on these criteria, it then assesses the performance of Hopfield networks and Associative Memory systems and identifies their operational limitations. This leads to the introduction of Randomized Internal Representations, a novel class of neural network systems which stores information in a fully distributed way yet is capable of encoding and utilizing context. It then assesses the performance of Competitive Learning and Adaptive Resonance Theory systems and again having identified their operational weakness, it describes the Dynamic Adaptation Scheme which satisfies all three design criteria for a real-world environment.

Descriptor transition tables for object retrieval using unconstrained cluttered video acquired using a consumer level handheld mobile device

Rieutort-Louis, Warren — 2016-11-03T00:00:00Z

Visual recognition and vision based retrieval of objects from large databases are tasks with a wide spectrum of potential applications. In this paper we propose a novel recognition method from video sequences suitable for retrieval from databases acquired in highly unconstrained conditions e.g. using a mobile consumer-level device such as a phone. On the lowest level, we represent each sequence as a 3D mesh of densely packed local appearance descriptors. While image plane geometry is captured implicitly by a large overlap of neighbouring regions from which the descriptors are extracted, 3D information is extracted by means of a descriptor transition table, learnt from a single sequence for each known gallery object. These allow us to connect local descriptors along the 3rd dimension (which corresponds to viewpoint changes), thus resulting in a set of variable length Markov chains for each video. The matching of two sets of such chains is formulated as a statistical hypothesis test, whereby a subset of each is chosen to maximize the likelihood that the corresponding video sequences show the same object. The effectiveness of the proposed algorithm is empirically evaluated on the Amsterdam Library of Object Images and a new highly challenging video data set acquired using a mobile phone. On both data sets our method is shown to be successful in recognition in the presence of background clutter and large viewpoint changes.

Open Badges : a best-practice framework

Voogt, Lennert — 2016-07-13T00:00:00Z

The widespread adoption of online education is severely challenged by issues of verifiability, reliability, security and credibility. Open Badges exist to address these challenges, but there is no consensus as to what constitutes best practices regarding the implementation of an Open Badge system within an educational context. In this paper we survey the current landscape of Open Badges from educational and technological perspectives. We analyze a broad set of openly-reported pilot projects and case studies, and derive a comprehensive best practice framework that tries to capture the requirements for successful implementation within educational institutions. We conclude by identifying some significant gaps in the technology and identify some possible future research directions.

Machine checkable design patterns using dependent types and domain specific goal-oriented modelling languages

de Muijnck-Hughes, Jan — 2016-06-22T00:00:00Z

Goal-Oriented Modelling Languages such as the Goal Requirements Language (GRL) have been used to reason about Design Patterns. However, the GRL is a general purpose modelling language that does not support concepts bespoke to the pattern domain. This thesis has investigated how advanced programming language techniques, namely Dependent Types and Domain Specific Languages, can be used to enhance the design and construction of Domain Specific Modelling languages (DSMLs), and apply the results to Design Pattern Engineering. This thesis presents Sif, a DSML for reasoning about design patterns as goal-oriented requirements problems. Sif presents modellers with a modelling language tailored to the pattern domain but leverages the GRL for realisation of the modelling constructs. Dependent types have influenced the design and implementation of Sif to provide correctness guarantees, and have led to the development of NovoGRL a novel extension of the GRL. A technique for DSML implementation called Types as (Meta) Modellers was developed in which the interpretation between a DSML and its host language is implemented directly within the type-system of the DSML. This provides correctness guarantees of DSML model instances during model construction. Models can only be constructed if and only if the DSML’s type-system can build a valid representation of the model in the host language. This thesis also investigated design pattern evaluation, developing PREMES an evaluation framework that uses tailorable testing techniques to provide demonstrable reporting on pattern quality. Linking PREMES with Sif are: Freyja - an active pattern document schema in which Sif models are embedded within pattern documents; and Frigg - a tool for interacting with pattern documents. The proof-of-concept tools in this thesis demonstrate: machine enhanced interactions with design patterns; reproducible automation in the PREMES framework; and machine checking of pattern documents as Sif models. With the tooling and techniques presented, design pattern engineering can become a more rigorous, demonstrable, and machine checkable process.

Extending cloud-based applications in challenged environments with mobile opportunistic networks

Pal, Shantanu — 2016-01-01T00:00:00Z

With the tremendous growth of mobile devices, e.g, smartphones, tablets and PDAs in recent years, users are looking for more advanced platforms in order to use their computational applications (e.g., processing and storage) in a faster and more convenient way. In addition, mobile devices are capable of using cloud-based applications and the use of such technology is growing in popularity. However, one major concern is how to efficiently access these cloud-based applications when using a resource-constraint mobile device. Essentially applications require a continuous Internet connection which is difficult to obtain in challenged environments that lack an infrastructure for communication (e.g., in sparse or rural areas) or areas with infrastructure (e.g., urban or high density areas) with restricted/full of interference access networks and even areas with high costs of Internet roaming. In these situations the use of mobile opportunistic networks may be extended to avail cloud-based applications to the user. In this thesis we explore the emergence of extending cloud-based applications with mobile opportunistic networks in challenged environments and observe how local user’s social interactions and collaborations help to improve the overall message delivery performance in the network. With real-world trace-driven simulations, we compare and contrast the different user’s behaviours in message forwarding, the impact of the various network loads (e.g., number of messages) along with the long-sized messages and the impact of different wireless networking technologies, in various opportunistic routing protocols in a challenged environment.

All across the circle : using auto-ordering to improve object transfer between mobile devices

Li, Chengzhao — 2016-06-01T00:00:00Z

People frequently form small groups in many social and professional situations: from conference attendees meeting at a coffee break, to siblings gathering at a family barbecue. These ad-hoc gatherings typically form into predictable geometries based on circles or circular arcs (called F-Formations). Because our lives are increasingly stored and represented by data on handheld devices, the desire to be able to share digital objects while in these groupings has increased. Using the relative position in these groups to facilitate file sharing can enable intuitive techniques such as passing or flicking. However, there is no reliable, lightweight, ad-hoc technology for detecting and representing relative locations around a circle. In this paper, we present two systems that can auto-order locations about a circle based on sensors that are standard on commodity smartphones. We tested these systems using an object-passing task in a laboratory environment against unordered and proximity-based systems, and show that our techniques are faster, are more accurate, and are preferred by users.

Traffic visualization - applying information visualization techniques to enhance traffic planning

Picozzi, Matteo — 2013-02-21T00:00:00Z

In this paper, we present a space-time visualization to provide city’s decision-makers the ability to analyse and uncover important “city events” in an understandable manner for city planning activities. An interactive Web mashup visualization is presented that integrates several visualization techniques to give a rapid overview of traffic data. We illustrate our approach as a case study for traffic visualization systems, using datasets from the city of Oulu that can be extended to other city planning activities. We also report the feedback of real users (traffic management employees, traffic police officers, city planners) to support our arguments.

Adult dental anxiety : recent assessment approaches and psychological management in a dental practice setting

Humphris, Gerald Michael — 2016-05-01T00:00:00Z

Dental Anxiety of patients is a common feature of the everyday experience of dental practice. This article advocates the use of regular assessment of this psychological construct to assist in patient management. Various tools, such as the Modified Dental Anxiety Scale (MDAS), are available to monitor dental anxiety that are quick to complete and easy to interpret. Patient burden is low. A new mobile phone assessment system (DENTANX) is being developed for distribution. This application and other psychological interventions are being investigated to assist patients to receive dental care routinely.

Multihoming with ILNP in FreeBSD

Simpson, Bruce — 2016-06-21T00:00:00Z

Multihoming allows nodes to be multiply connected to the network. It forms the basis of features which can improve network responsiveness and robustness; e.g. load balancing and fail-over, which can be considered as a choice between network locations. However, IP today assumes that IP addresses specify both network location and node identity. Therefore, these features must be implemented at routers. This dissertation considers an alternative based on the multihoming approach of the Identifier Locator Network Protocol (ILNP). ILNP is one of many proposals for a split between network location and node identity. However, unlike other proposals, ILNP removes the use of IP addresses as they are used today. To date, ILNP has not been implemented within an operating system stack. I produce the first implementation of ILNP in FreeBSD, based on a superset of IPv6 – ILNPv6 – and demonstrate a key feature of ILNP: multihoming as a first class function of the operating system, rather than being implemented as a routing function as it is today. To evaluate the multihoming capability, I demonstrate one important application of multihoming – load distribution – at three levels of network hierarchy including individual hosts, a singleton Site Border Router (SBR), and a novel, dynamically instantiated, distributed SBR (dSBR). For each level, I present empirical results from a hardware testbed; metrics include latency, throughput, loss and reordering. I compare performance with unmodified IPv6 and NPTv6. Finally, I evaluate the feasibility of dSBR-ILNPv6 as an alternative to existing multihoming approaches, based on measurements of the dSBR’s responsiveness to changes in site connectivity. We find that multihoming can be implemented by individual hosts and/or SBRs, without requiring additional routing state as is the case today, and without any significant additional load or overhead compared to unicast IPv6.

Parallel reality : tandem exploration of real and virtual environments

Davies, C. J. — 2016-01-18T00:00:00Z

Alternate realities have fascinated mankind since early prehistory and with the advent of the computer and the smartphone we have seen the rise of many different categories of alternate reality that seek to augment, diminish, mix with or ultimately replace our familiar real world in order to expand our capabilities and our understanding. This thesis presents parallel reality as a new category of alternate reality which further addresses the vacancy problem that manifests in many previous alternate reality experiences. Parallel reality describes systems comprising two environments that the user may freely switch between, one real and the other virtual, both complete unto themselves. Parallel reality is framed within the larger ecosystem of previously explored alternate realities through a thorough review of existing categorisation techniques and taxonomies, leading to the introduction of the combined Milgram/Waterworth model and an extended definition of the vacancy problem for better visualising experience in alternate reality systems. Investigation into whether an existing state of the art alternate reality modality (Situated Simulations) could allow for parallel reality investigation via the Virtual Time Windows project was followed by the development of a bespoke parallel reality platform called Mirrorshades, which combined the modern virtual reality hardware of the Oculus Rift with the novel indoor positioning system of IndoorAtlas. Users were thereby granted the ability to walk through their real environment and to at any point switch their view to the equivalent vantage point within an immersive virtual environment. The benefits that such a system provides by granting users the ability to mitigate the effects of the extended vacancy problem and explore parallel real and virtual environments in tandem was experimentally shown through application to a use case within the realm of cultural heritage at a 15th century chapel. Evaluation of these user studies lead to the establishment of a number of best practice recommendations for future parallel reality endeavours.

On the construction of decentralised service-oriented orchestration systems

Jaradat, Ward — 2016-06-22T00:00:00Z

Modern science relies on workflow technology to capture, process, and analyse data obtained from scientific instruments. Scientific workflows are precise descriptions of experiments in which multiple computational tasks are coordinated based on the dataflows between them. Orchestrating scientific workflows presents a significant research challenge: they are typically executed in a manner such that all data pass through a centralised computer server known as the engine, which causes unnecessary network traffic that leads to a performance bottleneck. These workflows are commonly composed of services that perform computation over geographically distributed resources, and involve the management of dataflows between them. Centralised orchestration is clearly not a scalable approach for coordinating services dispersed across distant geographical locations. This thesis presents a scalable decentralised service-oriented orchestration system that relies on a high-level data coordination language for the specification and execution of workflows. This system’s architecture consists of distributed engines, each of which is responsible for executing part of the overall workflow. It exploits parallelism in the workflow by decomposing it into smaller sub-workflows, and determines the most appropriate engines to execute them using computation placement analysis. This permits the workflow logic to be distributed closer to the services providing the data for execution, which reduces the overall data transfer in the workflow and improves its execution time. This thesis provides an evaluation of the presented system which concludes that decentralised orchestration provides scalability benefits over centralised orchestration, and improves the overall performance of executing a service-oriented workflow.

Mobility as first class functionality : ILNPv6 in the Linux kernel

Phoomikiattisak, Ditchaphong — 2016-06-01T00:00:00Z

Mobility is an increasingly important aspect of communication for the Internet. The usage of handheld computing devices such as tablets and smartphones is increasingly popular among Internet users. However, the current Internet protocol, IP, was not originally designed to support mobility over the Internet. Mobile users currently suffer from connection disruption when they move around. Once a device changes point of attachments between different wireless technology (vertical handoff) e.g. from WiFi to 3G, the IP address changes, and the bound session (e.g. TCP session) breaks. While the IETF Mobile IPv4 (MIPv4) and Mobile IPv6 (MIPv6) solutions have been defined for some time, and implementations are available, they have seen little deployment due to their complexity and performance. This thesis has examined how IP mobility can be supported as first class functionality, i.e. mobility can be enabled through the end hosts only, without changing the current network infrastructure. Current approaches such as MIPv6 require the use of proxies and tunnels which introduce protocol overhead and impact transport layer performance. The Identifier-Locator Network Protocol (ILNP) is an alternative approach which potentially works end-to-end, but this is yet to be tested. This thesis shows that ILNP provides mobility support as first class functionality, is implemented in an operating system kernel, and is accessible from the standard API without requiring changes to applications. Mobility management is controlled and managed by the end-systems, and does not require additional network-layer entities, only the end hosts need to be upgraded for ILNP to operate. This work demonstrates an instance of ILNP that is a superset of IPv6, called ILNPv6, that is implemented by extending the current IPv6 code in the Linux kernel. A direct performance comparison of ILNPv6 and MIPv6 is presented, showing the improved control and performance of ILNPv6, in terms of flow continuity, packet loss, handoff delay, and signalling overhead.

Applying contextual integrity to the study of social network sites

Hutton, Luke — 2015-11-30T00:00:00Z

Social network sites (SNSs) have become very popular, with more than 1.39 billion people using Facebook alone. The ability to share large amounts of personal information with these services, such as location traces, photos, and messages, has raised a number of privacy concerns. The popularity of these services has enabled new research directions, allowing researchers to collect large amounts of data from SNSs to gain insight into how people share information, and to identify and resolve issues with such services. There are challenges to conducting such research responsibly, ensuring studies are ethical and protect the privacy of participants, while ensuring research outputs are sustainable and can be reproduced in the future. These challenges motivate the application of a theoretical framework that can be used to understand, identify, and mitigate the privacy impacts of emerging SNSs, and the conduct of ethical SNS studies. In this thesis, we apply Nissenbaum's model of contextual integrity to the study of SNSs. We develop an architecture for conducting privacy-preserving and reproducible SNS studies that upholds the contextual integrity of participants. We apply the architecture to the study of informed consent to show that contextual integrity can be leveraged to improve the acquisition of consent in such studies. We then use contextual integrity to diagnose potential privacy violations in an emerging form of SNS.

A Linked Data scalability challenge : concept reuse leads to semantic decay

Pareti, Paolo — 2015-06-28T00:00:00Z

The increasing amount of available Linked Data resources is laying the foundations for more advanced Semantic Web applications. One of their main limitations, however, remains the general low level of data quality. In this paper we focus on a measure of quality which is negatively affected by the increase of the available resources. We propose a measure of semantic richness of Linked Data concepts and we demonstrate our hypothesis that the more a concept is reused, the less semantically rich it becomes. This is a significant scalability issue, as one of the core aspects of Linked Data is the propagation of semantic information on the Web by reusing common terms. We prove our hypothesis with respect to our measure of semantic richness and we validate our model empirically. Finally, we suggest possible future directions to address this scalability problem.

Evaluating the effects of fluid interface components on tabletop collaboration

Hinrichs, Uta — 2006-05-23T00:00:00Z

Tabletop displays provide exciting opportunities to support individual and collaborative activities such as planning, organizing, and storyboarding. It has been previously suggested that continuous flow of interface items can ease information access and exploration on a tabletop workspace, yet this concept has not been adequately studied. This paper presents an exploratory user study of Interface Currents, a reconfigurable and mobile tabletop interface component that offers a controllable flow for interface items placed on its surface. Our study shows that Interface Currents supported information access and sharing on a tabletop workspace. The study findings also demonstrate that mobility, flexibility, and general adjustability of Interface Currents are important factors in providing interface support for variations in task and group interactions.

Large displays in urban life : from exhibition halls to media facades

Hinrichs, Uta — 2011-05-07T00:00:00Z

Recent trends show an increasing prevalence of large interactive displays in public urban life. For example, museums, libraries, public plazas, or architectural facades take advantage of interactive technologies that present information in a highly visual and interactive way. Studies confirm the potential of large interactive display installations for educating, entertaining, and providing evocative experiences. This workshop will provide a platform for researchers and practitioners from different disciplines to exchange insights on current research questions in the area. The workshop will focus on how to design large interactive display installations that promote engaging experiences that go beyond playful interaction, and how to evaluate their impact. The goal is to cross-fertilize in-sights from different disciplines, establish a more general understanding of large interactive displays in public urban contexts, and to develop an agenda for future research directions in this area.

Using unsupervised machine learning for fault identification in virtual machines

Schneider, Christopher — 2015-01-01T00:00:00Z

Self-healing systems promise operating cost reductions in large-scale computing environments through the automated detection of, and recovery from, faults. However, at present there appears to be little known empirical evidence comparing the different approaches, or demonstrations that such implementations reduce costs. This thesis compares previous and current self-healing approaches before demonstrating a new, unsupervised approach that combines artificial neural networks with performance tests to perform fault identification in an automated fashion, i.e. the correct and accurate determination of which computer features are associated with a given performance test failure. Several key contributions are made in the course of this research including an analysis of the different types of self-healing approaches based on their contextual use, a baseline for future comparisons between self-healing frameworks that use artificial neural networks, and a successful, automated fault identification in cloud infrastructure, and more specifically virtual machines. This approach uses three established machine learning techniques: Naïve Bayes, Baum-Welch, and Contrastive Divergence Learning. The latter demonstrates minimisation of human-interaction beyond previous implementations by producing a list in decreasing order of likelihood of potential root causes (i.e. fault hypotheses) which brings the state of the art one step closer toward fully self-healing systems. This thesis also examines the impact of that different types of faults have on their respective identification. This helps to understand the validity of the data being presented, and how the field is progressing, whilst examining the differences in impact to identification between emulated thread crashes and errant user changes – a contribution believed to be unique to this research. Lastly, future research avenues and conclusions in automated fault identification are described along with lessons learned throughout this endeavor. This includes the progression of artificial neural networks, how learning algorithms are being developed and understood, and possibilities for automatically generating feature locality data.

Perceptual and social challenges in body proximate display ecosystems

Quigley, Aaron John — 2015-08-24T00:00:00Z

Coordinated multi-display environments from the desktop, second-screen to gigapixel display walls are increasingly common. Personal and intimate display devices such as head-mounted displays, smartwatches, smartphones and tablets are rarely part of such a multi-display ecosystem. This presents an opportunity to realise “body proximate” display environments, employing on and around the body displays. These can be formed by combining multiple handheld, head-mounted, wrist-worn or other personal or appropriated displays. However, such an ecosystem encapsulating evermore interaction points, is not yet well understood. For example, does this trap the user in an “interaction bubble” even more than interaction with individual displays such as smartphones? Within this paper, we investigate the perceptual and social challenges that could inhibit the adoption and acceptance of interactive proximate display ecosystems. We conclude with a series of research questions raised in the consideration of such environments.

Design and technology challenges for body proximate display ecosystems

Grubert, Jens — 2015-08-24T00:00:00Z

Body proximate display environments can be formed by combining multiple hand-held, head-mounted, wrist-worn or other displays. Wearable displays such as smartwatches and smartglasses have the potential to extend the interaction capabilities of mobile users beyond a single display. However, the display ecosystem formed by multiple personal displays on and around users’ bodies is not well understood, yet. Within this paper, we investigate the design and technology challenges that could inhibit the creation and the use of interactive display ecosystems.

Some challenges for ethics in social network research

Hutton, Luke — 2015-08-21T00:00:00Z

Social network sites (SNSes) comprise one of the most popular networked applications of late, with hundreds of millions of users. Collecting and analysing data from such systems creates myriad ethical issues and challenges for researchers both in networked systems and other fields, as highlighted by recent media sensitivity about research studies that have used data from Facebook. In our workshop contribution we discuss recent work that we have been carrying out in the area of responsible SNS research, revolving around themes of reproducibility, consent, incentives, and creating ethical workflows. This work was supported by the Engineering and Physical Sciences Research Council [grant numbers EP/J500549/1, EP/M506631/1].

Executing Bag of Distributed Tasks on virtually unlimited Cloud resources

Thai, Long Thanh — 2015-05-20T00:00:00Z

Bag-of-Distributed-Tasks (BoDT) application is the collection of identical and independent tasks each of which requires a piece of input data located around the world. As a result, Cloud computing offers an effective way to execute BoT application as it not only consists of multiple geographically distributed data centres but also allows a user to pay for what is actually used. In this paper, BoDT on the Cloud using virtually unlimited cloud resources is investigated. To this end, a heuristic algorithm is proposed to find an execution plan that takes budget constraints into account. Compared with other approaches, for the same given budget, the proposed algorithm is able to reduce the overall execution time up to 50%. This research is supported by the EPSRC grant ‘Working Together: Constraint Programming and Cloud Computing’ (EP/K015745/1), a Royal Society Industry Fellowship, an Impact Acceleration Account Grant (IAA) and an Amazon Web Services (AWS) Education Research Grant.

Enabling energy awareness of ICT users to improve energy efficiency during use of systems

Yu, Yi — 2015-01-01T00:00:00Z

Data centres have been the primary focus of energy efficiency researches due to their expanding scales and increasing demands of energy. On the other hand, there are several orders of magnitude more end-users and personal computing devices worldwide. Even the modest energy savings from the users would scale up and yield significant impact. As a result, we take the approach towards energy-saving by working with the end-users. We recognise that users of ICT systems are often unaware of their power usage, and are therefore unable to take effective actions even if they wanted to save energy. Apart from energy awareness, the majority of end-users often lack of sufficient knowledge or skills to reduce their energy consumption while using computing devices. Moreover, there is no incentive for them to save energy, especially in public environments where they do not have financial responsibilities for their energy use. We propose a flexible energy monitor that gathers detailed energy usage across complex ICT systems, and provides end-users with accurate and timely feedback of their individual energy usage per workstation. We tailored our prototype energy monitor for a 2-year empirical study, with 83 student users of a university computer lab, and showed that end-users will change their use of computers to be more energy efficient, when sufficient feedback and incentives (rewards) are provided. In our measurements, weekly mean group power consumption as a whole reduced by up to 16%; and weekly individual user power usage reduced by up to 56% during active use. Based on our observations and collected data, we see possibilities of energy saving from both hardware and software components of personal computers. It requires coordination and collaboration between both system administrators and end-users to maximise energy savings. Institutional ‘green’ policies are potentially helpful to enforce and regulate energy efficient use of ICT devices.

On dots in boxes, or Permutation pattern classes and regular languages

Hoffmann, Ruth — 2015-01-01T00:00:00Z

Wireless sensor network control through statistical methods

Fang, Lei — 2015-01-01T00:00:00Z

Wireless Sensor Networks (WSNs) form a new paradigm of computing that allows the physical world to be measured at an unprecedented resolution; and the importance of the technology has been increasingly recognised. However, WSNs are still facing critical challenges, including the low data quality and high energy consumption. In this thesis, formal statistical models are employed to address these two practical problems. With the formalism that is properly designed, sound statistical inferences can be made to guide local sensor nodes to make reasonable and timely decisions at local level in the face of uncertainties. To improve data reliability, we introduce formal Bayesian statistical method to form two on-line in-network fault detectors. The two detection techniques are well integrated with existing data collection protocols. Experimental results demonstrate the technique has good detection accuracy but limited computational and communication overhead. To improve energy efficiency, we propose a novel data collection framework that features both energy conservation and data fault filtering by exploiting Hidden Markov Models (HMMs). Another data collection framework, a Dynamic Linear Model (DLM) based solution, featuring both adaptive sampling and efficient data collection is also proposed. Experimental results show the two solutions effectively suppress unnecessary packet transmission while satisfying users’ precision requirement. To prove the feasibility, we show all the proposed solutions are lightweight by either real world implementation or formal complexity analysis.

Efficient monitoring of large scale infrastructure as a service clouds

Ward, Jonathan Stuart — 2015-01-01T00:00:00Z

Cloud computing has had a transformative effect upon distributed systems research. It has been one of the precursors of supposed big data revolution and has amplified the scale of software, networks, data and deployments. Monitoring tools have not, however, kept pace with these developments. Scale is central to cloud computing but it is not its chiefly defining property. Elasticity, the ability of a cloud deployment to rapidly and regularly change in scale and composition, is what differentiates cloud computing from alternative paradigms of computation. Older tools originating from cluster, grid and enterprise computing predominantly lack designs which allow them to tolerate huge scale and rapid elasticity. This has led to the development of monitoring as a service tools; third party tools which abstract the intricacies of the monitoring process from the end user. These tools rely upon an economy of scale in order to deploy large numbers of VMs or servers which monitor multiple users’ infrastructure. These tools have restricted functionality and trust critical operations to third parties, which often lack reliable SLAs and which often charge significant costs. We therefore contend that an alternative is necessary. This thesis investigates the domain of cloud monitoring and proposes Varanus, a new cloud monitoring tool, which eschews conventional architectures in order to outperform current tools in a cloud setting. We compare a number of aspects of performance including monitoring latency, resource usage and elasticity tolerance. Through investigation of current monitoring approaches in conjunction with a thorough examination of cloud computing we derive a design for a new tool which leverages peer to peer and autonomic computing in order to build a tool well suited to the requirements of cloud computing. Through a detailed evaluation we demonstrate how this tool withstands the effects of scale and elasticity which impair current tools and how it employs a novel architecture which reduces fiscal costs. We demonstrate that Varanus maintains a low, near 1 second monitoring latency, regardless of both scale and elasticity and does so without imparting significant computational costs. We conclude that this design embodied by this tool represents a successful alternative to current conventional and monitoring as a service tools.

From missions to systems : generating transparently distributable programs for sensor-oriented systems

Porter, Barry — 2012-12-04T00:00:00Z

Early Wireless Sensor Networks aimed simply to collect as much data as possible for as long as possible. While this remains true in selected cases, the majority of future sensor network applications will demand much more intelligent use of their resources as networks increase in scale and support multiple applications and users. Specifically, we argue that a computational model is needed in which the ways that data flows through networks, and the ways in which decisions are made based on that data, is transparently distributable and relocatable as requirements evolve. In this paper we present an approach to achieving this using high-level mission specifications from which we can automatically derive transparently distributable programs.

Children’s Creativity Lab : creating a ‘pen of the future’

Mann, Anne-Marie — 2014-11-11T00:00:00Z

Technology is changing the way we acquire new skills and proficiencies and handwriting is no exception to this. However, while some technological advancements exist in this area, the question of how we can digitally enhance the process of learning handwriting remains under-explored. Being immersed in this process on an everyday basis, we believe that school aged children can provide valuable ideas and insights into the design of future writing tools for learners developing their (hand)writing skills. As end-users of the proposed technology, we explore including children in a form of informed participatory design during a creativity lab where we invited 12 children, aged 11–12, to put themselves into the shoes of a product designers and create a Pen of the Future using prototyping materials. In this paper we describe our methodology and discuss the design ideas that children came up with and how these may inform the design of future writing tools. This work is funded by EPSRC and SICSA.

A web-oriented framework for the development and deployment of academic facing administrative tools and services

Nicoll, J. Ross — 2015-06-24T00:00:00Z

The demand for higher education has increased dramatically in the last decade. At the same time, institutions have faced continual pressure to reduce costs and increase quality of education, while delivering that education to greater numbers of students. The introduction of software systems such as virtual learning environments, online learning resources and centralised student record systems has become routine in attempts to address these demands. However, these approaches suffer from a variety of limitations: They do not take all stakeholders’ needs into account. They do not seek to reduce administrative overheads in academic processes. They do not reflect institution-specific academic policies. They do not integrate readily with other information systems. They are not capable of adequately modelling the complex authorisation roles and organisational structure of a real institution. They are not well suited to rapidly changing policies and requirements. Their implementation is not informed by sound software engineering practises or data architecture design. Crucially, as a consequence of these drawbacks such systems can increase administrative workload for academic staff. This thesis describes the research, development and deployment of a system which seeks to address these limitations, the Module Management System (MMS). MMS is a collaborative web application targeted at streamlining and minimising administrative tasks. MMS encapsulates a number of user-facing tools for tasks including coursework submission and marking, tutorial attendance tracking, exam mark recording and final grade calculation. These tools are supported by a framework which acts as a form of “university operating system”. This framework provides a number of different services including an institution abstraction layer, role-based views and privileges, security policy support integration with external systems.

Self managing monitoring for highly elastic large scale Cloud deployments

Ward, Jonathan Stuart — 2014-06-23T00:00:00Z

Infrastructure as a Service computing exhibits a number of properties, which are not found in conventional server deployments. Elasticity is among the most significant of these properties which has wide reaching implications for applications deployed in cloud hosted VMs. Among the applications affected by elasticity is monitoring. In this paper we investigate the challenges of monitoring large cloud deployments and how these challenges differ from previous monitoring problems. In order to meet these unique challenges we propose Varanus, a highly scalable monitoring tool resistant to the effects of rapid elasticity. This tool breaks with many of the conventions of previous monitoring systems and leverages a multi-tier P2P architecture in order to achieve in situ monitoring without the need for dedicated monitoring infrastructure. We then evaluate Varanus against current monitoring architectures. We find that conventional monitoring tools perform acceptably for small, non changing cloud deployments. However in the case of large or highly elastic deployments current tools perform unacceptably incurring increased latencies, high load and slowed operation necessitating that a new, alternative tool be used. Further, we demonstrate that Varanus maintains low latency and low resource monitoring state propagation at scale and during during periods of high elasticity.

BigExcel : a web-based framework for exploring big data in Social Sciences

Saleem, Muhammed Asif — 2015-01-07T00:00:00Z

This paper argues that there are three fundamental challenges that need to be overcome in order to foster the adoption of big data technologies in non-computer science related disciplines: addressing issues of accessibility of such technologies for non-computer scientists, supporting the ad hoc exploration of large data sets with minimal effort and the availability of lightweight web-based frameworks for quick and easy analytics. In this paper, we address the above three challenges through the development of 'BigExcel', a three tier web-based framework for exploring big data to facilitate the management of user interactions with large data sets, the construction of queries to explore the data set and the management of the infrastructure. The feasibility of BigExcel is demonstrated through two Yahoo Sandbox datasets. The first dataset is the Yahoo Buzz Score data set we use for quantitatively predicting trending technologies and the second is the Yahoo n-gram corpus we use for qualitatively inferring the coverage of important events. A demonstration of the BigExcel framework and source code is available at http://bigdata.cs.st-andrews.ac.uk/projects/bigexcel-exploring-big-data-for-social-sciences/. This research was pursued through an Amazon Web Services Education Research Grant. The first author was the recipient of an Erasmus Mundus scholarship.

Adaptive dissemination of network state knowledge in structured peer-to-peer networks

Hajiarabderkani, Masih — 2015-06-24T00:00:00Z

One of the fundamental challenges in building Peer-to-Peer (P2P) applications is to locate resources across a dynamic set of nodes without centralised servers. Structured overlay networks solve this challenge by proving a key-based routing (KBR) layer that maps keys to nodes. The performance of KBR is strongly influenced by the dynamic and unpredictable conditions of P2P environments. To cope with such conditions a node must maintain its routing state. Routing state maintenance directly influences both lookup latency and bandwidth consumption. The more vigorously that state information is disseminated between nodes, the greater the accuracy and completeness of the routing state and the lower the lookup latency, but the more bandwidth that is consumed. Existing structured P2P overlays provide a set of configuration parameters that can be used to tune the trade-off between lookup latency and bandwidth consumption. However, the scale and complexity of the configuration space makes the overlays difficult to optimise. Further, it is increasingly difficult to design adaptive overlays that can cope with the ever increasing complexity of P2P environments. This thesis is motivated by the vision that adaptive P2P systems of tomorrow, would not only optimise their own parameters, but also generate and adapt their own design. This thesis studies the effects of using an adaptive technique to automatically adapt state dissemination cost and lookup latency in structured overlays under churn. In contrast to previous adaptive approaches, this work investigates the algorithmic adaptation of the fundamental data dissemination protocol rather than tuning the parameter values of a protocol with fixed design. This work illustrates that such a technique can be used to design parameter-free structured overlays that outperform other structured overlays with fixed design such as Chord in terms of lookup latency, bandwidth consumption and lookup correctness. A large amount of experimentation was performed, more than the space allows to report. This thesis presents a set of key findings. The full set of experiments and data is available online at: http://trombone.cs.st-andrews.ac.uk/thesis/analysis.

Making social media research reproducible

Hutton, Luke — 2015-05-26T00:00:00Z

The huge numbers of people using social media makes online social networks an attractive source of data for researchers. But in order for the resultant huge numbers of research publications that involve social media to be credible and trusted, their methodologies, considerations of data handling and sensitivity, analysis, and so forth must be appropriately documented. We believe that one way to improve standards and practices in social media research is to encourage such research to be made reproducible, that is, to have sufficient documentation and sharing of research to allow others to either replicate or build on research results. Enabling this fundamental part of the scientific method will benefit the entire social media ecosystem, from the researchers who use data, to the people that benefit from the outcomes of research.

Coupled complex networks : structure, adaptation and processes

Shai, Saray — 2014-11-01T00:00:00Z

In the last 15 years, network science has established itself as a leading scientific tool for the study of complex systems, describing how components in a system interact with one another. Understanding the structure and dynamics of these networks of interactions is the key to understanding the global behaviour of the systems they represent, with a wide range of applications to fundamental societal problems; from designing stable and resilient infrastructures which are critical to our sustainability, to identifying topological patterns in interactome networks that are associated with breast cancer. Most studies so far have focused on isolated single networks that do not interact with or depend upon other networks, while in reality networks rarely live in isolation and are often just one component in a much larger complex multilevel network. Together with the increased availability of richer, bigger and multi-relational datasets, the analysis of coupled networks has been recently attracting many researchers, and has exposed a multitude of new features and phenomena that were not observed for isolated networks. In this thesis, we present analytical, numerical and empirical studies of coupled complex networks, aiming to understand the implications of coupling to the functionality and behaviour of complex systems. First, we present a theoretical framework for studying the robustness of modular or interconnected networks, providing the critical concentration of interconnections between modules, above which the internal structure of each module is inseparable from the system as a whole. Second, we present another theoretical framework to study epidemic spreading in interconnected adaptive networks, discovering a new stationary state that only emerges in the case of weakly coupled networks, where the epidemic localise in the coupled nodes. In order to obtain the exact quantitative behavior of the new state from the analytical model, one must account for the actual second-order moments of the system, even for homogeneous networks, where in single networks it is usually sufficient to treat such higher-order terms by a uniform approximation. Thirdly, we present a numerical study on the effect of correlated coupling on spreading dynamics in the presence of resource constraints, finding that positive correlation between coupled nodes can impede flow process through contention, and thus constitute a less spreading-efficient structure than negatively correlated networks. Finally, we complete the thesis with a large-scale empirical study of interacting transportation networks in the entire metropolitan areas of both London and New York. We find that coupling can strongly affect the structure, and consequently the behaviour, of such multilayer transportation systems.

Fault detection for binary sensors in smart home environments

Ye, Juan — 2015-03-23T00:00:00Z

Experiments in assisted living confirm that such systems can provide context-aware services that enable occupants to remain active and independent. They also demonstrate that abnormal sensor events hamper the correct identification of critical (and potentially life-threatening) situations, and that existing learning, estimation, and time-based approaches are inaccurate and inflexible when applied to multiple people sharing a living space. We propose a technique that integrates the semantics of sensor readings with statistical outlier detection. We evaluate the technique against four real-world datasets that include multiple individuals, and show consistent rates of anomaly detection across different environments.

An approach to situation recognition based on learned semantic models

Stevenson, Graeme — 2015-06-24T00:00:00Z

A key enabler of pervasive computing is the ability to drive service delivery through the analysis of situations: Semantically meaningful classifications of system state, identified through analysing the readings from sensors attached to the everyday objects that people interact with. Situation recognition is a mature area of research, with techniques primarily falling into two categories. Knowledge-based techniques use inference rules crafted by experts; however often they compensate poorly for sensing peculiarities. Learning-based approaches excel at extracting patterns from noisy training data, however their lack of transparency can make it difficult to diagnose errors. In this thesis we propose a novel hybrid approach to situation recognition that combines both techniques. This offers improvements over each used individually, through not sacrificing the intelligibility of the decision processes that the use of machine learning alone often implies, and through providing better recognition accuracy through robustness to noise typically unattainable when developers use knowledge-based techniques in isolation. We present an ontology model and reasoning framework that supports the uniform modelling of pervasive environments, and infers additional knowledge from that which is specified, in a principled way. We use this as a basis from which to learn situation recognition models that exhibit comparable performance with more complex machine learning techniques, while retaining intelligibility. Finally, we extend the approach to construct ensemble classifiers with either improved recognition accuracy, intelligibility or both. To validate our approach, we apply the techniques to real-world data sets collected in smart-office and smart-home environments. We analyse the situation recognition performance and intelligibility of the decision processes, and compare the results to standard machine learning techniques and results published in the literature.

Extensible automated constraint modelling via refinement of abstract problem specifications

Akgün, Özgür — 2014-06-25T00:00:00Z

Constraint Programming (CP) is a powerful technique for solving large-scale combinatorial (optimisation) problems. Constraint solving a given problem proceeds in two phases: modelling and solving. Effective modelling has an huge impact on the performance of the solving process. This thesis presents a framework in which the users are not required to make modelling decisions, concrete CP models are automatically generated from a high level problem specification. In this framework, modelling decisions are encoded as generic rewrite rules applicable to many different problems. First, modelling decisions are divided into two broad categories. This categorisation guides the automation of each kind of modelling decision and also leads us to the architecture of the automated modelling tool. Second, a domain-specific declarative rewrite rule language is introduced. Thanks to the rule language, automated modelling transformations and the core system are decoupled. The rule language greatly increases the extensibility and maintainability of the rewrite rules database. The database of rules represents the modelling knowledge acquired after analysis of expert models. This database must be easily extensible to best benefit from the active research on constraint modelling. Third, the automated modelling system Conjure is implemented as a realisation of these ideas; having an implementation enables empirical testing of the quality of generated models. The ease with which rewrite rules can be encoded to produce good models is shown. Furthermore, thanks to the generality of the system, one needs to add a very small number of rules to encode many transformations. Finally, the work is evaluated by comparing the generated models to expert models found in the literature for a wide variety of benchmark problems. This evaluation confirms the hypothesis that expert models can be automatically generated starting from high level problem specifications. An method of automatically identifying good models is also presented. In summary, this thesis presents a framework to enable the automatic generation of efficient constraint models from problem specifications. It provides a pleasant environment for both problem owners and modelling experts. Problem owners are presented with a fully automated constraint solution process, once they have a precise description of their problem. Modelling experts can now encode their precious modelling expertise as rewrite rules instead of merely modelling a single problem; resulting in reusable constraint modelling knowledge.

Investigating performance and energy efficiency on a private cloud

Smith, James William — 2014-06-25T00:00:00Z

Organizations are turning to private clouds due to concerns about security, privacy and administrative control. They are attracted by the flexibility and other advantages of cloud computing but are wary of breaking decades-old institutional practices and procedures. Private Clouds can help to alleviate these concerns by retaining security policies, in-organization ownership and providing increased accountability when compared with public services. This work investigates how it may be possible to develop an energy-aware private cloud system able to adapt workload allocation strategies so that overall energy consumption is reduced without loss of performance or dependability. Current literature focuses on consolidation as a method for improving the energy-efficiency of cloud systems, but if consolidation is undesirable due to the performance penalties, dependability or latency then another approach is required. Given a private cloud in which the machines are constant, with no machines being powered down in response to changing workloads, and a set of virtual machines to run, each with different characteristics and profiles, it is possible to mix the virtual machine placement to reduce energy consumption or improve performance of the VMs. Through a series of experiments this work demonstrates that workload mixes can have an effect on energy consumption and the performance of applications running inside virtual machines. These experiments took the form of measuring the performance and energy usage of applications running inside virtual machines. The arrangement of these virtual machines on their hosts was varied to determine the effect of different workload mixes. The insights from these experiments have been used to create a proof-of- concept custom VM Allocator system for the OpenStack private cloud computing platform. Using CloudMonitor, a lightweight monitoring application to gather data on system performance and energy consumption, the implementation uses a holistic view of the private cloud state to inform workload placement decisions.

MOOCs with attitudes : Insights from a practitioner based investigation

Chadaj, Monika — 2014-10-01T00:00:00Z

In the current educational landscape of shrinking public budgets and increasing costs, MOOCs have become one of the most dominant discourses in higher education (HE). However, due to their short history, they are only just beginning to be systematically investigated. In an attempt to shed more light on the MOOC phenomenon, this study complements other approaches by eliciting institutional attitudes to MOOC provision using qualitative content analysis on responses captured in a series of semi-structured interviews with participants who hold senior positions in universities and who are involved in creating institutional policy and/or the design and delivery of MOOCs. A context for these interviews was created by looking at MOOCs from historical, pedagogical, monetary and technological perspectives. Five topics emerged that were subsequently used as common points of reference for comparisons across the interviews: motivation, monetization, pedagogy, traditional universities and public access to higher education. The analysis of attitudes to, and the importance of, these topics are summarized, and also illustrated through quotes from the participants. Interestingly, it does not appear that MOOCs are regarded by insiders as disruptive as the media presents them, but rather are seen primarily as marketing vehicles for global education brands.

Augmented learning roads for internet routing

McCaffery, John Philip — 2014-10-01T00:00:00Z

As the Internet continues to establish itself as a utility, like power, transport or water, it becomes increasingly important to provide an engaging educational experience about its operation for students in related STEM disciplines such as Computer Science and Electrical Engineering. Routing is a core functionality of the global Internet. It can be used as an example of where theory meets practice, where algorithms meet protocols and where science meets engineering. Routing protocols can be included in the Computer Science curriculum in distributed systems, computer networking, algorithms, data structures, and graph theory. While there is a plethora of computer networking textbooks, and copious information of varying quality about the Internet spread across the Web, there is still an essential need for exploratory learning facilities of the type that support group work, experimentation and experiential learning. This paper reports on work using open virtual worlds to provide a multi-user interactive learning environment for Internet routing which exemplifies the capabilities of emerging immersive education technologies to augment conventional practice. The functionality of the learning environment is illustrated through examples and the underlying system which was built to support the routing simulations is explained.

Observation-driven configuration of complex software systems

Sage, Aled — 2004-06-01T00:00:00Z

The ever-increasing complexity of software systems makes them hard to comprehend, predict and tune due to emergent properties and non-deterministic behaviour. Complexity arises from the size of software systems and the wide variety of possible operating environments: the increasing choice of platforms and communication policies leads to ever more complex performance characteristics. In addition, software systems exhibit different behaviour under different workloads. Many software systems are designed to be configurable so that policies (e.g. communication, concurrency and recovery strategies) can be chosen to meet the needs of various stakeholders. For complex software systems it can be difficult to accurately predict the effects of a change and to know which configuration is most appropriate. This thesis demonstrates that it is useful to run automated experiments that measure a selection of system configurations. Experiments can find configurations that meet the stakeholders’ needs, find interesting behavioural characteristics, and help produce predictive models of the system’s behaviour. The design and use of ACT (Automated Configuration Tool) for running such experiments is described, in combination a number of search strategies for deciding on the configurations to measure. Design Of Experiments (DOE) is discussed, with emphasis on Taguchi Methods. These statistical methods have been used extensively in manufacturing, but have not previously been used for configuring software systems. The novel contribution here is an industrial case study, applying the combination of ACT and Taguchi Methods to DC-Directory, a product from Data Connection Ltd (DCL). The case study investigated the applicability of Taguchi Methods for configuring complex software systems. Taguchi Methods were found to be useful for modelling and configuring DC-Directory, making them a valuable addition to the techniques available to system administrators and developers.

Using online social media platforms for ubiquitous, personal health monitoring

Khorakhun, C. — 2014-10-15T00:00:00Z

We propose the use of an open and publicly accessible online social media platform (OSMP) as a key component for ubiquitous and personal remote health monitoring. Remote monitoring is an essential part of future mHealth systems for the delivery of personal healthcare allowing the collection of personal bio-data outside clinical environments. Previous mHealth projects focused on building private and custom platforms using closed architectures, which have a high cost for implementation, take a long time to develop, and may provide limited access and usability. By exploiting existing and publicly accessible infrastructure using an OSMP, initial costs can be reduced, at the same time as allowing fast and flexible application development at scale, whilst presenting users with interfaces and interactions that they are familiar with. We survey and discuss suitability of OSMPs in terms of functionality, performance and the key challenge in ensuring appropriate levels of security and privacy. Date of Acceptance: 29/08/2014

Wellbeing as a proxy for a mHealth study

Khorakhun, C. — 2014-11-02T00:00:00Z

The quantified-self is a key enabler for mHealth. We propose that a wellbeing remote monitoring scenario can act as a suitable proxy for mHealth monitoring by the use of an online social network (OSN). We justify our position by discussing the parallelism in the scenario between purpose-driven wellbeing and mHealth scenarios. The similarity between these two scenarios in terms of privacy and data sharing is discussed. By using such a proxy, some of the legal and ethical complexity can be removed from experimentation on new technologies and systems for mHealth. This enables technology researchers to carry out investigation and focus on testing new technologies, system interactions as well as security and privacy in healthcare in pre- clinical experiments, without loss of context. The analogy between two purpose-driven scenarios, i.e. fitness monitoring in wellbeing scenario and remote monitoring in mHealth, is discussed in terms of a practical example: we present a prototype using a wellbeing device -- Fitbit -- and an open source online social media platform (OSMP) -- Diaspora. Date of Acceptance: 21/09/2014

Mapping parallel programs to heterogeneous CPU/GPU architectures using a Monte Carlo Tree Search

Goli, Mehdi — 2013-06-20T00:00:00Z

The single core processor, which has dominated for over 30 years, is now obsolete with recent trends increasing towards parallel systems, demanding a huge shift in programming techniques and practices. Moreover, we are rapidly moving towards an age where almost all programming will be targeting parallel systems. Parallel hardware is rapidly evolving, with large heterogeneous systems, typically comprising a mixture of CPUs and GPUs, becoming the mainstream. Additionally, with this increasing heterogeneity comes increasing complexity: not only does the programmer have to worry about where and how to express the parallelism, they must also express an efficient mapping of resources to the available system. This generally requires in-depth expert knowledge that most application programmers do not have. In this paper we describe a new technique that derives, automatically, optimal mappings for an application onto a heterogeneous architecture, using a Monte Carlo Tree Search algorithm. Our technique exploits high-level design patterns, targeting a set of well-specified parallel skeletons. We demonstrate that our MCTS on a convolution example obtained speedups that are within 5% of the speedups achieved by a hand-tuned version of the same application.

Workflow partitioning and deployment on the cloud using Orchestra

Jaradat, Ward — 2015-02-15T00:00:00Z

Orchestrating service-oriented workflows is typically based on a design model that routes both data and control through a single point -- the centralised workflow engine. This causes scalability problems that include the unnecessary consumption of the network bandwidth, high latency in transmitting data between the services, and performance bottlenecks. These problems are highly prominent when orchestrating workflows that are composed from services dispersed across distant geographical locations. This paper presents a novel workflow partitioning approach, which attempts to improve the scalability of orchestrating large-scale workflows. It permits the workflow computation to be moved towards the services providing the data in order to garner optimal performance results. This is achieved by decomposing the workflow into smaller sub workflows for parallel execution, and determining the most appropriate network locations to which these sub workflows are transmitted and subsequently executed. This paper demonstrates the efficiency of our approach using a set of experimental workflows that are orchestrated over Amazon EC2 and across several geographic network regions.

RBioCloud : a light-weight framework for bioconductor and R-based jobs on the Cloud

Varghese, Blesson — 2014-10-02T00:00:00Z

Large-scale ad hoc analytics of genomic data is popular using the R-programming language supported by over 700 software packages provided by Bioconductor. More recently, analytical jobs are benefitting from on-demand computing and storage, their scalability and their low maintenance cost, all of which are offered by the cloud. While Biologists and Bioinformaticists can take an analytical job and execute it on their personal workstations, it remains challenging to seamlessly execute the job on the cloud infrastructure without extensive knowledge of the cloud dashboard. How analytical jobs can not only with minimum effort be executed on the cloud, but also how both the resources and data required by the job can be managed is explored in this paper. An open-source light-weight framework for executing R-scripts using Bioconductor packages, referred to as ‘RBioCloud’, is designed and developed. RBioCloud offers a set of simple command-line tools for managing the cloud resources, the data and the execution of the job. Three biological test cases validate the feasibility of RBioCloud. The framework is available from http://www.rbiocloud.com.

Data citation practices in the CRAWDAD wireless network data archive

Henderson, Tristan — 2015-01-01T00:00:00Z

CRAWDAD (Community Resource for Archiving Wireless Data At Dartmouth) is a popular research data archive for wireless network data, archiving over 100 datasets used by over 6,500 users. In this paper we examine citation behaviour amongst 1,281 papers that use CRAWDAD datasets. We find that (in general) paper authors cite datasets in a manner that is sufficient for providing credit to dataset authors and also provides access to the datasets that were used. Only 11.5% of papers did not do so; common problems included (1) citing the canonical papers rather than the dataset, (2) describing the dataset using unclear identifiers, and (3) not providing URLs or pointers to datasets. We are thankful for the generous support of our current funders ACM SIGCOMM and ACM SIGMOBILE, and our past funders Aruba Networks, Intel and the National Science Foundation.

Designing the Unexpected : Endlessly Fascinating Interaction for Interactive Installations

MacDonald, Lindsay — 2015-01-15T00:00:00Z

We present A Delicate Agreement, an interactive art installation designed to intrigue viewers by offering them an unfolding story that is endlessly fascinating. To achieve this, we set our story in the liminal space of an elevator, and populated this elevator with a set of unique characters. Viewers watch the story unfold through peepholes in the elevator’s doors, where in turn their gaze can trigger changes in the storyline. This storyline’s interactive response was created via a complex adaptive system using simple rules based on Goffman’s performance theory. This research was supported in part by SSHRC, NSERC, SMART Technologies, AITF, SurfNet and GRAND.

Repeating history : execution replay for Parallel Haskell programs

Ferrerio, Henrique — 2013-01-01T00:00:00Z

Parallel profiling tools, such as ThreadScope for Parallel Haskell, allow programmers to obtain information about the performance of their parallel programs. However, the information they provide is not always sufficiently detailed to precisely pinpoint the cause of some per- formance problems. Often, this is because the cost of obtaining that information would be prohibitive for a complete program execution. In this paper, we adapt the well-known technique of execution replay to make it possible to simulate a previous run of a program. We ensure that the non-deterministic parallel behaviour of the application is prop- erly emulated while the deterministic functional code is run unmodified. In this way, we can gather additional data about the behaviour of a par- allel program by replaying some parts of it with more detailed profiling information. We exploit this ability to identify performance bottlenecks in a quicksort implementation, and to derive a version that gives better speedups on multicore machines.

The Haptic Touch toolkit : enabling exploration of haptic interactions

Ledo, David — 2012-02-19T00:00:00Z

In the real world, touch based interaction relies on haptic feedback (e.g., grasping objects, feeling textures). Unfortunately, such feedback is absent in current tabletop systems. The previously developed Haptic Tabletop Puck (HTP) aims at supporting experimentation with and development of inexpensive tabletop haptic interfaces in a do-it-yourself fashion. The problem is that programming the HTP (and haptics in general) is difficult. To address this problem, we contribute the Haptictouch toolkit, which enables developers to rapidly prototype haptic tabletop applications. Our toolkit is structured in three layers that enable programmers to: (1) directly control the device, (2) create customized combinable haptic behaviors (e.g., softness, oscillation), and (3) use visuals (e.g., shapes, images, buttons) to quickly make use of these behaviors. In our preliminary exploration we found that programmers could use our toolkit to create haptic tabletop applications in a short amount of time. This work is partially funded by the AITF/NSERC/SMART Chair in Interactive Technologies, Alberta Innovates Tech. Futures, NSERC, and SMART Technologies.

Self-management of self-organising mobile computing applications : a separation of concerns approach

Fernandez Marquez, Jose Luis — 2014-03-24T00:00:00Z

Although the research area of self-organising systems is well established, their construction is often ad hoc. Consequently, such software is difficult reuse across applications that require similar functionality of have similar goals. The development of self-organising applications and, a fortiori, self-organising mobile applications is therefore limited to developers who are experts in specific self-organising mechanisms. As a first step towards addressing this, this paper discusses the notion of self-organising mechanisms provided as services for building higher level functionality in a modular way. This eases reuse and thus provides separation of concerns. Additionally, because of the dynamic and heterogeneous nature of mobile networks, services need to adapt themselves in order to ensure both functional and non-functional requirements. This paper discusses whether the self-management of self-organising mobile applications can be achieved in a modular fashion, via the self-management of low level self-organising services it employs, rather than considering the management of the complex system as a whole. We empirically investigate two non-functional aspects: resource optimisation and accuracy.

The cost of virtue : reward as well as feedback are required to reduce user ICT power consumption

Yu, Yi — 2014-06-11T00:00:00Z

We show that students in a school lab environment will change their behaviour to be more energy efficient, when appropriate incentives are in place, and when measurement-based, real-time feedback about their energy usage is provided. Rewards incentivise `non-green' users to be `green' as well as encouraging those users who already claim to be `green'. Measurement-based feedback improves user energy awareness and helps users to explore and adjust their use of computers to become `greener', but is not sufficient by itself. In our measurements, weekly mean group energy use as a whole reduced by up to 16%; and weekly individual user energy consumption reduced by up to 56% during active use. The findings are drawn from our longitudinal study that involved 83 Computer Science students; lasted 48 weeks across 2 academic years; monitored a total of 26778 hours of active computer use; collected approximately 2TB of raw data. This work was partly supported by the IU-AC project, funded by grant EP/J016756/1 from the Engineering and Physical Sciences Research Council (EPSRC).

Practical privacy and security for opportunistic networks

Parris, Iain — 2014-12-01T00:00:00Z

When in physical proximity, data can be directly exchanged between the mobile devices people carry - for example over Bluetooth. If people cooperate to store, carry and forward messages on one another's behalf, then an opportunistic network may be formed, independent of any fixed infrastructure. To enable performant routing within opportunistic networks, use of social network information has been proposed for social network routing protocols. But the decentralised and cooperative nature of the networks can however expose users of such protocols to privacy and security threats, which may in turn discourage participation in the network. In this thesis, we examine how to mitigate privacy and security threats in opportunistic networks while maintaining network performance. We first demonstrate that privacy-aware routing protocols are required in order to maintain network performance while respecting users' privacy preferences. We then demonstrate novel social network routing protocols that mitigate specific threats to privacy and security while maintaining network performance.

Towards controlling software architecture erosion through runtime conformance monitoring

de Silva, Lakshitha R. — 2014-12-01T00:00:00Z

The software architecture of a system is often used to guide and constrain its implementation. While the code structure of an initial implementation is likely to conform to its intended architecture, its dynamic properties cannot always be fully checked until deployment. Routine maintenance and changing requirements can also lead to a deployed system deviating from this architecture over time. Dynamic architecture conformance checking plays an important part in ensuring that software architectures and corresponding implementations stay consistent with one another throughout the software lifecycle. However, runtime conformance checking strategies often force changes to the software, demand tight coupling between the monitoring framework and application, impact performance, require manual intervention, and lack flexibility and extensibility, affecting their viability in practice. This thesis presents a dynamic conformance checking framework called PANDArch framework, which aims to address these issues. PANDArch is designed to be automated, pluggable, non-intrusive, performance-centric, extensible and tolerant of incomplete specifications. The thesis describes the concept and design principles behind PANDArch, and its current implementation, which uses an architecture description language to specify architectures and Java as the target language. The framework is evaluated using three open source software products of different types. The results suggest that dynamic architectural conformance checking with the proposed features may be a viable option in practice.

An elastic virtual infrastructure for research applications (ELVIRA)

Voss, Alexander — 2013-11-25T00:00:00Z

Cloud computing infrastructures provide a way for researchers to source the computational and storage resources they require to conduct their work and to collaborate within distributed research teams. We provide an overview of a cloud-based elastic virtual infrastructure for research applications that we have established to provide researchers with a collaborative research environment that automatically allocates cloud resources as required. We describe how we have used this infrastructure to support research on the Sun’s corona and how the elasticity provided by cloud infrastructures can be leveraged to provide high-throughput computing resources using a set of off-the-shelf technologies and a small number of additional tools that are simple to deploy and use. The resulting infrastructure has a number of advantages for the researchers compared to traditional clusters or grid computing environments that we discuss in the conclusions.

Virtual worlds, real traffic : interaction and adaptation

Oliver, Iain Angus — 2010-01-01T00:00:00Z

Metaverses such as Second Life (SL) are a relatively new type of Internet application. Their functionality is similar to online 3D games but differs in that users are able to construct the environment their avatars inhabit and are not constrained by predefined goals. From the network perspective metaverses are similar to games in that timeliness is important but differ in that their traffic is much less regular and requires more bandwidth This paper contributes to our understanding of metaverse traffic by validating previous studies and offering new insights. In particular we analyse the relationships between application functionality, SL's traffic control system and the wider network environment. Two sets of studies have been carried out: one of the traffic generated by a hands-on workshop which used SL; and a follow up set of controlled experiments to clarify some of the findings from the first study. The interplay between network latency, SL's traffic throttle settings, avatar density, and the errors in the client's estimation of avatar positions are demonstrated. These insights are of particular interest to those designing traffic management schemes for metaverses and help explain some of the oddities in the current user experience. Proceeding MMSys '10 Proceedings of the first annual ACM SIGMM conference on Multimedia systems

Managing humanitarian emergencies : Teaching and learning with a virtual humanitarian disaster tool

Ajinomoh, Olatokunbo — 2012-01-01T00:00:00Z

The importance of specialist intervention in the form of humanitarian aid from governments, NGOs and other aid agencies during a humanitarian emergency cannot be over-emphasised. Humanitarian aid is the assistance provided in response to a humanitarian crisis. Humanitarian aid may be logistical, financial or material and its central aim is to alleviate human suffering and save lives. This paper describes an inter-disciplinary project that created the Virtual Humanitarian Disaster learning and teaching resource (VHD) that is centred on the events occurring in the aftermath of an earthquake. To facilitate learning, scenarios with integrated task dilemmas have been modelled which will provide the opportunity for users of the resource to explore the inter-relationships between the key areas of activities which are important to the NGOs and other bodies which deliver humanitarian aid. Such areas include geo-political relationships, legal and regulatory requirements, information management, logistic, financial and human resource management imperatives. The VHD is primarily aimed at students. It creates a more flexible learning and teaching environment when compared with traditional classroom methods. The resource enables students to make decisions concerning critical situations within the controlled environment of a virtual world, where the consequences of any wrong decisions, will not directly impact on lives and property. The VHD has been embedded within an undergraduate module of the School of Management as it specifically relates to the final thematic area within which the module engages, namely the strategic and operational challenges faced by NGOs operating in the “humanitarian relief industry”. We demonstrate that virtual worlds can be used to enhance learning and make it more engaging. The VHD affords students the opportunity to explore given scenarios in accordance with a specified budget and in so doing, they realise module outcomes in a more active and authentic learning environment. The project received start-up funding in the form of a University of St Andrews FILTA award

Virtual machines for virtual worlds

Sanatinia, Amirali — 2012-01-01T00:00:00Z

Multi User Virtual Worlds provide a simulated immersive 3D environment that is similar to the real world. Popular examples include Second Life and OpenSim. The multi-user nature of these simulations means that there are significant computational demands on the processes that render the different avatar-centric views of the world for each participant, which change with every movement or interaction each participant makes. Maintaining quality of experience can be difficult when the density of avatars within the same area suddenly grows beyond a relatively small number. As such virtual worlds have a dynamic resource-on-demand need that could conceivably be met by Cloud technologies. In this paper we make a start to assessing the feasibility of using the Cloud for virtual worlds by measuring the performance of virtual worlds in virtual machines of the type used for Clouds. A suitable benchmark is researched and formulated and the construction of a test-bed for carrying out load experiments is described. The system is then used to evaluate the performance of virtual worlds running in virtual machines. The results are presented and analysed before presenting the design of a system that we have built for managing virtual worlds in the cloud.

Exploring heritage through time and space : Supporting community reflection on the highland clearances

McCaffery, John Philip — 2013-10-01T00:00:00Z

On the two hundredth anniversary of the Kildonan clearances, when people were forcibly removed from their homes, the Timespan Heritage centre has created a program of community centred work aimed at challenging pre conceptions and encouraging reflection on this important historical process. This paper explores the innovative ways in which virtual world technology has facilitated community engagement, enhanced visualisation and encouraged reflection as part of this program. An installation where users navigate through a reconstruction of pre clearance Caen township is controlled through natural gestures and presented on a 300 inch six megapixel screen. This environment allows users to experience the past in new ways. The platform has value as an effective way for an educator, artist or hobbyist to create large scale virtual environments using off the shelf hardware and open source software. The result is an exhibit that also serves as a platform for experimentation into innovative ways of community co-creation and co-curation.

Mobile cross reality for cultural heritage

Davies, Christopher John — 2013-10-01T00:00:00Z

Widespread adoption of smartphones and tablets has enabled people to multiplex their physical reality, where they engage in face-to-face social interaction, with Web-based social networks and apps, whilst emerging 3D Web technologies hold promise for networks of parallel 3D virtual environments to emerge. Although current technologies allow this multiplexing of physical reality and 2D Web, in a situation called PolySocial Reality, the same cannot yet be achieved with 3D content. Cross Reality was proposed to address this issue; however so far it has focused on the use of fixed links between physical and virtual environments in closed lab settings, limiting investigation of the explorative and social aspects. This paper presents an architecture and implementation that addresses these shortcomings using a tablet computer and the Pangolin virtual world viewer to provide a mobile interface to a corresponding 3D virtual environment. Motivation for this project stemmed from a desire to enable students to interact with existing virtual reconstructions of cultural heritage sites in tandem with exploration of the corresponding real locations, avoiding the adverse temporal separation caused otherwise by interacting with the virtual content only within the classroom. The accuracy of GPS tracking emerged as a constraint on this style of interaction.

Open Virtual Worlds : A serious platform for experiential and game based learning

Miller, Alan Henry David — 2012-01-01T00:00:00Z

This paper presents our experiences of and reflections on five years work in using virtual worlds to support exploratory learning across a range of disciplines and educational contexts. Both educational and systems aspects are considered. Experiential learning enriches education by allowing exploration of a subject. However, often barriers of time, place, cost or scale make it difficult to conduct real world experiential learning. This paper presents experiences in utilizing virtual worlds to support experiential learning in the arts, humanities and sciences. The work presented here draws upon several years of experience in designing, developing and deploying Virtual World applications, which address the concrete needs of specific subject areas in a range of educational contexts. The work was motivated by the observation that 3D educational environments could leverage digital literacy developed playing console games and to provide an engaging learning experience where users navigate a virtual environment much as they would the real world. Furthermore developments in computer hardware and networking mean that 3D applications run on standard computers found in offices and educational institutions. The Initial application developed was a simulation of an archeological excavation. Prototypes were developed in a First Person Shooter game, a Virtual Reality environment and a Virtual World. We found that Virtual World technology offered users presence through the proxy of avatars and powerfull support for shaping and programming the environment. Initially a simulation of an archeological dig, a virtual teaching space for a management course, a virtual laboratory for wireless networking and a lab for exploring Human Computer Interaction were developed on a Second Life island. The experience was positive and students engaged in valuable learning activities that would not otherwise have been possible. However, we came up against constraints, that were not inherent in Virtual Worlds per se, but rather flowed from Second Life's service model. This lead us to migrate our development platform from Second Life to OpenSim. The ability for institutions to manage their own virtual world servers offers benefits in the areas of content creation, application development, cost and scalability. However in providing a virtual world service a number of challenges arise, which must be met if the potential of educational virtual worlds is to be realised. These challenges lie in the realms of application design, support for resource creation, and system support. The power of Open Virtual Worlds is illustrated here by presenting three exemplar applications developed on OpenSim. These are a virtual laboratory for experimenting with Internet routing protocols, a reconstruction of Scotland largest and most important religious building, St Andrews Cathedral and a tool for learning about intervention in humanitarian disasters. A number of subjects and educational contexts are considered: contexts include PhD and masters research projects, laboratory sessions as part of accredited degree programs, open days for aspiring entrants, an exhibition held in a science center attended by the “interested public”, parties of primary school students with their teachers as well as scouts and cubs on a days expedition. Subjects areas include, computer science, archeology, art history, history and management. Taken together this work demonstrates the power of virtual worlds as a platform for developing 3D applications that support heterogeneous exploratory learning. There are still challenges to be met for the potential to be realised but the potential is considerable.

Towards the 3D Web with Open Simulator

Oliver, Iain Angus — 2013-03-25T00:00:00Z

Continuing advances and reduced costs in computational power, graphics processors and network bandwidth have led to 3D immersive multi-user virtual worlds becoming increasingly accessible while offering an improved and engaging Quality of Experience. At the same time the functionality of the World Wide Web continues to expand alongside the computing infrastructure it runs on and pages can now routinely accommodate many forms of interactive multimedia components as standard features - streaming video for example. Inevitably there is an emerging expectation that the Web will expand further to incorporate immersive 3D environments. This is exciting because humans are well adapted to operating in 3D environments and it is challenging because existing software and skill sets are focused around competencies in 2D Web applications. Open Simulator (OpenSim) is a freely available open source tool-kit that empowers users to create and deploy their own 3D environments in the same way that anyone can create and deploy a Web site. Its characteristics can be seen as a set of references as to how the 3D Web could be instantiated. This paper describes experiments carried out with OpenSim to better understand network and system issues, and presents experience in using OpenSim to develop and deliver applications for education and cultural heritage. Evaluation is based upon observations of these applications in use and measurements of systems both in the lab and in the wild.

An evaluation of user support strategies for managed learning in a multi user virtual environment

Perera, Galhenage Indika Udaya Shantha — 2013-01-01T00:00:00Z

The management of online learning environments so that they are effective and efficient presents a significant challenge for institutions and lecturers due to the complexity of requirements in the learning and teaching domain. The use of 3D Multi User Virtual Environments (MUVEs) for education introduces a novel set of management challenges. MUVEs were designed to cater for entertainment and commercial needs and as such do not intrinsically support managed learning. When MUVEs are used for educational purposes, forming 3D Multi User Learning Environments (MULEs), user support for learning management becomes an important factor. This thesis highlights the importance of managed learning in MULEs. It proposes a coordinated approach which accommodates the existing education institutional infrastructure. The research has focused on two very widely used and closely compatible MUVEs, Second Life (SL) and OpenSim. The thesis presents system and user studies that have been carried out on these selected MUVEs. The findings reveal the challenges that academics and students can experience if they do not have sufficient knowhow to manage learning activities in SL/OpenSim. User guidance and training tools were then developed for supporting learning management strategies in the context of SL/OpenSim and demonstrated in exemplar use-case scenarios. The user support models and tools which were developed have been extensively evaluated for their usability and educational value using diverse participant groups. The results validate the efficacy of these contributions, defending the research thesis. These contributions can be used in future research on managing MUVE supported education.

Correct model-to-model transformation for formal verification

Meedeniya, Dulani Apeksha — 2013-06-26T00:00:00Z

Modern software systems have increasingly higher expectations on their reliability, in particular if the systems are critical and real-time. The development of these complex software systems requires strong modelling and analysis methods including quantitative modelling and formal verification. Unified Modelling Language (UML) is a widely used and intuitive graphical modelling language to design complex systems, while formal models provide a theoretical support to verify system design models. However, UML models are not sufficient to guarantee correct system designs and formal models, on the other hand, are often restrictive and complex to use. It is believed that a combined approach comprising the advantages of both models can offer better designs for modern complex software development needs. This thesis focuses on the design and development of a rigorous framework based on Model Driven Development (MDD) that facilitates transformations of non-formal models into formal models for design verification. This thesis defines and describes the transformation from UML2 sequence diagrams to coloured Petri nets and proves syntactic and semantic correctness of the transformation. Additionally, we explore ways of adding information (time, probability, and hierarchy) to a design and how it can be added onto extensions of a target model. Correctness results are extended in this context. The approach in this thesis is novel and significant both in how to establish semantic and syntactic correctness of transformations, and how to explore semantic variability in the target model for formal analysis. Hence, the motivation of this thesis establishes: the UML behavioural models can be validated by correct transformation of them into formal models that can be formally analysed and verified.

Supporting system deployment decisions in public clouds

Khajeh-Hosseini, Ali — 2013-06-26T00:00:00Z

Decisions to deploy IT systems on public Infrastructure-as-a-Service clouds can be complicated as evaluating the benefits, risks and costs of using such clouds is not straightforward. The aim of this project was to investigate the challenges that enterprises face when making system deployment decisions in public clouds, and to develop vendor-neutral tools to inform decision makers during this process. Three tools were developed to support decision makers: 1. Cloud Suitability Checklist: a simple list of questions to provide a rapid assessment of the suitability of public IaaS clouds for a specific IT system. 2. Benefits and Risks Assessment tool: a spreadsheet that includes the general benefits and risks of using public clouds; this provides a starting point for risk assessment and helps organisations start discussions about cloud adoption. 3. Elastic Cost Modelling: a tool that enables decision makers to model their system deployment options in public clouds and forecast their costs. These three tools collectively enable decision makers to investigate the benefits, risks and costs of using public clouds, and effectively support them in making system deployment decisions. Data was collected from five case studies and hundreds of users to evaluate the effectiveness of the tools. This data showed that the cost effectiveness of using public clouds is situation dependent rather than universally less expensive than traditional forms of IT provisioning. Running systems on the cloud using a traditional 'always on' approach can be less cost effective than on-premise servers, and the elastic nature of the cloud has to be considered if costs are to be reduced. Decision makers have to model the variations in resource usage and their systems' deployment options to obtain accurate cost estimates. Performing upfront cost modelling is beneficial as there can be significant cost differences between different cloud providers, and different deployment options within a single cloud. During such modelling exercises, the variations in a system's load (over time) must be taken into account to produce more accurate cost estimates, and the notion of elasticity patterns that is presented in this thesis provides one simple way to do this.

Using self-reported social networks to improve opportunistic networking

Bigwood, Greg — 2012-06-19T00:00:00Z

Opportunistic networks provide an ad hoc communication medium without the need for an infrastructure network, by leveraging human encounters and mobile devices. Routing protocols in opportunistic networks frequently rely upon encounter histories to build up meaningful data to use for informed routing decisions. This thesis shows that it is possible to use pre-existing social-network information to improve existing opportunistic routing protocols, and that these self-reported social networks have a particular benefit when used to bootstrap an opportunistic routing protocol. Frequently, opportunistic routing protocols require users to relay messages on behalf of one another: an act that incurs a cost to the relaying node. Nodes may wish to avoid this forwarding cost by not relaying messages. Opportunistic networks need to incentivise participation and discourage the selfish behaviour. This thesis further presents an incentive mechanism that uses self-reported social networks to construct and maintain reputation and trust relationships between participants, and demonstrates its superior performance over existing incentive mechanisms.

Enabling network mobility support

Rehunathan, Devan — 2012-11-30T00:00:00Z

As computing devices become increasingly portable, it is becoming necessary to support Mobility as a core network functionality. The availability of devices such as smartphones, tablets, laptops as well as wireless network infrastructure is opening up the possibility of using Network Mobility to cater for multiple mobile nodes simultaneously. Network mobility may be useful in a number of mobile scenarios, where a large number of mobile nodes are moving in unison. A number of operational benefits stand to be gained by aggregating these nodes into a single mobile unit. Unfortunately, the current state for network mobility support, especially in terms of network layer protocols, is limited. This is in part due to the inherent complexity of mobile network scenarios, the high cost of testing mobile network protocols in operational environments and the difficulties in implementing such protocols. This thesis looks at how network mobility support may be better enabled by making experimentation with mobile networks more accessible. It shows this by first showing how analytical approaches can be useful in mobile network applications, as they abstract away from experimental details and allow for more straight forward protocol comparisons. It then goes on to look at the tools available to study mobile network protocols, where it introduces and extends an existing tool that uses virtual machines to allow for the study of mobile network protocols. Finally, it demonstrates a practical method in which mobile network support may be easily enabled in a practical setting.

The architecture of an autonomic, resource-aware, workstation-based distributed database system

Macdonald, Angus — 2012-11-30T00:00:00Z

Distributed software systems that are designed to run over workstation machines within organisations are termed workstation-based. Workstation-based systems are characterised by dynamically changing sets of machines that are used primarily for other, user-centric tasks. They must be able to adapt to and utilize spare capacity when and where it is available, and ensure that the non-availability of an individual machine does not affect the availability of the system. This thesis focuses on the requirements and design of a workstation-based database system, which is motivated by an analysis of existing database architectures that are typically run over static, specially provisioned sets of machines. A typical clustered database system — one that is run over a number of specially provisioned machines — executes queries interactively, returning a synchronous response to applications, with its data made durable and resilient to the failure of machines. There are no existing workstation-based databases. Furthermore, other workstation-based systems do not attempt to achieve the requirements of interactivity and durability, because they are typically used to execute asynchronous batch processing jobs that tolerate data loss — results can be re-computed. These systems use external servers to store the final results of computations rather than workstation machines. This thesis describes the design and implementation of a workstation-based database system and investigates its viability by evaluating its performance against existing clustered database systems and testing its availability during machine failures.

Cross-display attention switching in mobile interaction with large displays

Rashid, Umar — 2012-11-30T00:00:00Z

Mobile devices equipped with features (e.g., camera, network connectivity and media player) are increasingly being used for different tasks such as web browsing, document reading and photography. While the portability of mobile devices makes them desirable for pervasive access to information, their small screen real-estate often imposes restrictions on the amount of information that can be displayed and manipulated on them. On the other hand, large displays have become commonplace in many outdoor as well as indoor environments. While they provide an efficient way of presenting and disseminating information, they provide little support for digital interactivity or physical accessibility. Researchers argue that mobile phones provide an efficient and portable way of interacting with large displays, and the latter can overcome the limitations of the small screens of mobile devices by providing a larger presentation and interaction space. However, distributing user interface (UI) elements across a mobile device and a large display can cause switching of visual attention and that may affect task performance. This thesis specifically explores how the switching of visual attention across a handheld mobile device and a vertical large display can affect a single user's task performance during mobile interaction with large displays. It introduces a taxonomy based on the factors associated with the visual arrangement of Multi Display User Interfaces (MDUIs) that can influence visual attention switching during interaction with MDUIs. It presents an empirical analysis of the effects of different distributions of input and output across mobile and large displays on the user's task performance, subjective workload and preference in the multiple-widget selection task, and in visual search tasks with maps, texts and photos. Experimental results show that the selection of multiple widgets replicated on the mobile device as well as on the large display, versus those shown only on the large display, is faster despite the cost of initial attention switching in the former. On the other hand, a hybrid UI configuration where the visual output is distributed across the mobile and large displays is the worst, or equivalent to the worst, configuration in all the visual search tasks. A mobile device-controlled large display configuration performs best in the map search task and equal to best (i.e., tied with a mobile-only configuration) in text- and photo-search tasks.

Socio-technical analysis of system-of-systems using responsibility modelling

Greenwood, David — 2012-01-01T00:00:00Z

Society is challenging systems engineers by demanding increasingly complex and integrated IT systems (Northrop et al., 2006; RAE, 2004) e.g. integrated enterprise resource planning systems, integrated healthcare systems and business critical services provisioned using cloud based resources. These types of IT system are often systems-of-systems (SoS). That is to say they are composed of multiple systems that are operated and managed by independent parties and are distributed across multiple organisational boundaries, geographies or legal jurisdictions (Maier, 1998). SoS are notorious for becoming problematic due to interconnected technical and social issues. Practitioners claim that they are ill equipped to deal with the sociotechnical challenges posed by system-of-systems. One of these challenges is to identify the socio-technical threats associated with building, operating and managing systems whose parts are distributed across organisational boundaries. Another is how to troubleshoot these systems when they exhibit undesirable behaviour. This thesis aims to provide a modelling abstraction and an extensible technique that enables practitioners to identify socio-technical threats prior to implementation and troubleshoot SoS post-implementation. This thesis evaluates existing modelling abstractions for their suitability to represent SoS and suggests that an agent-responsibility based modelling abstraction may provide a practical and scalable way of representing SoS for socio-technical threat identification and troubleshooting. The practicality and scalability of the abstraction is explored through the use of case studies that motivate the extension of existing responsibility-based techniques so that new classes of system (coalitions-of-systems) and new classes of threat (agent-related threats) may be analysed. This thesis concludes that the notion of ‘responsibility’ is a promising abstraction for representing and analysing systems that are composed of parts that are independently managed and maintained by agents spanning multiple organisational boundaries e.g. systems-of-systems, enterprise-scale systems.

Interfacing Coq + SSReflect with GAP

Komendantsky, Vladimir — 2012-09-19T00:00:00Z

We report on an extendable implementation of the communication interface connecting Coq proof assistant to the computational algebra system GAP using the Symbolic Computation Software Composability Protocol (SCSCP). It allows Coq to issue OpenMath requests to a local or remote GAP instances and represent server responses as Coq terms. Presentation slides and preprint both provided by author. Preprint published in Electronic Notes in Theoretical Computer Science: Proceedings of the 9th International Workshop On User Interfaces for Theorem Provers (UITP10).

Collaborative and evolutionary ontology development & its application in IM system for enhanced presence

Zhai, Ying — 2012-08-28T00:00:00Z

This research contributes to the field of ontology-based semantic matching techniques and also to the field of Instant Messaging (IM) based enhanced presence. It aims to achieve a mutually beneficial development of two fields through interactions in their use of data and their functionality. With respect to semantic matching this research has developed a collaborative and self-evolutionary approach based on user involvement in order to overcome disadvantages of traditional ontology-based approaches. At the same time, enhanced semantic matching algorithms were also explored and developed to achieve better performance when searching and querying through the ontology. In order to realize this automatic, dynamic and collaborative approach, a Jabber-based IM system was built to support its development with specific data and to evaluate its performance. In the prototype of the system, Computer Science area is selected to be the domain of the ontology in order to demonstrate the practicability of the new approach. With respect to enhanced presence an efficient semantic-based contacts search engine which can feature context-based search ranking is provided to support academic researchers. It is especially designed to help new academic researchers to find potential contacts who share a common research interest. It enriches the IM system’s presence information, and helps the user to pick the most suitable contacts and conveniently organize meetings or co-operating with others. Consequently, this research improves the efficiency of users’ academic researching, and extends users’ relationship radius during their academic research careers. The contributions are particularly highlighted by the comprehensive support during the academic user’s self-educational process.

Adaptive network traffic management for multi user virtual environments

Oliver, Iain Angus — 2011-11-30T00:00:00Z

Multi User Virtual Environments (MUVE) are a new class of Internet application with a significant user base. This thesis adds to our understanding of how MUVE network traffic fits into the mix of Internet traffic, and how this relates to the application's needs. MUVEs differ from established Internet traffic types in their requirements from the network. They differ from traditional data traffic in that they have soft real-time constraints, from game traffic in that their bandwidth requirements are higher, and from audio and video streaming traffic in that their data streams can be decomposed into elements that require different qualities of service. This work shows how real-time adaptive measurement based congestion control can be applied to MUVE streams so that they can be made more responsive to changes in network conditions than other real-time traffic and existing MUVE clients. It is shown that a combination of adaptive congestion control and differential Quality of Service (QoS) can increase the range of conditions under which MUVEs both get sufficient bandwidth and are Transport Control Protocol (TCP) fair. The design, implementation and evaluation of an adaptive traffic management system is described. The system has been implemented in a modified client, which allows the MUVE to be made TCP fair without changing the server.

Visual ageing of human faces in three dimensions using morphable models and projection to latent structures

Hunter, David William — 2009-02-01T00:00:00Z

We present an approach to synthesising the effects of ageing on human face images using three-dimensional modelling. We extract a set of three dimensional face models from a set of two-dimensional face images by fitting a Morphable Model. We propose a method to age these face models using Partial Least Squares to extract from the data-set those factors most related to ageing. These ageing related factors are used to train an individually weighted linear model. We show that this is an effective means of producing an aged face image and compare this method to two other linear ageing methods for ageing face models. This is demonstrated both quantitatively and with perceptual evaluation using human raters.

On algorithm selection, with an application to combinatorial search problems

Kotthoff, Lars — 2012-06-20T00:00:00Z

Facebook or Fakebook? : The effects of simulated mobile applications on simulated mobile networks

Parris, Iain Siraj — 2012-01-01T00:00:00Z

The credibility of mobile ad hoc network simulations depends on accurate characterisations of user behaviour, e.g., mobility and application usage. If simulated nodes communicate at different rates to real nodes, or move in an unrealistic fashion, this may have a large impact on the network protocols being simulated and tested. Many future mobile network protocols, however, may also depend on future mobile applications. Different applications may be used at different rates or in different manners. But how can we determine realistic user behaviour for such applications that do not yet exist? One common solution is again simulation, but this time simulation of these future applications. This paper examines differences in user behaviour between a real and simulated mobile social networking application through a user study (n=80). We show that there are distinct differences in privacy behaviour between the real and simulated groups. We then simulate a mobile opportunistic network application using two real-world traces to demonstrate the impact of using real and simulated applications. We find large differences between using real and synthetic models of privacy behaviour, but smaller differences between models derived from the real and simulated applications. This work was supported by the Engineering and Physical Sciences Research Council [grant number EP/G002606/1].

Load balancing of irregular parallel applications on heterogeneous computing environments

Janjic, Vladimir — 2012-01-01T00:00:00Z

Large-scale heterogeneous distributed computing environments (such as Computational Grids and Clouds) offer the promise of access to a vast amount of computing resources at a relatively low cost. In order to ease the application development and deployment on such complex environments, high-level parallel programming languages exist that need to be supported by sophisticated runtime systems. One of the main problems that these runtime systems need to address is dynamic load balancing that ensures that no resources in the environment are underutilised or overloaded with work. This thesis deals with the problem of obtaining good speedups for irregular applications on heterogeneous distributed computing environments. It focuses on workstealing techniques that can be used for load balancing during the execution of irregular applications. It specifically addresses two problems that arise during work-stealing: where thieves should look for work during the application execution and how victims should respond to steal attempts. In particular, we describe and implement a new Feudal Stealing algorithm and also we describe and implement new granularity-driven task selection policies in the SCALES simulator, which is a work-stealing simulator developed for this thesis. In addition, we present the comprehensive evaluation of the Feudal Stealing algorithm and the granularity-driven task selection policies using the simulations of a large class of regular and irregular parallel applications on a wide range of computing environments. We show how the Feudal Stealing algorithm and the granularity-driven task selection policies bring significant improvements in speedups of irregular applications, compared to the state-of-the-art work-stealing algorithms. Furthermore, we also present the implementation of the task selection policies in the Grid-GUM runtime system [AZ06] for Glasgow Parallel Haskell (GpH) [THLPJ98], in addition to the implementation in SCALES, and we also present the evaluation of this implementation on a large set of synthetic applications.

Numerical evidence for phase transitions of NP-complete problems for instances drawn from Lévy-stable distributions

Connelly, Abram — 2011-11-01T00:00:00Z

Random NP-Complete problems have come under study as an important tool used in the analysis of optimization algorithms and help in our understanding of how to properly address issues of computational intractability. In this thesis, the Number Partition Problem and the Hamiltonian Cycle Problem are taken as representative NP-Complete classes. Numerical evidence is presented for a phase transition in the probability of solution when a modified Lévy-Stable distribution is used in instance creation for each. Numerical evidence is presented that show hard random instances exist near the critical threshold for the Hamiltonian Cycle problem. A choice of order parameter for the Number Partition Problem’s phase transition is also given. Finding Hamiltonian Cycles in Erdös-Rényi random graphs is well known to have almost sure polynomial time algorithms, even near the critical threshold. To the author’s knowledge, the graph ensemble presented is the first candidate, without specific graph structure built in, to generate graphs whose Hamiltonicity is intrinsically hard to determine. Random graphs are chosen via their degree sequence generated from a discretized form of Lévy-Stable distributions. Graphs chosen from this distribution still show a phase transition and appear to have a pickup in search cost for the algorithms considered. Search cost is highly dependent on the particular algorithm used and the graph ensemble is presented only as a potential graph ensemble to generate intrinsically hard graphs that are difficult to test for Hamiltonicity. Number Partition Problem instances are created by choosing each element in the list from a modified Lévy-Stable distribution. The Number Partition Problem has no known good approximation algorithms and so only numerical evidence to show the phase transition is provided without considerable focus on pickup in search cost for the solvers used. The failure of current approximation algorithms and potential candidate approximation algorithms are discussed.

Computation of infix probabilities for probabilistic context-free grammars

Nederhof, Mark Jan — 2011-07-01T00:00:00Z

The notion of infix probability has been introduced in the literature as a generalization of the notion of prefix (or initial substring) probability, motivated by applications in speech recognition and word error correction. For the case where a probabilistic context-free grammar is used as language model, methods for the computation of infix probabilities have been presented in the literature, based on various simplifying assumptions. Here we present a solution that applies to the problem in its full generality.

Reliable online social network data collection

Abdesslem, Fehmi Ben — 2012-01-01T00:00:00Z

Large quantities of information are shared through online social networks, making them attractive sources of data for social network research. When studying the usage of online social networks, these data may not describe properly users’ behaviours. For instance, the data collected often include content shared by the users only, or content accessible to the researchers, hence obfuscating a large amount of data that would help understanding users’ behaviours and privacy concerns. Moreover, the data collection methods employed in experiments may also have an effect on data reliability when participants self-report inacurrate information or are observed while using a simulated application. Understanding the effects of these collection methods on data reliability is paramount for the study of social networks; for understanding user behaviour; for designing socially-aware applications and services; and for mining data collected from such social networks and applications. This chapter reviews previous research which has looked at social network data collection and user behaviour in these networks. We highlight shortcomings in the methods used in these studies, and introduce our own methodology and user study based on the Experience Sampling Method; we claim our methodology leads to the collection of more reliable data by capturing both those data which are shared and not shared. We conclude with suggestions for collecting and mining data from online social networks.

Facial feature detection and tracking with a 3D constrained local model

Yu, Meng — 2010-01-01T00:00:00Z

This thesis establishes a framework for facial feature detection and human face movement tracking. Statistical models of shape and appearance are built to represent the human face structure and interpret target images of human faces. The approach is a patch-based method derived from an earlier proposed method, the constrained local model (CLM) [1] algorithm. In order to increase the ability to track face movements with large head rotations, a 3D shape model is used in the system. And multiple texture models from different viewpoints are used to model the appearance. During fitting or tracking, the current estimate of pose (shape coordinates) is used to select the appropriate texture model. The algorithm uses the shape model and a texture model to generate a set of region template detectors. A search is then performed in the global pose / shape space using these detectors. Different optimisation frameworks are used in the implementation. The training images are created by rendering expressive 3D face models with different scales, rotations, expressions, brightness, etc. Experimental results are demonstrated by fitting the model to image sequences with large head rotations to evaluate the performance of the algorithm. To evaluate the stability and selection of factors of the algorithm, more experiments are carried out. The results show that the proposed 3D constrained local model algorithm improves the performance of the original CLM algorithm for videos with large out-of-plane head rotations.

Improving the efficiency of learning CSP solvers

Moore, Neil C.A. — 2011-05-01T00:00:00Z

Backtracking CSP solvers provide a powerful framework for search and reasoning. The aim of constraint learning is increase global reasoning power by learning new constraints to boost reasoning and hopefully reduce search effort. In this thesis constraint learning is developed in several ways to make it faster and more powerful. First, lazy explanation generation is introduced, where explanations are generated as needed rather than continuously during propagation. This technique is shown to be effective is reducing the number of explanations generated substantially and consequently reducing the amount of time taken to complete a search, over a wide selection of benchmarks. Second, a series of experiments are undertaken investigating constraint forgetting, where constraints are discarded to avoid time and space costs associated with learning new constraints becoming too large. A major empirical investigation into the overheads introduced by unbounded constraint learning in CSP is conducted. This is the first such study in either CSP or SAT. Two significant results are obtained. The first is that typically a small percentage of learnt constraints do most propagation. While this is conventional wisdom, it has not previously been the subject of empirical study. The second is that even constraints that do no effective propagation can incur significant time overheads. Finally, the use of forgetting techniques from the literature is shown to significantly improve the performance of modern learning CSP solvers, contradicting some previous research. Finally, learning is generalised to use disjunctions of arbitrary constraints, where before only disjunctions of assignments and disassignments have been used in practice (g-nogood learning). The details of the implementation undertaken show that major gains in expressivity are available, and this is confirmed by a proof that it can save an exponential amount of search in practice compared with g-nogood learning. Experiments demonstrate the promise of the technique.

A new, robust, and generic method for the quick creation of smooth paths and near time-optimal path tracking

Bott, M. P. — 2011-11-30T00:00:00Z

Robotics has been the subject of academic study from as early as 1948. For much of this time, study has focused on very specific applications in very well controlled environments. For example, the first commercial robots (1961) were introduced in order to improve the efficiency of production lines. The tasks undertaken by these robots were simple, and all that was required of a control algorithm was speed, repetitiveness and reliability in these environments. Now however, robots are being used to move around autonomously in increasingly unpredictable environments, and the need for robotic control algorithms that can successfully react to such conditions is ever increasing. In addition to this there is an ever-increasing array of robots available, the control algorithms for which are often incompatible. This can result in extensive redesign and large sections of code being re-written for use on different architectures. The thesis presented here is that a new generic approach can be created that provides robust high quality smooth paths and time-optimal path tracking to substantially increase applicability and efficiency of autonomous motion plans. The control system developed to support this thesis is capable of producing high quality smooth paths, and following these paths to a high level of accuracy in a robust and near time-optimal manner. The system can control a variety of robots in environments that contain 2D obstacles of various shapes and sizes. The system is also resilient to sensor error, spatial drift, and wheel-slip. In achieving the above, this system provides previously unavailable functionality by generically creating and tracking high quality paths so that only minor and clear adjustments are required between different robots and also be being capable of operating in environments that contain high levels of perturbation. The system is comprised of five separate novel component algorithms in order to cater for five different motion challenges facing modern robots. Each algorithm provides guaranteed functionality that has previously been unavailable in respect to its challenges. The challenges are: high quality smooth movement to reach n-dimensional goals in regions without obstacles, the navigation of 2D obstacles with guaranteed completeness, high quality smooth movement for ground robots carrying out 2D obstacle navigation, near time-optimal path tracking, and finally, effective wheel-slip detection and compensation. In meeting these challenges the algorithms have tackled adherence to non-holonomic constraints, applicability to a wide range of robots and tasks, fast real-time creation of paths and controls, sensor error compensation, and compensation for perturbation. This thesis presents each of the above algorithms individually. It is shown that existing methods are unable to produce the results provided by this thesis, before detailing the operation of each algorithm. The methodology employed is varied in accordance with each of the five core challenges. However, a common element of methodology throughout the thesis is that of gradient descent within a new type of potential field, which is dynamic and capable of the simultaneous creation of high-quality paths and the controls required to execute them. By relating global to local considerations through subgoals, this methodology (combined with other elements) is shown to be fully capable of achieving the aims of the thesis. It is concluded that the produced system represents a novel and significant contribution as there is no other system (to the author’s knowledge) that provides all of the functionality given. For each component algorithm there are many control systems that provide one or more of its features, but none that are capable of all of the features. Applications for this work are wide ranging as it is comprised of five component algorithms each applicable in their own right. For example, high quality smooth paths may be created and followed in any dimensionality of space if time optimality and obstacle avoidance are not required. Broadly speaking, and in summary, applications are to ground-based robotics in the areas of smooth path planning, time optimal travel, and compensation for unpredictable perturbation.

Scaling measurement experiments to planet-scale: ethical, regulatory and cultural considerations

Henderson, Tristan Nicholas Hoang — 2009-06-01T00:00:00Z

Conducting planet-scale mobility experiments and measurements is of great interest to network researchers for building the next generation of wireless networking technologies, or for studying inter-disciplinary problems in complex networks. There are many technical challenges that need to be addressed before such experiments can take place. But at the same time, there are many non-technical issues that need to be tackled in order to preserve the welfare of participants in these studies. While some of these issues have been addressed in previous small-scale studies, they become increasingly complex when differences between countries need to be taken into account. This position paper highlights some of these issues and argues that they need to be addressed before planet-scale measurement experiments can be conducted. We discuss ethical, regulatory, cultural and privacy issues, and consider how to design measurement systems that will scale up to planet-wide experiments. We motivate our approach by discussing work in measurement of mobile and online social networks. Workshop held as part of 7th Annual International Conference on Mobile Systems, Applications and Services (MobiSys 2009)

Practical privacy-aware opportunistic networking

Parris, Iain — 2011-07-05T00:00:00Z

Opportunistic networks have been the study of much research — in particular on making end-to-end routing efficient. Users’ privacy concerns, however, have not been the subject of much research. What privacy concerns might opportunistic network users have? Is it possible to build opportunistic networks that can mitigate users’ privacy concerns while maintaining routing performance? Our work-to-date has tackled the problem of creating privacy-preserving routing protocols, with less emphasis on discovering users’ actual privacy concerns. We summarise our current results, and describe a future experiment that we have planned to better understand users’ privacy concerns.

PoRAP : an energy aware protocol for cyclic monitoring WSNs

Khemapech, Ittipong — 2011-06-23T00:00:00Z

This work starts from the proposition that it is beneficial to conserve communication energy in Wireless Sensor Networks (WSNs). For WSNs there is an added incentive for energy-efficient communication. The power supply of a sensor is often finite and small. Replenishing the power may be impractical and is likely to be costly. Wireless Sensor Networks are an important area of research. Data about the physical environment may be collected from hostile or friendly environments. Data is then transmitted to a destination without the need for communication cables. There are power and resource constraints upon WSNs, in addition WSN networks are often application specific. Different applications will often have different requirements. Further, WSNs are a shared medium system. The features of the MAC (Medium Access Control) protocol together with the application behaviour shape the communication states of the node. As each of these states have different power requirements the MAC protocol impacts upon the operation and power consumption efficiency. This work focuses on the development of an energy conservation protocol for WSNs where direct communication between sources and a base station is feasible. Whilst the multi-hop approach has been regarded as the underlying communication paradigm in WSNs, there are some scenarios where direct communication is applicable and a significant amount of communication energy can be saved. The Power & Reliability Aware Protocol has been developed. Its main objectives are to provide efficient data communication by means of energy conservation without sacrificing required reliability. This has been achieved by using direct communication, adaptive power adaptation and intelligent scheduling. The results of simulations illustrate the significance of communication energy and adaptive transmission. The relationship between Received Signal Strength Indicator (RSSI) and Packet Reception Rate (PRR) metrics is established and used to identify when power adaptation is required. The experimental results demonstrate an optimal region where lower power can be used without further reduction in the PRR. Communication delays depend upon the packet size whilst two-way propagation delay is very small. Accurate scheduling is achieved through monitoring the clock drift. A set of experiments were carried out to study benefits of direct vs. multi-hop communication. Significant transmitting current can be conserved if the direct communication is used. PoRAP is compared to Sensor-MAC (S-MAC), Berkeley-MAC (B-MAC) and Carrier Sense Multiple Access (CSMA). Parameter settings used in the Great Duck Island (GDI) a production habitat monitoring WSNs were applied. PoRAP consumes the least amount of energy.

A scalable architecture for the demand-driven deployment of location-neutral software services

MacInnis, Robert F. — 2010-01-01T00:00:00Z

This thesis presents a scalable service-oriented architecture for the demand-driven deployment of location-neutral software services, using an end-to-end or ‘holistic’ approach to address identified shortcomings of the traditional Web Services model. The architecture presents a multi-endpoint Web Service environment which abstracts over Web Service location and technology and enables the dynamic provision of highly-available Web Services. The model describes mechanisms which provide a framework within which Web Services can be reliably addressed, bound to, and utilized, at any time and from any location. The presented model eases the task of providing a Web Service by consuming deployment and management tasks. It eases the development of consumer agent applications by letting developers program against what a service does, not where it is or whether it is currently deployed. It extends the platform-independent ethos of Web Services by providing deployment mechanisms which can be used independent of implementation and deployment technologies. Crucially, it maintains the Web Service goal of universal interoperability, preserving each actor’s view upon the system so that existing Service Consumers and Service Providers can participate without any modifications to provider agent or consumer agent application code. Lastly, the model aims to enable the efficient consumption of hosting resources by providing mechanisms to dynamically apply and reclaim resources based upon measured consumer demand.

On the selection of connectivity-based metrics for WSNs using a classification of application behaviour

Boyd, Alan — 2010-06-07T00:00:00Z

This paper addresses a subset of Wireless Sensor Network (WSN) applications in which data is produced by a set of resource-constrained source nodes and forwarded to one or more sink nodes. The performance of such applications is affected by the connectivity of the WSN, since nodes must remain connected in order to transfer data from sources to sinks. Designers use metrics to measure and improve the efficacy of WSN applications. We aim to facilitate the choice of connectivity-based metrics by introducing a classification of WSN applications based on their data collection behaviour and indicating the metrics best suited to the evaluation of particular application classes. We argue that no suitable metric currently exists for a significant class of applications with the following characteristics: 1) application data is periodically routed or disseminated from source nodes to one or more sink nodes, and 2) the application can continue to function with the loss of source nodes although its useful network lifetime diminishes as a result. We present a new metric, known as Connectivity Weighted Transfer, which may be used to evaluate WSN applications with these characteristics.

A component-based model and language for wireless sensor network applications

Dearle, Alan — 2008-07-01T00:00:00Z

Wireless sensor networks are often used by experts in many different fields to gather data pertinent to their work. Although their expertise may not include software engineering, these users are expected to produce low-level software for a concurrent, real-time and resource-constrained computing environment. In this paper, we introduce a component-based model for wireless sensor network applications and a language, Insense, for supporting the model. An application is modelled as a composition of interacting components and the application model is preserved in the Insense implementation where active components communicate via typed channels. The primary design criteria for Insense include: to abstract over low-level concerns for ease of programming; to permit worst-case space and time usage of programs to be determinable; to support the fractal composition of components whilst eliminating implicit dependencies between them; and, to facilitate the construction of low footprint programs suitable for resource-constrained devices. This paper presents an overview of the component model and Insense, and demonstrates how they meet the above criteria.

A collaborative wireless sensor network routing scheme for reducing energy wastage

Boyd, Alan — 2010-05-01T00:00:00Z

A Wireless Sensor Network (WSN) is a network of battery-powered nodes in which data is routed from sources to sinks. Each node consumes energy in order to transmit or receive on its radio. Consequently, an intermediate node that is used by multiple sources will quickly expire. If some sources are unable to route without the presence of that node, any remaining energy they have is wasted. We present a new routing scheme known as node reliance, which rates the degree to which nodes are relied upon in routing. The use of node reliance reduces the contention for intermediate nodes, permitting sources to route to sinks for longer and thus maximising the useful lifetime of the network.

Reflection and reification in process system evolution : experience and opportunity

Greenwood, RM — 2001-01-01T00:00:00Z

Process systems aim to support many people involved in many processes over a long period of time. They provide facilities for storing and manipulating processes in both the representation and enactment domains. This paper argues that process systems should support ongoing transformations between these domains, at any level of granularity. The notion of creating a enactment model instance from a representation is merely one restricted transformation. Especially when process evolution is considered the case for thinking in terms of model instances is weak. This argument is supported by our experience of the ProcessWeb process system facilities for developing and evolving process models. The idea of hyper-code, which supports very general transformations between representation and enactment domains, is described. This offers the prospect of further improvements in this area.

A persistent hyper-programming system

Kirby, Graham Njal Cameron — 1997-01-01T00:00:00Z

We demonstrate the use of a hyper-programming system in building persistent applications. This allows program representations to contain type-safe links to persistent objects embedded directly within the source code. The benefits include improved efficiency and potential for static program checking, reduced programming effort and the ability to display meaningful source-level representations for first-class procedure values. Hyper-programming represents a completely new style of programming which is only possible in a persistent programming system.

Linguistic reflection in Java

Kirby, Graham Njal Cameron — 1998-08-01T00:00:00Z

Reflective systems allow their own structures to be altered from within. Here we are concerned with a style of reflection, called linguistic reflection, which is the ability of a running program to generate new program fragments and to integrate these into its own execution. In particular we describe how this kind of reflection may be provided in the compiler-based, strongly typed object-oriented programming language Java. The advantages of the programming technique include attaining high levels of genericity and accommodating system evolution. These advantages are illustrated by an example taken from persistent programming which shows how linguistic reflection allows functionality (program code) to be generated on demand (Just-In-Time) from a generic specification and integrated into the evolving running program. The technique is evaluated against alternative implementation approaches with respect to efficiency, safety and ease of use. This work is partially supported by the EPSRC through Grant GR/J 67611 ‘Delivering the Benefits of Persistence to System Construction’

Support for evolving software architectures in the ArchWare ADL

Morrison, Ron — 2004-01-01T00:00:00Z

Software that cannot evolve is condemned to atrophy: it cannot accommodate the constant revision and re-negotiation of its business goals nor intercept the potential of new technology. To accommodate change in software systems we have defined an active software architecture to be: dynamic in that the structure and cardinality of the components and interactions are changeable during execution; updatable in that components can be replaced; decomposable in that an executing system may be (partially) stopped and split up into its components and interactions; and reflective in that the specification of components and interactions may be evolved during execution. Here we describe the facilities of the ArchWare architecture description language (ADL) for specifying active architectures. The contribution of the work is the unique combination of concepts including: a pi-calculus based communication and expression language for specifying executable architectures; hyper-code as an underlying representation of system execution that can be used for introspection; a decomposition operator to incrementally break up executing systems; and structural reflection for creating new components and binding them into running systems.

A framework for constraint-based deployment and autonomic management of distributed applications (extended abstract)

Dearle, Alan — 2004-05-01T00:00:00Z

We propose a framework for the deployment and subsequent autonomic management of component-based distributed applications. An initial deployment goal is specified using a declarative constraint language, expressing constraints over aspects such as component-host mappings and component interconnection topology. A constraint solver is used to find a configuration that satisfies the goal, and the configuration is deployed automatically. The deployed application is instrumented to allow subsequent autonomic management. If, during execution, the manager detects that the original goal is no longer being met, the satisfy/deploy process can be repeated automatically in order to generate a revised deployment that does meet the goal. This work is supported by EPSRC Grants GR/M78403, GR/R51872, GR/S44501 and by EC Framework V IST-2001-32360

A framework for constraint-based deployment and autonomic management of distributed applications

Dearle, Alan — 2004-01-01T00:00:00Z

We propose a framework for deployment and subsequent autonomic management of component-based distributed applications. An initial deployment goal is specified using a declarative constraint language, expressing constraints over aspects such as component-host mappings and component interconnection topology. A constraint solver is used to find a configuration that satisfies the goal, and the configuration is deployed automatically. The deployed application is instrumented to allow subsequent autonomic management. If, during execution, the manager detects that the original goal is no longer being met, the satisfy/deploy process can be repeated automatically in order to generate a revised deployment that does meet the goal. Submitted to ICAC-04. Extended abstract available from IEEE at DOI:10.1109/ICAC.2004.1301386 This work is supported by EPSRC Grants GR/M78403, GR/R51872, GR/S44501 and by EC Framework V IST-2001-32360

A methodology for developing and deploying distributed applications

Kirby, Graham Njal Cameron — 2005-01-01T00:00:00Z

We describe a methodology for developing and deploying distributed Java applications using a reflective middleware system called RAFDA. We illustrate the methodology by describing how it has been used to develop a peer-to-peer infrastructure, and explain the benefits relative to other techniques. The strengths of the approach are that the application logic can be designed and implemented completely independently of distribution concerns, easing the development task, and that this gives great flexibility to alter distribution decisions late in the development cycle.

A flexible and secure deployment framework for distributed applications

Dearle, Alan — 2004-05-20T00:00:00Z

This paper describes an implemented system which is designed to support the deployment of applications offering distributed services, comprising a number of distributed components. This is achieved by creating high level placement and topology descriptions which drive tools that deploy applications consisting of components running on multiple hosts. The system addresses issues of heterogeneity by providing abstractions over host-specific attributes yielding a homogeneous run-time environment into which components may be deployed. The run-time environments provide secure binding mechanisms that permit deployed components to bind to stored data and services on the hosts on which they are running.

Automated static symmetry breaking in constraint satisfaction problems

Grayland, Andrews — 2011-01-01T00:00:00Z

Variable symmetries in constraint satisfaction problems can be broken by adding lexicographic ordering constraints. Existing general methods of generating such sets of ordering constraints can produce a huge number of additional constraints. This adds an unacceptable overhead to the solving process. Methods exist by which this large set of constraints can be reduced to a much smaller set automatically, but their application is also prohibitively costly. In contrast, this thesis takes a bottom up approach to generating symmetry breaking constraints. This will involve examining some commonly-occurring families of mathematical groups and deriving a general formula to produce a minimal set of ordering constraints which are sufficient to break all of the symmetry that each group describes. In some cases it is known that there exists no manageable sized sets of constraints to break all symmetries. One example of this occurs with matrix row and column symmetries. In such cases, incomplete symmetry breaking has been used to great effect. Double lex is a commonly used incomplete symmetry breaking technique for row and column symmetries. This thesis also describes another similar method which compares favourably to double lex. The general formulae investigated are used as building blocks to generate small sets of ordering constraints for more complex groups, constructed by combining smaller groups. Through the utilisation of graph automorphism tools and the groups and permutations software GAP we provide a method of defining variable symmetries in a problem as a group. Where this group can be described as the product of smaller groups, with known general formulae, we can construct a minimal set of ordering constraints for that problem automatically. In summary, this thesis provides the theoretical background necessary to apply efficient static symmetry breaking to constraint satisfaction problems. It also goes further, describing how this process can be automated to remove the necessity of having an expert CP practitioner, thus opening the field to a larger number of potential users.

Exploratory learning for wireless networking

Sturgeon, Thomas — 2010-11-30T00:00:00Z

This dissertation highlights the importance of computer networking education and the challenges in engaging and educating students. An exploratory learning approach is discussed with reference to other learning models and taxonomies. It is felt that an exploratory learning approach to wireless networks improves student engagement and perceived educational value. In order to support exploratory learning and improve the effectiveness of computer networking education the WiFi Virtual Laboratory (WiFiVL) has been developed. This framework enables students to access a powerful network simulator without the barrier of learning a specialised systems programming language. The WiFiVL has been designed to provide “anytime anywhere” access to a self-paced or guided exploratory learning environment. The initial framework was designed to enable users to access a network simulator using an HTML form embedded in a web page. Users could construct a scenario wherein multiple wireless nodes were situated. Traffic links between the nodes were also specified using the form interface. The scenario is then translated into a portable format, a URL, and simulated using the WiFiVL framework detailed in this dissertation. The resulting simulation is played back to the user on a web page, via a Flash animation. This initial approach was extended to exploit the greater potential for interaction afforded by a Rich Internet Application (RIA), referred to as WiFiVL II. The dissertation also details the expansion of WiFiVL into the realm of 3-dimensional, immersive, virtual worlds. It is shown how these virtual worlds can be exploited to create an engaging and educational virtual laboratory for wireless networks. Throughout each development the supporting framework has been re-used and has proved capable of supporting multiple interfaces and views. Each of the implementations described in this dissertation has been evaluated with learners in undergraduate and postgraduate degrees at the University of St Andrews. The results validate the efficacy of a virtual laboratory approach for supporting exploratory learning for wireless networks.

Reflection and hyper-programming in persistent programming systems

Kirby, Graham N. C. — 1992-01-01T00:00:00Z

In an orthogonally persistent programming system, data is treated in a manner independent of its persistence. This gives simpler semantics, allows the programmer to ignore details of long-term data storage and enables type checking protection mechanisms to operate over the entire lifetime of the data. The ultimate goal of persistent programming language research is to reduce the costs of producing software. The work presented in this thesis seeks to improve programmer productivity in the following ways: • by reducing the amount of code that has to be written to construct an application; • by increasing the reliability of the code written; and • by improving the programmer’s understanding of the persistent environment in which applications are constructed. Two programming techniques that may be used to pursue these goals in a persistent environment are type-safe linguistic reflection and hyper-programming. The first provides a mechanism by which the programmer can write generators that, when executed, produce new program representations. This allows the specification of programs that are highly generic yet depend in non-trivial ways on the types of the data on which they operate. Genericity promotes software reuse which in turn reduces the amount of new code that has to be written. Hyper-programming allows a source program to contain links to data items in the persistent store. This improves program reliability by allowing certain program checking to be performed earlier than is otherwise possible. It also reduces the amount of code written by permitting direct links to data in the place of textual descriptions. Both techniques contribute to the understanding of the persistent environment through supporting the implementation of store browsing tools and allowing source representations to be associated with all executable programs in the persistent store. This thesis describes in detail the structure of type-safe linguistic reflection and hyper-programming, their benefits in the persistent context, and a suite of programming tools that support reflective programming and hyper-programming. These tools may be used in conjunction to allow reflection over hyper-program representations. The implementation of the tools is described.

Design, implementation and deployment of state machines using a generative approach

Kirby, Graham Njal Cameron — 2008-01-01T00:00:00Z

We describe an approach to designing and implementing a distributed system as a family of related finite state machines, generated from a single abstract model. Various artefacts are generated from each state machine, including diagrams, source-level protocol implementations and documentation. The state machine family formalises the interactions between the components of the distributed system, allowing increased confidence in correctness. Our methodology facilitates the application of state machines to problems for which they would not otherwise be suitable. We illustrate the technique with the example of a Byzantine-fault-tolerant commit protocol used in a distributed storage system, showing how an abstract model can be defined in terms of an abstract state space and various categories of state transitions. We describe how such an abstract model can be deployed in a concrete system, and propose a general methodology for developing systems in this style.

Orthogonal persistence revisited

Dearle, Alan — 2009-07-01T00:00:00Z

The social and economic importance of large bodies of programs and data that are potentially long-lived has attracted much attention in the commercial and research communities. Here we concentrate on a set of methodologies and technologies called persistent programming. In particular we review programming language support for the concept of orthogonal persistence, a technique for the uniform treatment of objects irrespective of their types or longevity. While research in persistent programming has become unfashionable, we show how the concept is beginning to appear as a major component of modern systems. We relate these attempts to the original principles of orthogonal persistence and give a few hints about how the concept may be utilised in the future.

Probabilistic parsing

Nederhof, Mark Jan — 2008-01-01T00:00:00Z

Computing partition functions of PCFGs

Nederhof, Mark Jan — 2008-10-01T00:00:00Z

We investigate the problem of computing the partition function of a probabilistic context-free grammar, and consider a number of applicable methods. Particular attention is devoted to PCFGs that result from the intersection of another PCFG and a ﬁnite automaton. We report experiments involving the Wall Street Journal corpus. Acknowledgement provided in erratum at DOI:10.1007/s11168-009-9062-1

On the relationship between hypersequent calculi and labelled sequent calculi for intermediate logics with geometric Kripke semantics

Rothenberg, Robert — 2010-06-04T00:00:00Z

Node reliance : an approach to extending the lifetime of wireless sensor networks

Boyd, Alan W. F. — 2010-01-01T00:00:00Z

A Wireless Sensor Network (WSN) consists of a number of nodes, each typically having a small amount of non-replenishable energy. Some of the nodes have sensors, which may be used to gather environmental data. A common network abstraction used in WSNs is the (source, sink) architecture in which data is generated at one or more sources and sent to one or more sinks using wireless communication, possibly via intermediate nodes. In such systems, wireless communication is usually implemented using radio. Transmitting or receiving, even on a low power radio, is much more energy-expensive than other activities such as computation and consequently, the radio must be used judiciously to avoid unnecessary depletion of energy. Eventually, the loss of energy at each node will cause it to stop operating, resulting in the loss of data acquisition and data delivery. Whilst the loss of some nodes may be tolerable, albeit undesirable, the loss of certain critical nodes in a multi-hop routing environment may cause network partitions such that data may no longer be deliverable to sinks, reducing the usefulness of the network. This thesis presents a new heuristic known as node reliance and demonstrates its efficacy in prolonging the useful lifetime of WSNs. The node reliance heuristic attempts to keep as many sources and sinks connected for as long as possible. It achieves this using a reliance value that measures the degree to which a node is relied upon in routing data from sources to sinks. By forming routes that avoid high reliance nodes, the usefulness of the network may be extended. The hypothesis of this thesis is that the useful lifetime of a WSN may be improved by node reliance routing in which paths from sources to sinks avoid critical nodes where possible.

Privacy-enhanced social network routing in opportunistic networks

Parris, Iain Siraj — 2010-03-01T00:00:00Z

Opportunistic networking-forwarding messages in a disconnected mobile ad hoc network via any encountered nodes-offers a new mechanism for exploiting the mobile devices that many users already carry. Forwarding messages in such a network often involves the use of social network routing-sending messages via nodes in the sender or recipient's social network. Simple social network routing, however, may broadcast these social networks, which introduces privacy concerns. This paper introduces two methods for enhancing privacy in social network routing by obfuscating the social network graphs used to inform routing decisions. We evaluate these methods using two real-world datasets, and find that it is possible to obfuscate the social network information without leading to a significant decrease in routing performance.

Effective compilation of constraint models

Rendl, Andrea — 2010-01-01T00:00:00Z

Constraint Programming is a powerful technique for solving large-scale combinatorial (optimisation) problems. However, it is often inaccessible to users without expert knowledge in the area, precluding the wide-spread use of Constraint Programming techniques. This thesis addresses this issue in three main contributions. First, we propose a simple ‘model-and-solve’ approach, consisting of a framework where the user formulates a solver-independent problem model, which is then automatically tailored to the input format of a selected constraint solver (a process similar to compiling a high-level modelling language to machine code). The solver is then executed on the input, solver, and solutions (if they exist) are returned to the user. This allows the user to formulate constraint models without requiring any particular background knowledge of the respective solver and its solving technique. Furthermore, since the framework can target several solvers, the user can explore different types of solvers. Second, we extend the tailoring process with model optimisations that can compensate for a wide selection of poor modelling choices that novices (and experts) in Constraint Programming often make and hence result in redundancies. The elimination of these redundancies by the proposed optimisation techniques can result in solving time speedups of over an order of magnitude, in both naive and expert models. Furthermore, the optimisations are particularly light-weight, adding negligible overhead to the overall translation process. The third contribution is the implementation of this framework in the tool TAILOR, that currently translates 2 different solver-independent modelling languages to 3 different solver formats and is freely available online. It performs almost all optimisation techniques that are proposed in this thesis and demonstrates its significance in our empirical analysis. In summary, this thesis presents a framework that facilitates modelling for both experts and novices: problems can be formulated in a clear, high-level fashion, without requiring any particular background knowledge about constraint solvers and their solving techniques, while (sometimes naturally occurring) redundancies in the model are eliminated for practically no additional cost, improving the respective model in solving performance by up to an order of magnitude.

Tools and techniques for formalising structural proof theory

Chapman, Peter — 2010-06-01T00:00:00Z

Whilst results from Structural Proof Theory can be couched in many formalisms, it is the sequent calculus which is the most amenable of the formalisms to metamathematical treatment. Constructive syntactic proofs are filled with bureaucratic details; rarely are all cases of a proof completed in the literature. Two intermediate results can be used to drastically reduce the amount of effort needed in proofs of Cut admissibility: Weakening and Invertibility. Indeed, whereas there are proofs of Cut admissibility which do not use Invertibility, Weakening is almost always necessary. Use of these results simply shifts the bureaucracy, however; Weakening and Invertibility, whilst more easy to prove, are still not trivial. We give a framework under which sequent calculi can be codified and analysed, which then allows us to prove various results: for a calculus to admit Weakening and for a rule to be invertible in a calculus. For the latter, even though many calculi are investigated, the general condition is simple and easily verified. The results have been applied to G3ip, G3cp, G3s, G3-LC and G4ip. Invertibility is important in another respect; that of proof-search. Should all rules in a calculus be invertible, then terminating root-first proof search gives a decision procedure for formulae without the need for back-tracking. To this end, we present some results about the manipulation of rule sets. It is shown that the transformations do not affect the expressiveness of the calculus, yet may render more rules invertible. These results can guide the design of efficient calculi. When using interactive proof assistants, every case of a proof, however complex, must be addressed and proved before one can declare the result formalised. To do this in a human readable way adds a further layer of complexity; most proof assistants give output which is only legible to a skilled user of that proof assistant. We give human-readable formalisations of Cut admissibility for G3cp and G3ip, Contraction admissibility for G4ip and Craig's Interpolation Theorem for G3i using the Isar vernacular of Isabelle. We also formalise the new invertibility results, in part using the package for reasoning about first-order languages, Nominal Isabelle. Examples are given showing the effectiveness of the formalisation. The formal proof of invertibility using the new methods is drastically shorter than the traditional, direct method.

An investigation into the use of social network sites to support project communications

Harvey, Natalie — 2010-06-23T00:00:00Z

System deployment projects are extremely complex and with more and more organisations now choosing to configure and deploy off-the-shelf systems, the project teams are presented with new challenges. The aim of this study was to gain an understanding of the issues faced during such configuration and deployment projects and see if support could be provided. A year long observational study of one of these projects was carried. While it was initially assumed that it would be technical issues related to the system’s configuration that would be the primary problems, the study revealed communication issues to be at the heart of a large number of the issues. Online social networks such as Facebook are extremely popular, allowing users to stay in touch with large numbers of distributed people. Private social network sites were applied to projects to see if they could replicate the benefits the sites provide and support project communications. A social network site was created for both a distributed research project and an administrative systems project and their use observed. Statistical data on the use of the sites and qualitative feedback from users is presented to assess the viability of the approach. The experiments showed social network sites to have many benefits when used as a complementary mechanism to traditional channels for project communications. It is clear however, that social network sites cannot solve all the problems projects may encounter. If the use of a site is to be a success it is vital it gains a critical mass of users. The approach taken to the site’s configuration and introduction will be hugely influential in its success. In order to choose the right approach a clear understanding of what the project’s communication needs are and the possible uses of the site is needed. A process of configuration and development with a small group of potential users is recommended to ensure it is as user friendly as possible before going live to a large user base.

Autonomic management in a distributed storage system

Tauber, Markus — 2010-06-23T00:00:00Z

This thesis investigates the application of autonomic management to a distributed storage system. Effects on performance and resource consumption were measured in experiments, which were carried out in a local area test-bed. The experiments were conducted with components of one specific distributed storage system, but seek to be applicable to a wide range of such systems, in particular those exposed to varying conditions. The perceived characteristics of distributed storage systems depend on their configuration parameters and on various dynamic conditions. For a given set of conditions, one specific configuration may be better than another with respect to measures such as resource consumption and performance. Here, configuration parameter values were set dynamically and the results compared with a static configuration. It was hypothesised that under non-changing conditions this would allow the system to converge on a configuration that was more suitable than any that could be set a priori. Furthermore, the system could react to a change in conditions by adopting a more appropriate configuration. Autonomic management was applied to the peer-to-peer (P2P) and data retrieval components of ASA, a distributed storage system. The effects were measured experimentally for various workload and churn patterns. The management policies and mechanisms were implemented using a generic autonomic management framework developed during this work. The motivation for both groups of experiments was to test management policies with the objective to avoid unsatisfactory situations with respect to resource consumption and performance. Such unsatisfactory situations occur when either the P2P layer or the data retrieval mechanism is configured statically. In a statically configured P2P system two unsatisfactory situations can be identified. The first arises when the frequency with which P2P node states are verified is low and membership churn is high. The P2P node state becomes inaccurate due to a high membership churn, leading to errors during the routing process and a reduction in performance. In this situation it is desirable to increase the frequency to increase P2P state accuracy. The converse situation arises when the frequency is high and churn is low. In this situation network resources are used unnecessarily, which may also reduce performance, making it desirable to decrease the frequency. In ASA’s data retrieval mechanism similar unsatisfactory situations can be identified with respect to the degree of concurrency (DOC). The DOC controls the eagerness with which multiple redundant replicas are retrieved. An unsatisfactory situation arises when the DOC is low and there is a large variation in the times taken to retrieve replicas. In this situation it is desirable to increase the DOC, because by retrieving more replicas in parallel a result can be returned to the user sooner. The converse situation arises when the DOC is high, there is little variation in retrieval time and there is a network bottleneck close to the requesting client. In this situation it is desirable to decrease the DOC, since the low variation removes any benefit in parallel retrieval, and the bottleneck means that decreasing parallelism reduces both bandwidth consumption and elapsed time for the user. The experimental evaluations of autonomic management show promising results, and suggest several future research topics. These include optimisations of the managed mechanisms, alternative management policies, different evaluation methods, and the application of developed management mechanisms to other facets of a distributed storage system. The findings of this thesis could be exploited in building other distributed storage systems that focus on harnessing storage on user workstations, since these are particularly likely to be exposed to varying, unpredictable conditions.

Enabling exploratory learning through virtual fieldwork

Getchell, Kristoffer M. — 2010-06-23T00:00:00Z

This dissertation presents a framework which supports a group-based exploratory approach to learning and integrates 3D gaming methods and technologies with an institutional learning environment. This provides learners with anytime-anywhere access to interactive learning materials, thereby supporting a self paced and personalised approach to learning. A simulation environment based on real world data has been developed, with a computer games methodology adopted as the means by which users are able to progress through the system. Within a virtual setting users, or groups of users, are faced with a series of dynamic challenges with which they engage until such time as they have shown a certain level of competence. Once a series of domain specific objectives have been met, users are able to progress forward to the next level of the simulation. Through the use of Internet and 3D visualisation technologies, an excavation simulator has been developed which provides the opportunity for students to engage in a virtual excavation project, applying their knowledge and reflecting on the outcomes of their decisions. The excavation simulator enhances the student learning experience by providing opportunities for students to engage with the archaeological excavation process in a customisable, virtual environment. Not only does this provide students with an opportunity to put some of the theories they are familiar with into practice, but it also allows for archaeology courses to place a greater emphasis on the practical application of knowledge that occurs during the excavation process. Laconia Acropolis Virtual Archaeology (LAVA) is a co-operative exploratory learning environment that addresses the need for students to engage with archaeological excavation scenarios. By leveraging the immersive nature of gaming technologies and 3D multi-user virtual environments (MUVEs), LAVA facilitates the adoption of exploratory learning practices in environments which have previously been inaccessible due to barriers of space, time or cost.

Synthesis of facial ageing transforms using three-dimensional morphable models

Hunter, David W. — 2009-11-30T00:00:00Z

The ability to synthesise the effects of ageing in human faces has numerous uses from aiding the search for missing people to improving recognition algorithms and aiding surgical planning. The principal contribution of this thesis is a novel method for synthesising the visual effects of facial ageing using a training set of three-dimensional scans to train a statistical ageing model. This data-base is constructed by fitting a statistical Face Model known as a Morphable Model to a set of two dimensional photographs of a set of subjects at different age points in their lives. We verify the effectiveness of this algorithm with both quantitative and psychological evaluation. Most ageing research has concentrated on building models using two-dimensional images. This has two major shortcomings, firstly some of the information related to shape change may be lost by the projection to two-dimensions; secondly the algorithms are very sensitive to even slight variations in pose and lighting. By using standard face-fitting methods to fit a statistical face model to the image we overcome these problems by reconstructing the lost shape information, and can use a model of physical rotations and light transfer to overcome the issues of pose and rotation. We show that the three-dimensional models captured by face-fitting offer an effective method of synthesising facial ageing. The second contribution is a new algorithm for ageing a face model based on Projection to Latent Structures also known as Partial Least Squares. This method attempts to separate the training set into a set of basis vectors that best explains the shape and colour changes related to ageing from those factors within the training set that are unrelated to ageing. We show that this method is more accurate than other linear techniques at producing a face model that resembles the individual at the target age and of producing a face image of the correct perceived age. The third contribution is a careful evaluation of three well known ageing methods. We use both quantitative evaluation to determine the accuracy of the ageing method, and perceptual evaluation to determine how well the model performs in terms of perceived age increase and also identity retention. We show that linear methods more accurately capture ageing and identity information if they are trained using an individualised model, and that ageing is more accurately captured if PLS is used to train the model.

Consistency and the quantiﬁed constraint satisfaction problem

Nightingale, Peter — 2007-11-30T00:00:00Z

Constraint satisfaction is a very well studied and fundamental artificial intelligence technique. Various forms of knowledge can be represented with constraints, and reasoning techniques from disparate ﬁelds can be encapsulated within constraint reasoning algorithms. However, problems involving uncertainty, or which have an adversarial nature (for example, games), are difficult to express and solve in the classical constraint satisfaction problem. This thesis is concerned with an extension to the classical problem: the Quantified Constraint Satisfaction Problem (QCSP). QCSP has recently attracted interest. In QCSP, quantifiers are allowed, facilitating the expression of uncertainty. I examine whether QCSP is a useful formalism. This divides into two questions: whether QCSP can be solved efficiently; and whether realistic problems can be represented in QCSP. In attempting to answer these questions, the main contributions of this thesis are the following: - the definition of two new notions of consistency; - four new constraint propagation algorithms (with eight variants in total), along with empirical evaluations; - two novel schemes to implement the pure value rule, which is able to simplify QCSP instances; - a new optimization algorithm for QCSP; - the integration of these algorithms and techniques into a solver named Queso; - and the modelling of the Connect 4 game, and of faulty job shop scheduling, in QCSP. These are set in context by a thorough review of the QCSP literature.

Patterns of cooperative interaction: Linking ethnomethodology and design

Sommerville, I. — 2004-01-01T00:00:00Z

Patterns of Cooperative Interaction are regularities in the organisation of work, activity, and interaction amongst participants, and with, through and around artefacts. These patterns are organised around a framework and are inspired by how such regularities are highlighted in ethnomethodologically-informed ethnographic studies of work and technology. They comprise a high level description and two or more comparable examples drawn from specific studies. Our contention is that these patterns form a useful resource for re-using findings from previous field studies, for enabling analysis and considering design in new settings. Previous work on the relationship between ethnomethodology and design has been concerned primarily in providing presentation frameworks and mechanisms, practical advice, schematisations of the ethnomethodologist's role, different possibilities of input at different stages in development, and various conceptualisations of the relationship between study and design. In contrast, this paper seeks to firstly discuss the position of patterns relative to emergent major topics of interest of these studies. Subsequently it seeks to describe the case for the collection of patterns based on findings, their comparison across studies and their general implications for design problems, rather than the concerns of practical and methodological interest outlined in the other work. Special attention is paid to our evaluations and to how they inform how the patterns collection may be read, used and contributed to, as well as to reflections on the composition of the collection as it has emerged. The paper finishes, firstly, with a discussion of how our work relates to other work on patterns, before some closing comments are made on the role of our patterns and ethnomethodology in systems design. © ACM, 2004. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in ACM Transactions on Computer-Human Interaction, 11(1): 59-89, 10730516, (March 2004) http://doi.acm.org/10.1145/972648.972651

Space cost analysis using sized types

Vasconcelos, Pedro B. — 2008-11-01T00:00:00Z

Programming resource-sensitive systems, such as real-time embedded systems, requires guaranteeing both the functional correctness of computations and also that time and space usage fits within constraints imposed by hardware limits or the environment. Functional programming languages have proved very good at meeting the former logical kind of guarantees but not the latter resource guarantees. This thesis contributes to demonstrate the applicability of functional programming in resource-sensitive systems with an automatic program analysis for obtaining guaranteed upper bounds on dynamic space usage of functional programs. Our analysis is developed for a core subset of Hume, a domain-specific functional language targeting resource-sensitive systems (Hammond et al. 2007), and presented as a type and effect system that builds on previous sized type systems (Hughes et al. 1996, Chin and Khoo 2001) and effect systems for costs (Dornic et al. 1992, Reistad and Giord 1994, Hughes and Pareto 1999). It extends previous approaches by using abstract interpretation techniques to automatically infer linear approximations of the sizes of recursive data types and the stack and heap costs of recursive functions. The correctness of the analysis is formally proved with respect to an operational semantics for the language and an inference algorithm that automatically reconstructs size and cost bounds is presented. A prototype implementation of the analysis and operational semantics has been constructed and used to experimentally assess the quality of the cost bounds with some examples, including implementations of textbook functional programming algorithms and simplified embedded systems.

SNOOPIE : development of a learning support tool for novice programmers within a conceptual framework

Coull, Natalie J. — 2008-06-25T00:00:00Z

Learning to program is recognised nationally and internationally as a complex task that novices find challenging. There exist many endeavours to support the novice in this activity, including software tools that aim to provide a more supportive environment than that provided by standard software facilities, together with schemes that reduce the underlying complexity of programming by providing accessible micro-worlds in which students develop program code. Existing literature recognises that learning to program is difficult because of the need to learn the rules and operation of the language (program formulation), and the concurrent need to interpret problems and recognise the required components for that problem (problem formulation). This thesis describes a new form of learning support that addresses that dual task of program and problem formulation. A review of existing teaching tools that support the novice programmer leads to a set of requirements for a support tool that encompasses the processes of both program and problem formulation. This set of requirements is encapsulated in a conceptual framework for software tool development. The framework demonstrates how the requirements of a support tool can be met by performing a series of automated analyses at different stages in the student's development of a solution. An extended series of observations demonstrates the multi-faceted nature of problems that students encounter whilst they are learning to program and how these problems can be mapped onto the different levels of programs and problem formulation. These observations and the framework were used to inform the development of SNOOPIE, a sample instantiation of the framework for learning Java programming. This software tool has been fully evaluated and demonstrated to have a significant impact on the learning process for novice Java programmers. SNOOPIE is fully integrated into a current introductory programming module and a future programme of work is being established that will see SNOOPIE integrated with other established software tools.

Applications of Lie methods to computations with polycyclic groups

Assmann, Björn — 2007-11-30T00:00:00Z

In this thesis we demonstrate the algorithmic usefulness of the so-called Mal'cev correspondence for computations with infinite polycyclic groups. This correspondence between Q-powered nilpotent groups and rational nilpotent Lie algebras was discovered by Anatoly Mal'cev in 1951. We show how the Mal'cev correspondence can be realized on a computer. We explore two possibilities for this purpose and compare them: the first one uses matrix embeddings and the second the Baker-Campbell-Hausdorff formula. Then, we describe a new collection algorithm for polycyclically presented groups, which we call Mal'cev collection. Algorithms for collection lie at the heart of most methods dealing with polycyclically presented groups. The current state of the art is "collection from the left" as recently studied by Gebhardt, Leedham-Green/Soicher and Vaughan-Lee. Mal'cev collection is in some cases dramatically faster than collection from the left, while using less memory. Further, we explore how the Mal'cev correspondence can be used to describe symbolically the collection process in polycyclically presented groups. In particular, we describe an algorithm that computes the collection functions for splittable polycyclic groups. This algorithm is based on work by du Sautoy. We apply it to the computation of pro-p-completions of polycyclic groups. Finally we describe a practical algorithm for testing polycyclicity of finitely generated rational matrix groups. Previously, not only did no such method exist but it was not clear whether this question was decidable at all. Most of the methods described in this thesis are implemented in the computer algebra system GAP and publicly available as part of the GAP packages Guarana and Polenta. Reports on the implementation including runtimes for some examples are given at the appropriate places.

Normalisation & equivalence in proof theory & type theory

Lengrand, Stéphane J. E. — 2006-12-08T00:00:00Z

At the heart of the connections between Proof Theory and Type Theory, the Curry-Howard correspondence provides proof-terms with computational features and equational theories, i.e. notions of normalisation and equivalence. This dissertation contributes to extend its framework in the directions of proof-theoretic formalisms (such as sequent calculus) that are appealing for logical purposes like proof-search, powerful systems beyond propositional logic such as type theories, and classical (rather than intuitionistic) reasoning. Part I is entitled Proof-terms for Intuitionistic Implicational Logic. Its contributions use rewriting techniques on proof-terms for natural deduction (Lambda-calculus) and sequent calculus, and investigate normalisation and cut-elimination, with call-by-name and call-by-value semantics. In particular, it introduces proof-term calculi for multiplicative natural deduction and for the depth-bounded sequent calculus G4. The former gives rise to the calculus Lambdalxr with explicit substitutions, weakenings and contractions that refines the Lambda-calculus and Beta-reduction, and preserves strong normalisation with a full notion of composition of substitutions. The latter gives a new insight to cut-elimination in G4. Part II, entitled Type Theory in Sequent Calculus develops a theory of Pure Type Sequent Calculi (PTSC), which are sequent calculi that are equivalent (with respect to provability and normalisation) to Pure Type Systems but better suited for proof-search, in connection with proof-assistant tactics and proof-term enumeration algorithms. Part III, entitled Towards Classical Logic, presents some approaches to classical type theory. In particular it develops a sequent calculus for a classical version of System F_omega. Beyond such a type theory, the notion of equivalence of classical proofs becomes crucial and, with such a notion based on parallel rewriting in the Calculus of Structures, we compute canonical representatives of equivalent proofs.

Practical pollsterless remote electronic voting

Storer, Timothy W. — 2007-06-19T00:00:00Z

This thesis describes the design of a novel class of pollsterless voting schemes. Many cryptographic voting schemes necessitate a pollster because the client side computations are beyond the understanding or ability of the voter. Such interactions require that the voter trust the software to perform operations on their behalf, and in effect, the pollster acts as the voter. Conversely, the pollsterless schemes presented here permit voters to interact with an election authority directly, without complex computations. Pollsterless schemes have the additional advantage of permitting voting on virtually any networked device, increasing the potential mobility of voting. The proposed pollsterless schemes are implemented and then evaluated with respect to the particular requirements of the UK public election context. The flexibility of pollsterless schemes in particular are demonstrated to fulfill the diverse requirements that may arise in this context, whilst the mobility of pollsterless schemes is demonstrated to fulfill requirements to improve the convenience of voting.

A flexible, policy-aware middleware system

Walker, Scott Mervyn — 2006-01-01T00:00:00Z

Middleware augments operating systems and network infrastructure to assist in the creation of distributed applications in a heterogeneous environment. Current middleware systems exhibit some or all of the following five main problems: 1. Decisions must be made early in the design process. 2. Applications are inflexible to dynamic changes in their distribution. 3. Application development is complex and error-prone. 4. Existing systems force an unnatural encoding of application-level semantics. 5. Approaches to the specification of distribution policy are limited. This thesis defines a taxonomy of existing middleware systems and describes their limitations. The requirements that must be met by a third generation middleware system are defined and implemented by a system called the RAFDA Run-Time (RRT). The RRT allows control over the extent to which inter-address-space communication is exposed to programmers, aiding the creation, maintenance and evolution of distributed applications. The RRT permits the introduction of distribution into applications quickly and with minimal programmer effort, allowing for quick application prototyping. Programmers can conceal or expose the distributed nature of applications as required. The RRT allows instances of arbitrary application classes to be exposed to remote access as Web Services, provides control over the parameter-passing semantics applied to remote method calls and permits the creation of flexible distribution policies. The design of the RRT is described and evaluated qualitatively in the context of a case study based around the implementation of a peer-to-peer overlay network. A prototype implementation of the RRT is examined and evaluated quantitatively. Programmers determine the trade off between flexibility and simplicity offered by the RRT on a per-application basis, by concealing or exposing inter-address-space communication. The RRT is a middleware system that adapts to the needs of applications, rather than forcing distributed applications to adapt to the needs of the middleware system.