St Andrews Research Repository

St Andrews University Home
View Item 
  •   St Andrews Research Repository
  • Computer Science (School of)
  • Computer Science
  • Computer Science Theses
  • View Item
  •   St Andrews Research Repository
  • Computer Science (School of)
  • Computer Science
  • Computer Science Theses
  • View Item
  •   St Andrews Research Repository
  • Computer Science (School of)
  • Computer Science
  • Computer Science Theses
  • View Item
  • Login
JavaScript is disabled for your browser. Some features of this site may not work without it.

Data science use cases in the manufacturing industry : from theory to practice

Thumbnail
View/Open
Thesis-Diego-Arenas-Contreras-complete-version.pdf (3.817Mb)
Thesis-Diego-Arenas-latex-version-pre-pdf.zip (19.92Kb)
Date
29/11/2022
Author
Arenas Contreras, Diego Alejandro
Supervisor
Dobson, Simon
Funder
Aggreko
Data Lab
Grant ID
REG-17465
Keywords
Data science
Metadata
Show full item record
Altmetrics Handle Statistics
Altmetrics DOI Statistics
Abstract
One of the main challenges organisations face today is supporting business decisions from the massive volumes of data they are continuously collecting. The problem for organisations is how to become a data-driven organisation using the data they collect to generate insights and repeatable solutions connecting information needs with usable data products. Our objectives during the doctorate were to research and implement high quality technological and methodological solutions following best practices from academia and industry and, at the same time, build internal capacity for the organisation from experience. We implemented a series of data-related projects. The projects can be classified into two types. There are foundational projects that build infrastructure and processes to analyse data and applied data projects. Our methods included practices from software engineering, data science, and data engineering. We designed and built data solutions based on the principles of scalability, automation, encapsulation} and abstraction. We followed the principles mentioned above from the design phases of the projects; this allowed us to achieve good integration with the current systems and infrastructure of the organisation. We operationalised the technologies we explored for each project using a use-case driven approach. Users and stakeholders were involved early on in the projects, and we maintained excellent and continuous communication with them. The foundational projects implemented data architectures rather than implementing a specific ad-hoc solution so that the projects adjusted well to changing requirements and were generalisable to be reused entirely or components of the solutions in future projects. We used the foundational projects in the applied data projects. We deployed an estimation model to quantify the number of technicians needed to support an on-site project. Using an API to query the model, we used a microservice architecture exposing the final model to be consumed. We designed and implemented the analysis of estimating the lifespan of batteries using survival analysis and spectral clustering techniques. We ranked specific machines from best to worst performance based on fuel consumption to optimise resources on project sites. We designed and implemented a Python custom package to facilitate the exploration of databases for data science and data engineering projects. We designed and implemented a microservices architecture to support data streaming analytics. We made recommendations on using a machine learning framework to track and monitor machine learning models, wrote guidelines for best practices, and delivered internal tutorials about the use and benefits of these kinds of solutions. We implemented a data-driven architecture to support the analysis of telemetry data from multiple data sources. We implemented an alarm system on top of the solution using the analytical database of the project. Finally, we designed and implemented a custom Python package to handle repeatable data engineering tasks for the data engineering team. Data science and data engineering are new and essential roles in companies that aim to become data-driven organisations. We believe that using software engineering and software development techniques contributes significantly to this organisational change and accelerates internal innovation using data. We promptly provided data and information to the stakeholders to support their information needs and decision-making processes.
DOI
https://doi.org/10.17630/sta/361
Type
Thesis, DEng Doctor of Engineering
Collections
  • Computer Science Theses
URI
http://hdl.handle.net/10023/27244

Items in the St Andrews Research Repository are protected by copyright, with all rights reserved, unless otherwise indicated.

Advanced Search

Browse

All of RepositoryCommunities & CollectionsBy Issue DateNamesTitlesSubjectsClassificationTypeFunderThis CollectionBy Issue DateNamesTitlesSubjectsClassificationTypeFunder

My Account

Login

Open Access

To find out how you can benefit from open access to research, see our library web pages and Open Access blog. For open access help contact: openaccess@st-andrews.ac.uk.

Accessibility

Read our Accessibility statement.

How to submit research papers

The full text of research papers can be submitted to the repository via Pure, the University's research information system. For help see our guide: How to deposit in Pure.

Electronic thesis deposit

Help with deposit.

Repository help

For repository help contact: Digital-Repository@st-andrews.ac.uk.

Give Feedback

Cookie policy

This site may use cookies. Please see Terms and Conditions.

Usage statistics

COUNTER-compliant statistics on downloads from the repository are available from the IRUS-UK Service. Contact us for information.

© University of St Andrews Library

University of St Andrews is a charity registered in Scotland, No SC013532.

  • Facebook
  • Twitter