The following project opportunities are open for Scottish businesses or public sector organisations interested in collaborating with Scottish Universities and benefit from an applied research project with a doctoral student. Details on the next call and how to apply for funding. If any of the following projects are of interest to you or your organisation, please get in touch.
Explainable Machine Learning for predicting asset integrity, condition and faults
Surface/subsea structures are sometimes hard to monitor, and processes involved can be very laborious and expensive, especially when located in remote areas. Remote sensing is a common practice for collecting data, either time-series or images, but scaling the analytics up to produce insights about the asset integrity can be transformative for making the right decisions promptly. Managing assets, e.g. offshore infrastructures, could benefit from extracting as much knowledge as possible from already collected data that otherwise go wasted.
Detecting faults early can prevent very expensive damages from occurring. The various data generated by the monitoring systems need to be properly organised, aggregated and analysed to ascertain the integrity of the systems and decide what intervention(s) is(are) needed. A major aspect of this relates to detecting abnormalities and predicting future faults by leveraging and interrogating all data available, including sensor/acoustic/image/video data.
Machine Learning (ML) can create innovative solutions,e.g. on automating the decision-making processes through leveraging all sorts of historical data available. Such a system could detect abnormalities e.g. from images and time series, and predict future faults, along with the condition of assets. If we consider subsea pipelines as an example, an ML-based monitoring system can be used for reducing the risk of environmental damage due to hydrocarbon breaches.
Such capabilities can be used to better optimise the assets’ life span, reduce costs involved due to redundant monitoring of the asset integrity, prevent environmental damage (considering the creation of artificial reefs) and enhance the decision-making process (explainability) via incorporating uncertainty estimates of the predictions.
Our approach will produce novel technical outcomes on two areas:
a) Incorporating expert knowledge to the ML model for improving the predictive process. In many application domains such as environmental and asset monitoring, data do not come in abundance and/or might not be 100% representative of the problem one aims at tackling. Therefore, the contribution of the industry sponsor would be paramount for understanding the domain better as well as incorporating expert knowledge to the models.
b) Self-supervised learning techniques will be developed to enhance the learning process and implement robust and generalisable techniques. The main idea behind this is that the data one has available for solving a task might be able to be used in an alternative manner and in conjunction with the main task to improve the performance of the techniques developed, e.g. fault detection. How data is used, processed and interrogated can change the performance, usability and usefulness of ML techniques developed. The latter is a major reason, alongside explainability and trustworthiness, that prevents many companies from adopting ML techniques, something that this project aims at making a step change on.
Scotland, through its prestigious universities and innovative industry sector is leading the way on producing innovative and impactful research. This project will develop technologies that can be first applied to Scottish-based industries before being adopted more widely. Given that the economies and industries need to find alternative ways of boosting their financial robustness, ML can be an assistive technology to achieving this.
Collaboration Sought for the project: We are looking for an Industry Sponsor who have access to large amounts of data that can be anything from numerical data to time series, images and 3D volume data. Our proposed novel research investigations can provide tools and technologies that can transform and enhance the decision making process, leverage data that might be collected but not utilised and also optimise actions that might currently be relying purely on manual interventions. Our approach not only aims at using the data available for the effective decision-making, but it also aims at providing uncertainty estimates so that the confidence of the technique can be evaluated and taken into account when making a decision. Although the proposal focuses on asset management and fault detection/prediction, the proposed technical innovations can be applied to other areas (but not limited to), such as environment, healthcare, energy optimisation for households and retail and Oil & Gas industry.
Benefit to the Industrial Sponsor: We are currently going through a transformative period, whereby new technologies are needed to drive innovations and sustainability. Being able to use data and knowledge we have already got to extract more information and identify patterns are paramount for boosting productivity, investigating new areas, expanding portfolio and becoming more resilient. An Industry Sponsor, through the proposed project, can further innovate via optimising asset management, forecasting and decision making that can a) make them more financially resilient, b) identify new areas of activity via automating processes that currently rely on human interventions, hence enabling staff to focus on more important issues that can be used to expand the activities and c) make operational processes more streamlined via adding artificial intelligence in the loop.
Precision agriculture: Machine Learning- and Expert- based approaches to forecast fruit production
Europe’s food and drink sector employs 4.57 million people and has a turnover of 1.1 trillion, making it the largest manufacturing industry in the EU (Source: Data & Trends. EU Food & Drink Industry 2018. FoodDrink Europe 2018). Given the current financial uncertainty and the lack of manpower to pick fruits it becomes even more apparent that the agri-food sector needs a step change.
More specifically, in soft fruit industry, farmers are required to provide the retailers with accurate numbers of their produce to make sure the market demand is being met. Failing to accurately do so can lead to a) underestimating production, hence risking having to waste their produce, adding to food waste and increasing the carbon footprint, or b) overestimating production, hence not meeting the expectations of the retailer leading to hefty penalties or damage to their reputation. Providing accurate yield forecasting has long been the holy grail of the fruit industry, which still suffers from high predictive errors, due to the various uncertainties that forecasting entails, such as weather and seasonal variations.
Another problem growers have to deal with and consider is intra-field fruit crop variation, i.e. specific areas of the field performing worse than others. Modern farms and greenhouses collect various types of data for monitoring purposes. Instead of data going unexploited and with the advent of big data analytics and Machine Learning all this information can be used to develop algorithms and models that can be useful and inexpensive assistive tools to the farmers. Currently, farmers rely on experience-based/empirical predictions, which sometimes do not represent the reality due to factors/patterns they fail to consider. That means they might provide wrong estimates to the retailers, leading to the issues we touched upon previously.
The overarching aim of this project is to develop a novel machine learning approach to forecast fruit production and quantify uncertainty, whilst considering intra-field variations. From a technical point of view we aim at developing new techniques on Deep Learning optimisation and causal inference, through the use of Bayesian approaches, domain adaptation and self-supervised learning. Our goal will be to leverage, collate and incorporate data pertaining to the growth of the plant as well as various environmental parameters. We will also incorporate expert knowledge to the data driven model for improving the forecasting process, which will be very useful for quantifying and understanding the variations observed within a field.
Scotland and especially Aberdeenshire is very strong in soft fruit production, in particular strawberries. Considering the financial uncertainty that the current corona virus situation has brought about, along with Brexit, it is even more evident that the sector requires the use of advanced technologies to innovate and become more resilient. The proposed project can provide this step change that can have a very high impact in the agri-food sector.
Collaboration Sought for the project: This project is looking for an Industry Sponsor that can provide large amounts of data corresponding to plant growth, yield, energy/water costs and environmental data (these could be obtained through Met office as well). Ideally the Industry Sponsor can be a grower/farmer, a consortium of growers/farmers and any organisation who can provide such datasets. This project can offer the technical innovation needed to extra knowledge and patterns from the data that a human might fail to do so. Therefore for the success of the project, we are looking for a partner who can actively support the project not only via providing the datasets needed but also providing their expertise, which can be used to inform the models and studies.
Benefit to the Industrial Sponsor: Given the highly competitive nature of the sector, farmers could benefit from tools not only for monitoring the field performance but also for providing them with insights, such as accurate yield predictions and better understanding of their fields. This project will provide farmers and growers a powerful tool to assist them with the yield forecasting and production estimation at various intervals of time, e.g. 1-week, 2-weeks, 3-weeks ahead with corresponding uncertainty estimates, etc. so that they can provide to the retailers more accurate information about their produce. Financially, considering the current situation, this can be transformative as it can enable them to achieve better prices for their produce, capture a bigger piece of the pie (market) through further expansion, and also increase their reputation in the field via adopting innovative solutions.
Advanced Data Analytics for Condition Assessment of Future Smart Grid Power Electronics
Power converters are becoming an integral part of future smart grids. These provide a necessary interface between the source of power available and the load/network by conditioning the energy flow to meet pre-defined system specs and/or grid code requirements. Hence, their seamless operation is crucial to the healthy operation of future smart grids. For this reason, condition monitoring of power converters is gaining more attention to provide means of predicting component failures and scheduling preventive maintenance schemes. Typically, semiconductor device stresses, capacitor ageing, and inductor/transformer insulation need to be monitored and assessed.
This project will study the condition assessment of a typical power electronic converter interface used in smart grids with the scope of providing an intelligent framework for early warning of potential failures or end-of-life. The project will perform advanced data analytics on measurements obtained from a lab prototype to assess the state-of-health of the power electronics under different operation scenarios for prolonged operational durations. Scenarios will also include timed overload and simulated fault conditions. The aim is to monitor the performance of the converter semiconductor devices and passive components using various sensors and data acquisition means and processing the big sets of data generated through a dedicated Big Data server for machine learning. The data will be analysed and used to provide system intelligence for:
* Early warnings of component failures
* Establishing scheduled preventive maintenance schemes
* Predicting behaviour of power converter operation for periods of time beyond the typical experimental test time.
Companies working in the domain of energy, smart grids, renewable energy and energy storage technologies should be interested.
This field is an innovative research area integrating data science research into power electronics interface, something which is fairly new to academia and industry with the scope of predicting failures, end-of-life and predicting remaining useful life, the latter being particularly interesting to end users.
The potential outcomes of this project are the data science driven “engines” that can be applied to condition assessment of power electronics in smart grids and a theoretical framework providing the intelligence behind this procedure.
Since Scotland is one of the global leaders in the renewable energy sector, this is highly likely to present an impactful solution to several industries working in the areas of smart grid solutions. It is definitely a strongly contributing step towards UK’s net zero economy target.
Collaboration Sought for the project: It is crucial to this project to have an industry sponsor from the energy/smart grid sector to provide use case data/information to inform the research. This includes data on:
* Failure rates
* Failure causes
* Typical end-of-life
* Typical operating conditions (including device related and environmental)
Input will also be sought in the form of samples of failed devices and components to inform system modeling.
The project could also be tailored towards a specific power -electronic based interface operating at the company premises with the objective of predicting failures/end-of-life, scheduling preventive maintenance schemes, and predicting behavior during extreme conditions such as faults. This would be of direct economic benefit to the industry sponsor.
Benefit to the Industrial Sponsor: The project will help the industry sponsor predict failures/end-of-life, scheduling preventive maintenance schemes, and predict behaviour during extreme conditions such as faults. This would be of direct economic benefit to the industry sponsor.
The project will also help them enter new markets with this powerful predictive tool, as this tool is generic and can be applied to similar and new technologies incorporating power electronic devices and components.
Contact the project supervisor, Ahmed Aboushady email@example.com, for more information.
FABULON: Faster Data Analytics through Better Language-level Management of NUMA
Currently most software is constructed in programming languages like Dart, Java, Go and Scala that use automatic dynamic memory
management. Many applications, especially in data analytics and AI, are memory intensive, placing significant demands on the memory management system. A second challenge for memory managers is that memory access latency and bandwidth is becoming increasingly non-uniform as the number of general purpose cores in architectures continues to grow.
While there are good hardware and operating system tools for
understanding non-uniform memory access, programming language implementations have far more information about the memory access patterns of an (analytics) program than the operating system, and the ability to dynamically adapt memory management. Hence they are better placed to both profile and optimise access to non-uniform memory. Unfortunately there has been little systematic study of the impact of Non-Uniform Memory Architectures (NUMAs), and especially emergent larger NUMAs, on automatic memory management, and many modern language implementations have limited NUMA adaption.
High-performance data analytics is an important application domain, and increasingly executes on shared-memory NUMA servers. Our research shows that poor NUMA locality can degrade performance on state-of-the-art machines by a factor of 3x. Moreover current techniques for improving NUMA performance have neither been systematically combined, nor scaled to the massive data volumes in modern data analytics applications. In consequence there are significant potential benefits for data analytics applications on server-size NUMAs.
The FABULON project aims (1) to reach a deep understanding of the performance challenges posed by emergent mid-size NUMAs for modern language implementations and (2) to investigate how to exploit this knowledge to develop better language implementation technologies and (3) evaluate the effectiveness of the technologies to improve the performance of real-life data analytics applications.
Collaboration Sought for the project: We are looking for an industry partner (1) with a compute intensive data analytics application (2) that runs on a managed runtime-environment on server-size NUMA architectures and (3) is interested in improving the performance. We envision a collaboration on the characteristics of the application and related workloads, to tune our technologies and improve performance.
Benefit to the Industrial Sponsor: The sponsor will benefit from (1) expert code review of their application; (2) faster execution of the application on large NUMA architectures; (3) faster execution of other applications with similar characteristics. That is, as our techniques operate on the runtime-environment, the improvements directly apply to other applications with similar workloads.
Contact the project supervisors, Hans-Wolfgang Loidl, Heriot-Watt University; Phil Trinder, Jeremy Singer, Glasgow University for more information.
Data matching in absence of unique identifiers for enhancing predictive modelling
In today’s data driven economy, a massive volume of data opens new opportunities for combining different databases to enhance the accuracy of different predictive models, such as predicting the cross-selling opportunities or customers’ churn. Unfortunately, often there no unique identifiers that can be used to link different sources of information together or they cannot be used for privacy preserving reasons. This limits the data that can be used for model training.
There are various approaches that have been developed in different fields of knowledge to overcome the problem, e.g. propensity score matching in statistics and econometrics or fuzzy matching in computer science. Propensity score matching (PSM) is a collection of statistical algorithms that estimates the missing or unobserved outcome or some aspect of behaviour resulting from some intervention. Fuzzy matching combines numerous variables to create approximate matches which then can be evaluated and ranked.
Nevertheless, there is no investigation as to advantages each approach can offer, and how they can be adapted to suit specific problems. This PhD project will close this gap by providing a comprehensive comparison of existing methods and proposing a new methodology that will incorporate their strengths, in particular by building on Dr Andreeva’s work on clustering combined with collaborative filtering.
Collaborative filtering is the algorithm widely used in recommender systems, e.g. Amazon or Netflix, where new products are offered based on the similarity of a new user to existing users. Clustering splits the data into homogeneous groups and can potentially reduce the time and effort associated with collaborative filtering.
There are various measures of similarity and various clustering approaches that can be evaluated in the context of a specific task/problem. The innovation consists in combining the benefits of existing approaches into a new algorithm.
The project will use vast amounts of data, and data integration/ management is one of the key problems of Data Science. The outcomes will be useful to all businesses/ public organisations that rely on predictive modelling in their operations. It will improve the accuracy of predictive modelling, which in turn will lead to increased efficiency and enhanced customer satisfaction.
Collaboration sought: An industrial partner should be willing to offer some proprietary data for analysis that can be matched to other sources. Some sources can be public. The data can be protected by non-disclosure agreement, incorporating the partner’s preferences – this is a standard practice for Dr Andreeva’s work. There is a slight preference for financial services, given Dr Andreeva’s expertise in this field, but any other sectors are welcome.
Benefit to the industry sponsor: The project will develop a tailored solution to suit any needs/problems of the industrial partner. The examples include but not limited to improving the estimates of financial risk, cross-selling opportunities, predicting which customer are most likely to switch to a competitor. The sponsor will get a first-hand access to any project results ahead of any public presentations or publications. There is a possibility of redacting the project outputs in order to preserve the confidentiality and commercial interests.
Contact the project supervisor, Dr Galina Andreeva for more information.
Predicting DNA variants causal for altered disease risk using Machine Learning
Genetic studies reveal causal links between DNA and disease risk. However, such links are to DNA variants and not to genes, and do not reveal the molecular mechanisms of disease. Our group is combining machine learning applied to Transcription Factor (TF) binding with Mendelian Randomisation in order to pinpoint individual DNA variants that alter both (1) TF binding affinity and (2) disease risk, and are thus causal. In this project, we seek to apply machine learning, as applied to experimental TF binding data, in order to improve the precision by which TF binding affinity is inferred. This is an essential step toward predicting variants causal of disease risk change.
Collaboration sought: We are looking for either a drug development company (which is interested in investigating the genetic support for drug targets) or an AI/ML company (which is interested in applying new AI/ML approaches to functional genomics data.
Benefit to the industry sponsor: Dialogue and two-way engagement with a research group working at the fertile interface between population size data analysis (e.g. UK Biobank) and functional genomics data (e.g. transcription factor binding data in human primary and cancer cell lines).
Contact the project supervisor, Chris Ponting for more information.
Economic feasibility and environmental impacts of bioenergy in supporting net-zero energy building (NZEB+Bio) in the UK
The energy consumed by a net-zero energy building (NZEB) is as much as the renewable energy generated onsite or elsewhere. It is expected that NZEB will play an important role in mitigating greenhouse gas (GHG) emission and has received significant attention in recent years. Biomass accounts for around 12% of the world’s renewable energy resources. Distributed bioenergy production serves as a potential way of fulfilling NZEB.
It is important to understand the economic feasibility and environmental impacts of bioenergy on the design of NZEB. This project will design a novel configuration of bioenergy-supported NZEB and will decide the profitability and carbon footprint of the configuration using big-data supported cost-benefit analysis and life cycle assessment. The results will enable policymakers to make informed decisions for the fulfilment of NZEB in the UK.
The project is looking for an industry sponsor in the sector of (not limited to) sustainable/green building development and design, bioenergy technology development, or distributed bioenergy application, that will could potentially provide input data on the design of bioenergy-supported net-zero energy building (NZEB). The partnership will enable the PhD candidate to receive training from an industry supervisor and to design a bioenergy-supported NZEB configuration driven by future building industry standards and market demands.
Contact the project supervisor, Siming You for more information.
Mining Arguments from Natural Language Text
Giving machines the ability to understand natural language has been an AI goal for decades. A recent research direction in this area has focussed on “Argument Mining”. This is the automatic identification, extraction, and reuse of arguments from textual resources.
This project will involve a detailed study of the structure of natural language arguments from the industrial partners domain, with the aim of devising new and effective computational mining techniques. The successful candidate will be expected to further focus their project, and may choose for example, to focus on the effective application or extension of existing natural language or machine learning techniques applied to the argument mining domain of the industrial partner.
The core research themes of this project would be to:
- develop & evaluate automated argument mining techniques that can be applied to real-world problems
- extend extant tools for manual argument analysis through the addition of automated mining features so that they can be applied at scale to the creation of training data for supervised machine learning approaches
- research novel techniques for visualising and presenting mined argumentative data to support sense-making of the target domain.
Contact the project supervisor Dr Simon Wells for more information.
A unified approach based on semantic models and continuous deep learning to data uncertainty and inconsistency in smart IoT systems
Smart IoT-based Applications, such as smart city and smart factory, are characterized as sensor-driven technology, which has the tendency of producing huge volume of data with increasing velocity. The resulting data produced by these applications are mostly used to support organisation, planning, interpretation and decision-making activities. However, these data come with a number of quality issues that collectively results in uncertainties and inconsistencies.
In this project, we aim to innovatively integrate semantics-based data modelling and analysis with continuous deep learning to provide a novel effective solution to the above problem.
The semantic data model will provide a machine-understandable foundation for the IoT data and its analysis, and will be able to produce near real-time solution for the detection and correction of IoT data uncertainties. However, this model may be static and imprecise to cope with the highly dynamic nature of IoT systems and the data they have been generating. Therefore, we propose to use deep learning to support the continuous evolution of the semantic model and its data analysis algorithms.
Collaboration sought: We are looking for an industrial partner in the following area(s):
- 1. Provider of smart IoT applications, e.g. smart city, smart building, smart factory, smart transport, smart vehicle, etc;
- 2. Developer of smart IoT Applications, e.g. smart city, smart building, smart factory, smart transport, smart vehicle, etc;
- 3. Company specializing in data modeling and analysis;
- 4. Company specializing in smart sensors, IoT networks and devices;
Contact the project supervisor Prof Xiaodong Liu for more information.
Artificial Intelligence Based Communication System for Collaborative and Fault-tolerant Multicast Music Distribution
Advances of data communication make it easy today for a group of users to work remotely and collaboratively and produce rich multimedia content and distribute it to a large audience in the Internet. IP multicast constitutes an effective communication method that saves both the network bandwidth and the processing overhead especially when different sources are involved. However, real-time communications are highly sensitive to packet loss. This is especially the case for live music concerts. To address this issue, several traffic engineering and fault tolerance approaches have been devised. These techniques include and not limited to audio video compression, networking buffer management, queuing algorithms, traffic classification and prioritisation, etc.
The artificial intelligence could bring another level to improve the reliability of the transmission especially when multiple sources are involved in one single broadcast application. By monitoring the communication pattern and the network performance, artificial intelligence processes could be introduced in the communication framework to address any audio/video quality degradation or loss. A possible solution consists of creating virtual packets inside the network infrastructure or inject artificial made ones at the user’s end to replace missing critical data packets.
This thesis project aims therefore to explore how artificial intelligence and deep learning techniques could be applied to improve the reliability and the quality of multicast distributed concerts where a set of musicians collaborate remotely to record or play an album. The new proposed techniques could be embedded into the advanced audio-visual streaming technology LOLA that has been developed by Edinburgh Napier University and tested with musicians in Edinburgh, London, and Boston [ Word first for transatlantic real time album recording]
Collaboration sought: A multimedia content publisher or distributor is required. Their expertise in audio/video compression, transmission over IP network will help to develop new reliability and quality of service techniques to broadcast real time multimedia content to a large audience.
Contact the project supervisor Dr Imed Romdhani for further information.
Smart algorithms to solve large-scale optimisation problems
Optimisation problems can be found everywhere. Examples include: finding good parameters for a model or process, scheduling and logistics, resource allocation, or finding the shortest paths for a vehicle. Sometimes there is more than one goal (such as improving both monetary cost and efficiency), and here the optimisation problem is about finding the trade-off between these goals so an informed choice of solution can be made. These problems are usually also rooted in an underlying data set, capturing the specifics of the application (e.g. databases of orders that need satisfied, resource demand over time, or regional-scale maps of locations).
The core research themes of this project would be to:
(1) devise methods to intelligently search through possible answers to an optimisation problem; exploiting what human experts already know about the problem; and specifically for large-scale problems how to break the space of possibilities down to make it easier to solve
(2) develop approaches for communicating the answers to large-scale problems in an intuitive way
(3) research ways of explaining why particular solutions were chosen.
Contact the project supervisor Sandy Brownlee firstname.lastname@example.org for more information.
Realising a Flexible Quality Framework for Managing Data Assets
This project will explore the following questions:
– How can data veracity measures (metrics) be encoded and enacted within a data ecosystem?
– How can data provenance be used to support new forms of veracity checking and anomaly detection?
– How can data policies be framed to reason about data veracity, and recommend appropriate decision-making actions?
Transparent & Accountable Data Management for the Internet of Things
Building on an existing portfolio of research into data transparency and provenance, the proposed project will examine the following questions: What characteristics of IoT devices and their behaviours are necessary to formulate a model of transparency? How do we represent norms against which devices (and the ecosystems of which they are a part) can be held to account?