Data to Discovery Seminar Series
Data to Discovery is a monthly seminar series organised by the Data Science Unit where you tell us about your data.
We often see data as just characters on our computer screen. However, each character has a story to tell: an ingenious machine that breaks the genome down in pieces and reads its sequence; an amazing person who fights incurable disease and volunteers for case studies; and a visionary who spends a lifetime looking at these data to find hidden patterns and meaning.
In this seminar series, we will learn about data from different fields of science: how they are collected - the technology and the human efforts behind it - and how they contribute to a better understanding of the world around us.
We will hear from experts in their respective fields of study, and learn about,
what field of science do they work on (e.g., gene therapy or earthquake modelling)
what is the nature of data associated with it (e.g., tabular, text, images, time series, networks, structured/non-structured, big/small, readily available/very rare etc.),
where does the data come from (e.g., the data might come from a global effort in understanding a socio-economic issue, or it might come from a patient in critical care),
how is the data collected (e.g., data collection often uses cutting edge techniques such as sequencing the human genome, or observing a material in extreme conditions)
how is the data processed and analysed (e.g., it might involve simple statistical tools with careful prior choices or state-of-the-art machine learning on big data)
what has been done with this kind of data (e.g., the questions that have been answered so far, such as the functionality of a gene, or the responsiveness of a drug)
what are the open/ground-breaking questions that need answering (e.g., what is the respective field of study trying to achieve as a community)
whether the data is publicly available (e.g., if the data and/or similar data are easily available, if so then where and how one can access it, and if not then what is the alternative)
The seminar series is supported by CDT in Data Science.
List of upcoming seminars
We are working out the agenda for the coming months, watch this space!
List of past seminars
Scotland is home to one of the world’s fastest tidal channels at the Pentland Firth, and a recent government announcement (the CFD AR4 renewable energy contracts) is accelerating activity to establish the world’s largest tidal energy farm. However, high levels of uncertainty in the hydrodynamic characterisations of our dynamic marine environments lead to technical and commercial risk for developers and operators of offshore renewable energy devices.
The combined action of waves, currents and turbulence leads to highly variable 3D velocity fields, both in time and in space, with variation ranging from seconds (both random turbulence and deterministic wave orbital motion) to days, months and years (based on the predictable interaction of the sun-earth-moon tidal system). This talk will give an introduction to the School of Engineering’s efforts on combined measurement and modelling of these energetic sites, that have been conducted as part of multiple industrial-academic research projects over ten years spanning 2012-2022.
Brian Sellar is a Chancellor’s Fellow in Offshore Renewable Energy (ORE) with a PhD from the University of Edinburgh (2013) in novel distributed sensing for real-time wave field measurement and over 10 years of research experience working on industrial and academic projects in the area of wave and tidal energy. As work-package lead of a major European tidal energy project (www.realtide.eu) - where he was responsible for data capture, processing and provision - he delivered multi-year measurement and modelling campaigns, together with novel sensor demonstration. Brian recently led the UK Supergen ORE Hub FASTWATER tidal resource modelling project that combined University of Edinburgh expertise across the School of Engineering and the School of Maths. He currently leads the University’s involvement in the EPSRC HAPiWEC project and is Co-I on the £14m European ILIAD (https://www.ocean-twin.eu/) project that seeks to build tools to enable “Digital Twins of the Ocean”.
Many of the world's buildings are unmapped and are not included in census data. Missing data about human settlements is a problem for example in disaster response or delivering basic services such as electricity or vaccination. This talk will go through methods for identifying buildings in high-resolution satellite imagery, including challenging settings such as informal urban settlements, and how these were applied to create the Open Buildings dataset, containing the geometry of 817M buildings across Africa and South/Southeast Asia.
John Quinn is a Senior Research Software Engineer at Google Research in Ghana and Research Director of Sunbird AI in Uganda. He was previously technical lead for Africa projects at United Nations Global Pulse, and Senior Lecturer in Computer Science at Makerere University in Uganda. He has worked on a number of large scale AI projects across the African continent, in the fields of remote sensing, speech and language, agriculture and health. He holds a BA in Computer Science from the University of Cambridge (2000), and a PhD in Machine Learning from the University of Edinburgh (2007).
Theory and simulation suggest that the very first generation of stars to form in the Universe were predominantly much more massive than the sun. However, these predictions are in stark contrast to observations of the local Universe, where the vast majority of stars are smaller than the sun and those significantly more massive are vanishingly rare. This transition in star formation took place in the first few hundred million years after the Big Bang when galaxies were first starting to form, an era known as Cosmic Dawn. Studying this epoch requires modelling several nonlinear physical processes and is thus primarily the domain of complex computer simulations. In this talk, I will show how we are able to simulate the formation of the first "modern" stars from initial conditions that represent the state of the Universe shortly after the Big Bang. These simulations produce large and complex data, including 3D multi-resolution "snapshots" with extreme dynamic range (the ratio of the largest to smallest sizes) and "merger trees" which describe the hierarchical growth of cosmic structures. I will discuss some of the open-source tools that we have developed to improve and simplify our analysis of this data and ways in which they can be extended to other scientific domains.
Britton Smith is a computational astrophysicist in the Institute for Astronomy at the University of Edinburgh. He got his PhD in Astronomy and Astrophysics in 2007 from Penn State University and held postdoctoral and research scientist positions in Colorado, Michigan, Edinburgh and California before returning to Edinburgh in 2019 as a Chancellor's Fellow. Britton studies the formation of the first stars and galaxies using computer simulations and is a developer of several open-source scientific software packages.
Crystalline materials form the basis of many of the technologies we rely on, ranging from the silicon transistors powering our computers to the sugar powering us. The principal tool for determining the arrangement of atoms in a material (X-ray diffraction) was discovered just over a century ago, and in that time chemists have built up vast databases of materials which continue to grow exponentially. In this talk, I will discuss the information that is contained within diffraction data, its impact in understanding the physical properties of materials, and how it is impacting the way chemists make new materials. In relation to my own research, I will highlight how atomic structure data can be used to discover new materials, and some of the challenges in harnessing these data.
James Cumby is a lecturer in inorganic chemistry at the University of Edinburgh. He gained his PhD in materials chemistry from the University of Birmingham (2010-2014) followed by a postdoctoral research position in the Centre for Science at Extreme Conditions (CSEC) at Edinburgh. His research group is focussed on developing new materials with useful (and unusual) physical properties such as magnetism or electronic conduction, using a combination of experimental techniques and computational modelling. A particular area of interest is in using data-driven learning to predict new materials.
Much of the routine data generated during the provision of medical care in critical care units is discarded rather than fully used to help clinicians improve treatments. Multi-centre ‘intensive-care big-data’ initiatives such as the adult BrainIT group have successfully improved adult brain trauma care with new research ideas and data-driven improvement interventions. With the support of a prestigious EU grant, I have successfully set- up a new paediatric brain trauma ‘big-data’ initiative (KidsBrainIT) that uses high-quality bedside physiological data from patients recruited in 15 PICU in 5 countries to better understand the importance of bespoke management improvements (e.g. treatment of increased brain pressure from brain swelling). I am using KidsBrainIT as a proof-of-concept to demonstrate the benefits of data-intensive informatics in improvement research and to translate this concept into IMPACT-ACE that will ultimately improve patient care, safety and outcome in the broader critical-care setting and other medical specialities in the future. In this talk, I will summarise the challenges we have overcome using this research approach and how we may make better use of big data generated routinely within critical care units.
Dr. Milly Lo is a consultant paediatric intensivist and research lead at Royal Hospital for Sick Children in Edinburgh. Dr. Lo’s research training included successful completion of a PhD degree in Edinburgh and post-doctoral research training in Toronto (Canada) to better understand how acute physiological insults, genetic, and biochemical factors influence life-threatening childhood brain trauma outcome. This research training prepared Dr. Lo for the role of Hon. Reader at the University of Edinburgh. Dr. Lo is the first paediatric intensivist to be awarded a prestigious EU research grant to set-up and lead an international data informatics paediatric brain trauma research initiative called KidsBrainIT. Dr. Lo was also the first paediatric intensivist to be awarded an NRS Career Research Fellowship in 2013 to set up and lead the first data informatics improvement research programme in paediatric critical care in the UK (IMPACT-ACE). Dr Lo’s current research focuses on employing data informatics approach to big data generated from routine clinical care for research to improve patient treatments, outcome, and safety in the paediatric critical care setting. This approach minimises data wastage and helps to unlock vital information and research ideas that allow a more personalised approach to tailor treatment for our patients.
Perfectly ordered structures have been reported to drastically outperform traditional packing in a variety of applications in chemistry and engineering. While this used to be a rather theoretical concept, 3D printing now enables the fabrication of such ordered structures, with complex geometry, and with resolution at the micron scale. In this lecture I will present a holistic toolbox to design, manufacture and characterize such structures. In my research group we blend a range of modelling and experimental methods, from fluid dynamics to machine learning, from materials science to engineering practice. I will demonstrate how our approach to 3D printing delivers optimized structures and materials with improved performance, with specific focus on applications in the separation sciences (e.g. chromatography) and biotechnology sectors (e.g. bioreactors). Hopefully this talk will spark your interest on this topic, and make you realize how 3D printed structures could complement and boost your research, regardless of its background and scope!
Simone Dimartino is a Senior Lecturer at the Institute for Bioengineering at the University of Edinburgh. He did his PhD at the University of Bologna on membrane-based separations in the biopharmaceutical industry (2009), followed by an academic position at the University of Christchurch, New Zealand, where he explored new separation methods for the production of biologics. He now employs 3D printing methods for the fabrication of devices with perfectly ordered internal morphology, with applications ranging bioseparations, biocatalysis and heat transfer. To know more about his research please watch: fun science communication video, and interview on the future of 3D printing and chromatography.
We admit around 10,000 patients to Intensive Care Units (ICUs) in Scotland every year, with conditions such as sepsis, cardiac arrest and trauma. These patients are critically unwell, and despite best care, approximately 18% of patients admitted to ICU will die during their hospital admission. Significant amounts of data are collected on every patient admitted to the ICU including not only characteristics such as age, sex, and comorbidity, but also beat to beat information on heart rate, blood pressure and other markers of critical illness. This information helps us as clinicians in our diagnosis and management of each individual critically ill patient.
However, there are as yet unrealised potential uses for this data to improve outcomes for our whole population. Analysis of routine text healthcare data will enable us to characterise different healthcare phenotypes, identifying patients at risk of adverse events, or who may benefit from different interventions. Analysis of high-frequency physiological patient data could enable us to detect and potentially predict adverse events early and implement management changes quickly and effectively.
Collaboration between data scientists and clinicians using this routinely collected healthcare data could transform the way that patients are managed in the ICU and ultimately improve outcomes for this critically unwell population.
Annemarie Docherty is a Wellcome Clinical Research Development Fellow based in the Usher Institute, and Consultant in Critical Care at the Royal Infirmary Edinburgh. Her PhD, in myocardial injury in critically ill patients with co-existing cardiovascular disease (CVD), found that a quarter of patients with co-existing CVD have a heart attack in the first ten days of their ICU admission. Fewer than 5% of these heart attacks were diagnosed clinically, and these patients had greater mortality. She leads ICU-HEART (Intensive care - Cardiovascular disease: Use of HEAlthcare Routine data to inform Trial design), a programme of work that aims to improve outcomes for critically ill patients with co-existing cardiovascular disease using routine national healthcare data in Scotland and England.
Tracking the progress of the Sustainable Development Goals and targeting interventions requires frequent, up-to-date data on social, economic and ecosystem conditions. My research seeks to examine the role that remotely sensed satellite data could have in mapping and monitoring socioeconomic conditions by exploring how household wellbeing and deprivation can be predicted from land use maps and building roof material type both derived from fine spatial resolution satellite data. We demonstrate that satellite data can predict wellbeing in Kenya with between 51 and 62% accuracy. Prediction accuracy was higher when using a multi-level approach to linking households to landscapes and most land use changes between 2005 and 2014 were observed in homesteads of the poorest households. High-resolution satellite data could provide a faster and cheaper way to track several SDGs but work so far across several research groups and countries has been based on secondary data analysis. However, a challenge lies in upscaling the work to regional and national levels to make it relevant to policy makers. We are exploring various approaches including CNNs to identify how we might best move forward.
Gary Watmough is an Interdisciplinary Lecturer in Land use and socio-ecological systems and deputy director of the MSc in Earth Observation at the School of Geosciences, University of Edinburgh. Prior to this position he was a Marie-Sklodowska Curie Postdoctoral Research Fellow in Biosciences at Aarhus University in Denmark (2015-2017) and a Postdocotral Research Fellow in the Earth Institute at Columbia University, New York, USA (2012 – 2015). Dr. Watmough has worked as an interdisciplinary scientist on several projects linking Earth Observations data with Socioeconomic datasets in Nepal, India, Kenya and Mozambique. He has also worked with the International Fund for Agriculture and Development (IFAD) examining the role that Earth Observation data can have in monitoring and evaluating development interventions. He gained his PhD in Remote Sensing and Spatial Analysis from the University of Southampton, UK.
X-ray microtomography is an imaging technique that enables reconstructing the internal structure of objects based on variations in x-ray absorption and phase contrast. The x-rays produced by synchrotron light sources are even bright enough to see through pressure vessels and capture changes to the internal structure of rock samples during geological processes in experiments. Time-resolved (4D) microtomography, which produces vast amounts of image data, is currently revolutionising experimental geosciences. These data allow the quantification and interpretion of grain (i.e. micron-) scale processes in rocks. Where combined with more conventional mechanical, chemical, hydraulic and thermal data, they enable significant advances in our understanding of tectonic processes. These advances are currently curtailed by our lacking ability to optimize tomographic data acquisition, streamline data processing and most importantly, mine, combine and interpret the experimental data. In this talk I will outline our group’s experimental work with (synchrotron-based) 4D x-ray microtomography and describe our interfaces with data science. I will report on our current data analysis strategies and discuss our stumbling blocks in data processing and interpretation.
Florian Fusseis is a Senior Lecturer in Structural Geology at the School of Geosciences, UoE. Together with Ian Butler, he runs the 4D x-ray microtomography group at the School. Florian’s research interests include fluid-rock interaction and rock deformation as well as synchrotron x-ray imaging. Ian and Florian are considered pioneers in in-situ microtomography experiments at geological conditions and have studied rock deformation, reaction and fluid flow in dozens of synchrotron experiments over the past ten years.
Modern society is unsustainable. This fact highlights the imperative for transformational change to the industrial system and the materials basis of modern society, i.e., the ‘total materials system’. ‘Computational industrial ecology’ is an emerging approach that aims to leverage data and modelling tools to quantify the total materials system in order to understand how to improve its efficiency and reduce its environmental burdens. The Industrial Ecology Team is currently working towards this aim by developing a data structure for sustainability science data, a relational database to efficiently contain these data for querying, and an algorithm to unify these data into a model of the total materials system. In this seminar, we will introduce industrial ecology, discuss current research activities, and identify some potential areas for future research.
Rupert J. Myers is a Lecturer in Chemical Engineering: Industrial Ecology at the University of Edinburgh. His scholarly journey through various engineering and science disciplines, from Melbourne to Sheffield, EMPA, Berkeley, Yale, MIT, and Edinburgh, has been driven by a mission to reduce environmental burdens via sustainable engineering. He currently champions this mission by leading University learning in industrial ecology, and by focussing his research on globally pervasive materials that are virtually unmatched in importance to society, such as cement and metals. In 2015 he was awarded the Mike Sellars Medal for best PhD thesis in the University of Sheffield’s Department of Materials Science & Engineering.
The discovery of new classes of porous adsorbents such as metal-organic frameworks (MOFs) has opened access to a very large number of porous structures with a wide range of functionalities, which can be potentially exploited in different separation applications. Experimental evaluation of all these materials for specific applications is not feasible, and as a result, this prompted the development of high throughput computational screening methods.
In this presentation, I will reflect on the development of the multiscale strategies that combine molecular simulations and pressure swing adsorption modelling and optimization to predict performance of the materials on the process scale. Specifically, I will focus on the challenges associated with the interface between molecular and process levels of description and demonstrate that the emerging picture is quite complex. As a case study, a well-known 4-step vacuum swing adsorption (VSA) cycle with light product pressurization (LPP), and Zeolite 13X as adsorbent in application to carbon dioxide removal from a typical flue gas stream (15% CO2, 85% N2, 1 atm) will be considered. I will discuss (a) the effect of the protocol for fitting experimental adsorption data with analytical adsorption models (e.g. dual-site Langmuir model), (b) influence of the pellet porosity and (c) influence of the pellet size on the process performance and material raking. Another aspect of the multiscale strategies we intend to explore is the accuracy of the molecular force fields, particularly in reproducing nitrogen isotherms, and how this affects predictions for the performance of the material in a process and the resulting ranking.
Prof. Lev Sarkisov obtained his PhD in Chemical Engineering from the University of Massachusetts (Amherst, USA) in 2001. Following postdoctoral research posts at Northwestern (2001-2003) and Yale Universities (2003-2005), he joined the University of Edinburgh in 2005 as a Lecturer in Chemical Engineering. He was promoted to Senior Lecturer in 2010 and Professor in 2017 with the Personal Chair in Molecular Thermodynamics, and became Director of Discipline in Chemical Engineering (equivalent to the Head of the Department) in 2018. Prof. Sarkisov’s group specializes in multi-scale approaches to design of novel, functional porous materials for carbon capture, sensing, energy storage and drug delivery; multi-scale approaches to engineering chemical processes; adsorption and membrane separation processes, molecular simulations in application to chemical engineering problems.