# Study with us

A list of all the possible topics of research within ICSA.

For information on postgraduate study within Informatics please refer to the Postgraduate section on the Informatics website.

## Possible PhD Topics in ICSA

This is the list of possible PhD topics suggested by members of staff in ICSA. These topics are meant to give PhD applicants an idea of the scope of the work in the Institute. Of course applicants can also suggest their own topic. In both cases, they should contact the potential supervisor before submitting an application.

### NEW! Dynamic Code Analysis and Optimisation

Prospective Supervisors: Björn Franke

While static analysis attempts to derive code properties from source code or some kind of intermediate representation much more information about a program becomes available during its execution. For example, many concrete values of variables are not known at compilation time, but only become available until the program is running. Dynamic code analysis attempts to extract useful program information at runtime, either in an offline profiling stage or in a runtime system much like a just-in-time compiler. In the later case, there is an additional challenge in that code instrumentation and analysis must not impact performance too much. Dynamic information can be used to drive code optimisations, possibly speculatively, including parallelisation.

### NEW! High-level Hardware Synthesis of Neural Network in Lift

Prospective Supervisors: Christophe Dubach

Lift is a novel approach to achieving performance portability on parallel accelerators. Lift combines a high-level functional data parallel language with a system of rewrite rules which encode algorithmic and hardware-specific optimisation choices. The goal of Lift is to target many different type of accelerators from a single high-level source code. This project will investigate the ability to synthesis neural network onto FPGAs using Lift.

Lift

### Optimisation for Deep Learning on Embedded Devices

Prospective Supervisors: Michael O'Boyle

2 PhD studentships are available under the supervision of Prof. Michael O'Boyle within the Institute for Computing Systems Architecture, at the School of Informatics, University of Edinburgh, to begin in 2017, start date flexible. Both these studentships are in collaboration with ARM. The projects are concerned with efficient implementation of deep learning networks on constrained devices. While there has been much activity in how to efficiently learn a network with large training data, there is much less on how to deploy it efficiently on an constrained resource device. The best network and code structure depends on scenario and there will be a trade-off between space, time, energy and accuracy. The projects will investigate code optimisations such as code specialisation, higher-parameter exploration, auto-tuned libraries, reduced bit data representation etc to explore these trade-offs. How to update and adapt the network to new data could also be an an area of research.

Bonseyes PhD places

### Advanced JIT compilation for mobile devices

Prospective Supervisor: Bjoern Franke

Just-in-time (JIT) compilation is a frequently dynamic compilation strategy. It aims to provide application portability whilst minimising the compilation overhead on the target device. In this project we aim to adapt and extend JIT compilation technology to mobile devices that typically contain less powerful processors than e.g. desktop or notebook computers and are battery powered and, hence, need to meet tight energy constraints.

### Auto-Parallelisation

Prospective Supervisors: Michael O'BoyleBjoern Franke

The aim of this project is to develop advanced compiler technology that can take emerging applications and automatically map them on to the next generation multi-core processors such as the IBM Cell. This PhD will involve new research into discovering parallelism within multimedia and streaming applications going beyond standard data parallel analysis. The project will also investigate cost-effective mapping of parallelism to processors which may include dynamic or adaptive compilation.

### Design and Optimisation of Multi-core Heterogeneous Systems

Prospective Supervisors: Nigel Topham

DSP Architectures Multi-core DSP systems offer challenges in the design of the memory and interconnect architecture so that the DSPs can optimally access system resources without conflict. This PhD project will focus on optimisation strategies used for designing both the hardware architecture and also efficient ways of partitioning the software across cores, to obtain the highest performance for typical use-cases.

### Compilation of DSP Algorithms via a High-level Algorithm Description Languages

Prospective Supervisors: Nigel Topham

One of the challenges of DSP and compiler design is the difficulty of write compilers that can translate high-level C code to t DSP instruction sets without using low-level directives or intrinsic functions. This means that porting code from one DSP to another can be very time consuming. Code development is complicated by hardware is resource constraints and last-minute algorithm adjustments. This PhD project will look at developing a high-level algorithm description language that is easy to write and maps well onto different DSP instruction sets. The project will then look at techniques to allow the performance of DSP algorithms to be tuned automatically according to the available hardware and a set of performance constraints.

### Co-design of DSP Cores and Compilers

Prospective Supervisors: Nigel Topham

Designing compilers to compile code efficiently onto DSP cores is an ongoing challenge, particularly if the processor has external hardware acceleration. Most compilers require the use of machine-specific directives or intrinsics in order to optimally compile for accelerators, which means that porting code from one DSP core to another is very time consuming. The task of scheduling software is also made more difficult due to the latency of hardware accelerators. This PhD project will first focus on the co-optimisation of the DSP core and compiler technology to allow high-level C to be used. The project will then look at application-specific methods to optimise accelerator hardware in terms of execution speed, power consumption and silicon area.

### Compilers that Learn to Optimise

Prospective Supervisors: Michael O'Boyle

Develop a compiler framework that can automatically learn how to optimise programs.

Rather than hard-coding a compiler strategy for each platform, we aim to develop a novel portable compiler approach that can automatically tune itself to any fixed hardware and can improve its performance over time. This is achieved by employing machine learning approaches to optimisation, where the machine learning algorithm first learns the optimisation space and then automatically derives a compilation strategy that attempts to generate the best'' optimised version of any user program. Such an approach, if successful, will have a wide range of applications. It will allow portability and performance of compilers across platforms, eliminating the human compiler-development bottleneck.

### Design, Analysis and Optimisation of Next-Generation Mobile Cellular Networks

Prospective Supervisors: Mahesh Marina

Mobile cellular networks are witnessing dramatic changes owing to the exponential increase in mobile data usage and the emergence of devices like smartphones and tablets in recent years. At the same time, mobile network operators are faced with declining average revenue per user (ARPU) in view of the greater competition as well as the increasing infrastructure investments to keep up with the rising demand for high-performance mobile data networks. This has given rise to several new approaches that are currently under investigation to better cope with mismatch between user demand and ARPU trends. Embracing the heterogeneity in the cellular network infrastructure with the inclusion of consumer and third-party owned base stations (femto cells and pico cells) is one such approach towards denser infrastructure at lower cost to the operators. Re-architecting radio access networks (RANs) to make base stations low cost, low power and easy to deploy by centralising as much of the base station processing is another approach, sometimes referred to as cloud RANs. Yet another approach involves offloading traffic at times of peak load to other co-located wireless networks (e.g., WiFi) and the related concept of opportunistic secondary use of other licensed spectrum (e.g., TV white spaces) via cognitive radios. There is also much scope for innovation through network monitoring and analysis for service and infrastructure optimisation to generate new revenue streams and reduce mobile network operating expenditure (OPEX) via self-organising network (SON) functionalities, respectively. The aim of this project is to investigate cutting-edge research issues concerning network architectures, performance analysis, optimisations, holistic resource management protocols and algorithms (including interference coordination) within the context of the aforementioned approaches. Exploring the benefit of sofware-defined radios (SDRs) and networking platforms in addressing these issues is also within the scope of this work.

### Distributed algorithms and protocols for mobile and wireless networks

Prospective supervisor: Rik Sarkar

The goal is to develop algorithms and protocols for processing data inside a network, and answering questions about this data. This is  useful in sensor and mobile networks where local computation abilities can be used to make the system more efficient and reduce communication. Mobile networks have additional properties such as moving nodes and GPS capabilities, which create for us additional challenges as well as opportunities in protocol design. These topics can be explored both analytically through algorithmics and experimentally through simulations and implementations.

### Energy and Area Modelling for Architecture Synthesis

Prospective Supervisors: Nigel Topham

Investigate ways in which physically accurate wire length models can be constructed from high level hardware specifications.

As on-chip transistors shrink, the contribution of inter-gate wiring delays to the overall timing and power consumption of logic circuits becomes more pronounced. This will have a profound effect on the design of future microprocessors, and on the ways in which automated design methodologies model the delay and power consumption. For example, the inaccuracies of statistical wire-load models becomes so severe below 0.18μm that they are of little practical use. The alternative, which is to use some form of physical synthesis, is too computationally expensive to use in the kinds of iterative design-space exploration where hundreds or thousands of competing hardware structures must be evaluated.

Prospective Supervisors: Nigel Topham

Explore the possibilities for low-power multi-threaded architectures, particularly those aimed at real-time embedded micro-controller applications.

Multi-threading is an established technique for time-slicing the micro- architectural structures in a deeply-pipelined processor. It allows the hardware resources to achieve better utilisation, yielding higher overall processor throughput, and for real-time computing it offers a good solution for guaranteed response times. However, the replication of resources and the provision of hardware thread schedulers, typically leads to complex processor designs that are physically large and consume lots of energy.

### Memory Consistency Models and Cache Coherency for Parallel Architectures

Prospective Supervisor: Vijay Nagarajan

Parallel architectures (e.g. multicores, manycores and GPUs) are here. Since performance on parallel architectures is contingent on programmers writing parallel software, it is crucial that the parallel architectures are "programmable". The memory consistency model which essentially specifies what a memory read can return is at the heart of concurrency semantics. The cache coherency sub-system which provides a view of a coherent shared memory is at the heart of shared memory programming. The goal of this project is to design and implement memory consistency models and cache coherence subsystem for future parallel architectures.

### Parallelism Discovery

Prospective Supervisor: Bjoern Franke

Most legacy applications are written in a sequential programming language and expose very little scope for immediate parallelisation. The broad availability of multicore computers, however, necessitates the parallelisation of such applications if the users want to further improve application performance. In this project we investigate dynamic methods for the discovery of parallelism within sequential legacy applications.

### Patterns and Skeleton in Parallel Programming

Prospective Supervisors: Murray Cole

The skeletal approach to parallel programming advocates the use of program forming constructs which abstract commonly occurring patterns of parallel computation and interaction.

Many parallel programs can be expressed as instances of more generic patterns of parallelism, such as pipelines, stencils, wavefronts and divide-and-conquer. In our work we call these patterns "skeletons". Providing a skeleton API simplifies programming: the programmer only has to write code which customizes selected skeletons to the application. This also makes the resulting programs more performance portable: the compiler and/or run-time can exploit structural information provided by the skeleton to choose the best implementation strategy for a range of underlying architectures, from GPU, through manycore, and on to large heterogeneous clusters.

Opportunities for research in this area include the full integration of skeletons into the language and compilation process, dynamic optimization of skeletons for diverse heterogeneous systems, the extension of skeleton approaches to applications which are "not quite" skeleton instances, the automatic discovery of new (and old) skeletons in existing applications, and the design and implementation of skeleton languages in domain-specific contexts.

### Processor Design

Prospective Supervisors: Michael O'Boyle

Investigating new ways to explore the design space of multi-core low power scalable architectures suitable for mobile embedded devices.

Future embedded systems will have the capacity to deploy many CPUs in a single chip. There are a large number of variables in the complex process of optimising the design of such multi-core systems. Automated design space exploration is therefore becoming an important area of research for future embedded systems.

### Reconfigurable Caches

Prospective Supervisors: Michael O'Boyle

Investigate reconfigurable cache designs to provide scalable high-performance configurations for future multi-core systems.

Current trends toward chip-multiprocessors with increasing number of processors and the comparatively smaller increase in on-chip cache sizes place increasing pressure on the cache hierarchy. Reconfigurable cache hierarchies allow the system to dynamically adapt to the workload to maximise the effectiveness of caching.

### Reconfigurable Data-Parallel Structures for Embedded Computation

Prospective Supervisors: Nigel Topham

Search for reconfigurable structures that do not rely on relatively inefficient Field Programmable Gate Arrays.

One of the defining characteristics of computationally-demanding embedded computations, such as high definition video codecs, is the large quantities of data parallelism in many of their algorithms. The aim of this project is to explore the characteristics of a range of data-parallel algorithms, from typical embedded computations, and identify new and novel ways to provide reconfigurable micro- architectural structures to exploit the available data parallelism.

### Searching the Embedded Program Optimisation Space

Prospective Supervisors: Michael O'Boyle

Investigate the use of automatically generated performance predictors based on machine learning to act as proxies for the machine.

Efficient implementation is critical for embedded systems. Current optimising compiler approaches based on static analysis fail to deliver performance as they rely on a hardwired idealised model of the processor. This project is concerned with using feedback directed search techniques which can dramatically outperform standard approaches. This project will investigate the use of automatically generated performance predictors based on machine learning to act as proxies for the machine. This will allow extremely rapid determination of good optimisations and allow greater coverage of the optimisation space.