Computer science, Mathematics & Statistics

Computer Science, Mathematics & Statistics resources.

Teach Yourself Computer Science

Teach yourself computer science is a pretty good initial guide for people who want to learn computer science on their own. It also contains links to multiple useful resources.

Website: https://teachyourselfcs.com/

Introduction to Computing for Data Analysis

A hands-on introduction to basic programming principles and practice relevant to modern data analysis, data mining, and machine learning.

Website: https://www.edx.org/course/introduction-to-computing-for-data-analysis

Foundations of Data Science: Computational Thinking with Python

Learn the basics of computational thinking, an essential skill in today’s data-driven world, using the popular programming language, Python.

Website: https://www.edx.org/course/foundations-data-science-computational-uc-berkeleyx-data8-1x

Mathematics for Machine Learning

This book provides great coverage of all the basic mathematical concepts for machine learning.

Link: https://mml-book.github.io/

Probability and Statistics

Textbook by Morris H. DeGroot, Mark J. Schervish

Publisher website: https://www.pearson.com/us/higher-education/product/De-Groot-Probability-and-Statistics-3rd-Edition/9780201524888.html

Machine learning

Machine learning is the science of getting computers to act without being explicitly programmed. This course provides a broad introduction to machine learning, data mining, and statistical pattern recognition. Topics include: (i) Supervised learning (parametric/non-parametric algorithms, support vector machines, kernels, neural networks). (ii) Unsupervised learning (clustering, dimensionality reduction, recommender systems, deep learning). (iii) Best practices in machine learning (bias/variance theory; innovation process in machine learning and AI). The course also draws from numerous case studies and applications, so that you'll also learn how to apply learning algorithms to building smart robots (perception, control), text understanding (web search, anti-spam), computer vision, medical informatics, audio, database mining, and other areas.

Website: https://www.coursera.org/learn/machine-learning

Deep learning

Deep learning (also known as deep structured learning or hierarchical learning) is part of a broader family of machine learning methods based on learning data representations, as opposed to task-specific algorithms. Learning can be supervised, semi-supervised or unsupervised.

Deep learning architectures such as deep neural networks, deep belief networks and recurrent neural networks have been applied to fields including computer vision, speech recognition, natural language processing, audio recognition, social network filtering, machine translation, bioinformatics, drug design and board game programs, where they have produced results comparable to and in some cases superior to human experts.

Deep learning models are vaguely inspired by information processing and communication patterns in biological nervous systems yet have various differences from the structural and functional properties of biological brains, which make them incompatible with neuroscience evidences.

Andrew Ng Deep Learning courses provide good overview of this field of research.

Website: https://www.deeplearning.ai/

Data visualisation

The difficulty of extracting information from high dimensional data is the main motivation for renewed interest in the problems of dimensionality reduction, feature extraction and clustering. High dimensional data take many different forms: from Digital image libraries to gene expression microarrays and financial time series. Researchers in fields as diverse as finance, physics, medicine, and bioinformatics have to deal with such large data sets. By formulating the problem of dimensionality reduction and clustering in a general setting, however, many different types of data can be analysed in the same underlying mathematical framework. This course explores this general framework and introduces several methods for dimensionality reduction, feature selection, and clustering.

Website: http://www.math.uwaterloo.ca/~aghodsib/courses/f06stat890/f06stat890.html

R Programming

In this course you will learn how to program in R and how to use R for effective data analysis. You will learn how to install and configure software necessary for a statistical programming environment and describe generic programming language concepts as they are implemented in a high-level statistical language. The course covers practical issues in statistical computing which includes programming in R, reading data into R, accessing R packages, writing R functions, debugging, profiling R code, and organizing and commenting R code. Topics in statistical data analysis will provide working examples.

Website: https://www.coursera.org/learn/r-programming

Data Science: R Basics

Build a foundation in R and learn how to wrangle, analyse, and visualize data. This course covers common programming commands, how to operate on vectors, and when to use advanced functions such as sorting.

Website: https://www.edx.org/course/data-science-r-basics

Introduction to Data Science

These are the class notes used in the HarvardX Data Science Series. The code to generate the notes is available on GitHub. For updates follow @rafalab

Website: https://rafalab.github.io/dsbook/index.html

A Course for Visualization in R

A course on data visualisation in R taking you from beginner to advanced.

Website: http://flowingdata.com/2015/05/06/introducing-a-course-for-visualization-in-r/

Introduction to Probability and Statistics

This course provides an elementary introduction to probability and statistics with applications. Topics include: basic combinatorics, random variables, probability distributions, Bayesian inference, hypothesis testing, confidence intervals, and linear regression.

Website: https://ocw.mit.edu/courses/mathematics/18-05-introduction-to-probability-and-statistics-spring-2014/

Modern Statistics for Modern Biology

The aim of this book by Susan Holmes, and Wolfgang Huber is to enable scientists working in biological research to quickly learn many of the important ideas and methods that they need to make the best of their experiments and of other available data.

Website: http://web.stanford.edu/class/bios221/book/introduction.html#introduction

Statistical Thinking for the 21st Century

This book by Russell A. Poldrack describes the approaches that are increasingly used in real statistical practice in the 21st century. The methods described take advantage of today’s increased computing power to solve statistical problems in ways that go far beyond the more standard methods that are usually taught in the undergraduate statistics courses.

Website: https://statsthinking21.org/

Seeing Theory

A visual introduction to probability and statistics

Website: https://students.brown.edu/seeing-theory/

Statistical Thinking and Data Analysis

This course is an introduction to statistical data analysis. Topics are chosen from applied probability, sampling, estimation, hypothesis testing, linear regression, analysis of variance, categorical data analysis, and nonparametric statistics.

Website: https://ocw.mit.edu/courses/sloan-school-of-management/15-075j-statistical-thinking-and-data-analysis-fall-2011/

Statistics for Applications

This course offers an in-depth theoretical foundation for statistical methods that are useful in many applications. The goal is to understand the role of mathematics in the research and development of efficient statistical methods.

Website: https://ocw.mit.edu/courses/mathematics/18-650-statistics-for-applications-fall-2016/

Selective Inference and False Discovery Rate I

Interesting lecture on multiple hypothesis testing.

Website: https://www.youtube.com/watch?v=oONHlua2gBY

Statistics and R

An introduction to basic statistical concepts and R programming skills necessary for analysing data in the life sciences.

Website: https://www.edx.org/course/statistics-r-harvardx-ph525-1x-1

Explore Statistics with R

Learn basic statistics in a practical, experimental way, through statistical programming with R, using examples from the health sciences.

Website: https://www.edx.org/course/statistics-and-r

Data Science in Stratified Healthcare and Precision Medicine

In this course, you will learn about some of the different types of data and computational methods involved in stratified healthcare and precision medicine. Topics include: (i) Sequence Processing, (ii) Image Analysis, (iii) Network Modelling, (iv) Probabilistic Modelling, (v) Machine Learning, (vi) Natural Language Processing, (vii) Process Modelling and (viii) Graph Data.

Website: https://www.coursera.org/learn/datascimed

Introduction to Statistical Methods for Gene Mapping

This data course is a primer to statistical genetics and covers an approach called linkage disequilibrium mapping, which analyses non-familial data and has been successfully used to identify genetic variants associated with common and complex genetic traits.

Website: https://www.edx.org/course/introduction-to-statistical-methods-for-gene-mapping

Introduction to Mathematical Thinking

The goal of the course is to help you develop a valuable mental ability – a powerful mathematical way of thinking that our ancestors have developed over three thousand years.

Website: https://online.stanford.edu/courses/hstar-y0001-introduction-mathematical-thinking

Applications of Linear Algebra Part 1

Learn to use linear algebra in computer graphics by making images disappear in an animation or creating a mosaic or fractal and in data mining to measure similarities between movies, songs, or friends.

Website: https://www.edx.org/course/applications-linear-algebra-part-1-davidsonx-d003x-1

Applications of Linear Algebra Part 2

Explore applications of linear algebra in the field of data mining by learning fundamentals of search engines, clustering movies into genres and of computer graphics by posterizing an image.

Website: https://www.edx.org/course/applications-linear-algebra-part-2-davidsonx-d003x-2

Matrix Methods in Data Analysis, Signal Processing, and Machine Learning

Linear algebra concepts are key for understanding and creating machine learning algorithms, especially as applied to deep learning and neural networks. This course reviews linear algebra with applications to probability and statistics and optimization–and above all a full explanation of deep learning.

Website: https://ocw.mit.edu/courses/mathematics/18-065-matrix-methods-in-data-analysis-signal-processing-and-machine-learning-spring-2018/ 

A First Course in Machine Learning - second edition

A First Course in Machine Learning covers the core mathematical and statistical techniques needed to understand some of the most popular machine learning algorithms. The algorithms presented span the main problem areas within machine learning: classification, clustering and projection.

Link: http://www.dcs.gla.ac.uk/~srogers/firstcourseml/

Pattern Recognition and Machine Learning - Christopher Bishop

This textbook provides a comprehensive introduction to the fields of pattern recognition and machine learning. It is aimed at advanced undergraduates or first-year PhD students, as well as researchers and practitioners. No previous knowledge of pattern recognition or machine learning concepts is assumed. Familiarity with multivariate calculus and basic linear algebra is required, and some experience in the use of probabilities would be helpful though not essential as the book includes a self-contained introduction to basic probability theory.

Publisher website: https://www.springer.com/gp/book/9780387310732