Beautiful web of data science

The web is a beautiful place, you can learn so much on it, for free. This document puts together some resources on (or related to) data science which are absolutely brilliant. We also list books, which might not be freely available, but that are definitely worth owning.

Topic-specific resources are already reported in every page for details, more insight, citation purposes. Here, we list material which is comprehensively good: online courses, books, blogs, and which span several of the topics we touch on in the book. All the material listed here can be of any format.

Statistics and Probability

  • C Bergstrom and J West, Calling Bullshit in the age of Big Data, a course on spotting manipulative use of data, wrong conclusions and overall misunderstandings

  • D Huff, How to lie with Statistics, a lovely little book on the common mistakes and misunderstandings around the use of numbers for reaching conclusions. Old (1954), but very valuable and entertaining

  • T Vigen, Spurious Correlations, a project which visually displays improbable correlations between sets of data to demonstrate that, well, correlation is not causation

  • Think Stats, an O'Reilly book by A B Downey, which is freely available online

  • Seeing Theory, a visual introduction to probability and statistics

Machine Learning - general material

  • Notes from the Stanford CS229 Course on Machine Learning (all lecture notes are open-access)

  • R2D3, a visual introduction to Machine Learning, project by S Yee and T Chu

  • Explained Visually, a project by V Powell and L Lehe

  • C Molnar, Interpretable Machine Learning, a book, freely available

Neural Networks

Computer Vision

  • The Hypermedia Image Processing Reference, a website built by the University of Edinburgh, School of Informatics

  • Pyimagesearch, a website curated by A Rosebrock on Computer Vision and Machine/Deep Learning on images, with tutorials for OpenCV and lots of good material

Coding and Computer Science

  • Sorting Algorithms interactives, by Toptal

  • Practical Business Python, a website By C Moffitt devoted to best practices on using Python for practical reasons, it's very good

  • Gayle Laakmann McDowell, Cracking the Coding Interview, CareerCup

Python references relevant to Data Science

  • Scipy lecture notes - they're pretty brilliant and obviously focussed on Python, but you can learn lots of concepts in data

  • scikit-learn has tutorials and extensive explanations for every algorithm they support, as well as general notes on Machine Learning

  • The Hitchhiker's guide to Python (not particularly targeted at data science, but a very useful reference)