Resources
Latest updated: Jan 2, 2024
Research utilities
Books/Survey
 Harvard CS197  AI research experiences
 Stanford CS197  Computer Science research
 CS Research 101 by Neeldhara Misra and Shashank Srikant
 How to read research papers by Aaditya Ramdas
 AI research journey and advice by Jason Wei
 You and Your research by Richard Hamming
 A few words on research for graduate students by Fan Chung Graham
 Four golden lessons by Steven Weinberg
 How I prepared for Deepmind and google AI research internship interviews in 2019 by David Stutz
 Tips for Success as a New Researcher by Alex Tamkin
 Research as a Stochastic Decision Process by Jacob Steinhardt
 Ph.D students must break away from undergraduate mentality by Jason Hong
 Research Taste Exercises by Christopher Olah
 Better Saving and Logging for Research experiments by Daniel Seita
 CS PhD Statements of Purpose
 Personal Statement Advice by Suchin Gururangan
 PhD Statement of Purpose by Nelson Liu
 Writing by Eric Zhang
 AI paper Feed
 Tips for Writing Technical Papers by Jennifer Widom
 How to avoid ML pitfalls: a guide for academic researchers by Michael A. Lones
 Write the paper first by Jason Eisner
 An opinionated guide to ML research by John Schulman
 Lessons learned the hard way in grad school (so far) by Andrey Kurenkov
 DeMystifying Good Research and Good Papers by FeiFei Li
 What You Know Matters More Than What You Do by Cal Newport
 Collaboration and Credit principles by Christopher Olah
 The PhD Grind — Philip Guo’s Ebook bu Daniel Takeshi
 SOP  Rishabh Ranjan
 SOP  Ameya Daigavane
 SOP  Siddartha Devic
 SOP  Aaron Dharna
 SOP  Naveen Raman
 3 qualities of successful Ph.D. students; Perseverance, tenacity and cogency by Matt Might
 Console productivity hack: Discover the frequent; then make it the easy by Matt Might
 Classroom Fortress: The Nine Kinds of Students by Matt Might
 10 easy ways to fail a Ph.D. by Matt Might
 Productivity tips, tricks and hacks for academics (2015 edition) by Matt Might
 What every computer science major should know by Matt Might
 Fan Pu Zeng
 Troubling trends in machine learning scholarship by Zachary Lipton, Jacob Steinhardt
 Career examples: proposals+comments
 How to organize your files by Jason Eisner
 How to find research problems by Jason Eisner
 3 shell scripts to improve your writing, or “My Ph.D. advisor rewrote himself in bash by Matt Might
 ProblemSolving strategies by Arthur Engel
Links
 Tuning playbook
 JiaHuang Bin research advice
 ML Contests
 Arxivsanity
 Graduate Fellowship Opportunities 202324 Academic Year
 Applied ML
 How to Do Great Research
Bloggers
 Sebastian Ruder  NLPfocused
 Lil’Log: Many introductions to new topics, wellwritten, easy to understand, highly recommend!
 Jay Alammar: Many introductions to new topics, good visualizations
 Andrej Karpathy: Legends in the NLP, many good tips
 Distill Many introductions to new topics, good visualizations
 Colah’s blog: Focus on computer vision, many good tips, collaboration with Distill.
 I’m a bandit: Optimization, statistics, probability theory, ML theory.
 Sudeep Raja: Online learning
 Bounded Regret: Introductions to new topics, many opinions, many tips
 inFERENCe: Statistics, various topics, many tutorials, insights and opinions
 Mike Bostock: Software tips
 Michael Nielsen: Focused on CV, touch on quantum.
 Daniel Takeshi’s blog Various introductory topics, class reviews @Berkeley, lots of tips
 Gregory Gundersen: Focused on statistics, timeseries data, GP, Bayesian inference
 Massimiliano (Max) Patacchiola’s blog: RL, fewshot, variational inference
 Machine Learning Research Blog – Francis Bach: Focused on theoretical ML
 arg min blog: Focused on theoretical ML, a lot of experience
 Kiran Vodrahalli: Huge resources
 Machine Thoughts : Various topics, lots of opinions, AI general
 Off convex: ML theory
 Hunch: ML theory
 Cosma Rohilla Shalizi: CMU statistics prof, very good notebooks
 Windows on theory: Original thought on many topics
 Karl Stratos
 Machine Learning Blog  ML@CMU
 FastML
 OpenAI Blog
 Pinterest Engineering Blog – Medium
Math ML
Courses
 CMU 10606  Math ML
 Theoretical Toolkit for CS
 Berkeley EE227BT  Convex Optimization
 Berkeley EE227C  Convex optimization and approximation
 Berkeley EECS 127/227 AT  Optimization models and applications
 Cornell CS 4783/5783 Mathematical foundations of machine learning
 Princeton ORF523  Convex and Conic optimization
 Princeton ELE522  LargeScale Optimization for Data Science
 Cornell ORIE 6300  Mathematical Programming I
 CMU 15859(E)  Linear and Semidefinite Programming (Advanced Algorithms)
 CMU MATH 720  Measure Theory and Integration
 NYU Mathematical tools for data science
 TTIC 31150/CMSC 31150  Mathematical Toolkit
 CMPUT 340  Introduction to Numerical Methods
Books/Survey
 Algebra, Topology, Differential Calculus, and Optimization Theory For Computer Science and Machine Learning by Jean Gallier and Jocelyn Quaintance
 Mathematics for Machine Learning UCBerkeley by Garrett Thomas
 Real Not Complex
 Real and Complex analysis by Walter Rudin
 Real Analysis: Measure Theory, Integration and Hilbert spaces by Elias M. Stein and Rami Shakarchi
 Matrix Cookbook by Kaare Brandt Petersen and Michael Syskind Pedersen
 Linear algebra review by Zico Kolter
 Probability and Measure  Course notes for STAT 571
 Convex Optimization: Algorithms and Complexity by Sébastien Bubeck
 Introductory Lectures on Convex Optimization: A Basic Course by Yurii Nesterov
 Convex Optimization by Stephen Boyd, Lieven Vandenberghe
 All the math you missed (but need to know for graduate school) by Thomas Garrity
 Mathematics for Machine Learning by Marc Peter Deisenroth, A. Aldo Faisal, Cheng Soon Ong
 Group theory by Mark McConnell
 Topology by James Munkres
 Proximal algorithms by Neal Parikh, Stephen Boyd
Papers
 Matrix Calculus You need for DL by Terence Parr and Jeremy Howard
 The math of AI by Gitta Kutyniok
 The modern math of DL by Julius Berner, Philipp Grohs, Gitta Kutyniok, Philipp Petersen
Computational ML
Courses
 CMU 10607  Computational ML
 Numerics of Machine Learning
Books/Survey
Papers
Deep Learning
Courses
 TTIC 31230  Fundamentals of DL
 CMU 10417  Intermediate Deep Learning
 CMU 10414  Deep Learning System
 CMU 10707  Deep Learning
 Illinois  DL theory
 Princeton CS597B  Theoretical DL
 Uni Maryland CMSC 828W  Foundations of DL
 Stanford STATS 385  Analyses of DL
 CS W182/282  Designing, visualizing, and understanding deep neural networks
 MIT 65930  Hardware architecture for DL
 Stanford STATS385  Analyses of Deep Learning
 UC Berkeley Stat212b  Topics Course on Deep Learning
 MIT 18.177  Mathematical Aspects of Deep Learning
 MIT 6.883  Science of Deep Learning: Bridging Theory and Practice
Books/Survey
 Deep Learning by Ian Goodfellow, Yoshua Bengio and Aaron Courville
 Neural network and Deep Learning by Michael Nielsen
 The Little Book of Deep Learning by Francois Fleuret
 Deep learning theory lecture notes by Matus Telgarsky
 Challenges in DL by Razvan Pascanu
 The principles of DL Theory by Daniel A. Roberts and Sho Yaida
 Theory of deep learning by Raman Arora, Sanjeev arora, Joan Bruna, Nadav Cohen, Rong Ge, Suriya Gunasekar, Chi Jin, Jason Lee, Tengyu Ma, Behnam Neyshabur, Zhao Song
 Deep learning: a statistical viewpoint by Peter Bartlett, Andrea Montanari, Alexander Rakhlin
 Mathematical introduction to deep learning: methods, implementations, and theory by Arnulf Jentzen, Benno Kuckuck, Philippe von Wurstemberger
Papers
 Open Problems in Applied Deep Learning by Maziar Raissi
 A statistician teaches deep learning by G. Jogesh Babu, David Banks, Hyunsoon Cho, David Han, Hailin Sang, Shouyi Wang
 Toward Theoretical Understanding of DL by Sanjeev Arora
 Recent advances in deep learning theory by Fengxiang He, Dacheng Tao
Machine Learning
Courses
 CMU 10716  Advanced Machine Learning
 Cornell CS6780  Advanced Machine Learning
 Caltech CS159  Advanced ML
 Stanford STATS214/CS229M  Machine Learning Theory
 Cornell CS6783  Machine Learning Theory
 Princeton CS 511  Theoretical ML
 UofA CMPUT 654  Theoretical foundations of ML
 TTIC 31250  An Introduction to the Theory of Machine Learning
 CMUT 10715  Advanced Intro to ML
 CMU 10605  ML with large datasets
 CMU 10418  ML for structured data
 Cornell ECE 5545  ML hardware and systems
 Stanford CS 229s  System for ML
 Gaussian Process Summer School
 Berkeley EECS 208  Computational Principles for HighDimensional Data Analysis
 CMU 36708  Statistical methods for machine learning
 Washington STAT 928  Statistical Learning theory
 TTIC 31120: Computational and Statistical Learning Theory
 MIT 9.520/6.860  Statistical Learning Theory and Applications
 CMU 36708  The ABCDE of Statistical methods for ML
 TTIC 31120: Computational and Statistical Learning Theory
 Columbia COMS 4252  Intro to computational learning theory
 CMU 36465/665  Conceptual Foundations of Statistical Learning
 Caltech CS/CNS/EE/IDS 165  Foundations of Machine Learning and Statistical Inference
 CMU 15859  Algorithms for Big Data
 A Handson Approach for Implementing Stochastic Optimization Algorithms from Scratch
 UPenn  The algorithmic foundations of adaptive data analysis
 MIT 18408  Algorithmic Aspects of ML
 Cornell 6781  Foundations of Modern Machine Learning
 Columbia COMS 4772  ML Theory
 Columbia COMS 4774  Unsupervised Learning
 Princeton COS 598  Unsupervised Learning: Theory and Practice
 A course in ML by Hal Daume
 Duke COMPSCI 590.2  Algorithimic Aspects of Machine Learning
 Princeton CS 597A  New Directions in Theoretical Machine Learning
 Simons  Foundations of Machine Learning
 Simons  Foundations of Data Science
 Machine Learning with kernel methods
 Stanford CS364A  Algorithmic Game Theory
 Michigan EECS 598  Theoretical foundations of ML
Books/Survey
 Pen and Paper Exercises in ML by Michael U. Gutmann
 Patterns, Predictions, and Actions  A story about machine learning by Moritz Hardt and Benjamin Recht
 Mathematical Analysis of Machine Learning Algorithms by Tong Zhang
 Artificial Intelligence: A modern approach by Stuart Russell and Peter Norvig
 Deep Learning Cheatsheet by Afshine Amidi and Shervine Amidi
 Recent Advances in Bayesian Optimization
 The Algorithmic Foundations of Differential Privacy by Cynthia Dwork, Aaron Roth
 Foundations of Data science by Avrim Blum, John Hopcroft, and Ravindran Kannan
 Provable Algorithms for machine learning problems by Rong Re
 CS229T/STAT231: Statistical Learning Theory (Winter 2016) by Percy Lianghttps://web.stanford.edu/class/cs229t/2017/Lectures/percynotes.pdf
 Understanding Machine Learning: From Theory to Algorithms by Shai ShalevShwartz and Shai BenDavid
 Foundations of Machine Learning by Mehryar Mohri, Afshin Rostamizadeh, and Ameet Talwalkar
 Learning with Kernels by Bernhard Schölkopf and Alexander J. Smola
 Linear Dimensionality Reduction: Survey, Insights, and Generalizations by John P. Cunningham, Zoubin Ghahramani
 Computational optimal transport by Gabriel Peyré, Marco Cuturi
 Pattern Recognition and Machine Learning by Christopher M. Bishop
 Machine Learning: A Probabilistic Perspective by Kevin Murphy
 Probabilistic Machine Learning: An Introduction by Kevin Murphy
 Probabilistic Machine Learning: Advanced Topics by Kevin Murphy
 Sampling algorithms
Papers
 Theory of classification: A Survey of some recent advances by Stephane Boucheron, Olivier Bousquet, and Gabor Lugosi
 Bayesian nonparametrics and the probabilistic approach to modelling by Zoubin Ghahramani
 An introduction to Hidden Markov Models and Bayesian networks by Zoubin Ghahramani
 A Unifying Review of Linear Gaussian Models by Sam Roweis, Zoubin Ghahramani
 Bayesian Nonparametric Models by Peter Orbanz, Yee Whye Teh
 Dirichlet Process by Yee Whye Teh
 Hierarchical Bayesian Nonparametric Models with Applications by Yee Whye Teh
Reinforcement Learning
Courses
 CMU 10703  Deep RL
 Berkeley CS285  Deep RL
 Stanford CS224R  Deep RL
 CMPUT 655  RL 1  Graduate
 CMPUT 605 RL Theory Grad
 Washington CSE 599  RL and Bandits
 MIT 67950  RL Foundations and Methods
 Cornell CS 6789  Foundations of RL
 Illinois CS 542  Statistical RL
 Princeton COS 597R  Probabilistic Topics in RL
 McGill ECSE 506  Stochastic control and decision theory
 Stanford EE263: Introduction to Linear Dynamical Systems
 Columbia Dynamic Programming and RL
 David Silver RL course
 Purdue CS58300: RL
 Columbia Dynamic programming and reinforcement learning
Books/Survey
 A succint Summary of Reinforcement Learning by Sanjeevan Ahilan
 An Introduction to Deep RL Vincent FrançoisLavet, Peter Henderson, Riashat Islam, Marc G. Bellemare and Joelle Pineau
 Empirical Design in Reinforcement Learning
 Towards Continual RL: A review and perspectives by Khimya Khetarpal, Matthew Rieme, Irina Rish, Doina Precup
 A survey of MetaRL by Jacob Beck, Risto Vuorio, Evan Zheran Liu, Zheng Xiong, Luisa Zintgraf, Chelsea Finn, Shimon Whiteson
 Reinforcement Learning: An Introduction by Richard S. Sutton, Andrew G. Barto
 Reinforcement Learning: Theory and Algorithms by Alekh Agarwal, Nan Jiang, Sham M. Kakade, Wen Sun
 Algorithms for Reinforcement Learning by Csaba Szepesvari
 Dynamic Programming and Optimal Control by Dimitri P. Bertsekas
 Simulation and the Monte Carlo Method by Reuven Y. Rubinstein, Dirk P. Kroese
 Practical Methods for Optimal Control and Estimation using nonlinear programming by John T. Betts
 Multiagent Reinforcement Learning: Foundations and Modern Approaches by Stefano V. Albrecht, Filippos Christianos, Lukas Schäfer
 RL, bit by bit by Xiuyuan Lu, Benjamin Van Roy, Vikranth Dwaracherla, Morteza Ibrahimi, Ian Osband and Zheng Wen
 A Tour of Reinforcement Learning: The View from Continuous Control by Benjamin Recht
Papers
 Mathematical Foundations of Monte Carlo Methods
Online Learning
Courses
 Southern Cali CSCI 659  Intro to Online Optimization/Learning
 Berkeley EE290/CS194  ML for sequential decision making under uncertainty
 Victoria CSC 482/581  Intro to Online Learning
 Washington CSE599  Online Learning
 Berkeley EE 290  Theory of Multiarmed Bandits and RL
 Washington CSE 599M  Interactive Machine Learning in Nonstochastic Environments
 Illinois IE 498: Online Learning and Decision Making
 Columbia COMS E6998.001: Bandits and RL
 MIT 6883: Online methods in ML
 Remark: These courses are closely related to Reinforcement Learning.
Books/Survey
 Introduction to Online Optimization by Sebastien Bubeck
 A modern Introduction to Online Learning by Francesco Orabona
 Introduction to Online Convex Optimization by Elad Hazan
 Online Learning and Online Convex Optimization by Shai ShalevShwartz
 Online Learning: A comprehensive Survey by Steven C.H. Hoi, Doyen Sahoo, Jing Lu, Peilin Zhao
 Bandit algorithms by Tor Lattimore and Csaba Szepesvari
 Regret Analysis of Stochastic and Nonstochastic multiarmed bandit problem by Sebastien Bubeck and Nicolo CesaBianchi
 Introduction to Multiarmed bandits by Aleksandrs Slivkins
 A tutorial on Thompson Sampling Daniel J. Russo , Benjamin Van Roy , Abbas Kazerouni, Ian Osband and Zheng Wen
 Online Evaluation for Information Retrieval by Katja Hofmann, Lihong Li, Filip Radlinski
 Prediction, Learning, Games by Nicolo CesaBianchi, Gabor Lugosi
 Bandit lecture notes by Kevin Jamieson
 From Bandits to MonteCarlo Tree Search: The optimistic principle applied to optimization and planning by Remi Munos
Papers
Geometric DL/ Graph NN
Courses
 AMMI Geometric DL
 Stanford CS224W  Machine Learning with Graphs
 UPenn  GNN
 Stanford CS246: Mining Massive Data Sets
Books/Survey
 Geometric Deep Learning Grids, Groups, Graphs, Geodesics, and Gauges by Michael M. Bronstein, Joan Bruna, Taco Cohen, Petar Veličković
Papers
Meta Learning
Courses
 Stanford CS 330  Deep Multitask and meta learning
Books/Survey
Papers
Representation Learning
Courses
 Mila IFT 6135 Representation Learning
Books/Survey
Papers
PGM
Courses
 Stanford CS228  PGM
 UofT CSC 412  Probabilistic ML
 CMU 10708  PGM
Books/Survey
 [Probabilistic graphical model: Principles and techniques by Daphne Koller and Nir Friedman]
Papers
Statistics
Courses
 UofT STA314H1F  Statistical Learning theory I
 Uoft STA414  Statistical Learning theory II
 Berkeley EECS 126  Probability and Random processes
 Berkeley Stat210B  Theoretical Statistics
 CMU 36705  Intermediate Statistics
 Stanford STATS 300B: Theory of Statistics II
 Statistics 311/Electrical Engineering 377: Information Theory and Statistics
 Stanford EE 378B – Inference, Estimation, and Information Processing
 Harvard CS 229r  Information Theory in Computer Science
 Harvard CS 229r  Essential Coding Theory
 MIT 18657  High dimensional probability
 Princeton MAT 589  Topics in Probability, Statistics and Dynamics: Modern discrete probability theory
Books/Survey
 Probability Theory Survey by Arian Maleki and Tom Do
 Statistics Proof book
 Probability with Martingales by David Williams
 Highdimensional statistics: A nonasymptotic viewpoint by Martin J. Wainwright
 Highdimensional data analysis with lowdimensional models: principles, computation, and applications by John Wright and Yi Ma
 Highdimensional Probability: An introduction with Applications in Data Science by Roman Vershynin
 Probability in High dimension by Ramon van Handel (Princeton)
 Introduction to Statistical Learning Theory Olivier Bousquet, Stephane Boucheron, and Gabor Lugosi
 Illinois ECE 543 Statistical Learning Theory by Bruce Hajek and Maxim Raginsky
 Oxford Modern Statistical Theory by George Deligiannidis
 All of statistics: A concise Course in statistical inference by Larry Wasserman
 All of nonparametric statistics by Larry Wasserman
 Introduction to Nonparametric Estimation by Alexandre B. Tsybakov
 Mathematical Statistics by Jun Shao
 Asymptotic Statistics by A. W. van der Vaart
 A Survey on Distribution Testing: Your Data is Big. But is it Blue? by C. Canonne
 Introduction to the nonasymptotic analysis of random matrices by Roman Vershynin
 Concentration inequalities: A nonasymptotic theory of independence by Stephane Boucheron, Gabor Lugosi, Pascal Massart
 An Introduction to Matrix Concentration Inequalities by Joel A. Tropp
 Modern Discrete Probability: An Essential Toolkit by Sebastien Roch
 MATH 170A  Probability theory by Steven Heilman
 MATH 170B  Probability theory by Steven Heilman
 The elements of statistical learning: Data mining, inference and prediction  Trevor Hastie, Robert Tibshirani, Jerome Friedman
 A first course in probability by Sheldon Ross
 A second course in probability by Sheldon Ross
 Mathematical Statistics with applications by Wackerly, Mendenhall, Scheaffer
 Introduction to mathematical statistics by Hogg, McKean, Craig
 Elements of information theory by Thomas Cover, Joy Thomas
 Applied linear statistical models by Michael Kutner, Christopher Nachtsheim, John Neter, William Li
 Fundamentals of Statistical Exponential Families with Applications in Statistical Decision Theory by Lawrence D. Brown
Papers
NLP
Courses
 Stanford CS224d  Deep Learning for Natural Language Processing
 Harvard CS287  Machine Learning for Natural Language
 CMU CS 11747  Neural Networks for NLP
 CMU CS 11731  Machine Translation and Sequencetosequence Models
 Princeton COS495 Natural Language Processing
Surveys/Books
 A Primer on Neural Network Models for Natural Language Processing by Yoav Goldberg
Papers
Foundations
Courses
 CMU 15451  Algorithms
 CMU 15213  Introduction to computer systems
 CMU 15213/15513/14513  Introduction to Computer Systems
 Stanford CS 111  OS
 Harvard CS 125: Algorithms and Complexity
 Columbia COMS 6998006  Foundations of Blockchains
 Columbia COMS 4995  Randomized Algorithms
Books/Survey
 Notes on Randomized Algorithms by James Aspnes

Papers