Viewing a single comment thread. View all comments

IamTimNguyen OP t1_j3cybcu wrote

Part I. Introduction

00:00:00 : Biography

00:02:36 : Harvard hiatus 1: Becoming a DJ

00:07:40 : I really want to make AGI happen (back in 2012)

00:09:00 : Harvard math applicants and culture

00:17:33 : Harvard hiatus 2: Math autodidact

00:21:51 : Friendship with Shing-Tung Yau

00:24:06 : Landing a job at Microsoft Research: Two Fields Medalists are all you need

00:26:13 : Technical intro: The Big Picture

00:28:12 : Whiteboard outline

Part II. Classical Probability Theory

00:37:03 : Law of Large Numbers

00:45:23 : Tensor Programs Preview

00:47:25 : Central Limit Theorem

00:56:55 : Proof of CLT: Moment method

01:02:00 : Moment method explicit computations

Part III. Random Matrix Theory

01:12:45 : Setup

01:16:55 : Moment method for RMT

1:21:21 : Wigner semicircle law

Part IV. Tensor Programs

1:31:04 : Segue using RMT

1:44:22 : TP punchline for RMT

1:46:22 : The Master Theorem (the key result of TP)

1:55:02 : Corollary: Reproof of RMT results

1:56:52 : General definition of a tensor program

Part V. Neural Networks and Machine Learning

2:09:09 : Feed forward neural network (3 layers) example

2:19:16 : Neural network Gaussian Process

2:23:59 : Many large N limits for neural networks

2:27:24 : abc parametrizations (Note: "a" is absorbed into "c" here): variance and learning rate scalings

2:36:54 : Geometry of space of abc parametrizations

2:39:50 : Kernel regime

2:41:35 : Neural tangent kernel

2:43:40 : (No) feature learning

2:48:42 : Maximal feature learning

2:52:33 : Current problems with deep learning

2:55:01 : Hyperparameter transfer (muP)

3:00:31 : Wrap up

24

JustARandomNoob165 t1_j5e4vke wrote

I know it is a very broad question, but maybe you have recommendation of materials/resources to prepare yourself better to digest the topics used in this talk? Thank you in advance very much!

1

ThatInternetGuy t1_j3ejkr3 wrote

To be honest, even though I've coded in many ML code repos and I did well in college math, but this video outline looks like an alien language to me. Tangent kernel, kernel regime (is AI getting into the politics?), punchline for Matrix Theory (who's trying to get a date here?), etc.

−15

sentient-machine t1_j3eqajw wrote

Seems like a completely normal technical outline to me. I suspect you just lack the mathematical sophistication here?

19

kastbort2021 t1_j3fea3l wrote

Or they could just be quite recent topics? NTK, for example, seems to have been introduced in 2018. If you're not actively reading ML research papers, you'll probably have a hard time getting exposed to those topics.

3

ReginaldIII t1_j3elb8t wrote

I've read this several times and I don't really understand what it is you are trying to say. Where does politics or dating come into it?

10

ThatInternetGuy t1_j3emjfs wrote

I was just saying that ML researchers are using terms that are way too technical to infer meaning, or using a common word such as punchline to mean something else entirely. What does punchline have anything to do with ML?

−15

ReginaldIII t1_j3epizn wrote

No need to downvote, it was an honest question not an attack. Have you studied the literature and background mathematics of this area much?

Regime is a well established term in mathematics and many other fields, and one example of a "regime" (a domain under rules or constrains) is what you are likely familiar with as a political regime.

With respect to "punchline", I'm going to assume you didn't look at the video at the timestamp listed? Here it is https://youtu.be/1aXOXHA7Jcw?t=6105 All he is saying is that, after a few minutes long tangent talking about something the "punchline" is him circling back around to the point he was trying to make.

It isn't a literal haha punchline, it's not a mathematical term, the punchline comes at the end of a joke, a joke often takes you on a journey before circling back to some type of point. He used the word to mean that here too.

Timothy Nguyen, OP of this post and the host of the video, made a light hearted chapter title within a long video based on a term that Greg Yang used on his whiteboard.

18

madrury83 t1_j3epnqt wrote

Repurposing common words to have technical meanings is a basic trope in mathematics: kernel, neuron, limit, derivative, spectrum, manifold, atlas, chart, model, group, ring, ideal, field, topology, open, closed, compact, exotic, neighborhood, domain, immerse, embed, fibre, bundle, flow, section, measure, category, scheme, torsion, ...

... and typing Natural Transformation into google shows you skinny dudes that got buff.

14

cdsmith t1_j3evg7e wrote

Punchline is just sort of common vernacular for "here's where all the parts come together in a moment of realization". It's a metaphor to a joke, where you have all the setup, and then there's the moment when you "get it" and laugh.

4

cdsmith t1_j3ev3je wrote

This is definitely a theory presentation, though it does end with some applications to hyperparameter transfer when scaling model size. But if your main experience with ML is building models and applications, I'm not surprised it looks unfamiliar.

That being said, though, give it a chance if you're interested. Some parts of the outline didn't look familiar to me either, but the video is well-made and stops to explain most of the background knowledge. And you can always gloss over the bits you don't understand.

1