master3243
master3243 t1_irhd9dj wrote
When the article was first posted here the top comment was insulting the paper and questioning how Nature reviewers were not scrutinizing enough simply because it was by Google.
I was having long discussions fighting for the paper and hopefully a video like this by Yannic will make people recognize that it's not all fluff and hype but actually a big deal.
master3243 t1_irdoyzz wrote
Reply to comment by ReginaldIII in [R] Discovering Faster Matrix Multiplication Algorithms With Reinforcement Learning by EducationalCicada
> You only care about the contribution to matmul
False, which is why I said it would have been better if they released everything. I definitely personally care more about the model/code/training process than the matmul result.
However, people are not 1 dimensional thinkers, I can simultaneously say that deepmind should release all their recourses AND at the same time say that this work is worthy of a nature publication and aren't missing any critical requirements.
master3243 t1_irdoq0o wrote
Reply to comment by dkangx in [P] Stable-DreamFusion: A working implementation of text-to-3D DreamFusion, powered by Stable Diffusion by hardmaru
Empirical results don't necessarily prove theoretical results, in fact most Deeplearning research (mine included) is trying out different stuff based on intuition and past experiences on what worked until you have something that achieves really good results,
Then you attempt to formally and theoretically show why the thing you did is justified mathematically.
And often enough, once you start going through the formal math you get ideas on how to further improve or different paths to take on your model, and thus it's a back and forth.
However, someone could just as easily get good results with a certain architecture/loss and then fail to justify it formally or skip certain steps or take an invalid jump from one step to another, which results in theoretical work that is wrong but works great empirically.
master3243 t1_irdi8o7 wrote
Reply to [P] Stable-DreamFusion: A working implementation of text-to-3D DreamFusion, powered by Stable Diffusion by hardmaru
In the paper, Appendix A.4 for deriving the loss and gradients,
I don't see how this is true (eq 14) https://i.imgur.com/ZuN2RC2.png
As the RHS seems to equal (2 * alpha_t) * LHS
I'm also unsure how in the same equation this happens https://i.imgur.com/DHixElF.png
master3243 t1_irddrxw wrote
Reply to [P] Stable-DreamFusion: A working implementation of text-to-3D DreamFusion, powered by Stable Diffusion by hardmaru
Is this an implementation of the model architecture/training or does it have a final/checkpoint model that I can use for generation right now?
master3243 t1_irc6to3 wrote
Reply to comment by [deleted] in [R] Discovering Faster Matrix Multiplication Algorithms With Reinforcement Learning by EducationalCicada
I literally cannot tell if your joking or not!
If I release an algorithm that beats SOTA along with a full and complete proof would I also need to attach all my notes and different intuitions that made me take the decisions I took???????
I can 100% tell you've never worked on publishing improvements to algorithms or math proofs because NO ONE DOES THAT. All they need is 1-the theorem/algorithm and 2-Proof that it's correct/beats SOTA
master3243 t1_irc4wwf wrote
Reply to comment by [deleted] in [R] Discovering Faster Matrix Multiplication Algorithms With Reinforcement Learning by EducationalCicada
What are you talking about? They definitely don't need to release that (it would be nice but not required). By that metric almost ALL papers in ML fail to meet that standard. Even the papers that go above and beyond and RELEASE THE FULL MODEL don't meet you're arbitrary standard.
Sure the full code would be nice, but ALL THEY NEED to show us is a PROVABLY CORRECT SOTA matrix multiplication which proves their claim.
Even the most advanced breakthrough in DL (in my opinion) which is Alphafold where we have the full model, doesn't meet your standard since (as far as I know) we don't have the code for training the model.
There are 4 levels of code release
Level 0: No code released
Level 1: Code for the output obtained (only applies to outputs that no human/machine can obtain such as protein folding on previously uncalculated patterns or matrix factorization or solutions to large NP problems that can't be solved using classical techniques)
Level 2: Full final model release
Level 3: Full training code / hyperparameters / everything
In the above scale, as long as a paper achieves Level 1 then it proves that the results are real and we don't need to take their word for it, thus it should be published.
If you want to talk about openness, then sure I would like Level 3 (or even 2).
But the claim that the results aren't replicable is rubbish, this is akin to a mathematician showing you the FULL, provably correct, matrix multiplication algorithm he came up with that beats the SOTA and you claim it's "not reproducible" because you want all the steps he took to reach that algorithm.
The steps taken to reach an algorithm are NOT required to show that an algorithm is provably correct and SOTA.
EDIT: I think you're failing to see the difference between this paper (and similarly alphafold) and papers that claim that they developed a new architecture or a new model that achieves SOTA on a dataset. Because in that case, I'd agree with you, showing us the results is NOT ENOUGH for me to believe that you're algorithm/architecture/model actually does what you claim it does. But in this case, literally the result in itself (i.e. the matrix factorization) is enough for them to prove that claim since that kind of result is impossible to cheat. Imagine I release a groundbreaking paper that says I used DeepLearning to Prove P≠NP and attached a pdf document that has a FULL PROOF that P≠NP (or any other unsolved problem) and it's 100% correct, would I need to also release my model? Would I need to release the code I used to train the model? no! All I need to release for my publication would be the pdf that contains the theorem.
master3243 t1_irbx3a6 wrote
Reply to comment by ReginaldIII in [R] Discovering Faster Matrix Multiplication Algorithms With Reinforcement Learning by EducationalCicada
What???
I have no idea what you're talking about, their code and contribution is right here https://github.com/deepmind/alphatensor/blob/main/recombination/sota.py
Their contributions are lines 35, 80 88
master3243 t1_iraqxox wrote
Reply to comment by ReginaldIII in [R] Discovering Faster Matrix Multiplication Algorithms With Reinforcement Learning by EducationalCicada
I will repeat the same sentiment, it was released yesterday.
> publicly released in NATURE that it is replicable
It is replicable, they literally have the code.
master3243 t1_iraj7rp wrote
Reply to comment by ReginaldIII in [R] Discovering Faster Matrix Multiplication Algorithms With Reinforcement Learning by EducationalCicada
But the paper was released literally yesterday?!
How did you already conclude that "no one can [...] actually apply it"
No where else in science do we hold such scrutiny and its ridiculous to judge how useful a paper is without at least waiting 1-2 years to see what comes out of it.
ML is currently suffering from the fact that people expect each paper to be a huge leap on its own, that's not how science work or has ever worked. Science is a step by step process, and each paper is expected to be just a single step forward not the entire mile.
master3243 t1_ir9bp3h wrote
Reply to comment by ThePerson654321 in [R] Google announces Imagen Video, a model that generates videos from text by Erosis
What about a coherent 30 second silent clip from a short description that is as difficult to distenguish from real images as current SOTA image generation.
master3243 t1_ir9a5wt wrote
Reply to comment by ThePerson654321 in [R] Google announces Imagen Video, a model that generates videos from text by Erosis
Image generation is by definition an easier task so the two will never catch up.
But do you not think that at some point in the future, video generation in the year 20XX will be better than image generation in 2022?
Even in the year 2050 or 2100?
master3243 t1_ir9a1st wrote
Reply to comment by ThatInternetGuy in [R] Discovering Faster Matrix Multiplication Algorithms With Reinforcement Learning by EducationalCicada
Absolutely nothing you said contradicts my point that the optimal algorithm is an unsolved problem, and thus you can't claim that it's impossible for an RL agent to optimize over current methods.
master3243 t1_ir99ne0 wrote
Reply to comment by purplebrown_updown in [R] Discovering Faster Matrix Multiplication Algorithms With Reinforcement Learning by EducationalCicada
I have, and I always have skepticism about DL.
But the post above doesn't even levy any theoretical or practical problems with the paper. Claiming that it's dense or that it's missing a github repo are not criticisms that weaken a research paper. Sure they're nice to have but definitely not requirements.
master3243 t1_ir96avy wrote
These look quite trippy but amazing nonetheless.
This one in particular is quite impressive
> Prompt: A bunch of autumn leaves falling on a calm lake to form the text "imagen Video". Smooth. >
master3243 t1_ir95zxe wrote
Reply to comment by Unicycldev in [R] Google announces Imagen Video, a model that generates videos from text by Erosis
It reminds me of image generation in the early days (a few years ago lol) when it wasn't yet super realistic.
Although this is faster than I expected it's still obviously not at the level of Imagen with image generation.
master3243 t1_ir94pxk wrote
Reply to comment by Ulfgardleo in [R] Discovering Faster Matrix Multiplication Algorithms With Reinforcement Learning by EducationalCicada
I don't think you're right unless deepmind is lying in the abstract of a nature paper which I highly doubt.
> Particularly relevant is the case of 4 × 4 matrices in a finite field, where AlphaTensor’s algorithm improves on Strassen’s two-level algorithm for the first time, to our knowledge, since its discovery 50 years ago
master3243 t1_ir94he8 wrote
Reply to comment by ThatInternetGuy in [R] Discovering Faster Matrix Multiplication Algorithms With Reinforcement Learning by EducationalCicada
It Is an unsolved problem, there's no known optimal algorithm yet.
Unless you have a proof your hiding from the rest of the world?
> The optimal number of field operations needed to multiply two square n × n matrices up to constant factors is still unknown. This is a major open question in theoretical computer science.
master3243 t1_ir9457u wrote
Reply to comment by purplebrown_updown in [R] Discovering Faster Matrix Multiplication Algorithms With Reinforcement Learning by EducationalCicada
They claim its provably correct and faster. Matmul is one of the most used algorithms and is heavily researched (and has major open problems)
Would you like to step up and prove yourself in that competitive area?
master3243 t1_irhlkct wrote
Reply to [P] Stable-DreamFusion: A working implementation of text-to-3D DreamFusion, powered by Stable Diffusion by hardmaru
Pretty cool, I just tested with the prompt
"a DSLR photo of a teddy bear riding a skateboard"
Here's the result:
https://media.giphy.com/media/eTQ5gDgbkD0UymIQD6/giphy.gif
Reading the paper and understanding the basics of how it worked, I would have guessed that it would have a tendency to create a Neural Radiance Field where the front of the object is duplicated over many different camera angles, since updating the NeRF from a different angle the diffusion model will output an image that closely matches an already created angle from before.
I think imagen can prevent this simply because of it's sheer power such that even if given a noisy image of the backside of a teddy bear it can figure out that it truly is the backside and not just the front again. Not sure if that made sense, I did a terrible job articulating the point.