starstruckmon

starstruckmon t1_j6l1k5l wrote

>take a human and show them 4 or 5 images of an animal they've never seen before they'll generally be able to draw it quite well

4-5 is actually enough to fine tune a pretrained SD model. Which is the correct comparison since we're already pretrained. Even if you ignore all the data upto that point in your life, even newborn brains are pretrained by evolution. They aren't initialised from random weights. Easier to notice this in other animals that can start walking right after birth.

7

starstruckmon t1_j6l0a56 wrote

Very few patients are going to be okay with robotic surgery that isn't supervised by a doctor. It doesn't even matter if it's technically better. Patients are just not going to trust it. Same with pilots. Even if it's fully auto-pilot they'll want someone to take control if something goes wrong. Trains are much easier to automate yet they still have an engineer/driver.

3

starstruckmon t1_j6kygds wrote

I can't really speculate on that topic. It's currently an active area of research.

To be honest, this problem is so widely known that I hadn't considered finding sources to support the claim. Here is the best authoritative source I could quickly find

https://arxiv.org/abs/2012.15613

It may seem counter-intuitive to link to a paper that supposedly fixes this issue, but this is obviously the most likely scenario in which a paper would discuss it. Also, if you read it carefully, you'll see that while the authors managed to reduce the gap, it still persists.

1

starstruckmon t1_j6jw3kl wrote

It seems like you're talking about a model that has been trained in both languages. However, there are two issues with this. Firstly, the Chinese generally prefer to train models solely on Chinese data or with a limited amount of English data included. Secondly, multi-language models currently perform significantly worse compared to models that are trained on a single language.

1

starstruckmon t1_j6j4jxi wrote

Even if the LLMs themselves don't become perfect at generating Parcel psudocode, having a compiler LM that can reliably convert Parcel ( or something simmilar ) to actual code would be a massive win. Imagine coding in natural language psudocode. A high-er level programming language.

3

starstruckmon t1_j6izowe wrote

I would be very surprised. They have technically speaking ( as per benchmarks ), one of the best text-to-image generators right now, yet the practical output is far below what we have in quality due to the limited dataset.

It would probably be even worse for text. Wikipedia, reddit, all the code forums like stackoverflow, documentations and manuals, vast majority of scientific papers. They'd be leaving so much out.

4

starstruckmon t1_j6d3lsr wrote

I can guarantee the next paper out of this Google team is going to be a diffusion model ( instead of AudioLM ) conditioned on MuLan embeddings.

The strength of the Google model is the text understanding which is coming from the MuLan embeddings. While the strength of the work you highlighted is the quality from the diffusion model.

It's the obvious next step following the same path as Dalle1->Dalle2.

1

starstruckmon OP t1_j501y7y wrote

From the paper

>One natural avenue for future work would be to investigate fine-tuning mechanisms for such large-scale models, which would allow further accuracy recovery. We conjecture that this should be possible, and that probably at least 80-90% sparsity can be achieved with progressive pruning and fine-tuning.

So, that comes next. Though I doubt the 80-90% guesstimate.

1

starstruckmon t1_j4uufbc wrote

You don't really need a separate extension, do you? Your bot can just be another user submitting the timestamps.

Though it would help if the extension developer provided a list of videos that are being watched by their users but has no timestamps yet, so your bot isn't spending time scraping though unpopular videos.

1

starstruckmon t1_j41dgsk wrote

There's no way for us to tell for certain, but since Google has used it for creativity oriented projects/papers like Dramatron, I don't think so. I feel the researchers would have said something instead of leading the whole world intentionally astray as everyone is now following Chinchilla's scaling laws.

Chinchilla isn't just a smaller model. It's adequately trained unlike GPT3 which is severely undertrained, so simmilar, if not exceeding ( as officially claimed ), capabilities isn't unexpected.

1

starstruckmon t1_j3wvdqt wrote

>I think you may be underestimating the compute cost. It’s about $6M of compute (A100 servers) to train a GPT-3 level model from scratch. So with a billion dollars, that’s about 166 models.

I was actually overestimating the cost to train. I honestly don't see how these numbers don't further demonstrate my point. Even if it cost a whole billion ( that's a lot of experimental models ), that's still 10 times less than what they're paying.

>Considering experimentation, scaling upgrades, etc., that money will go quickly. Additionally, the cost to host the model to perform inference at scale is also very expensive. So it may be the case that the $10B investment isn’t all cash, but maybe partially paid in Azure compute credits. Considering they are already running on Azure.

I actually expect every last penny to go into the company. They definitely aren't buying anyone's shares ( other than maybe a partial amount of employee's vested shares ; this is not the bulk ). It's mostly for new shares created. But $10B for ~50% still gives you a pre-money valuation of ~10B. That's a lot.

1