Recent comments in /f/MachineLearning

ChuckSeven t1_jeeae4o wrote

Look, it doesn't matter. You can't claim that LLM are better if you don't demonstrate it on an established benchmark with a large variety of translations. How should I know if those Japanese anime translations are correct? For what its worth it might be just "prettier" text but a wrong translation.

It's sad to get downvoted on this subreddit for insisting on very basic academic principles.

2

light24bulbs t1_jee9lvm wrote

I agree with you. Looking at papers like ToolFormer and so on, we are very close.

We are only a couple years away from AGI, which is what I've been saying for YEARS and getting yelled at here. The WaitButWhy article in 2016 was dead right

−5

MysteryInc152 t1_jee5zba wrote

It's not cherry picked lol.

Wild how everyone will just use that word even when they've clearly not tested the supposed model themselves. I'm just showing you what anyone who's actually used these models for translation will tell you

https://youtu.be/5KKDCp3OaMo

https://www.reddit.com/r/visualnovels/comments/11rty62/gpt4_ai_vs_human_translation_on_the_opening_scene/?utm_source=share&utm_medium=android_app&utm_name=androidcss&utm_term=1&utm_content=share_button

1

a_beautiful_rhind t1_jee547c wrote

512 context? I used alpaca-native and even llama + alpaca lora on long 2048 context. It worked fine.

>We plan to release the model weights by providing a version of delta weights that build on the original LLaMA weights, but we are still figuring out a proper way to do so.

This is where the weights currently "are".

Also.. do 30b next!

Edit.. playing with the demo: >YOUR INPUT VIOLATES OPENAI CONTENT MODERATION API. PLEASE TRY AGAIN.

And "as a language model" replies.. including about me saying that I want to download the weights. Model says it can't be downloaded and has no "physical form". Ayy lmao.

Please stop training "openAI-isms" into models.

2

Tostino t1_jee40if wrote

The possibility of creating an autonomous agent with current level hardware is not as far-fetched as it may seem. A single adept engineer could conceivably construct such an agent by amalgamating insights from disparate papers that have been divulged in the field of artificial intelligence. These papers may contain novel algorithms, techniques, or architectures that could be integrated into a coherent and functional system. Moreover, the open source implements that are available today, such as langchain/flow and pinecone db (or similar), could provide the necessary tools and frameworks to assemble an architecture that is self augmenting and self refining. Such an architecture could leverage the power of distributed computing, natural language processing, and machine learning to improve its own performance and capabilities over time. This could potentially enable the agent to surpass the optimal human capacities at most undertakings, or at least match them.

−2

hapliniste t1_jee3gvr wrote

I tried some things in the web demo and it is really good.

What people haven't realised yet is that Koala (another model they did not publish about for now) is also available in the web demo and it is CRAZY GOOD! It's also really fast because I guess I'm the only one using it right now haha.

I really recommand to try it, it looks like Vicuna is a bit bellow GPT3.5 and Koala a bit above but I did not test it enough to be sure right now.

2

FermiAnyon t1_jee34lx wrote

Glad you're here. This would be a really interesting chat for like a bar or a meetup or stunting ;)

But yeah, I'm just giving my impressions. I don't want to make any claims of authority or anything as I'm self taught with this stuff...

But yeah, I have no idea how our brains do it, but when you're building a model whether it's a neural net or you're just factoring a matrix, you'll end up with a high dimensional representation that'll get used as an input to another layer or that'll just be used straight away for classification. It may be overly broad, but I think of all of those high dimensional representations as embeddings and the dimensionality available for encoding an embedding as the embedding space.

Like if you were into sports and you wanted to organize your room so that distance represents relationships between equipment. Maybe the baseball is right next to the softball and the tennis racket is close to the table tennis paddle, but they're a little farther away from the baseball stuff, then you've got some golf clubs and they're kind of in one area of the room because they all involve hitting things with another thing. Then your kite flying stuff and your fishing stuff and your street luge stuff is kind of as far apart as possible from the other stuff because it's not obvious to me anyway that they're related. Your room is a two dimensional embedding space.

When models do it, they just do it with more dimensions and more concepts, but they learn where to put things so that the relationships are properly represented and they just learn all that from lots of cleverly crafted examples.

4