MysteryInc152
MysteryInc152 t1_jee5zba wrote
Reply to comment by ChuckSeven in [D] Can large language models be applied to language translation? by matthkamis
It's not cherry picked lol.
Wild how everyone will just use that word even when they've clearly not tested the supposed model themselves. I'm just showing you what anyone who's actually used these models for translation will tell you
MysteryInc152 t1_jeanjj9 wrote
Reply to comment by ChuckSeven in [D] Can large language models be applied to language translation? by matthkamis
>LLM can do translation but they are significantly worse than translation models trained on translation data.
This is not true at all lol. They're better by a wide margin.
MysteryInc152 t1_jeanb01 wrote
Reply to comment by ChuckSeven in [D] Can large language models be applied to language translation? by matthkamis
>LLM trained on a multi-lingual corpus can be prompted to translate but they are far inferior to actual translation models.
No lol. You would know this if you've ever actually tried to translate with GPT-4 and the like. They re far superior to current sota
https://github.com/ogkalu2/Human-parity-on-machine-translations
MysteryInc152 t1_je9s41k wrote
Bilingual LLMs are much better translators than traditional SOTA.
https://github.com/ogkalu2/Human-parity-on-machine-translations
MysteryInc152 t1_je9rwv5 wrote
Reply to comment by ZestyData in [D] Can large language models be applied to language translation? by matthkamis
He's talking about unsupervised predict the next token GPTs. That's definitely not how Google Translate and the like work.
And GPT like models far outperform traditional SOTA translators
https://github.com/ogkalu2/Human-parity-on-machine-translations
MysteryInc152 t1_je2qmez wrote
Reply to comment by WarProfessional3278 in GPT's Language Interpretation will make traveling so much better by BlackstockTy476
not gpt-4 but i made some comparisons awhile back. Even before gpt-4, bilingual llms were way ahead of the game.
https://github.com/ogkalu2/Human-parity-on-machine-translations
MysteryInc152 t1_jdvqj47 wrote
Reply to comment by was_der_Fall_ist in [D]GPT-4 might be able to tell you if it hallucinated by Cool_Abbreviations_9
That's not what I meant in regards to calibration. It's not about saying an answer x% of the time or not. It's about being able to correctly estimate gaps in knowledge.
Good calibration is what you want.
MysteryInc152 t1_jdu4v0n wrote
In the gpt-4 technical paper, we see base gpt-4 have really good calibration. That is confidence directly correlated with ability to solve problems. But apparently the RlHF they did knocked that out some.
MysteryInc152 t1_jdu4sl2 wrote
Reply to comment by [deleted] in [D]GPT-4 might be able to tell you if it hallucinated by Cool_Abbreviations_9
In the gpt-4 technical paper, we see base gpt-4 have really good calibration. That is confidence directly correlated with ability to solve problems. But apparently the RlHF they did knocked that out some.
MysteryInc152 t1_jdruv58 wrote
Reply to comment by ecnecn in Why is maths so hard for LLMs? by RadioFreeAmerika
I didn't say you couldn't. I said it's not highly encoded in language. Not everything that can be extracted from language can be extracted with the same ease.
MysteryInc152 t1_jdrpjd4 wrote
Reply to comment by ecnecn in Why is maths so hard for LLMs? by RadioFreeAmerika
Sorry I'm hijacking the top comment so people will hopefully see this.
Humans learn language and concepts through sentences, and in most cases semantic understanding can be built up just fine this way. It doesn't work quite the same way for math.
When I look at any arbitrary set of numbers, I have no idea if they are prime or factors because they themselves don't have much semantic content. In order to understand whether they are those things or not actually requires to stop and perform some specific analysis on them learned through internalizing sets of rules that were acquired through a specialized learning process. Humans themselves don't learn math by just talking to one another about it, rather they actually have to do it in order to internalize it.
In other words, mathematics or arithmetic is not highly encoded in language.
The encouraging thing is that this does improve with more scale. GPT-4 is much much better than 3.5
MysteryInc152 t1_jdj8x5e wrote
Reply to comment by loopuleasa in [D] I just realised: GPT-4 with image input can interpret any computer screen, any userinterface and any combination of them. by Balance-
>they mentioned an image takes 30 seconds to "comprehend" by the model...
wait really ? Cn you link source or something. There's no reason a native implementation should take that long.
Now i'm wondering if they're just doing something like this -https://github.com/microsoft/MM-REACT
MysteryInc152 t1_jdhe8g1 wrote
Reply to [R] Artificial muses: Generative Artificial Intelligence Chatbots Have Risen to Human-Level Creativity by blabboy
General Purpose Technologies (from the Jobs Paper), General Artificial Intelligence. The skirting around the word is really funny. They've figured it out but no one wants to call a spade a spade yet.
MysteryInc152 t1_jd9vmd4 wrote
Traditional NLP is out the door yes. There isn't anything bespoke models can do that Large enough LLMs can't do better.
MysteryInc152 t1_jd3v3kp wrote
Reply to comment by Zealousideal_Ad3783 in Bing chat’s new feature: turning text into images! by Marcus_111
There are foundation models that do these kinds of things. You can connect them to a language model to get the kind of effect you're thinking about.
MysteryInc152 OP t1_jcrz16i wrote
Reply to comment by MisterManuscript in [R] ChatGLM-6B - an open source 6.2 billion parameter Eng/Chinese bilingual LLM trained on 1T tokens, supplemented by supervised fine-tuning, feedback bootstrap, and RLHF. Runs on consumer grade GPUs by MysteryInc152
Yeah I'm wrong it seems. Read a few articles using bootstrapping in the definition I used so I assumed that was generally it.
MysteryInc152 t1_jcro0q5 wrote
Reply to comment by Either-Job-341 in [P] The next generation of Stanford Alpaca by [deleted]
He's talking about the playground which is per token https://platform.openai.com/playground
MysteryInc152 t1_jcrnqc8 wrote
Reply to [P] The next generation of Stanford Alpaca by [deleted]
You can try training chatGLM. 6b parameters and initially trained on 1T English/Chinese Tokens. Also completely open source. However, it's already been fine tuned and had RLHF but that was optimized for Chinese Q/A. Could use some English work,
Another option is RWKV. There are 7b and 14b models(I would go with the 14b, it's the better of the two) fine tuned to a context length of 8196 tokens. He plans on increasing context further too.
MysteryInc152 OP t1_jcpzgd4 wrote
Reply to comment by Temporary-Warning-34 in [R] ChatGLM-6B - an open source 6.2 billion parameter Eng/Chinese bilingual LLM trained on 1T tokens, supplemented by supervised fine-tuning, feedback bootstrap, and RLHF. Runs on consumer grade GPUs by MysteryInc152
Bootstrapping is basically taking a model's best/better outputs on a certain task and finetuning on that.
EDIT: Seems I'm wrong on that
MysteryInc152 OP t1_jcpxcn5 wrote
Reply to comment by Temporary-Warning-34 in [R] ChatGLM-6B - an open source 6.2 billion parameter Eng/Chinese bilingual LLM trained on 1T tokens, supplemented by supervised fine-tuning, feedback bootstrap, and RLHF. Runs on consumer grade GPUs by MysteryInc152
Oh for sure. Changed it to long context, i think that's better. I just meant there's no hard context limit.
MysteryInc152 OP t1_jcputc0 wrote
Reply to [R] ChatGLM-6B - an open source 6.2 billion parameter Eng/Chinese bilingual LLM trained on 1T tokens, supplemented by supervised fine-tuning, feedback bootstrap, and RLHF. Runs on consumer grade GPUs by MysteryInc152
Uses relative positional encoding. Long context in theory but because it was trained on 2048 tokens of context, performance gradually declines after that. Finetuning for more context wouldn't be impossible though.
You can run with FP-16 (13GB RAM), 8-bit(10GB) and 4-bit(6 GB) quantization.
MysteryInc152 t1_jclpjzi wrote
Reply to comment by FallUpJV in [R] RWKV 14B ctx8192 is a zero-shot instruction-follower without finetuning, 23 token/s on 3090 after latest optimization (16G VRAM is enough, and you can stream layers to save more VRAM) by bo_peng
It's predicting language. as long as the structure can allow properly to learn to predict language, you're good to go.
MysteryInc152 t1_jeecbeq wrote
Reply to comment by ChuckSeven in [D] Can large language models be applied to language translation? by matthkamis
I didn't downvote you but it's probably because you're being obtuse. anyway whatever. if you don't want to take evidence at plain sight then don't. the baseline human comparisons are right there. Frankly it's not my problem If you're so suspicious of results and not bilingual to test it yourself. It's not really my business if you believe me or not.