Recent comments in /f/MachineLearning
IntrepidTieKnot t1_jebtzyj wrote
Reply to [R] TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with Millions of APIs - Yaobo Liang et al Microsoft 2023 by Singularian2501
Isn't this basically the description of Langchain?
FermiAnyon t1_jebpsg3 wrote
Reply to comment by mattsverstaps in [D] Turns out, Othello-GPT does have a world model. by Desi___Gigachad
Yeah, isotropic as in being the same in all directions. So we're probably all familiar with embedding space and the fact that the positional relationships between concepts in embedding space basically encodes information about those relationships. Isotropy in language models refers to the extent to which concepts which are actually unrelated appear unrelated in embedding space.
In other words, a model without this property might havre an embedding space that isn't large enough, but you're still teaching it things and the result is that you're cramming things into your embedding space that's too small, so unrelated concepts are no longer equidistant from other unrelated concepts, implying a relationship that doesn't really exist with the result being that the language model confuses things that shouldn't be confused.
Case in point: I asked chatgpt to give me an example build order for terrans in Broodwar and it proceeded to give me a reasonable sounding build order, except that it was mixing in units from Starcraft 2. Now no human familiar with the games would confuse units like that. I chalk that up to a lack of relevant training data, possibly mixed with an embedding space that's not large enough for the model to be isotropic.
That's my take anyway. I'm still learning ;) please someone chime in and fact check me :D
WokeAssBaller t1_jebpjog wrote
Reply to comment by lgastako in [D] The best way to train an LLM on company data by jaxolingo
fine tuning is just additional training, so if it works from scratch it works with fine tuning. And no it may not be as effective as other methods but the poster was claiming it was impossible
yaosio t1_jebomzk wrote
Reply to comment by waxroy-finerayfool in [D] Turns out, Othello-GPT does have a world model. by Desi___Gigachad
A world model doesn't mean a model of the world, it means a model from the data it's been given. Despite not being told what an Othello board looks like there's an internal representation of an Othello board.
[deleted] t1_jebnyhc wrote
Reply to comment by ZestyData in [D] Turns out, Othello-GPT does have a world model. by Desi___Gigachad
[removed]
ZestyData t1_jebmes7 wrote
Reply to comment by waxroy-finerayfool in [D] Turns out, Othello-GPT does have a world model. by Desi___Gigachad
...
bruh
currentscurrents t1_jebm09t wrote
Reply to comment by tvetus in [R] LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention by floppy_llama
No, do you have a link?
mattsverstaps t1_jeblz4h wrote
Reply to comment by FermiAnyon in [D] Turns out, Othello-GPT does have a world model. by Desi___Gigachad
Isotropic? Not isomorphic? Please elaborate
[deleted] t1_jeblsuz wrote
Reply to comment by Im_Unlucky in [D] The best way to train an LLM on company data by jaxolingo
[deleted]
tvetus t1_jebkax4 wrote
Reply to comment by currentscurrents in [R] LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention by floppy_llama
Did you see the paper on voice to voice for talking with whales?
[deleted] t1_jebk4w0 wrote
Reply to comment by theotherquantumjim in [R] The Debate Over Understanding in AI’s Large Language Models by currentscurrents
[removed]
qncapper t1_jebjy9s wrote
Reply to comment by Im_Unlucky in [D] The best way to train an LLM on company data by jaxolingo
Cool, how can I be confident about my model not spewing sh*t or not making up things on the fly, cause what it gives out has impact on my stakeholders.
[deleted] t1_jebilrl wrote
Reply to comment by FermiAnyon in [D] Turns out, Othello-GPT does have a world model. by Desi___Gigachad
[removed]
[deleted] t1_jebi227 wrote
[removed]
icm76 t1_jebgew9 wrote
!remind me in 3 days
dancingnightly t1_jebf9zn wrote
Reply to comment by Jadien in [D] Turns out, Othello-GPT does have a world model. by Desi___Gigachad
incredibly interesting given humans represent some quantities this way too (spanning from left-to-right in the brain for numbers)
Zealousideal-Ice9957 t1_jebdm73 wrote
Reply to comment by EvilMegaDroid in [D] FOMO on the rapid pace of LLMs by 00001746
They just completed the data collection a few days ago, and they claim prompts of really high quality due to strict filtering algorithm and the propension of the community to create a better open source alternative to OAI.
trajo123 t1_jebbxaf wrote
Reply to comment by Tight-Lettuce7980 in [D] Improvements/alternatives to U-net for medical images segmentation? by viertys
Sufficient to train a model from scratch? Unlikely. Sufficient to fine-tune a model pre-trained on 1million+ images (imagenet, etc)? Probably yes. As mentioned, some extra performance can be squeezed out with some smart data augmentation.
Builder992 t1_jebamry wrote
!Remindme 3 days
waxroy-finerayfool t1_jeba68s wrote
It's a model of a 2 dimensional array, not the world.
TehDing t1_jeb7rvq wrote
Like tensorboard graph view?
currentscurrents t1_jeb4shv wrote
Reply to comment by DigThatData in [R] LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention by floppy_llama
Yeah, I think that's why they're starting with whales - they're an easy subject since their vocalizations can be heard through the water from miles away. They also seem to have a fairly complex vocal language, unlike for example songbirds with memorized mating calls.
DigThatData t1_jeb49b8 wrote
Reply to comment by currentscurrents in [R] LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention by floppy_llama
this is probably not a concern for whale vocalizations, but an issue for attempting to decode animal communications generally via LLMs is that they're probably communicating as much information (if not more) non-vocally. for example, if we wanted to train an LLM to "understand" dog communication, it'd probably be more important to provide it with signals corresponding to changes in body and face pose than vocalizations. interesting stuff in any event.
glichez t1_jeb28mx wrote
dagster
bliblufra t1_jebv54y wrote
Reply to [D] Directed Graph-based Machine Learning Pipeline tool? by Driiper
Kubeflow? There are two more tools related to it: Elyra (open source) and Vertex AI (GCP, basically built on top of Kubeflow and TFX)