Imnimo t1_jdi9wze wrote on March 24, 2023 at 4:14 PM

Reply to [R] Artificial muses: Generative Artificial Intelligence Chatbots Have Risen to Human-Level Creativity by blabboy

Surely the Alternative Uses Test is all over the place in the LLM training data?

Imnimo t1_j9x01v0 wrote on February 25, 2023 at 4:04 AM

Reply to comment by Hyper1on in [D] To the ML researchers and practitioners here, do you worry about AI safety/alignment of the type Eliezer Yudkowsky describes? by SchmidhuberDidIt

Overfitting is just one among many possible optimization failures. While these models might over-memorize portions of training data, they're also badly underfit in many other respects (as evidenced by their frequent inability to answers questions humans would find easy).

If Bing is so well-optimized that it has learned these strange outputs as some sort of advanced behavior to succeed at the LM or RLHF tasks, why is it so weak in so many other respects? Is simulating personalities either so much more valuable or so much easier than simple multi-step reasoning, which these models struggle terribly with?

Imnimo t1_j9w6m9c wrote on February 25, 2023 at 12:09 AM

Reply to comment by Hyper1on in [D] To the ML researchers and practitioners here, do you worry about AI safety/alignment of the type Eliezer Yudkowsky describes? by SchmidhuberDidIt

How do you distinguish a behavior which is incentivized by the training objective and behavior that is the result of an optimization shortcoming, and why is it obvious to you that this is the former?

Imnimo t1_j9vzhgy wrote on February 24, 2023 at 11:18 PM

Reply to comment by Hyper1on in [D] To the ML researchers and practitioners here, do you worry about AI safety/alignment of the type Eliezer Yudkowsky describes? by SchmidhuberDidIt

So you would argue that the behavior highlighted in the post leads to either a lower loss on language modeling or a lower loss on RL finetuning than the intended behavior? That strikes me as very unlikely.

Imnimo t1_j9ux0jn wrote on February 24, 2023 at 7:04 PM

Reply to comment by Jinoc in [D] To the ML researchers and practitioners here, do you worry about AI safety/alignment of the type Eliezer Yudkowsky describes? by SchmidhuberDidIt

Well, I don't really think this is a semantic disagreement. I'm using their definition of the term.

If the issue is the danger of an AI arms race, what does a poorly-trained model have to do with it? Isn't the danger supposed to be that the model will be too strong, not too weak?

Imnimo t1_j9upa4x wrote on February 24, 2023 at 6:15 PM

Reply to comment by Jinoc in [D] To the ML researchers and practitioners here, do you worry about AI safety/alignment of the type Eliezer Yudkowsky describes? by SchmidhuberDidIt

My point is that this isn't even misalignment in the first place. No more than an Imagenet classifier with 40% accuracy is misaligned. Misalignment is supposed to be when a model's learned objective is different from the human designer's objective. In their desperation to see threats everywhere, EZ et al resort to characterizing poor performance as misalignment.

Imnimo t1_j9ulhb8 wrote on February 24, 2023 at 5:51 PM

Reply to comment by Jinoc in [D] To the ML researchers and practitioners here, do you worry about AI safety/alignment of the type Eliezer Yudkowsky describes? by SchmidhuberDidIt

I don't see how it doesn't. Is this not an example of his followers wringing their hands over so-called misalignment that's really just poor performance?

Imnimo t1_j9uip00 wrote on February 24, 2023 at 5:33 PM

Reply to comment by Jinoc in [D] To the ML researchers and practitioners here, do you worry about AI safety/alignment of the type Eliezer Yudkowsky describes? by SchmidhuberDidIt

https://www.lesswrong.com/posts/jtoPawEhLNXNxvgTT/bing-chat-is-blatantly-aggressively-misaligned

Imnimo t1_j9rvl16 wrote on February 24, 2023 at 2:49 AM

Reply to [D] To the ML researchers and practitioners here, do you worry about AI safety/alignment of the type Eliezer Yudkowsky describes? by SchmidhuberDidIt

No, a lot of his arguments strike me as similar to arguments from the 1800s about how some social trend or another spells doom in a generation or two. And then his followers spend their time confusing "Bing was mean to me" with "Bing is misaligned" (as opposed to "Bing is bad at its job") and start shouting "See? See? Alignment is impossible and it's already biting us!"

Imnimo t1_iu50biy wrote on October 28, 2022 at 4:23 PM

Reply to [D] DL Practitioners, Do You Use Layer Visualization Tools s.a GradCam in Your Process? by DisWastingMyTime

I don't use that sort of thing as part of a normal process, but I did run into a situation where I had an image dataset with small objects on potentially distracting backgrounds. Regular old CAM helped me check whether my misclassifications were finding the right object and just not understanding what it is, or missing the object all together (it was mostly the former).