BrotherAmazing

BrotherAmazing t1_j8q7wow wrote

Indeed, this example of a simple harmonic oscillator is not a practical use case, but more of a simple example and pedagogical tool.

There are potential use cases for more complex problems. Many complex physical systems are governed by partial differential equations, however it is often impossible to write explicit formulas for the solutions to these equations, and so the physical states must be experimentally observed as they evolve or else computationally demanding high-fidelity simulations must be run, sometimes on supercomputers for many days, in order to numerically estimate how the physical system evolves.

Just think about recent work in protein folding. Would a deep NN that tries to make protein folding predictions benefit from knowing physics-based constraints?

2

BrotherAmazing t1_j8q4qdx wrote

Isn’t it more technically correct to state that a “regular NN” could learn to extrapolate this in theory, but is so unlikely to do so that the probability might as well be zero?

PINNs are basically universal function approximators that have additional knowledge about physics-based constraints imposed, so it’s not surprising and shouldn’t be taken as an “dig” on “regular NNs” that they can better decide what solutions may make sense and are admissible vs. something that is basically of an “equivalent” architecture and design but without any knowledge of physics encoded in to regularize it.

1

BrotherAmazing t1_j8jd23p wrote

Yes.

“SoTA” is also often ill-defined and while important, can sometimes be a bit overhyped IMO.

Most practitioners and engineers want something that is as good as it can be or is above some threshold in accuracy, given constraints that can often be severe. If a “SoTA” approach cannot meet these real-world constraints, I would argue it’s not “SoTA” for that particular problem of interest.

If you have something that performs very well under such real-world constraints and can demonstrate value to the practitioner, it should be considered for publication by the editors.

3

BrotherAmazing t1_j7nt1za wrote

Possibly, I guess, but how would you or anyone else know? Wu Dao 2 is like a mythical beast, like the Loch Ness Monster, that we catch blurry glimpses of and that’s it.

Also, even suppose Wu Dao 2 is SOTA despite no one being able to confirm that (trust me bro!). The problem is it was trained by just copying what Google and OpenAI had published and trying to just scale up what they did. I’m not sure I would call that a “leader in the space” if you have no clue how to make any innovations yourself, so you wait for someone else to publish an innovation and then you just copy it and try to scale it up.

6

BrotherAmazing t1_j79rgi5 wrote

The subject line alone is an ill-posed question. Large language models are not inherently or intrinsically dangerous, of course not. But can they be dangerous in some sense of the word “dangerous” when employed in certain manners? Of course they could be.

Now if we go beyond the subject line, OP you post is a little ridiculous (sorry!). The language model “has plans” to do something if it “escapes”? Uhm.. no, no, no. The language model is a language model. It has inputs that are, say, text and then outputs a text response for example. That is it. It cannot “escape” and “carry out plans” anymore than my function y = f(x) can “escape” and “carry out plans”, but it can “talk about” such things despite not being able to do them.

1

BrotherAmazing t1_j73k2x8 wrote

It’s very specific to what you are doing. GPUs are absolutely superior hands down for the kind of R&D and early offline prototyping I do when you consider all the practical business aspects with efficiency, cost, flexibility, and practicality given our business’ and staff pedigree and history.

2

BrotherAmazing t1_j52hucj wrote

If OP asked this question in a court of law, the attorney would immediately yell “OBJECTION!” and the Judge would sustain, scold OP, but give them a chance to ask a question that doesn’t automatically pre-suppose and imply that pre-training cannot be “correct” or that there is always a “better” way than pre-training.

FWIW, I often avoid transfer learning or pre-training when it’s not needed, but I’m sure I could construct a problem that is not pathological and of practical importance where pre-training is “optimal” in some sense of that word.

2

BrotherAmazing t1_j4v2gm8 wrote

First, I’m blown away that you are suggesting that you don’t know your students and their writing styles, some of which are performed in-class and almost all of which differ significantly from the way ChatGPT writes, but second, my teachers said the exact same thing you are saying decades ago and freaked out when CliffsNotes came out!

Re-read my prior argument because nothing you just said impacts it, and it still stands.

2

BrotherAmazing t1_j4tjnj1 wrote

I would like to see what happens if you train an N-class classifier with a final FC output layer that is of size (N+M) x 1 and you simply pretend there are “M” unknown classes that you have no training examples for so those M components are always 0 for your initial training set and you always make predictions by re-normalizing/conditioning on the fact that those elements are 0.

Now you add a new class with your spare “capacity” in that last layer and start re-training from where you left off without modifying the architecture, but now some data have non-zero labels for the N+1st class and now you re-normalize predictions only to condition on the last M-1 classes being 0 instead of M.

Then see how training starting from this initially trained N-class network progresses in becoming an (N+1)-class classifier compared to the baseline of just starting over from scratch and see whether it saves you compute time for certain problems while simultaneously being just as accurate in the end (or not!).

IDK how practical or important this would really be (probably not much!) even if it did lead to computational savings, but would be a fun little nerdy study.

2

BrotherAmazing t1_j4tdklr wrote

No it’s not.

Anyone who wanted to cheat on a take-home essay or assignment always could, and anyone who has to write an essay in-class monitored for more critical and competitive standardized tests cannot be pulling out their devices and typing into chatGPT, which doesn’t write A+ essays that a teacher can’t detect are “a little off” anyway.

As a former educator myself, I always knew which students had mastered the material and could intelligently talk about it in class discussions, during office hours, and through in-class essays/quizzes where they could not cheat while I closely monitored. They couldn’t get an A+ by simply cheating on a few of the take-home essays, and the typical cheaters are cheating just to get by and still end up with inferior grades to those who master the subject.

Furthermore, concentrating too much on catching cheaters takes away from time you could be spending enriching the learning experience of everyone else.

It also sounds corny but is true: When you cheat, you’re only cheating yourself. Cheating really is self-policing in many instances. When we interview candidates who have a degree and a high GPA, it’s very obvious of they just got good grades but are clueless and we don’t hire them. It might be cheating, or maybe grade inflation, or perhaps just short-term memorizing but not actually retaining or understanding what they were learning, but it’s night and day.

Those who truly care to learn will excel in their jobs and get better promotions. ChatGPT isn’t going to help you there.

Having said that, I would consider possibly modifying the curriculum you only give take-home work that is 90% of the grade and can, but it’s not worth stressing over. Put your effort into teaching and enriching the lives of those who want to learn and yearn for knowledge. You’re an educator first, and police work is just a side gig you can’t ignore, but isn’t your main purpose.

2

BrotherAmazing t1_j4mjntx wrote

There could be a separate database and algorithm to detect this if they wanted to, but this wasn’t a goal of chatGPT.

You wouldn’t need an AI/ML to do this, and also note it isn’t 100% impossible for a human to respond identically to chatGPT’s response, especially for shortest length responses, without knowing chatGPT would respond the same way.

Why do you “need” this? Just curious.

1

BrotherAmazing t1_j4mj44x wrote

If the whole motivation here is to detect the cheating student, most cheating students won’t simply copy and paste but will spend at least 5 - 15 min making modifications and writing some in their own language.

Policing cheating beyond punishing those who obviously are cheating in the worst ways is not as important as one might think. Cheating on highly competitive graduate school entrance exams is something to strictly police, but not an English writing assignment or a math word problem. It sounds corny, but the student in those cases really is just cheating themselves.

Any professor who talks to you in person or in class discussions or office hours, sees how you interact in group projects, and any employer who works with you on complex real-world problems chatGPT can’t solve will know very quickly that you (the cheater) don’t have a firm grasp of the material, prerequisites, or know-how to apply it, and the student or employee that does have that know-how and understanding gets the promotion, better Reference, has the better grades still (on average over all classes), and will interview much better for jobs and can speak intelligently about what they accomplished and solve problems on the spot on a white board, while the cheater fumbles and cannot pull out chatGPT during the interview, lol.

1

BrotherAmazing t1_j34uve3 wrote

Try a Siamese network trained with the triplet loss function as one baseline if you can label/construct a database with pairs labeled as “similar” and “dissimilar” if the definition of similarity is easy for a human to understand but hard to code up as a simple algorithm.

I’m assuming you aren’t just searching for nearly exact replicas of some input image and your definition of “similar” is more complex, as the former should be fairly trivial, no?

2

BrotherAmazing t1_j34t0cm wrote

Depends if they want to match nearly exact images or match images that are just similar in visual appearance to a human. If it is the latter, then the distances in these later layers need not be close for similar images. A popular example of this is adversarial images.

2

BrotherAmazing t1_j2zuccr wrote

It’s not really fair to have a dog labelled as a Japanese spaniel (that is one) and let a a deep neural network train on a bunch of images of Japanese spaniels for a week, then have me try to identify the dog when I’ve never heard of or seen a Japanese spaniel before or heard or read about them so I guess papillon, then you tell me the CNN is “superior”.

If you consolidated all dog classes into “dog” humans wouldn’t get a single one wrong. Also, if you took an intelligent person and let them study and train on these classes with flashcards for as many training iterations as the CNN has during training, I imagine the human would perform at least comparably if not better than the CNN but that usually is not how the test is performed.

5

BrotherAmazing t1_j22bx73 wrote

He may be talking about Appalachia where things are so depressed, or could be one of those Mormon types who builds his own home by cutting down the trees nearby and doesn’t need electricity or running water, since his 7 sister wives and 48 children fetch water from the wells each day and perform hard labor doin’ God’s work.

2

BrotherAmazing t1_j0fai9p wrote

In this case, I don’t think anyone can tell you wtf is going on without a copy of your code and dataset. There are just so many unknowns, but is this 1000 dim dense layer the last layer before a softmax?

Are you training the other layers then adding this new layer with new weight initialization in between the trained layers, or are you adding it in as a new architecture and re-initializing the weights everywhere and starting from scratch again?

5

BrotherAmazing t1_iz1xhsn wrote

IMO you should ideally go into a field that has at least some job opportunities (if you want to be a “dog psychologist”, you need a backup plan with that small of a market!), but you should focus on fields you are good at and that captivate you.

You don’t want to make an economic blunder and pay for a massive tuition bill and find out there are no jobs in that field, but a talented, motivated, and ambitious landscaper is going to be happier and have more opportunities in landscaping than someone whose heart wasn’t in AI/ML, or whose heart was in it but they just aren’t good at it, and they pushed through to get a degree in it just because “that job market is strong and should continue to be”.

I review resumes and do interviews with AI/ML candidates for positions, and I don’t care what degree they have, or how desperate we are for talent, if it’s clear they aren’t that good or aren’t that into their field, no job.

2

BrotherAmazing t1_iypzqk6 wrote

To be fair, there is research into ANNs that adapt their architectures over time or dynamically adapt the plasticity of certain weights while engaged in “lifelong learning”, and groups have built such networks, but these are the exceptions and almost always the architecture gets fixed and weights are just updated with some standard backprop that can lead to the so-called “catastrophic forgetting” when a dataset shifts it’s PDF if you don’t do anything more advanced than the “vanilla” NN setup.

2