ReginaldIII
ReginaldIII t1_iznu5ry wrote
Reply to comment by satireplusplus in [P] I made a command-line tool that explains your errors using ChatGPT (link in comments) by jsonathan
If people were constantly crunching an LLM every time they got a stack trace and this was a normal development practice despite it being largely unnecessary.
Then given it is all complete avoidable, would it not be a waste of energy?
> It's a tool like any other, you're using a computer too to avoid doing basic tasks by hand.
That's a nonstarter. There are plenty of tasks more efficiently performed by computers. Reading an already very simple stack trace is not one of them.
ReginaldIII t1_iznsvav wrote
Reply to comment by satireplusplus in [P] I made a command-line tool that explains your errors using ChatGPT (link in comments) by jsonathan
Honest question, do you consider the environmental impact of how you are using this to avoid very basic and easy to do tasks?
ReginaldIII t1_iznsdsj wrote
Reply to [P] I made a command-line tool that explains your errors using ChatGPT (link in comments) by jsonathan
That's such an unnecessarily wordy explanation. The error message literally explained it to you concisely.
If it produces such an unnecessary output for such a simple error message god help you when it is more complicated.
Further more, ChatGPT cannot do deductive reasoning. It can only take existing chains of thought from its training set and swap out the keywords consistently to apply that same logic template to something else which may or may not fit to it correctly.
This is a bad idea. And if I'm perfectly honest, a waste of electricity. Save the planet and don't push this as a legitimate usage.
ReginaldIII t1_izl0quh wrote
Reply to comment by FerretDude in [R] Illustrating Reinforcement Learning from Human Feedback (RLHF) by robotphilanthropist
This is a really nice write up, thank you.
I'm interested what your thoughts are on prompt manipulation and "reasoning" your way around ChatGPT's ethical responses (and how those responses were even added during training). What direction do you see being best to combat these issues?
Also, have you looked at incorporating querying external sources for information by decomposing problems to reason about them? The quality of ChatGPT made me think of Binder https://lm-code-binder.github.io/ and how powerful a combination they could be. A benefit of Binder is the chain of reasoning is encoded in the intermediate steps and queries which can be debugged and audited.
Something ChatGPT lacks is that ability to properly explain itself. You can ask it to explain it's last output, but you can also ask it to lie to you and it does.
If you ask it to lie to you convincingly, who is to say it isn't?
Can a conversationally trained LLM ever be used in a production application (as many are beginning to do) without a more rigorous rule based framework around it?
ReginaldIII t1_izf02ey wrote
Reply to [D] We're the Meta AI research team behind CICERO, the first AI agent to achieve human-level performance in the game Diplomacy. We’ll be answering your questions on December 8th starting at 10am PT. Ask us anything! by MetaAI_Official
> "A strange game. The only winning move is not to play. How about a nice game of chess?"
ReginaldIII t1_ize0b4z wrote
Reply to comment by suedepaid in [R] The Forward-Forward Algorithm: Some Preliminary Investigations [Geoffrey Hinton] by shitboots
That's good. I will keep an eye out :)
ReginaldIII t1_iz4ries wrote
Reply to comment by katprop in [R] The Forward-Forward Algorithm: Some Preliminary Investigations [Geoffrey Hinton] by shitboots
Do you have access to the video of his presentation still?
It bothers me greatly that they paywall their presentations even after the conference has ended.
By all means have exclusivity for the duration of the actual conference, and limit commenting and discussion to conference attendees. But as soon as the conference ends they should flip the switch and make everything public. There's literally no reason not to, it isn't going to stop people wanting to attend.
ReginaldIII t1_iz1f43w wrote
Reply to comment by link0007 in [P] Save your sklearn models securely using skops by unofficialmerve
Tidymodels is a specific example of an R extension package with it's own file format. That would be like saying you are quite happy with the Python infrastructure for saving PyTorch models. It's still specific to that thing.
There are plenty of good ways of storing model weights, those based on hdf5 archives being a great choice since they are optimized for block tensor operations, on disk chunking, support lazy slicing, and support nested groups of tensors. Keras uses hdf5 for it's save_weights and load_weights functions.
If your models are getting huge you need a different strategy anyway. And this is where S3 object store backed systems like TensorStore become more ideal.
ReginaldIII t1_iydnekt wrote
Reply to comment by philthechill in Does anyone uses Intel Arc A770 GPU for machine learning? [D] by labloke11
Also worth considering how many years it is going to take to offset the sizeable cost of such a migration.
Forget the price of the hardware, how long is it going to take to offset the cost of the programming and administration labour to pull off this sort of move?
What about maintenance? We've got years of experience with Nvidia cards in datacentres, we understand the failure modes pretty well, we understand the tooling needed to monitor and triage these systems at scale.
What guarantees do I have that if I fill my racks with this hardware they won't be dying or catching on fire within a year?
What guarantees do I have that Intel won't unilaterally decide this is a dead cat for them and they want to scrap the project? Like they have for almost every GPU adjacent project they've had.
ReginaldIII t1_iydl00l wrote
Reply to comment by AtomKanister in Does anyone uses Intel Arc A770 GPU for machine learning? [D] by labloke11
> That's exactly how innovation is made
It's also how companies overextend and go out of business.
ReginaldIII t1_iyctmv8 wrote
Reply to comment by philthechill in Does anyone uses Intel Arc A770 GPU for machine learning? [D] by labloke11
90% of the time my job is to be a small and consistently performing cog in a much bigger machine because I am there to help drive down stream science outcomes for other scientists (often in a different discipline).
We need to get X done within Y timeframe.
> "Lets consider upending our infrastructure and putting millions of pounds worth or existing and battle proven code and hardware up in flux so we can fuck around seeing if Intel has actually made a viable GPU-like product on their umpteenth attempt"
... is not exactly an easy sell to my board of governance.
I was in the first wave of people who got access to Xeon Phi Knights Corner Co-Processor cards. Fuck my life did we waste time on that bullshit. The driver support was abysmal, even with Intels own ICC compiler and their own MPI distribution.
ReginaldIII t1_iycqo8c wrote
Reply to comment by trajo123 in Does anyone uses Intel Arc A770 GPU for machine learning? [D] by labloke11
> People replying with "don't bother, just use Nvidia&CUDA" only make the problem worse ...music for Nvidia's ears.
My job is to get a trained model out the door so we can run experiments.
My job is not revolutionize the frameworks and tooling available so that competing hardware can be made a feasible alternative for everyone.
There are only so many hours in the day. I get paid for a very specific job. I have to work within the world that exists around me right now.
ReginaldIII t1_iy5w27q wrote
Reply to comment by sam__izdat in [P] Stable Diffusion 2.0 and the Importance of Negative Prompts for Good Results (+ Colab Notebooks + Negative Embedding) by minimaxir
I would argue for the images in the blue car post, that while the cars themselves reached a good fidelity and stopped improving, the backgrounds really improved and grounded the cars in their scenes better.
I think because this is treading into human subjective perception and aesthetic and compositional preferences, this sort of idea can only be tested by a wide scale blind comparative user study.
Similar to how such studies are conducted in lossy compression research.
> It's entirely possible that putting in "close up photo of a plate of food, potatoes, meat stew, green beans, meatballs, indian women dressed in traditional red clothing, a red rug, donald trump, naked people kissing" will amplify some of what you want and cut out some of what's (presumably) a bunch of irrelevant or low-quality SEO spam.
I think the nature of the datasets and language models is always going to mean a specialized negative prompt for where your image is located in the latent space will be needed to tune that image to it's optimum output for whatever composition you are aiming for. It's letting to nudge it around. How much wiggle room that area of the latent manifold has to give for variation will vary greatly.
ReginaldIII t1_iy5si81 wrote
Reply to comment by sam__izdat in [P] Stable Diffusion 2.0 and the Importance of Negative Prompts for Good Results (+ Colab Notebooks + Negative Embedding) by minimaxir
I do think there's something interesting here, the presence of a negative prompt does seem beneficial.
I wonder if, having "any" negative prompt is almost taking up some of the "slack" in the latent manifold. A better defined negative prompt might have some diminishing returns with regard to quality. But it does seem to have the ability to significantly influence the style, colour palette, and composition of the images.
ReginaldIII t1_iy5nmcv wrote
Reply to comment by sam__izdat in [P] Stable Diffusion 2.0 and the Importance of Negative Prompts for Good Results (+ Colab Notebooks + Negative Embedding) by minimaxir
I hate that I'm going to be "that guy" but it's not obvious enough that it's just a joke because it does actually produce reasonably similar results. "Improved" is, at least from this, somewhat plausible so I would be careful saying it because you don't actually mean it seriously but that isn't clear.
You'd have been dunking on them just as well if you'd said a bullshit random prompt performs comparatively.
ReginaldIII t1_iy5motj wrote
Reply to comment by sam__izdat in [P] Stable Diffusion 2.0 and the Importance of Negative Prompts for Good Results (+ Colab Notebooks + Negative Embedding) by minimaxir
What metric are you using to say this is an improved prompt? I think it's fair to say it is somewhat comparable but I think you'd need a set of metrics to define an improvement.
A proportion of N images produced where the hands were correct. Or a comparative user study where participants see the image pairs side by side randomly swapped and choose which they prefer.
And it definitely needs a comparison to a baseline of no negative prompt.
It will also be interesting to see if this still applies to SD 2 since it uses a different language model.
ReginaldIII t1_iy5li80 wrote
Reply to comment by sam__izdat in [P] Stable Diffusion 2.0 and the Importance of Negative Prompts for Good Results (+ Colab Notebooks + Negative Embedding) by minimaxir
That is both hilarious and a really interesting result. Thanks for sharing this.
How did you search for that improved negative prompt?
It would be interesting to see a third column for the same prompts/seeds without any negative prompt as a baseline.
ReginaldIII t1_ixyifz3 wrote
Reply to comment by [deleted] in [P] Memory Profiling for Pandas and Python by thapasaan
You are literally just mocking someone because English is not their first language. When the point of their comment was as simple as "I think Pycharm is better than VSCode".
You are a rude person. That was unnecessary.
ReginaldIII t1_ixi0bfl wrote
Reply to comment by new_name_who_dis_ in [D] Schmidhuber: LeCun's "5 best ideas 2012-22” are mostly from my lab, and older by RobbinDeBank
Look up how often people like LeCun actively avoid citing his "most cited papers in the field" out of little more than unprofessional spite.
> it'd be much better if he was actually encouraging people to adopt them the proper way
He is and does. That's literally why they are highly cited papers in the first place.
His argument for not being cited isn't against the wider community who do cite him. It's against the major players who actively refuse to cite him.
ReginaldIII t1_ixhgfbo wrote
Reply to comment by Insighteous in [D] Schmidhuber: LeCun's "5 best ideas 2012-22” are mostly from my lab, and older by RobbinDeBank
Imagine your peer reviewed publications were routinely uncited by your immediate peers and they often claimed novelty in ideas you had already published about.
Schmidhuber literally just wants to be cited when people refer to work that they did.
ReginaldIII t1_ixhg0a6 wrote
Reply to comment by chaosmosis in [D] Schmidhuber: LeCun's "5 best ideas 2012-22” are mostly from my lab, and older by RobbinDeBank
Title isn't even misleading.
And given that the content behind the title is the length of a tweet, is it not reasonable for a person to actually just read the tweet to get the full content?
How is it a "more" arrogant claim? LeCun of his own fruition identified the most important works, and they align with works that Schmidhuber's group has put out in that time frame. That is a matter of fact, not opinion. Those peer reviewed and published works exist.
ReginaldIII t1_ixhfqha wrote
Reply to comment by [deleted] in [D] Schmidhuber: LeCun's "5 best ideas 2012-22” are mostly from my lab, and older by RobbinDeBank
Dude has been consistently publishing good ideas the whole time and continues to.
His criticism isn't that he should be lauded solely for past successes. It's that people, like LeCun, actively go out of their way to not cite his labs contributions out of personal and petty grievances with him as an individual, and often actively espouse the novelty of ideas he has already established.
ReginaldIII t1_ixelgkf wrote
Reply to [R] Human-level play in the game of Diplomacy by combining language models with strategic reasoning — Meta AI by hughbzhang
A strange game. The only winning move is not to play. How about a nice game of chess?
E: -7? It was a movie quote guys...
ReginaldIII t1_ivqkw78 wrote
Reply to comment by doyougitme in [P] Serverless Jupyter Lab with GPUs and persistent storage by doyougitme
TOS > Prohibited Uses > 3. there's a typo on "Umpersonate"
I couldn't find a formal statement on your data policy for data uploaded to "uwstore", which I assume from your stack and how it appears in the notebooks is AWS EBS?
Do you give guarantees about data sandboxing between users? Have you penetration tested the ability to leak data out of another users environment or persistent storage.
Do you reserve the right to audit data a user has uploaded into uwstore or do you promise not to look at users data? This has a pretty big impact on what kinds of data people can process on your service as we often need a clearly written data policy to know we're operating within our rules of governance.
You allow Docker images, which is a nice feature, but are there size limits on images? You provide a build service based on a provided Dockerfile, how many cores / ram / local storage does that build process run on?
If I have pushed a (potentially large) image to a repository, and I base my Dockerfile FROM that image, is this allowed or do you have a set of curated base images we need to use?
Do you allow users to provide an access token so you can pull images from a private repository? If you've pulled a private image from a repository and it is now in the cache of your services, are other users able to base their Dockerfile's FROM that private image without providing an access token?
ReginaldIII t1_iznvys0 wrote
Reply to comment by satireplusplus in [P] I made a command-line tool that explains your errors using ChatGPT (link in comments) by jsonathan
Okay. But if you didn't do this you would not need to crunch a high end GPU for a couple of seconds. And if many people were doing this as part of their normal development practices then that would be many high end GPUs crunching for a considerable amount of time.
At what scale does the combined environmental impact become concerning?
It is literally a lot more energy consumed than is consumed by interpreting the error yourself, or by Googling and then accessing a doc page or stackoverflow thread. And it is energy that gets consumed every time anyone gets that error, regardless of whether an explanation for it has been generated for someone else already.
> Ever played a video game? You probably wasted 1000x as much energy in just one hour.
In terms of what value you get out of the hardware for the energy you put into it, the game is considerably more efficient than an LLM.
> The real advantage is that this can really speed up your programming and it can program small functions all by itself. It is much better than stackoverflow.
If an otherwise healthy person insists on walking with crutches all day every day. Will they be as strong as someone who just walks?