farmingvillein t1_iw1vkd1 wrote on November 12, 2022 at 8:22 AM

Reply to comment by Peantoo in [D] Current Job Market in ML by diffusion-xgb

Sounds like you are looking more for a role advertised with more of an R&D role--"data scientist", "ML research engineer", etc.

Role names are fluid/arbitrary, but "ML Engineer" at anywhere other than the largest shops (or very AI-specialized startups) is generally going to be someone who can help get an ML system into production (hence the cloud concerns).

Your general options (other than somehow boning up on cloud and passing the interviews now) would be to more narrowly tailor the roles you apply for (per above); continue to apply for MLE but be aware of the issues (per above); and/or apply to some general SWE roles so that you can get some more modern commercial/cloud stack experience.

As a general statement, 1-2 years of a generic SWE role would probably do wonders for your infrastructure-level knowledge; i.e., you'll probably be able to write your own ticket after this.

That said, if you have zero interest in expanding into this, I'd just focus on applying roles that are more narrowly written.

farmingvillein t1_iw1av5m wrote on November 12, 2022 at 4:13 AM

Reply to comment by Peantoo in [D] Current Job Market in ML by diffusion-xgb

Yes, understood--you're aiming at the wrong market, is what I was trying to get at.

farmingvillein t1_iw0yjja wrote on November 12, 2022 at 2:20 AM

Reply to comment by Peantoo in [D] Current Job Market in ML by diffusion-xgb

> but I keep hitting that wall of, "we use AWS and need someone to help with the production side of things."

That doesn't sound like big tech?--Meta, Alphabet, etc. heavily use their own internal tools.

> but I keep hitting that wall of, "we use AWS and need someone to help with the production side of things."

Also doesn't sound like "truly" ML roles.

farmingvillein t1_iw0pgz2 wrote on November 12, 2022 at 1:03 AM

Reply to comment by Peantoo in [D] Current Job Market in ML by diffusion-xgb

> but not having CI/CD and cloud experience is basically gatekeeping me

Big tech doesn't really care about this. Are you passing leetcode & ML skills screens? This is where I would focus.

(Well, would have, prior to the current crash...)

farmingvillein t1_iw0ozbu wrote on November 12, 2022 at 12:59 AM

Reply to comment by [deleted] in [D] Current Job Market in ML by diffusion-xgb

Doesn't this say -~30%, not 80%? What am I reading wrong here?

farmingvillein t1_ivvkfko wrote on November 10, 2022 at 10:44 PM

Reply to comment by DreamyPen in [Discussion] Can we train with multiple sources of data, some very reliable, others less so? by DreamyPen

This is the right 80-20 starting answer.

farmingvillein t1_ivpuvwj wrote on November 9, 2022 at 7:06 PM

Reply to comment by EmbarrassedHelp in [D] Video: The New AI Model Licenses have a Legal Loophole (OpenRAIL-M of BLOOM, Stable Diffusion, etc.) by ykilcher

Pretty gnarly.

Some quick observations:

Not clear if this is an immediate concern, since you can make a "Derivative of the Model" which arguably gets you out from under this clause.

"Version" is poorly defined--I could see someone trying to take an aggressive stance that a sufficiently degraded update is not an updated "version", but something else.

"Reasonable efforts" is potentially (depending on jurisdiction) a hole large enough to drive a truck through. E.g., if you build a service that depends on their model, and then they release a degraded one that makes it so that you can't service a large % of your users anymore, you could argue that there is no "reasonable effort" available that allows you to transition to the new model (given the corresponding commercial costs).

If you want to rely on any of the above, though, you definitely should get your counsel's blessing...

farmingvillein t1_iv38uzt wrote on November 4, 2022 at 11:11 PM

Reply to comment by hybridteory in [D] DALL·E to be made available as API, OpenAI to give users full ownership rights to generated images by TiredOldCrow

That is my point? I'm not sure how to square your (correct) statement with your prior statement:

> Dall-E 2 is not there yet, but we are close to prompting "Original Mona Lisa painting" and be given back the original Mona Lisa painting with striking similarities

farmingvillein t1_iv2vqmx wrote on November 4, 2022 at 9:32 PM

Reply to comment by hybridteory in [D] DALL·E to be made available as API, OpenAI to give users full ownership rights to generated images by TiredOldCrow

> Codex is not technically copy pasting; it is generating a new output that is (almost) exactly the same, or indistinguishable on the eyes of a human, to the input.

Nah, it is literally generating duplicates. This is copying, in the eyes of the law. Whether this is an actual legal problem remains to be seen.

> Dall-E 2 is not there yet, but we are close to prompting "Original Mona Lisa painting" and be given back the original Mona Lisa painting with striking similarities.

This is confused. Dall-E 2 is "not there yet", as a general statement, because they specifically have trained it not to do this.

farmingvillein t1_iv2bbw6 wrote on November 4, 2022 at 7:13 PM

Reply to comment by hybridteory in [D] DALL·E to be made available as API, OpenAI to give users full ownership rights to generated images by TiredOldCrow

If there can be a lawsuit, there eventually certainly will be one.
The issues here are--for now--different. The current claim is that Codex is copy-pasting things that need licenses attached. (Whether this is true will of course be played out in court.) For image generation, no one has made the claim--yet--that these systems are emitting straight copies (at any meaningful scale) of someone else's original pictures.

farmingvillein t1_iv29u6c wrote on November 4, 2022 at 7:03 PM

Reply to comment by Takahashi_Raya in [N] Class-action lawsuit filed against GitHub, Microsoft, and OpenAI regarding the legality of GitHub Copilot, an AI-using tool for programmers by Wiskkey

Getty and Shutterstock literally turned around and partnered with generative AI companies--who do exactly what you flag as a problem--to sell images on their platforms.

farmingvillein t1_iv1ye88 wrote on November 4, 2022 at 5:49 PM

Reply to comment by chasingourselves in [N] Class-action lawsuit filed against GitHub, Microsoft, and OpenAI regarding the legality of GitHub Copilot, an AI-using tool for programmers by Wiskkey

Which seems like a solvable, albeit terribly painful, problem?

If this is the direction that things end up going, honestly this will ultimately only be massively in the favor of OpenAI (and a small # of very well-funded competitors), as it will create a very, very painful barrier to entry.

farmingvillein t1_iv1x778 wrote on November 4, 2022 at 5:41 PM

Reply to comment by Takahashi_Raya in [N] Class-action lawsuit filed against GitHub, Microsoft, and OpenAI regarding the legality of GitHub Copilot, an AI-using tool for programmers by Wiskkey

> It hasn't that is why ghetty has blocked ai

You are right that OP is wrong (re:whether this is a settled legal issue)...but let's not pretend that ghetty [sic] doing so has to do with anything than attempted revenue maximization on their part.

Successful, prolific AI art devalues their portfolio of images, and they know that.

farmingvillein t1_iupoett wrote on November 2, 2022 at 2:36 AM

Reply to comment by OnceReturned in [N] Meta AI | Evolutionary-scale prediction of atomic level protein structure with a language model by xutw21

To be super clear, I'm not questioning the overall utility! Strictly a statement of, I can't square this with metas mission statement.

farmingvillein t1_iuowrov wrote on November 1, 2022 at 11:17 PM

Reply to [N] Meta AI | Evolutionary-scale prediction of atomic level protein structure with a language model by xutw21

I'm not sad that they are doing this, in the sense that it is almost certainly net-good for humanity, but it is bizarre to me that MetaAI is investing here.

farmingvillein t1_itefjav wrote on October 23, 2022 at 12:42 AM

Reply to comment by LetterRip in [R] Scaling Instruction-Finetuned Language Models - Flan-PaLM- Google 2022 - 75.2% on five-shot MMLU / Forecasters expected this SOTA would need until 2024! - Public checkpoints! by Singularian2501

> Note that 540B parameters is more than 2 TB for float 32

They only provide checkpoints up to the 11B model, however (unless I'm reading things wrong), so this is a moot point, at the moment.

farmingvillein t1_iteevh5 wrote on October 23, 2022 at 12:37 AM

Reply to [D] Any pre trained retrieval based language models available? by invertedpassion

I can't vouch for it personally--and it may or may not meet your needs (noting that this is about repurposing existing non-retrieval models to meet a maybe-similar need)--but maybe check out: https://twitter.com/bencmejla/status/1583499775789654016

farmingvillein t1_is36n5p wrote on October 12, 2022 at 11:34 PM

Reply to comment by suflaj in [D] Wide Attention Is The Way Forward For Transformers by SuchOccasion457

Sorry, didn't mean to imply that you were saying that it was useless--that is in response to my own criticism of the paper's title (versus the paper itself).

> I find it funny that someone would actually say that instead of "they perform roughly the same"

Yeah...for better or worse, though, if you say something performs "on parity", people assume (because it is frequently true...) that what you really mean is "-0.1% but that totally isn't a big deal".

I don't fault them for highlighting the 0.3% as a light pushback on the above, but I do blame 1) OP in their post highlighting this point (which, to your point, is at best misleading about the key claims of the paper) and 2) the authors for picking the ludicrous title.

farmingvillein t1_is35lpb wrote on October 12, 2022 at 11:26 PM

Reply to comment by suflaj in [D] Wide Attention Is The Way Forward For Transformers by SuchOccasion457

Well, the key claim of the paper (which OP should have instead reflected in the top-level post) is not that there is a big accuracy increase, but that performance is equal or better, while being computationally advantaged:

> We show an in-depth evaluation and demonstrate how wide models require a far smaller memory footprint and can run faster on commodity hardware, in addition, these wider models are also more interpretable

I.e., get ~equal performance at lower cost (what's not to like?).

That said, the real issue with this paper is that they only look at very small datasets...which makes the paper basically useless for making grandiose claims like:

> WIDE ATTENTION IS THE WAY FORWARD FOR TRANSFORMERS

That doesn't mean that the paper itself is useless, of course...it is an interesting data point...but they absolutely should not have chosen that title.

farmingvillein t1_iqs51o3 wrote on October 2, 2022 at 6:22 PM

Reply to Do companies/teams accept ppl coming from a completely different field into AI or ML? [D] by ritheshgirish9

If you're gunning for ML research, that will be tough.

ML engineer, though? Don't overthink it--just find opportunities that you find interesting and apply. Yes, every company is of course going to prefer unicorn candidates who have "been there, done that", but the need for ML engineers has exploded (demand > supply), so they can't be so choosey.

And, in reality, for most companies hiring ML engineers, what they really need, first and foremost, is people who will be excited doing SWE work on ML pipelines...which is really another way to say SWE work (which happens to involve ML). So if you're a strong software engineer, a lot of places will immediately be interested in considering you for such a role.

(To be clear, I don't say the above to "knock" ML engineering, companies doing such work, etc.--rather just commenting that the reality is that at many companies shipping "real" ML products to production, the day-to-day pains are often about data pipelines breaking, scaling problems, software version incompatibilities, etc.

I.e., it isn't about exclusively solving deep Pytorch voodoo or similar.

Rather, much of it is "classic" SWE concerns that touch ML systems.

This is a good thing for you, as they know that they can hire someone smart and you can be productive pretty quickly--and then you can learn the more obnoxious ins-and-outs of paper implementation / why CUDA hates you / etc. as you grow.

Good luck!)