abnormal_human t1_je60s31 wrote on March 29, 2023 at 5:40 PM

Reply to [D] The best way to train an LLM on company data by jaxolingo

Yes, it's totally possible to train an LLM to understand tabular data. It's a very general purpose architecture. With enough resources it is well suited to a wide range of problems, and yes, Azure/Snowflake can do everything you need (at some price, assuming you know what to do with them).

You need to make a decision about whether you want to bake the info into the LLM, or whether you want to teach the LLM to find the answers and then format them for humans.

This will depend on your use case, budget, team size, competencies, data set size, and time to market requirements. Baking the info into the LLM is a lot harder than doing the other thing, like potentially 100x-1000x harder and more expensive, and without people with experience doing it, you will waste a lot of time/energy getting there.

abnormal_human t1_jdyxteq wrote on March 28, 2023 at 5:18 AM

Reply to [D] Instruct Datasets for Commercial Use by JohnyWalkerRed

Model weights are not currently considered to be copyrightable, and there is no DMCA/RIAA/MPAA machinery providing additional consequences for "pirating" them. At least for the moment, it's not a big risk to use LLaMA/Alpaca models for commercial use so long as you have not made an agreement with Facebook not to do it.

The OpenAI policy is about competing models, and comes from the TOS of using their API. Stanford agreed to that TOS, then released the text (which is again, not copyrightable). Random people downloading that data set aren't party to that agreement or bound by it.

I'm sure that Google, Facebook, Amazon, Netflix, etc will be cautious here, but for a random smaller org, this is a risk/benefit tradeoff, not an absolute.

A person who takes a torrented LLaMA and finetunes it using the Stanford data set didn't necessarily engage in any contracts prohibiting that.

The original leaker of LLaMA weights broke the rules. That's about it. Tsk tsk.

abnormal_human t1_jdywyac wrote on March 28, 2023 at 5:08 AM

Reply to comment by antonivs in [D] FOMO on the rapid pace of LLMs by 00001746

I'm in the midst of a similar project. It also doesn't require massively expensive compute because for domain specific tasks, you often don't need models with gajillions of parameters to achieve business-interesting results.

abnormal_human t1_jc3j3ah wrote on March 13, 2023 at 7:51 PM

Reply to [R] Stanford-Alpaca 7B model (an instruction tuned version of LLaMA) performs as well as text-davinci-003 by dojoteef

Things are moving fast these days. Hopefully I can get some models trained before the technology leapfrogs me again.

abnormal_human t1_jb9kyzr wrote on March 7, 2023 at 1:35 PM

Reply to comment by ReginaldIII in [R] Created a Discord server with LLaMA 13B by ortegaalfredo

Actually, it doesn't. GPLv3 just requires that if OP distributes a binary to someone, the source used to produce that binary is also made available. With server side code the binary isn't being distributed, so no obligation to distribute source.

abnormal_human t1_jad6qae wrote on February 28, 2023 at 4:21 PM

Reply to comment by Beli_Mawrr in [R] Microsoft introduce Kosmos-1, a Multimodal Large Language Model (MLLM) that can perceive general modalities, learn in context (i.e., few-shot), and follow instructions (i.e., zero-shot) by MysteryInc152

Yeah, probably.

abnormal_human t1_jacjmrj wrote on February 28, 2023 at 1:37 PM

Reply to [R] Microsoft introduce Kosmos-1, a Multimodal Large Language Model (MLLM) that can perceive general modalities, learn in context (i.e., few-shot), and follow instructions (i.e., zero-shot) by MysteryInc152

Am I reading right that this is a 1.6B parameter model?

abnormal_human t1_j9gjycf wrote on February 21, 2023 at 8:10 PM

Reply to comment by pyepyepie in [D] What would be the ideal map for "learning" machine learning? by Ashb0rn3_

I guarantee that you have used stochastic gradient descent before if you’ve done any significant amount of ML work. This technique and other optimization methods like it are rooted in differential equations.

abnormal_human t1_j9fhvd5 wrote on February 21, 2023 at 3:31 PM

Reply to [D] What would be the ideal map for "learning" machine learning? by Ashb0rn3_

I went through this about five years ago.

For me, the main job was learning all of the terminology and getting a feel for which techniques are used to solve what kinds of problems. At the time when I went through this, I spent many hours listening to podcasts. Just listening to people talk about the stuff helped me get a map of the territory and decide where to dive deeper.

Then as soon as I had even the slightest grasp of a possible solution to a problem in my domain, I would go try to attack it. In this early era I made hundreds of Jupyter notebooks. Each one was me spending a few hours trying out a technique on some data from my business. Some worked, some didn't, but I got a lot of experience in a short time.

I had a strong math and SWE background to begin with. If you don't, you may have some extra catching up to do. As far as math goes, Linear Algebra is the most important. Probability Theory and Differential Equations are also very applicable. Most SWE work tied to Machine Learning is pretty basic. Lots of Python, but it helps to understand how computers work because you do get into data at scale pretty often.

At this point, I've deployed many ML systems to production, they are serving hundreds of thousands of users daily, and I can keep up with experts when conversing, designing stuff, etc.

abnormal_human t1_ita5okt wrote on October 22, 2022 at 1:44 AM

Reply to Massive clog of waste in the line has all drains in the house backing up, what should I do? by Caprine-Evisc

You can try a bigger snake. Rent one. It's worth a shot, since it's the exact tool a drain person would try first, and renting one for a few hours will cost less than a plumber coming out to do it.

If that doesn't work, you'll soon be spending more than what a plumber would charge buying stuff that you don't know how to use in order to maybe fix the problem.

A drain company will have several snakes, an inspection camera, a pressure-washer based drain clearing device, etc, and they can change techniques every time one of them fails. Going through the same process buying/renting stuff every step of the way will quickly get expensive.

One of the other problems with drain clearing is that there is some voodoo involved in reasoning about how the system is put together and where your snake is going. Plumbers have intuition about this based on the age of the building, materials used, construction details, etc, but you likely don't and it's not really easy to teach that knowledge. Sometimes the right way to get to a certain part of the system with a snake is not intuitive.

Good luck.

abnormal_human t1_istxc9l wrote on October 18, 2022 at 5:56 PM

Reply to [D] How frustrating are the ML interviews these days!!! TOP 3% interview joke by Mogady

As a person who hires people, these interviews sound like bullshit and you shouldn't work there or feel bad about this.

My experience hiring DS/ML people is that technical skill is rarely the problem. At this point, 80% of what I am measuring is whether you are product-oriented enough to deliver something without a ton of hand holding. When the interview process is too technical-focused, you end up hiring a bunch of hermits who fail because of communication/collaboration problems or take too narrow a view on the product side.

abnormal_human t1_irull9o wrote on October 11, 2022 at 3:47 AM

Reply to comment by [deleted] in Meeting your daily step goal really does work to prevent important illnesses. Taking more than 8,200 steps a day – the equivalent of walking around four miles – was found to protect against the likes of obesity, sleep apnoea, high blood pressure and major depressive disorder by Wagamaga

I walk around the house, ideally outdoors, when on conference calls. Can get hours of walking in that way.