Recent comments in /f/deeplearning
geoffroy_lesage OP t1_jdbz94x wrote
Reply to comment by Jaffa6 in Question for use of ML in adaptive authentication by geoffroy_lesage
I see, I like the black box aspect but I understand it makes things difficult for when we need consistent output... What kind of "key" would you be able to generate and with what models? What about mathematical or statistical ways to try to reduce the output to make it more stable? This might be a dum idea but imagine if the model spits out floats, we get 1.1 but we expect 1 we could apply rounding to get integers in which case we would more often get 1... or we could do multiple runs and average them out.. or use fancy math like finite fields, modulo arithmetic, using different base math, etc...
And yea I get it that we could use something that is on device but unfortunately that is not something I want to rely on.. nothing that is hard coded anywhere can be used.
The goal here is to generate this key and use it to encrypt/decrypt stuff. I never want to store this key anywhere, it needs to be generated by the user data fed into the model
Jaffa6 t1_jdbysmg wrote
Broadly speaking, machine learning models are huge black boxes that you can't really explain the behaviour of.
It's going to be very difficult (if it's even possible) to guarantee that a certain user's behaviour will create a unique key because it would really just be multiplying and adding some different numbers (which come from the factors you mentioned).
You can certainly generate a key, though.
Much simpler is, as someone else suggested, just using something like the device's MAC address. But then you'll run into issues with them being locked out if they change address.
geoffroy_lesage OP t1_jdbwzjj wrote
Reply to comment by the_Wallie in Question for use of ML in adaptive authentication by geoffroy_lesage
I'm not quite sure I understand: "Some unique fingerprint has to come from some sort of behavioral or bio data that can reasonably be assumed to uniquely identify a user"
--> you mean to say "you have to get something unique from the user directly"? Because there are many ways to acquire unique things about a user.... how they type words into a keyboard is a very unique one for example, and there are many metrics that can be measured to figure that out...
- Pressure, Duration of press, Duration between presses, Speed
- Accuracy of presses
- use of backspace, use of auto-correct
- use of emojis, punctuation
- length of phrases, length of text, etc
the_Wallie t1_jdbwjfb wrote
Reply to comment by geoffroy_lesage in Question for use of ML in adaptive authentication by geoffroy_lesage
it depends on what your users are doing in your app. Some unique fingerprint has to come from some sort of behavioral or bio data that can reasonably be assumed to uniquely identify a user. Encoding data in some meaningful way (ml or otherwise) can only happen after you choose what you're encoding.
geoffroy_lesage OP t1_jdbw2s7 wrote
Reply to comment by the_Wallie in Question for use of ML in adaptive authentication by geoffroy_lesage
Yea no worries, I can authenticate them differently at first and start tracking data for a while before it becomes important to have this key generated.
But this process you are describing is just to identify users individually using a standard test, not to generate a unique key per user... Is there some way I could achieve this? Generating a unique key from a machine learning model?
the_Wallie t1_jdbvvue wrote
Reply to comment by geoffroy_lesage in Question for use of ML in adaptive authentication by geoffroy_lesage
I would probably ask them the user to draw a particular shape or set of shapes with their finger and record where they start and how they deviate from the perfect lining of that shape, then (using a vector that represents those deviations over time, their speed, the total time to completion and the starting position), build a database to of users and loosely identify a user using a nearest neighbor algo, or using a deep classifier that has the users as its output layer. What's challenging is you need to start building the data before you can apply it to logins, unless you already have a good proxy task in that context of your app that doesn't require logins (or that you can get after authenticating users using different means).
geoffroy_lesage OP t1_jdbv9zh wrote
Reply to comment by the_Wallie in Question for use of ML in adaptive authentication by geoffroy_lesage
I see, I appreciate the advice. That being said, if the use case made sense for it and it is possible to do, how would you do it? Just as a thought experiment. If you had enough data and power
the_Wallie t1_jdbuwai wrote
Reply to comment by geoffroy_lesage in Question for use of ML in adaptive authentication by geoffroy_lesage
Then either make it a 'stay logged in' experience or use bio info (facial recognition, fingerprints), depending on your security requirements.
Custom machine learning models are difficult to maintain and integrate compared to out of the box standard it solution and api integrations with external (ml) services. We should really only apply them when it makes sense (ie when we have an important, complex problem we can't navigate with simple heuristics and a large amount of relevant data).
geoffroy_lesage OP t1_jdbuonn wrote
Reply to comment by the_Wallie in Question for use of ML in adaptive authentication by geoffroy_lesage
Right, yes it will require their consent but this information stays on device since the ML happens on-device as well. The full picture is that I'm trying to make a passwordless experience where the key generated by the ML model is their password and is used to encrypt and decrypt some data on the device as well (: Idk if that makes sense
the_Wallie t1_jdbujh4 wrote
Reply to comment by geoffroy_lesage in Question for use of ML in adaptive authentication by geoffroy_lesage
OK I understand the what now, but not the 'why'. If you're processing personal information to recognize users, that requires their consent. If you have their consent, and we're talking about an installed app on an iPhone or Android, why not just use the user ID or device Id as the identifier? No ml required. Are you trying to identify different users of the same device?
geoffroy_lesage OP t1_jdbu15i wrote
Reply to comment by the_Wallie in Question for use of ML in adaptive authentication by geoffroy_lesage
Oh my bad, I've just re-written the post for more user-story type explanation...
the_Wallie t1_jdbtm3l wrote
... What? I read this twice and still had no idea what it is you're trying to achieve or why. Could you try to explain it as a user story?
Stunning-Butterfly89 t1_jdbqn4s wrote
Reply to Anyone have any good alternatives to Paperspace? My account got closed for unauthorized access. by FermatsLastAccount
You can try JarvisLabs. Only pay for your storage which is pretty cheap when VM is off, also you can run VMs as long as you want no 6hr limit. I absolutely love it. It offers 6000, A100s etc
Note: I am not linked with them
GrandDemand t1_jdbjfmw wrote
Palit is a good brand from what I've heard! And yeah I'm not as familiar with it but googling the same thing regarding power limiting/undervolting the 4090 will bring up similar results about optimizing efficiency for it. You can likely get it not exceed 350W under load and likely even get a performance boost, not even regression. The 4090 is incredibly efficient given its performance, it also draws way less power when idling than the 3090 or 3090Ti. Had no idea the price gap was that small haha otherwise I would've recommended the 4090 straight away especially given the price in energy increase you've more than likely experienced
Numerous_Talk7940 OP t1_jdbifz9 wrote
Reply to comment by GrandDemand in How noticeable is the difference training a model 4080 vs 4090 by Numerous_Talk7940
I see, thanks for your help! The 4090 is from palit though while the used 3090/3090ti are from MSI or Gigabyte, which are more expensive in general. For my use case however, the brand should not matter. Does the advice for undervolting still hold for the 4090?
GrandDemand t1_jdbi5kd wrote
Reply to comment by Numerous_Talk7940 in How noticeable is the difference training a model 4080 vs 4090 by Numerous_Talk7940
I'd say in that case just go new 4090 unless you can get a used 3090 for half that or a used 3090Ti for a little over half. I'm surprised that the difference is that small, I guess I'm accustomed to the used market in the US which I imagine is quite a bit larger
fhadley t1_jdbbak8 wrote
Reply to Anyone have any good alternatives to Paperspace? My account got closed for unauthorized access. by FermatsLastAccount
Sorry for not coming here with an actual answer but why would they ban stockfish?
Numerous_Talk7940 OP t1_jdb6dt1 wrote
Reply to comment by GrandDemand in How noticeable is the difference training a model 4080 vs 4090 by Numerous_Talk7940
3090 900-1000, 3090ti 1200-1300 4090 new 1750, which is why I considered still a new 4090, because the price gaps between those 3 Options seem odd.
GrandDemand t1_jd9w67s wrote
Reply to comment by GrandDemand in How noticeable is the difference training a model 4080 vs 4090 by Numerous_Talk7940
Although since you are in Europe, what does pricing look like roughly for used 3090s/3090Tis versus new 4090s?
GrandDemand t1_jd9w0n4 wrote
Reply to comment by Numerous_Talk7940 in How noticeable is the difference training a model 4080 vs 4090 by Numerous_Talk7940
Yes that's likely better actually since its a much newer card (less likely to have been mined on) just make sure you look into tuning it for efficiency as the card is designed to run at 450W (which is insane). I can't direct you to any guides for the 3090Ti specifically but I'd just google 3090Ti undervolt. The 3090 should probably also be undervolted too, you really don't need these cards to be hitting their power limits of 450W and 350W respectively, tuning them to a more reasonable 325-350W and 280-300W makes way more sense.
GrapplingHobbit t1_jd9c32x wrote
Reply to comment by viperx7 in Alpaca Turbo : A chat interface to interact with alpaca models with history and context by viperx7
Also... it doesn't seem to generate at all without an internet connection. Is that the expected behaviour?
Numerous_Talk7940 OP t1_jd990x3 wrote
Reply to comment by GrandDemand in How noticeable is the difference training a model 4080 vs 4090 by Numerous_Talk7940
I see and took a look, the 3090 ti is sometimes available for only 100 euros more (used that is). Is that something I could consider aswell?
GrapplingHobbit t1_jd98i9i wrote
Reply to comment by viperx7 in Alpaca Turbo : A chat interface to interact with alpaca models with history and context by viperx7
Hoping you manage to figure out what is slowing things down on windows! In the direct command line interface on the 7b model the responses are almost instant for me, but pushing out around 2 minutes via Alpaca-Turbo, which is a shame because the ability to edit persona and have memory of the conversation would be great.
GrandDemand t1_jd96065 wrote
Reply to comment by Numerous_Talk7940 in How noticeable is the difference training a model 4080 vs 4090 by Numerous_Talk7940
No. I'd get a used 3090. Save the rest of your money for when you have more experience and a better grasp of the kinds of problems you'd like to solve and the hardware needed to run the corresponding models. Then you'll realize either that your hardware is sufficient as is (with a 3090), a 4090 would actually benefit you, a card with 48GB of VRAM is essential (ie. You need either an Ada Titan if it comes out or 2x 3090s), or it's way too expensive to run on consumer hardware and just use cloud GPU instances with A100s or H100s instead. But the 3090 will be a great card for now, and a used one in great condition (sometimes even open box or with an active warranty) can be found easily on the hardwareswap subreddit for $800 or even less.
Jaffa6 t1_jdbzs22 wrote
Reply to comment by geoffroy_lesage in Question for use of ML in adaptive authentication by geoffroy_lesage
This is unfortunately going to be a bit harsh, but it's worth knowing sooner rather than later: Cryptography (which this essentially is) is a VERY difficult field and creating a secure encryption scheme is very difficult.
Wanting to encrypt and decrypt without the key being stored anywhere is an admirable goal, but this is certainly not the way I'd recommend doing it and it's not likely to be secure this way.
If you're dead set on doing it like this, then pretty much any neural network can do it. You're just inputting numbers and wanting numbers out.
I guess your training data would be many sets of behavioural data from each user, say at least 50 users, and training it to predict the user from that data, but heavily penalising it if it matches another user too.