Recent comments in /f/deeplearning

Jaffa6 t1_jdbzs22 wrote

This is unfortunately going to be a bit harsh, but it's worth knowing sooner rather than later: Cryptography (which this essentially is) is a VERY difficult field and creating a secure encryption scheme is very difficult.

Wanting to encrypt and decrypt without the key being stored anywhere is an admirable goal, but this is certainly not the way I'd recommend doing it and it's not likely to be secure this way.

If you're dead set on doing it like this, then pretty much any neural network can do it. You're just inputting numbers and wanting numbers out.

I guess your training data would be many sets of behavioural data from each user, say at least 50 users, and training it to predict the user from that data, but heavily penalising it if it matches another user too.

1

geoffroy_lesage OP t1_jdbz94x wrote

I see, I like the black box aspect but I understand it makes things difficult for when we need consistent output... What kind of "key" would you be able to generate and with what models? What about mathematical or statistical ways to try to reduce the output to make it more stable? This might be a dum idea but imagine if the model spits out floats, we get 1.1 but we expect 1 we could apply rounding to get integers in which case we would more often get 1... or we could do multiple runs and average them out.. or use fancy math like finite fields, modulo arithmetic, using different base math, etc...
And yea I get it that we could use something that is on device but unfortunately that is not something I want to rely on.. nothing that is hard coded anywhere can be used.
The goal here is to generate this key and use it to encrypt/decrypt stuff. I never want to store this key anywhere, it needs to be generated by the user data fed into the model

2

Jaffa6 t1_jdbysmg wrote

Broadly speaking, machine learning models are huge black boxes that you can't really explain the behaviour of.

It's going to be very difficult (if it's even possible) to guarantee that a certain user's behaviour will create a unique key because it would really just be multiplying and adding some different numbers (which come from the factors you mentioned).

You can certainly generate a key, though.

Much simpler is, as someone else suggested, just using something like the device's MAC address. But then you'll run into issues with them being locked out if they change address.

1

geoffroy_lesage OP t1_jdbwzjj wrote

I'm not quite sure I understand: "Some unique fingerprint has to come from some sort of behavioral or bio data that can reasonably be assumed to uniquely identify a user"
--> you mean to say "you have to get something unique from the user directly"? Because there are many ways to acquire unique things about a user.... how they type words into a keyboard is a very unique one for example, and there are many metrics that can be measured to figure that out...
- Pressure, Duration of press, Duration between presses, Speed
- Accuracy of presses
- use of backspace, use of auto-correct
- use of emojis, punctuation
- length of phrases, length of text, etc

1

geoffroy_lesage OP t1_jdbw2s7 wrote

Yea no worries, I can authenticate them differently at first and start tracking data for a while before it becomes important to have this key generated.

But this process you are describing is just to identify users individually using a standard test, not to generate a unique key per user... Is there some way I could achieve this? Generating a unique key from a machine learning model?

1

the_Wallie t1_jdbvvue wrote

I would probably ask them the user to draw a particular shape or set of shapes with their finger and record where they start and how they deviate from the perfect lining of that shape, then (using a vector that represents those deviations over time, their speed, the total time to completion and the starting position), build a database to of users and loosely identify a user using a nearest neighbor algo, or using a deep classifier that has the users as its output layer. What's challenging is you need to start building the data before you can apply it to logins, unless you already have a good proxy task in that context of your app that doesn't require logins (or that you can get after authenticating users using different means).

1

the_Wallie t1_jdbuwai wrote

Then either make it a 'stay logged in' experience or use bio info (facial recognition, fingerprints), depending on your security requirements.

Custom machine learning models are difficult to maintain and integrate compared to out of the box standard it solution and api integrations with external (ml) services. We should really only apply them when it makes sense (ie when we have an important, complex problem we can't navigate with simple heuristics and a large amount of relevant data).

1

geoffroy_lesage OP t1_jdbuonn wrote

Right, yes it will require their consent but this information stays on device since the ML happens on-device as well. The full picture is that I'm trying to make a passwordless experience where the key generated by the ML model is their password and is used to encrypt and decrypt some data on the device as well (: Idk if that makes sense

1

the_Wallie t1_jdbujh4 wrote

OK I understand the what now, but not the 'why'. If you're processing personal information to recognize users, that requires their consent. If you have their consent, and we're talking about an installed app on an iPhone or Android, why not just use the user ID or device Id as the identifier? No ml required. Are you trying to identify different users of the same device?

2

GrandDemand t1_jdbjfmw wrote

Palit is a good brand from what I've heard! And yeah I'm not as familiar with it but googling the same thing regarding power limiting/undervolting the 4090 will bring up similar results about optimizing efficiency for it. You can likely get it not exceed 350W under load and likely even get a performance boost, not even regression. The 4090 is incredibly efficient given its performance, it also draws way less power when idling than the 3090 or 3090Ti. Had no idea the price gap was that small haha otherwise I would've recommended the 4090 straight away especially given the price in energy increase you've more than likely experienced

1

GrandDemand t1_jd9w0n4 wrote

Yes that's likely better actually since its a much newer card (less likely to have been mined on) just make sure you look into tuning it for efficiency as the card is designed to run at 450W (which is insane). I can't direct you to any guides for the 3090Ti specifically but I'd just google 3090Ti undervolt. The 3090 should probably also be undervolted too, you really don't need these cards to be hitting their power limits of 450W and 350W respectively, tuning them to a more reasonable 325-350W and 280-300W makes way more sense.

0

GrapplingHobbit t1_jd98i9i wrote

Hoping you manage to figure out what is slowing things down on windows! In the direct command line interface on the 7b model the responses are almost instant for me, but pushing out around 2 minutes via Alpaca-Turbo, which is a shame because the ability to edit persona and have memory of the conversation would be great.

1

GrandDemand t1_jd96065 wrote

No. I'd get a used 3090. Save the rest of your money for when you have more experience and a better grasp of the kinds of problems you'd like to solve and the hardware needed to run the corresponding models. Then you'll realize either that your hardware is sufficient as is (with a 3090), a 4090 would actually benefit you, a card with 48GB of VRAM is essential (ie. You need either an Ada Titan if it comes out or 2x 3090s), or it's way too expensive to run on consumer hardware and just use cloud GPU instances with A100s or H100s instead. But the 3090 will be a great card for now, and a used one in great condition (sometimes even open box or with an active warranty) can be found easily on the hardwareswap subreddit for $800 or even less.

0