Recent comments in /f/deeplearning

suflaj t1_jc6n8v1 wrote

Just apply an aggregation function on the 0th axis. This can be sum, mean, min, max, whatever. The best is sum, since your loss function will naturally regularise the weights to be smaller and it's the easiest to differentiate. This is in the case you know you have 18 images, for the scenario where you will have a variable amount of images, use mean. The rest are non-differentiable and might give you problems.

If you use sum, make sure you do gradient clipping so the gradients don't explode in the beginning.

2

Kuchenkiller t1_jc4t81a wrote

If you have enough images of defects but are just lacking the labeling (probably easier to come by) one approach is to generate random morphological structures on your bottles (e.g. just some random circles and ellipses) and then apply cycleGAN or CUT to transform from this "segmented" image domain to the domain of real images. As said, you still need a lot of data but don't need labelling. Just generating useful data from noise (basic GAN idea) can work in theory but is extremely hard to train. I had way more success with the domain transfer approach (my case in medical imaging)

2

ats678 t1_jc4lem7 wrote

In the same fashion as LLM, I think Large Vision Models and multimodal intersections with LLM are the next big thing.

Apart from that, I think things such as model quantisation and model distillation are going to become extremely relevant in the short term. If the trend of making models larger will keep running at this pace, it will be necessary to find solutions to run them without using a ridiculous amount of resources. In particular I can see people pre-train large multimodal models and then distill them for specific tasks

1

gradientic t1_jc430rz wrote

Not really direct answer to your question, but the general problem you are trying to solve is called image anomaly detection, there are well know approaches that try to solve this problem, some of them in an unsupervised manner (assuming that you have a significant dataset of images without anomalies - learn inlier look for outliers) - check https://towardsdatascience.com/an-effective-approach-for-image-anomaly-detection-7b1d08a9935b for some ideas and pointers (sry if pointing obvious things)

5

GufyTheLire t1_jc3evhf wrote

I've tried that once. Asked ChatGPT why L0, L1.. Ln norms, so seemingly different, were all named in a similar way. It correctly listed the norms' definitions and use cases, but failed to generalize the concept and made up some bullshit reason why they are named like that. Took me some time down the Wikipedia and Google rabbit hole to find out about Lp spaces and substitute different p values in the definition of p-norm to get the real reason

1

notgettingfined t1_jc33bc3 wrote

Reply to comment by grid_world in Image reconstruction by grid_world

I don’t under how that prevents learning from the individual images. I think you would need to explain the problem better. You could also add all the images together into channels so so you would have a 36x90x90input and then a 3x90x90 output

2