master3243
master3243 t1_iyxcdtw wrote
Reply to comment by deepbootygame in [D] NeurIPS 2022 Outstanding Paper modified results significantly in the camera ready by Even_Stay3387
It's very easy to point and criticize but what exactly do you propose is done in this type of situation?
Ban the authors because they acknowledged and rectified their error? Good job you just guaranteed that no author will ever speak up about any mistakes they legitimately made.
Not to mention that their updated results are still a massive improvement.
master3243 t1_iyxbtij wrote
Reply to comment by lemlo100 in [D] NeurIPS 2022 Outstanding Paper modified results significantly in the camera ready by Even_Stay3387
As a person who mainly researchers AI but also worked in software engineering previously, I have never seen AI and unit testing together in the same room... sadly
master3243 t1_iw1x35r wrote
Reply to comment by VinnyVeritas in [R] ZerO Initialization: Initializing Neural Networks with only Zeros and Ones by hardmaru
> They theoretically show that, different from naive identity mapping, their initialization methods can avoid training degeneracy when the network dimension increases. In addition, they empirically show that they can achieve better performance than random initializations on image classification tasks, such as CIFAR-10 and ImageNet. They also show some nice properties of the model trained by their initialization methods, such as low-rank and sparse solutions.
master3243 t1_iw1mpgb wrote
Reply to comment by bluevase1029 in [R] ZerO Initialization: Initializing Neural Networks with only Zeros and Ones by hardmaru
Definitely if each person has a completely different setup.
But that's why we contenirize our setups and use a shared environment setup
master3243 t1_iw1hggt wrote
Reply to comment by VinnyVeritas in [R] ZerO Initialization: Initializing Neural Networks with only Zeros and Ones by hardmaru
The problem is not random variance between trained models.
Check out the abstract, it answers why this work is useful.
master3243 t1_iw1h2h7 wrote
Reply to comment by elcric_krej in [R] ZerO Initialization: Initializing Neural Networks with only Zeros and Ones by hardmaru
> potentially removes a lot of random variance from the process of training
You don't need the results of this paper for that.
One of my teams had a pipeline where every single script would initialize the seed of all random number generators (numpy, torch, pythons radom) to 42.
This essentially removed non-machine-precision stochasticity between different training iterations with the same inputs.
master3243 t1_iv2h0c5 wrote
Reply to comment by yaosio in [D] DALL·E to be made available as API, OpenAI to give users full ownership rights to generated images by TiredOldCrow
How did you literally just ignore the two articles above that show the US copyright office granting the copyright?
master3243 t1_iuzrd2n wrote
Reply to comment by yaosio in [D] DALL·E to be made available as API, OpenAI to give users full ownership rights to generated images by TiredOldCrow
> In the US AI created art can't be covered by copyright
What? Literally the answer was one google search away
Kashtanova obtained a US copyright on the art compiled into 18-pages which was created by Midjourney
Sources:
Artist receives first known US copyright registration for latent diffusion AI art
master3243 t1_iuz0d6h wrote
Reply to [D] DALL·E to be made available as API, OpenAI to give users full ownership rights to generated images by TiredOldCrow
Do they implicitly mean DALLE 2 or do they actually mean 1?
I can't tell anymore and I feel it's definitely possible they try to push the generic name "DALL-E" to refer to their newest model.
I still sometimes jokingly refer to it as "unCLIP" as that is what they called their model in the original paper.
master3243 t1_iutotkj wrote
Reply to comment by DaltonSC2 in [D] Graph neural networks by No_Captain_856
It's not just about learning different categories.
Imagine you're trying to study a social network of people, take twitter users for example, the individual nodes will probably be the users and the data associated with them (past tweets, bio, etc) while the edges would be the connection between users that you care about (e.g. A follows B, or A tweeted at B, or A retweeted post by B, etc.) and you can see how each of those connections carries information other than just a binary yes or no (e.g. When did A follow B? How many previous tweets did A see of B? How many followers did B have at the time? How many tweets did B have at that time?)
You can see how an individual edge can carry an extremely rich feature vector between nodes A and B where those features are separate from the features belonging to either node A and B themselves. Thus, it's possible that a binary adjacency matrix would not be enough to capture the intrinsic properties of that system.
master3243 t1_iujfxwi wrote
Reply to comment by boyetosekuji in [News] The Stack: 3 TB of permissively licensed source code - Hugging Face and ServiceNow Research Denis Kocetkov et al 2022 by Singularian2501
very many and very much
master3243 t1_itwsc56 wrote
Reply to comment by Jean-Porte in [D]Cheating in AAAI 2023 rebuttal by [deleted]
It might also be informative to know what were the details of the communication? No matter what, it's wrong and I believe the paper should be rejected. But the repercussions for the reviewer might depend on the intentions which should be inferable from the details of the communication.
master3243 t1_itwrjl8 wrote
Reply to comment by DeepGamingAI in [D]Cheating in AAAI 2023 rebuttal by [deleted]
Most likely a reviewer or meta-reviewer is collaborating with the author
master3243 t1_it65913 wrote
Reply to comment by 3kberockin in [D] AAAI 2023 Reviews by CauseRevolutionary59
A single reviewer on a paper?! I thought the minimum was 2.
master3243 t1_it655sq wrote
Reply to comment by RedTachyon in [D] AAAI 2023 Reviews by CauseRevolutionary59
Is there a "spin again" button?
I think you want (7, 7, 7) for the award.
master3243 t1_isv7ezg wrote
Reply to comment by Cheap_Meeting in [D] How frustrating are the ML interviews these days!!! TOP 3% interview joke by Mogady
People in ML (of all people) should know that when looking at a crappy metric, the top 3 models are probably crappy models that generalize poorly to the real world dataset.
master3243 t1_iss7aiv wrote
Reply to comment by Red-Portal in [D] Machine Learning conferences/journals with a mathematical slant? by vajraadhvan
I remember trying to publish a theory paper (statistical learning theory) in ICML and got criticized by two reviewers that complained the paper had no experimental justification (despite being pure information theoretic lower bound of any learnt algorithm which was impossible to justify experimentally??) and my professor and I doubt they understood what was happening.
The third reviewer was extremely knowledgeable in this area and we truly appreciated their comments which definitely helped better the paper.
master3243 t1_isharj4 wrote
Impressive, how big is the dataset? Huggingface says n<2k which seems incredibly small.
Also, what is an individual sample point? A gundam image and it's name?
master3243 t1_isch3m9 wrote
Reply to comment by Co0k1eGal3xy in [R] Mind's Eye: Grounded Language Model Reasoning through Simulation - Google Research 2022 by Singularian2501
Great
master3243 t1_iscb9mu wrote
Reply to comment by Co0k1eGal3xy in [R] Mind's Eye: Grounded Language Model Reasoning through Simulation - Google Research 2022 by Singularian2501
My link also says that heavier objects can fall slower than light objects. As in the styrofoam board that was heavier than the small ball yet it fell slower.
In the absence of more detail such as the dynamics of the shapes and the inclusion of air drag or not, it is fair to say that the most correct answer to the "which" question is "both". I would only count the "heavy first" answer as correct IF it included the discussion on air drag, otherwise the correct answer is "both". But that's my opinion and not objectively the only way to interpret this.
Especially given a model that has so many physics articles/material included in it's dataset, it's a pretty big fail that it can't answer this properly.
master3243 t1_isc3kp7 wrote
Reply to comment by Co0k1eGal3xy in [R] Mind's Eye: Grounded Language Model Reasoning through Simulation - Google Research 2022 by Singularian2501
They fall at the same rate
https://www.wired.com/2013/10/do-heavier-objects-really-fall-faster/
master3243 t1_irq9k6g wrote
Reply to comment by aviisu in [D] What kind of mental framework/thought process the researchers have when working on solving/proving the math of the new algorithms? by aviisu
That's just how math is done in research. If you don't like that then you'll hate, even more, pure math papers where they start with a theorem then show steps to end up with a true statement that is the theorem.
The intuition behind how the author came up with that path of thinking to come up with the final theorem is left (justifiably) entirely in the authors scratch paper or notebooks.
Some authors do give out insight onto the steps they took or their general intuition which is always nice, but not a requirement.
It's also worth mentioning that a lot of us like doing research but don't like writing research papers (as that is just a necessity due to humans lacking telepathic communication) so giving out more info is an optional step in a disliked process which makes sense why it's skipped.
master3243 t1_iz2f181 wrote
Reply to [R] The Forward-Forward Algorithm: Some Preliminary Investigations [Geoffrey Hinton] by shitboots
Interesting read, I'm always interested in research about alternatives to backprop.
One important paragraph (for the curious, that won't read the paper):
> The forward-forward algorithm is somewhat slower than backpropagation and does does not generalize quite as well on several of the toy problems investigated in this paper so it is unlikely to replace backpropagation for applications where power is not an issue. The exciting exploration of the abilities of very large models trained on very large datasets will continue to use backpropagation.
> The two areas in which the forward-forward algorithm may be superior to backpropagation are as a model of learning in cortex and as a way of making use of very low-power analog hardware without resorting to reinforcement learning (Jabri and Flower, 1992).