actualsnek
actualsnek t1_j495737 wrote
Reply to comment by ReasonablyBadass in [D] What's your opinion on "neurocompositional computing"? (Microsoft paper from April 2022) by currentscurrents
Crazy that this has even a single upvote and proof that this subreddit is no longer the community for academic discourse it once was. Do you know who Paul Smolensky is? He practically invented the term "neuro-symbolic" and was virtually the only researcher seriously working on it in the 20 years leading up to the deep learning revolution. Harmonic Grammar, Optimality Theory, Tensor Product Representations. Please pick up perhaps any article on connectionism before 2010.
No, this is not a new term for for neuro-symbolic computing (which is now just a buzzword applicable to half of the field), it's a specific theoretical take on how compositional structure could be captured by vectorial representations.
actualsnek t1_j493haq wrote
Reply to comment by visarga in [D] What's your opinion on "neurocompositional computing"? (Microsoft paper from April 2022) by currentscurrents
We're exploring some data augmentation approaches right now (see my response to u/giga-chad99) but how would you propose generating those problems with compositional structure?
actualsnek t1_j4931de wrote
Reply to comment by giga-chad99 in [D] What's your opinion on "neurocompositional computing"? (Microsoft paper from April 2022) by currentscurrents
Text2image generation models do anecdotally appear to be better than image-text matching models at compositional tasks, but if you look closely at some generated images, you'll notice compositional failures. They often apply properties to entities on which the text did not describe them as applied to, or misunderstand the described relation between entities as a more common relation between those entities.
Try a prompt like "man with dog ears running in the park", and it'll generate images of a man with a dog (sometimes with amplified ears) running in the park. Why? Because models don't have the underlying ability to create compositional representations, they instead simply approximate their training data distribution.
Examples like "a raccoon in a spacesuit playing poker" often do well because spacesuits are only ever worn and poker is only ever played (i.e. relations that are common in the training distribution). Try a prompt like "a raccoon sitting on a poker chip and holding a spacesuit" and you'll see pretty drastic failures.
All this being said, generative models *still* appear better than discriminative models for vision-language compositionality tasks, and our current work is exploring approaches to impart this ability onto discriminative models to solve tasks like Winoground.
actualsnek t1_j44m1z9 wrote
Reply to [D] What's your opinion on "neurocompositional computing"? (Microsoft paper from April 2022) by currentscurrents
Compositionality is increasingly a significant area of concern across many subfields of deep learning. Winoground recently showed that all state-of-the-art vision-language models drastically fail to comprehend compositional structure, a feature which many linguists would argue is fundamental to the expressive power of language.
Smolensky is also a great guy and was affiliated with the PDP group that developed backprop in the 80's. The best path to neurosymbolic computing & compositional reasoning remains unclear, but Smolensky and his student Tom McCoy have done some great work over the last few years exploring how symbolic structures are implicitly represented in neural nets.
actualsnek t1_j4neego wrote
Reply to comment by Acrobatic-Name5948 in [D] What's your opinion on "neurocompositional computing"? (Microsoft paper from April 2022) by currentscurrents
NECSTransformer appears to be a generalization of the TP-Transformer presented by Schlag et al. 2019 with implementation available at this GitHub repo.