currentscurrents
currentscurrents t1_j4ijvez wrote
Reply to comment by RuairiSpain in [P] I built arxiv-summary.com, a list of GPT-3 generated paper summaries by niclas_wue
A Snappy Headline Is All You Need
currentscurrents t1_j4ijlqv wrote
You can fine-tune image generator models and some smaller language models.
You can also do tasks that don't require super large models, like image recognition.
>that's beyond just some toy experiment?
Don't knock toy experiments too much! I'm having a lot of fun trying to build a differentiable neural computer or memory-augmented network in pytorch.
currentscurrents t1_j4a2las wrote
Reply to comment by iamnotlefthanded666 in [D] Is MusicGPT a viable possibility? by markhachman
>Specifically, 1) we design an expert system to generate a melody by developing musical elements from motifs to phrases then to sections with repetitions and variations according to pre-given musical form; 2) considering the generated melody is lack of musical richness, we design a Transformer based refinement model to improve the melody without changing its musical form. MeloForm enjoys the advantages of precise musical form control by expert systems and musical richness learning via neural models.
currentscurrents t1_j49x0ev wrote
Reply to comment by blueSGL in [D] Is MusicGPT a viable possibility? by markhachman
Also MeloForm, a Microsoft project that composes music using expert systems.
currentscurrents t1_j49u28o wrote
Reply to comment by mycall in [D] Is MusicGPT a viable possibility? by markhachman
I don't think it's that simple - whether or not generative AI is considered "transformative" has not yet been tested by the courts.
Until somebody actually gets sued over this and it goes to court, we don't know how the legal system is going to handle it. There is currently a lawsuit against Github Copilot, so we will probably know in the next couple years.
currentscurrents t1_j499l3p wrote
Reply to [D] Combining Machine Learning + Expert Knowledge (Question for Agriculture Research) by Tigmib
Are you trying to do research, or solve a problem? Building expert systems out of neural networks is still a new, experimental idea. If you just want to get the job done you may want to pick more proven methods.
currentscurrents t1_j490rvn wrote
Reply to comment by BarockMoebelSecond in [D] Bitter lesson 2.0? by Tea_Pearce
It's meaningful right now because there's a threshold where LLMs become awesome, but getting there requires expensive specialized GPUs.
I'm hoping in a few years consumer GPUs will have 80GB of VRAM or whatever and we'll be able to run them locally. While datacenters will still have more compute, it won't matter as much since there's a limit where larger models would require more training data than exists.
currentscurrents t1_j48csbo wrote
Reply to comment by RandomCandor in [D] Bitter lesson 2.0? by Tea_Pearce
If it is true that performance scales infinitely with compute power - and I kinda hope it is, since that would make superhuman AI achievable - datacenters will always be smarter than PCs.
That said, I'm not sure that it does scale infinitely. You need not just more compute but also more data, and there's only so much data out there. GPT-4 reportedly won't be any bigger than GPT-3 because even terabytes of scraped internet data isn't enough to train a larger model.
currentscurrents t1_j4716tp wrote
Reply to comment by ml-research in [D] Bitter lesson 2.0? by Tea_Pearce
Try to figure out systems that can generalize from smaller amounts of data? It's the big problem we all need to solve anyway.
There's a bunch of promising ideas that need more research:
- Neurosymbolic computing
- Expert systems built out of neural networks
- Memory augmented neural networks
- Differentiable neural computers
currentscurrents t1_j4702g0 wrote
Reply to comment by mugbrushteeth in [D] Bitter lesson 2.0? by Tea_Pearce
Compute is going to get cheaper over time though. My phone today has the FLOPs of a supercomputer from 1999.
Also if LLMs become the next big thing you can expect GPU manufacturers to include more VRAM and more hardware acceleration directed at them.
currentscurrents OP t1_j44ycdz wrote
Reply to comment by Farconion in [D] What's your opinion on "neurocompositional computing"? (Microsoft paper from April 2022) by currentscurrents
From what I've seen, it's a promising field that should be possible. But so far but nobody's made it work for more than toy problems.
currentscurrents t1_j44pu0u wrote
Reply to I just started out guys, wish me luck by 47153
Is it though? These days it seems like even a lot of research papers are just "we stuck together a bunch of pytorch components like lego blocks" or "we fed a transformer model a bunch of data".
Math is important if you want to invent new kinds of neural networks, but for end users it doesn't seem very important.
currentscurrents OP t1_j44nngb wrote
Reply to comment by omniron in [D] What's your opinion on "neurocompositional computing"? (Microsoft paper from April 2022) by currentscurrents
The paper does talk about this and calls transformers "first generation compositional systems" - but limited ones.
>Transformers, on the other hand, use graphs, which in principle can encode general, abstract structure, including webs of inter-related concepts and facts.
> However, in Transformers, a layer’s graph is defined by its data flow, yet this data flow cannot be accessed by the rest of the network—once a given layer’s data-flow graph has been used by that layer, the graph disappears. For the graph to be a bona fide encoding, carrying information to the rest of the network, it would need to be represented with an activation vector that encodes the graph’s abstract, compositionally-structured internal information.
>The technique we introduce next—NECST computing—provides exactly this type of activation vector.
They then talk about a more advanced variant called NECSTransformers, which they consider a 2nd generation compositional system. But I haven't heard of this system before and I'm not clear if it actually performs better.
currentscurrents OP t1_j43s8ki wrote
Reply to comment by Diffeologician in [D] What's your opinion on "neurocompositional computing"? (Microsoft paper from April 2022) by currentscurrents
In the paper they talk about "first generation compositional systems" and I believe they would include differentiable programming in that category. It has some compositional structure, but the structure is created by the programmer.
Ideally the system would be able to create it's own arbitrarily complex structures and systems to understand abstract ideas, like humans can.
currentscurrents t1_j3jst6n wrote
Reply to comment by Immarhinocerous in [Discussion] Is there any alternative of deep learning ? by sidney_lumet
I know there's a whole field of decision tree learning, but I'm not super up to date on it.
I assume neural networks are better or else we'd be using trees instead.
currentscurrents t1_j3fop2j wrote
Reply to comment by tdgros in [Discussion] Is there any alternative of deep learning ? by sidney_lumet
You can represent any neural network as a decision tree, and I believe you can represent any decision tree as a series of if statements...
But the interesting bit about neural networks is the training process, automatically creating that decision tree.
currentscurrents t1_j3epeo7 wrote
Reply to comment by junetwentyfirst2020 in [Discussion] Is there any alternative of deep learning ? by sidney_lumet
Transformers are just deep learning with attention.
And attention is just another neural network telling the first one where to look.
currentscurrents t1_j3eo4uc wrote
Reply to comment by singularpanda in [D] Will NLP Researchers Lose Our Jobs after ChatGPT? by singularpanda
There's plenty of work to be done in researching language models that train more efficiently or run on smaller machines.
ChatGPT is great, but it needed 600GB of training data and megawatts of power. It must be possible to do better; the average human brain runs on 12W and has seen maybe a million words tops.
currentscurrents t1_j3emas4 wrote
Reply to comment by suflaj in [D] Will NLP Researchers Lose Our Jobs after ChatGPT? by singularpanda
>I hate to break your bubble, but the task is also achievable even with GPT2
Is it? I would love to know how. I can run GPT2 locally, and that would be fantastic level of zero-shot learning to be able to play around with.
I have no doubt you can fine-tune GPT2 or T5 to achieve this, but in my experience they aren't nearly as promptable as GPT3/ChatGPT.
>Specifically the task you gave it is likely implicitly present in the dataset, in the sense that the dataset allowed the model to learn the connections between the words you gave it
I'm not sure what you're getting at here. It has learned the connections and meanings between words of course, that's what a language model does.
But it still followed my instructions, and it can follow a wide variety of other detailed instructions you give it. These tasks are too specific to have been in the training data; it is successfully generalizing zero-shot to new NLP tasks.
currentscurrents t1_j3eiw5w wrote
Reply to comment by suflaj in [D] Will NLP Researchers Lose Our Jobs after ChatGPT? by singularpanda
I think you're missing some of the depth of what it's capable of. You can "program" it to do new tasks just by explaining in plain english, or by providing examples. For example many people are using it to generate prompts for image generators:
>I want you to act as a prompt creator for an AI image generator.
>Prompts are descriptions of artistic images than include visual adjectives and art styles or artist names. The image generator can understand complex ideas, so use detailed language and describe emotions or feelings in detail. Use terse words separated by commas, and make short descriptions that are efficient in word use.
>With each image, include detailed descriptions of the art style, using the names of artists known for that style. I may provide a general style with the prompt, which you will expand into detail. For example if I ask for an "abstract style", you would include "style of Picasso, abstract brushstrokes, oil painting, cubism"
>Please create 5 prompts for an mob of grandmas with guns. Use a fantasy digital painting style.
This is a complex and poorly-defined task, and it certainly was not trained on this since the training stops in 2021. But the resulting output is exactly what I wanted:
>An army of grandmas charging towards the viewer, their guns glowing with otherworldly energy. Style of Syd Mead, futuristic landscapes, sleek design, fantasy digital painting.
Once I copy-pasted it into an image generator it created a very nice image.
I think we're going to see a lot more use of language models for controlling computers to do complex tasks.
currentscurrents OP t1_j39hde8 wrote
Reply to comment by visarga in [D] Special-purpose "neuromorphic" chips for AI - current state of the art? by currentscurrents
Not bad for a milliwatt of power though - an arduino idles at about 15 milliwatts.
I could see running pattern recognition in a battery-powered sensor or something.
currentscurrents OP t1_j34uma6 wrote
Reply to comment by IntelArtiGen in [D] Special-purpose "neuromorphic" chips for AI - current state of the art? by currentscurrents
>That alone I doubt it, even if it could theoretically reproduce how the brain works with the same power efficiency it doesn't mean you would have the algorithm to efficiently use this hardware.
I meant just in terms of compute efficiency, using the same kind of algorithms we use now. It's clear they won't magically give you AGI, but Innatera claims 10000x lower power usage with their chip.
This makes sense to me; instead of emulating a neural network using math, you're building a physical model of one on silicon. Plus, SNNs are very sparse and an analog one would only use power when firing.
>Usual ANNs are designed for current tasks and current tasks are often designed for usual ANNs. It's easier to use the same datasets but I don't think the point of SNNs is just to try to perform better on these datasets but rather to try more innovative approaches on some specific datasets.
I feel like a lot of SNN research is motivated by understanding the brain rather than being the best possible AI. It also seems harder to get traditional forms of data into and out of the network, like you have to convert images into spike timings - for which there are several methods each with downsides and upsides.
currentscurrents t1_j338km2 wrote
Reply to comment by C_Hawk14 in [Discussion] If ML is based on data generated by humans, can it truly outperform humans? by groman434
Good question.
Unfortunately, I have no clue what makes "good" art either. This is a pretty old problem that may not be solvable.
currentscurrents t1_j2zidye wrote
Reply to comment by C_Hawk14 in [Discussion] If ML is based on data generated by humans, can it truly outperform humans? by groman434
I think that actually measures how good it is at getting popular on social media, which is not the same task as making good art.
There's also some backlash against AI art right now, so this might favor models that can't be distinguished from human art rather than models that are better than human art.
currentscurrents t1_j4jj1l6 wrote
Reply to comment by junetwentyfirst2020 in [D] What kinds of interesting models can I train with just an RTX 4080? by faker10101891
It's a little discouraging when every interesting paper has a cluster of 64 A100s in their methods section.