blueSGL

blueSGL t1_j4gkxvd wrote

imagen does not allow for chaining together prompts, phenaki (also google) does.

If the question is about being able to create full movies then phenaki would be the tech needed to build on because it allows direction of entire scenes via temporally consistent context aware chaining rather than individual shots of concepts

2

blueSGL t1_j4ejzxn wrote

after seeing https://phenaki.video/

hard to say, ask me again when we see the next generation that surpasses the above.

no way to tell how easy temporal consistency is going to be, bad temporal consistency will relegate it to youtube and 'b-movie'/'low budget TV'

This is not like moving from the 80's into the 90's where bad CG was the best they had and so it was used. This needs to beat out the best of the best CG graphics otherwise the movie will be panned for shitty effects.


it could go through a period in comping and effects. Stuff like 'style matching' two clips that sorts out the lighting and color grade, or adding muzzle flashes and squibs by painting a mask on a keyframe and let the AI handle the rest, so adding to existing structure rather than making it all up wholesale


I suppose soon we'll be seeing a 'whisper' like AI to do auto Audio Description on scenes and acting direction, to caption existing films. (using existing AD could be a good contrastive learning technique)

1

blueSGL t1_j3z8z1g wrote

Chat GPT + Voice Synth will not only allow for the story lines to be written but for the voice work to be done.

I was really disappointed listening to Tod Howard and how Bethesda felt that someone realizing part way through a play though that they really wanted to play a different character, and treated starting over as having failed the player. I mean with that sort of attitude no wonder the games have been getting blander and blander.

Now think of a different future where all the interlocking storylines is all worked out ahead of time by a LLM. Where having everything be based on your current skills and previous actions 'behind the scenes'

so instead of blending things to a grey mush to 'disappoint' no one, instead you can have vibrant different adventures through the same land each time you play.

8

blueSGL t1_j3u0j9q wrote

4

blueSGL t1_j3pnv1l wrote

3D printers need to get way more plug and play for the average person to use them. I've got one, and when it's all calibrated and running and using good filament you can run it for weeks. Then comes the time to sort out the hot end change out the tube or something else and it's back to needing someone that tinkers with things.

There is also the other side, creating things to print. If it's not on thingiverse be prepared to break out the calipers and Fusion 360. Photogrammetry and AI could really help this out. take a video of the back of the remote with a missing cover and let some software handle everything and generate the STL file.

3

blueSGL t1_j3izpcc wrote

again what do you mean by that, people code new software every day.

You can ask for poems that don't exist, essays that don't exist.

All these things have had their structure extracted understood then followed to create new items.

Asking for code is the same.

>Will ChatGPT be able to write better code than any human within the next year?

A good coder needs to eat and sleep and take time to understand new technology has a limited scope in the programming languages known, has good days and bad days, has 'blocks' and is a single unit able to process problems serially at human level speed.

1

blueSGL t1_j3ir8t3 wrote

read the paper, it's not that it performs better, it's that abilities that are as good as random suddenly hit a phase change and become measurably better.

you were initially saying

> only answer prompts with solutions it’s already seen before.

Lets look at an example that makes things crystal clear.

Image generators by combining concepts can come up with brand new images. Does it have to have seen dogs before in order to place one in the image? yes. does it need to have seen one that looks identical to the final dog. e.g. could you crop the image and reverse image search it and get a match. No.

The same is true with poems, summations, code, etc... it's finding patterns and creating outputs that match the requested pattern so to get back to the point of coding it could very well output code it's never seen before by ingesting enough to understand syntax.

It's seen dogs before. it outputs similar but unique dogs. It's seen code before. It outputs similar but unique code.

1

blueSGL t1_j3ijeyf wrote

>It can’t actually solve problems, only answer prompts with solutions it’s already seen before.

.

>people are grasping at straws to try to explain a mechanism they don’t understand.

You are making definitive statements about things you say that experts in the field 'don't understand'

either you are claiming you know more than them or you are professing your ignorance of the matter.

Which is it.

1

blueSGL t1_j3hgjlz wrote

I wonder how much the decision by Nvidia to stop supporting NVlink just when having the ability on consumer cards would have been really useful played into the equation?

As in, consumer hardware now taps out at 24gig

1

blueSGL t1_j3hf9vn wrote

> If we have text2video + continuity (so that you can make a prompt, then make another that merges your first one with the second one to give it some kind of continuity) would be amazing.

I take it you are refering to a video from StabilityAI and not Google, because Google has already shown off 'prompt sequence' video gen

https://phenaki.video/

3

blueSGL t1_j3du306 wrote

You can bet dollars to doughnuts that chatGPT is being run against real environments in training.

You know how it gets things wrong, and you need to keep prompting it then eventually it gets the thing correct?

That happening at scale.

Everything being recorded and every test case where it finally generates working code, that's a new piece of training data.

With just the current dataset and ability to feed known good answers back in this could bootstrap itself up in capability.

But of course it's not just using the data that's being ground out internally, it's also going to be training on all the conversations people are having with it right now.

2

blueSGL t1_j30p4fu wrote

> any AI that only has 4000 characters of memory cannot be considered AGI or anything close to it.

From the comments of that article: https://www.cerebras.net/press-release/cerebras-systems-enables-gpu-impossible-long-sequence-lengths-improving-accuracy-in-natural-language-processing-models/

>The proliferation of NLP has been propelled by the exceptional performance of Transformer-style networks such as BERT and GPT. However, these models are extremely computationally intensive. Even when trained on massive clusters of graphics processing units (GPUs), today these models can only process sequences up to about 2,500 tokens in length. Tokens might be words in a document, amino acids in a protein, or base pairs on a chromosome. But an eight-page document could easily exceed 8,000 words, which means that an AI model attempting to summarize a long document would lack a full understanding of the subject matter. The unique Cerebras wafer-scale architecture overcomes this fundamental limitation and enables sequences up to a heretofore impossible 50,000 tokens in length.

Would that be enough?

0

blueSGL t1_j30ofaj wrote

and that is why models like this will change the world.

being able to summon information and clarification at the touch of a button in an easy to digest package.

amateurs becoming more expert in their chosen niche interests I'm sure this will provide a few cross domain 'ah-ha' moments.

all the autodidacts out there are going to have a field day.

20

blueSGL t1_j2z28ow wrote

> Wisdom of the Crowd

Something I recently saw mentioned by Ajeya Cotra is to query the LLM by re entering the previous output and asking if its correct, repeat this multiple times, take an average the answers provides a higher level of accuracy than just taking the first answer. (something that sounds weird to me)

Well ok, if viewed from the vantage point that the models are very good at doing certain things and people have not worked out how to correctly prompt/fine tune yet, it's not that weird. It's more that the base level outputs are shockingly good and then someone introduces more secret sauce and makes them even better. The problem with this is there is no saying what the limit to the models that already exist are.

1

blueSGL OP t1_j2nnos9 wrote

>Brands in China are looking for alternative spokespeople after many celebrities recently ran into negative press about tax evasion or personal scandals, said Sirius Wang, chief product officer and head of marketplace Greater China at Kantar.

>At least 36% of consumers had watched a virtual influencer or digital celebrity perform in the last year, according to a survey published by Kantar this fall. Twenty-one percent had watched a virtual person host an event or broadcast the news, the report said.

Kinda hard to get my head around the fact that it's so main stream there a 1/3rd of people are already exposed to it.

I'm aware that V-tubers over here are a thing but I always saw it as a far more niche.

How long till a main stream US or UK news program is presented by a virtual presenter?

11