MysteryInc152 t1_j6okowf wrote on January 31, 2023 at 8:13 PM

Reply to comment by Zetsu-Eiyu-O in [D] Generative Model FOr Facts Extraction by Zetsu-Eiyu-O

Not sure what you mean by penalize but say you wanted an LLM that wasn't instruction fine-tuned to translate between 2 languages it knows.

Your input would be

Language x: "text of language x"

Language y: "translated language x text"

You'd do this for a few examples. 2 or 3 should be good. Or even one depending on the task. Then finally

Language x: "text you want translated"

Language y: The model would translate the text and output here

All transformer generative LLMs work the same way with enough scale. GPT-2 (only 1.5b parameters) does not have the necessary scale.

MysteryInc152 t1_j6o62z6 wrote on January 31, 2023 at 6:42 PM

Reply to [D] Generative Model FOr Facts Extraction by Zetsu-Eiyu-O

This is what In-context learning is for.

Giving the model a few examples of a text input and a corresponding fact extraction is all that's necessary.

MysteryInc152 t1_j6jkmus wrote on January 30, 2023 at 8:05 PM

Reply to comment by currentscurrents in [N] OpenAI has 1000s of contractors to fine-tune codex by yazriel0

The human brain has trillions of synapses (the closest biological equivalent to parameters), is multimodal and evolution fine-tuned.

MysteryInc152 t1_j60vz8p wrote on January 26, 2023 at 10:38 PM

Reply to comment by FallUpJV in Few questions about scalability of chatGPT [D] by besabestin

OpenAI's models are still undertrained as well.

MysteryInc152 t1_j5uvo3i wrote on January 25, 2023 at 7:00 PM

Reply to comment by Kamimashita in [D]Are there any known AI systems today that are significantly more advanced than chatGPT ? by Xeiristotle

Nothing that would beat Open AI's stuff (Google's stuff) is open for inference or finetuning from the public.

I think the best Open source alternative is this

https://github.com/THUDM/GLM-130B

https://huggingface.co/spaces/THUDM/GLM-130B

But it's not finetuned for instruction so you have to prompt/approach it like a text completer. And also you'll need a 4x3090 to get it running locally.

The best open source instruction finetuned models are the flan t5 models

https://huggingface.co/google/flan-t5-xxl

If you're not necessarily looking for open source but still actual alternatives that aren't just an API wraparound of GPT, you can try cohere

https://cohere.ai/pricing

Good thing is that it's completely free for non commercial or non production use

or alephalpha

https://app.aleph-alpha.com/

Not free but the pricing is decent and they have a visual language model as well. Something like flamingo

https://www.deepmind.com/blog/tackling-multiple-tasks-with-a-single-visual-language-model

MysteryInc152 t1_j5tits4 wrote on January 25, 2023 at 1:44 PM

Reply to [D]Are there any known AI systems today that are significantly more advanced than chatGPT ? by Xeiristotle

Google has few systems that would beat current public SOTA models. PALM/Minerva/Med Palm is the best but Flamingo, Chinchilla/Sparrow would also best chatGPT.

Dunno about anything from meta. They have open source GPT models released but they're not as good as Open AI's stuff.

MysteryInc152 t1_j5eyfnm wrote on January 22, 2023 at 3:04 PM

Reply to [D] Couldn't devs of major GPTs have added an invisible but detectable watermark in the models? by scarynut

Any watermark that couldn't easily be bypassed (paraphrase, switching out words every nth word etc) would cripple the output of the model. In fact even the simple watermarks could have weird effects on output.

MysteryInc152 t1_j50ym7g wrote on January 19, 2023 at 5:16 PM

Reply to comment by Daos-Lies in [D] Inner workings of the chatgpt memory by terserterseness

There's a repo that actually uses embeddings for long term conversations you can try out.

https://github.com/Kav-K/GPT3Discord

MysteryInc152 t1_j50pw6e wrote on January 19, 2023 at 4:23 PM

Reply to comment by IntelArtiGen in [D] Inner workings of the chatgpt memory by terserterseness

With embeddings, it should theoritically not have a hard limit at all. But experiments here suggest a sliding context window of 8096

https://mobile.twitter.com/goodside/status/1598874674204618753?t=70_OKsoGYAx8MY38ydXMAA&s=19

MysteryInc152 t1_j50pkxw wrote on January 19, 2023 at 4:21 PM

Reply to comment by Daos-Lies in [D] Inner workings of the chatgpt memory by terserterseness

With embeddings, it should theoritically not have a hard limit at all. But experiments here suggest a sliding context window of 8096

https://mobile.twitter.com/goodside/status/1598874674204618753?t=70_OKsoGYAx8MY38ydXMAA&s=19

MysteryInc152 t1_j4lv0d5 wrote on January 16, 2023 at 5:14 PM

Reply to [D]: Are there models like CODEX but work in a reversed way? by GoodluckH

Codex and chatGPT can understand more than just functions. The issue with them is the limited token window.

MysteryInc152 t1_j4l8fwz wrote on January 16, 2023 at 2:44 PM

Reply to comment by [deleted] in [D] Can ChatGPT flag it's own writings? by MrSpotgold

Yeah well, that's not really how these models work. There's no pulling from a database and there's no external searching. The model was trained and frozen.

While it is possible to have the model access some external database in the future, yeah...that's not going to happen in relation to previous chat entries you have no right or access to. That's a privacy can of worms no corporation with any sense will get into as well as being prohibitively expensive for no real gain at all.

MysteryInc152 t1_j4l38t9 wrote on January 16, 2023 at 2:04 PM

Reply to [D] Can ChatGPT flag it's own writings? by MrSpotgold

No and no

MysteryInc152 t1_iyots1p wrote on December 3, 2022 at 12:14 AM

Reply to comment by Ribak145 in Have you updated your timelines following ChatGPT? by EntireContext

What it can do with code is pretty astounding ?