Philosopher AI

hbotz · August 23, 2020, 9:30pm

In this work we do not fine-tune GPT-3 because our focus is on task-agnostic performance, but GPT-3 can be fine-tuned in principle and this is a promising direction for future work.

considering the OpenAI people havent even done this somehow i dont think tech demo people are.

yes yes sure it would be helpful for the author to have put the prompt structure on github. you asked a question, so i gave an answer. its few-shot.

hbotz · August 23, 2020, 9:34pm

its like ur a middle school report writer who opens up the GPT-3 paper flips to page 6 and sees a list of various

different settings that we will be evaluating GPT-3 on or could in principle evaluate GPT-3 on

which are a set of four bullet points "fine-tuning, few-shot, one-shot, zero-shot"

and from that make a generic declaration

except ur missing the point because the tech demo is few-shot, like basically all the tech demos. do note the "in principle" clause covering the fact that GPT-3 is not fine-tuned in practice.

electrowizard · August 23, 2020, 9:35pm

If the guy doesn't say if it's zero, one, few, or many-shot then we don't know.

And what? GPT-3 tech demos almost always include how fine-tuning was done and with what datasets or what prompts and responses were used.

You realize that two of the GPT-3 pretrained models are smaller than GPT-2 and can be fine-tuned with a moderate amount of cloud compute, right? Or can be completely trained from scratch with DGX-2s

hbotz · August 23, 2020, 9:36pm

buddy if the weights arent public how are random people supposed to fine tune it

electrowizard · August 23, 2020, 9:36pm

We don't fucking know if it's few-shot, that's the fucking point. It could be one-shot for all we know. That information almost always accompanies the demos

hbotz · August 23, 2020, 9:37pm

okay very cool i conjecture it is few shot because you minimally need one example for "reject" and one example for "accept". maybe i am wrong, but i cant think of a different way to do it

indeed the author should have put it on github but i will provide you an answer to your question. i think that is more productive than a useless criticism considering neither you nor i are the tech demo author

big_ass · August 23, 2020, 9:38pm

Nobody's training GPT-3 "from scratch" by any traditional definition of from scratch

The suggestion that a guy who made a gimmick philosophy bot for twitter may have spent 5-10 million dollars to train a GPT-3 philosophy model and you can't wait to see the paper on his methods is bad-faith pseudointellectual masturbation

You wanted to say "few-shot" and "github" a couple times so someone like epokkk or iaafr would come in here and say "wow this guy really knows what he's talking about! He knows about github!"

hbotz · August 23, 2020, 9:43pm

tfw ewiz is a GPT-3 model that recombobulated

we start this section by explicitly defining and contrasting the different settings that we will be evaluating GPT-3 on or could in principle evaluate GPT-3 on ... Fine Tuning (FT) ... Few-Shot (FS) ... One-Shot (1S) ... Zero-Shot (0S)

into

notably misinterpreting what "fine tuning" means

electrowizard · August 23, 2020, 9:44pm

You're right, the API for fine-tuning hasn't been released: it comes out in a few weeks. I've been working with zero and one-shot, so I hadn't checked if it was out yet. It's still important to know if it's zero, one, or few-shot and what prompt/response pairs were used

LuckyArtist · August 23, 2020, 9:44pm

How about you idiots just ask the ai about this

big_ass · August 23, 2020, 9:45pm

electrowizard · August 23, 2020, 9:45pm

Some papers before GPT-3 referred to few-shot as a fine-tuning mechanism. I already stated that depending on the paper, some classify it as fine-tuning while some depend on the scale of the dataset being a certain size for it to qualify as fine-tuning

LuckyArtist · August 23, 2020, 9:46pm

Look how easy this was
https://philosopherai.com/philosopher/are-you-zero-shot-or-one-shot-6358a0

electrowizard · August 23, 2020, 9:47pm

I fucking said this moron. Fuck off

electrowizard · August 23, 2020, 9:48pm

This shit is literally my job you fucking idiot.

electrowizard · August 23, 2020, 9:50pm

It is very likely that NVIDIA is like they have with every other extremely large language model to present an optimized training mechanism specific to their hardware. Come the fuck on

electrowizard · August 23, 2020, 9:51pm

I need to reread the paper, but I believe it can learn to reject from zero-shot if it concludes that the prompt given isn't similar enough to the zero-shot prompt

big_ass · August 23, 2020, 9:51pm

hbotz · August 23, 2020, 9:51pm

you said you were working on it, right?

can you give an example of what you mean by a "zero-shot prompt"

electrowizard · August 23, 2020, 9:52pm

Yes, none of that shit is contradictory. Saying that there would be vastly improved performance with from-scratch training on a philosophy corpus doesn't contradict shit when THE FUCKING SENTENCE BEFORE THAT says it's very unlikely any of that is true.