Philosopher AI

In this work we do not fine-tune GPT-3 because our focus is on task-agnostic performance, but GPT-3 can be fine-tuned in principle and this is a promising direction for future work.

considering the OpenAI people havent even done this somehow i dont think tech demo people are.

yes yes sure it would be helpful for the author to have put the prompt structure on github. you asked a question, so i gave an answer. its few-shot.

its like ur a middle school report writer who opens up the GPT-3 paper flips to page 6 and sees a list of various

different settings that we will be evaluating GPT-3 on or could in principle evaluate GPT-3 on

which are a set of four bullet points "fine-tuning, few-shot, one-shot, zero-shot"

and from that make a generic declaration

except ur missing the point because the tech demo is few-shot, like basically all the tech demos. do note the "in principle" clause covering the fact that GPT-3 is not fine-tuned in practice.

If the guy doesn't say if it's zero, one, few, or many-shot then we don't know.

And what? GPT-3 tech demos almost always include how fine-tuning was done and with what datasets or what prompts and responses were used.

You realize that two of the GPT-3 pretrained models are smaller than GPT-2 and can be fine-tuned with a moderate amount of cloud compute, right? Or can be completely trained from scratch with DGX-2s

buddy if the weights arent public how are random people supposed to fine tune it

We don't fucking know if it's few-shot, that's the fucking point. It could be one-shot for all we know. That information almost always accompanies the demos

okay very cool i conjecture it is few shot because you minimally need one example for "reject" and one example for "accept". maybe i am wrong, but i cant think of a different way to do it

indeed the author should have put it on github but i will provide you an answer to your question. i think that is more productive than a useless criticism considering neither you nor i are the tech demo author

Nobody's training GPT-3 "from scratch" by any traditional definition of from scratch

The suggestion that a guy who made a gimmick philosophy bot for twitter may have spent 5-10 million dollars to train a GPT-3 philosophy model and you can't wait to see the paper on his methods is bad-faith pseudointellectual masturbation

You wanted to say "few-shot" and "github" a couple times so someone like epokkk or iaafr would come in here and say "wow this guy really knows what he's talking about! He knows about github!"

:pensive: tfw ewiz is a GPT-3 model that recombobulated

we start this section by explicitly defining and contrasting the different settings that we will be evaluating GPT-3 on or could in principle evaluate GPT-3 on ... Fine Tuning (FT) ... Few-Shot (FS) ... One-Shot (1S) ... Zero-Shot (0S)

into

notably misinterpreting what "fine tuning" means

1 Like

You're right, the API for fine-tuning hasn't been released: it comes out in a few weeks. I've been working with zero and one-shot, so I hadn't checked if it was out yet. It's still important to know if it's zero, one, or few-shot and what prompt/response pairs were used

How about you idiots just ask the ai about this

Some papers before GPT-3 referred to few-shot as a fine-tuning mechanism. I already stated that depending on the paper, some classify it as fine-tuning while some depend on the scale of the dataset being a certain size for it to qualify as fine-tuning

Look how easy this was
https://philosopherai.com/philosopher/are-you-zero-shot-or-one-shot-6358a0

2 Likes

I fucking said this moron. Fuck off

This shit is literally my job you fucking idiot.

It is very likely that NVIDIA is like they have with every other extremely large language model to present an optimized training mechanism specific to their hardware. Come the fuck on

I need to reread the paper, but I believe it can learn to reject from zero-shot if it concludes that the prompt given isn't similar enough to the zero-shot prompt

you said you were working on it, right?

can you give an example of what you mean by a "zero-shot prompt"

Yes, none of that shit is contradictory. Saying that there would be vastly improved performance with from-scratch training on a philosophy corpus doesn't contradict shit when THE FUCKING SENTENCE BEFORE THAT says it's very unlikely any of that is true.