ChatGPT — an epochal event

Ajit Rajasekharan
12 min readDec 17, 2022
The training procedure to create InstructGPT, a precursor to ChatGPT. Given the similarity of the training process in the InstructGPT paper illustrations and the ChatGPT blog, it is likely ChatGPT’s training procedure is similar to this. ChatGPT paper has not been released yet. Image by Author
Definition of “An epochal event” in ChatGPT’s own words — Image by Author using ChatGPT

A quiet shift has been in progress over the last few years in the NLP ecosystem driven by large language models(LLMs). This culminated in an epochal event, just a few days ago with implications far beyond just the NLP ecosystem.

OpenAI released ChatGPT — a large language model trained to capture user intent almost all the time in its responses. A key factor contributing to the success of ChatGPT is the training process, which is the culmination of years of research in reinforcement learning and language modeling. The combination of language modeling followed by supervised tuning and reinforcement learning yielded a model whose responses are hard to distinguish from a human, even when it occasionally “hallucinates” (a euphemism coined by researchers for false statements), or at times stubbornly justifies (e.g. asserting 3599 is a prime number despite acknowledging that is composite 3599 = 59 * 61) when it is wrong, etc.

OpenAI published a blog on ChatGPT. A paper is not out yet, so there has been a flurry of activity online to glean more details about this model. Some of the communities’ guesses/observations are likely to be revised or even proven wrong when the paper is published. For this reason, any…

--

--