This is all super weird and shows that we really have no idea what’s going on with these LLMs. Are we really ready for this stuff to become the backbone of internet search?

We’ll demonstrate a previously undocumented failure mode for GPT-2 and GPT-3 language models, which results in bizarre completions (in some cases explicitly contrary to the purpose of the model), and present the results of our investigation into this phenomenon.

