Entropy Bonus

Melissa O'Neill

Jul 31

Thanks for the reply. Much appreciated!

Perhaps you might be able to see how I misunderstood your key point. The title of your piece is “Can LLMs be creative?” with the subtitle “How far out of distribution can modern AI systems go?”, perhaps it's not a huge surprise that that's what I took the thrust of the article to be.

Arguably, solving problems in restricted domains is something classical AI and other search techniques have a long history with. We don't think of most chess programs as creative, even as they thrash humans. AlphaGo has at least as much in common with chess programs as it does with LLMs (in fact, its successor, AlphaZero can be either). AlphaGo's main mode of “creative exploration” is Monte Carlo Tree Search. In the game with Lee Sedol (who, incidentally, was not the best in the world, at that time, being ranked below Lee Chang-ho), the 37th move did indeed show that AlphaGo’s board evaluation function (via its policy network and value network) had captured a nuance previously undiscovered by humans, but today's chess programs also make “inspired” moves that grandmasters can learn from, but again, people don't see chess programs as the epitome of creativity.

Likewise, SMT systems like Z3, and symbolic math systems like Mathematica, and interactive theorem provers like Rocq follow the rules of math and can churn out solutions to challenging problems and prove proofs, yet few would call them creative. Google Deep Mind's first foray into the Math Olympiad used LLMs heavily augmented by these kinds of systems to achieve its silver medal status. So again, in my eyes, this prowess in solving math problems doesn't, of itself, show evidence of the kind of creativity that matters to people.

With GPT-3, one of the first things that helped with problem solving wasn't RL per se, it was providing instructions to be systematic, to “think step by step”. In part, this was compensating for a problem with the training data. We tend to publish final answers, not the scaffolding that got us there, and so in their writing, LLMs attempted to recreate what they'd seen and skip all the careful working. Whether by RL or by just changing the training data, a necessary step was to say “no, don't just leap to an answer, think it through, and pay attention to catch your own mistakes”.

For agentic work, RL does help massively. If you want LLMs to navigate web pages to book a vacation for you, or be sure to run the test suite before committing changes to the codebase, it's absolutely going to help. But again, coloring inside the lines is rarely seen as a sign of creativity. Although, sure, in some contexts, rules do help channel creative forces productively.

In any case, I wasn't critiquing RL any more than the gentle parody I commissioned that pondered whether planes could “really fly” was critiquing jets—much as you don't need a jet to perform aerobatics and you don't need a Harrier to land in a field, we can witness plenty of creativity from LLMs with no RL in sight. Of course, no matter what, we can always say it isn't “the right kind” of creativity much as we can say that our airplanes don't do “the right kind” of flying. RL won't stop people saying that “true creativity” belongs only to humans, as your final paragraph made clear.

Expand full comment

David Bachman

Aug 4

As I sad in the original post, I was really focusing more on the idea of what it means to be "out of distribution". It does NOT mean that an LLM produced something new and novel. It means the LLM produced something not in the statistical distribution defined by the training data. In your initial response to my post you said GPT2 produced an "entire fantastical creation that created new and unique details for a scenario that did not exist in the training data." That does not mean what it produced was out-of-distribution. Almost by definition you need some other technique besides supervised learning to do that, and RL is pretty much the only thing that anyone developing LLMs has come up with. You can argue that ChatGPT wasn't just supervised learning, even from its beginning, because of the RLHF phase of training, but that's also RL.

With all that said, I basically agree with you .... supervised learning can certainly produce new things that APPEAR creative. When a human combines existing ideas in a new way they are certainly labelled creative, and that's what supervised learning can do.

Expand full comment

Jul 30

Nicely put. LLMs by these lights are assuredly creative. As stated before, I hope we get an LLM- devised molecule that cures glioblastomas or solves Three Body problems in a twinkling, but in the cultural creativity arena we already have an overabundance of middlebrow (or worse) books/TV/movies/podcasts(hmm...that may be a tautology). Do we really deserve the LLM equivalent of the Victorian novelist, William Ainsworth, who wrote 45 books - mostly novels. Name one! (Interest piqued? Read Zadie Smith's The Fraud). Can LLMs generate sublime literature? It may be too soon to tell.

For what its worth human creativity IS different than LLM creativity in which syntax excludes semantics. If I can persuade you that structure has a great deal to do with function, the human CNS is a horse of a different color compared to LLMs' ersatz neural architecture which lacks >1000 synapses per neuron, astrocytes, microglia, axoplasm, glymphatics, neurotransmitters, intraneuronal hormones, and all the other neural wetware that evolved over the course of 4.5 billion years ... and which (unless you're a Dualist) appear to underlie human cognition ...and creativity. Human creativity is structurally and functionally sui generis and in the haute cultural arena will perforce be different than the LLM cultural creativity. Is it better? Will ChatGPT #8 be the next Tolstoy/Joyce/Proust? The audience awaits.

Expand full comment

Melissa O'Neill

Jul 31

From what I can tell, William Ainsworth was popular in his day. Likely no one will care about Dan Brown a few decades from now either.

Is it your position that only our best art deserves the label of “creative”? Most people on the planet have no published work, no exhibitions. Is everyone outside the top 0.1% a worthless dullard?

A few years ago, we would have been amazed at an AI system that could write a coherent story, make a drawing that matched a prompt, or composed a coherent tune. Now of course, that's a given, and the question is whether the quality of the output is truly outstanding, outshining the best humans can do. Watching the goal posts move is my new spectator sport.

I'll leave the essentialist stuff you closed with alone, except to say that 4.5 billion years of evolution was optimizing for gene transmission, not creation of great art or literature.

Expand full comment

As a undergraduate English major, I cordially invite you to explore the Victorian literary purgatory of WH Ainsworth. His Delphi digital oeuvre can be readily had for $2.99

Of course middlebrow or worse art is creative but the Techbros' aspirations appear to be aimed a little higher. When AI defeats Kasparov at chess and Lee Sedol at Go I'm pretty sure they would like to take on Tolstoy in a best of 5 series

Expand full comment

If I'm guilty of promoting greatness in culture I fear AI is liable to the same accusation. I am concerned that the AI true believers may be punching above their weight in the realm of belle lettres. I don't believe stochastic parrots can write the next Swann's Way (imitations not accepted).

Expand full comment

If you insist on invoking the Selfish Gene meme, help me grasp the selective/survival advantage for genes that produced our frontal lobes which created calculus and String Theory.

Expand full comment