AI and the three types of creativity
Revisiting Boden's classification with modern LLMs in mind
Audio version available here.
A few weeks ago I wrote a post titled, “Can LLMs be creative?”. In it, I argued that the original version of systems like ChatGPT lacked any real creative ability, but now that techniques like RLVR (Reinforcement Learning with Verifiable Rewards) have been added to the training process, creativity is possible. In the comments section, my colleague Melissa O’Neill had some fantastic push-back, and in the ensuing conversation I realized that I did not articulate well what I meant by the word “creative.” After some further research I came across the work of Margaret Boden who, in her book The Creative Mind, articulated three distinct types of creativity in her discussions of AI. This was long before ChatGPT; the original version of Boden’s book was published in 1990, and the second edition in 2004. Boden’s categories nicely resolve the debate Melissa and I had: I was talking about one of Boden’s categories, while Melissa was talking about the others. In this post I’ll explain these terms, and talk about how they relate to modern AI systems and my earlier post on AI creativity.
The subtitle of my original post was “How far out of distribution can modern AI systems go?” This is a technical question that turns out to be the crux of the issue. I’ll explain. The goal of any Machine Learning algorithm, dating back long before the invention of Neural Networks, is to be able to extrapolate from some observed data to new data. Humans often do this without thinking. For example, if I tell you that the price of 4 widgets is 50 cents, and the price of 10 widgets is 90 cents, you might reasonably guess that the price of 7 widgets (halfway between 4 and 10) is 70 cents (halfway between 50 and 90). That’s a basic example of perhaps the first ML algorithm, Linear Regression: you have a bunch of data that you assume all lies on (or near) a line if you were to plot it. You can then guess about unknown data (like the price of 7 widgets) by assuming this linear pattern will extend to that new situation. This works because, loosely speaking, 4 widgets, 7 widgets, and 10 widgets are all in the same “statistical distribution,” so we say the new data point (7 widgets) is “in-distribution”. The price of 1200 wobbles, which our data says nothing about, would be “out-of-distribution”.
My earlier blog post started with an anecdote that was repeated often on the internet: the first AI image generators seemed to be incapable of creating an image of a glass of wine filled to its brim, presumably because no such image is in their training data. That doesn’t mean that they were incapable of generating an image of a glass of wine that is 30% full, even if no such photos were in their training data. If they were trained on glasses that were 10%, 20%, 40%, and 45% full, then 30% would have been “in-distribution,” so the ability to generate such an image wouldn’t have been all that surprising. However, a truly full glass might very well have been “out-of-distribution” if it was enough unlike any of the training images. Early AIs were only able to explore the statistical distributions on which they were trained, but as long as they were operating within those bounds, they could create something new. In Boden’s terminology, this would be called “exploratory creativity”.
LLMs are trained on written data so they can generate writing. For all its limitations, the first publicly available version of ChatGPT was able to generate written stories that had never been seen before. They weren’t particularly inspiring pieces of creative writing. Many reported that they might have received a C- grade if turned in by a high school student. But as uninspired or generic as they may have been, those stories were definitely new. In that sense, even the earliest incarnation of ChatGPT was capable of creativity. Every few months since then the models have improved. You can always argue about how good they are at creative writing, but there is little doubt that they’re better than they were just after that first release.
Another reason why the models have exhibited some abilities that people have labelled “creative” is that they are trained on a huge variety of information. This means that, even when operating within the bounds of their training data, they may use connections between seemingly disparate types of information to come to novel conclusions. If a human combines existing ideas in new ways, most people would recognize that as creativity. In Boden’s terminology, the ability to combine existing ideas in new ways is called “combinatorial creativity”.
Many years ago I attended a talk by the mathematician Peter Hilton, about what it was like to work with Alan Turing at Bletchley Park, where the German Enigma Code was cracked during World War II (as portrayed in the movie The Imitation Game). Hilton described Turing as a true “genius,” and then described what that word meant to him. Here’s what he said: when you are playing chess against a person much smarter than you are, they may make a move that you didn’t expect, but in retrospect you can see how they arrived at that decision. When playing against a true genius, they may make a move that completely baffles you, and yet it works. Somehow their brains just seem to work differently. For the purposes of this conversation, Turing was able to produce solutions that are out-of-distribution. This kind of thinking is what Boden referred to as “Transformational Creativity.”
Recently I gave a talk about AI to an audience of mostly mathematicians. Several people in the audience were uncomfortable with using AI as a research tool. One of them asked the question, “Aren’t LLMs just a way for Big Tech companies to steal the work of generations of humans?” There is some validity to this, but the real story is complicated. The earliest version of ChaGPT was terrible at mathematics, even though it was likely trained on all of the papers posted to the arXiv preprint server. Pretraining on lots of previous mathematics just wasn’t enough to imbue it with any mathematical prowess. However, within the past year LLMs have gotten a lot better, primarily because Reinforcement Learning (“with verifiable rewards”) has been added to their training procedure. With this technique pre-trained LLMs improve their mathematical abilities by exploring problem spaces on their own and learning from experience. In other words, newer AI systems are gaining the ability to move beyond the distribution defined by their training data: they are now showing signs of Transformational Creativity.
Going back to the title of my original post, “Can LLMs be creative?”, I think Melissa is right. LLMs have exhibited the ability to produce novel results since their inception. However, that ability is only evidence of some kinds of creativity. For LLMs to move beyond that, AI engineers have had to introduce new training techniques. It’s only been within the last year that those algorithmic improvements have been applied successfully, and we are only just beginning to see what that added ability can create.


