LLMs, Humor & Play

Essay • 950 Words • Humor, Artificial Intelligence, 2026 • 06/17/2026 • View in graph

Can an LLM write a joke? Well yes and no. Many people argue if an LLM can really do perform speech acts like assertion because there is no intentionality and/or Theory of Mind behind its output. So if a joke is something created in order to make someone laugh, the LLM would have to understand and intend that, which many critics would rightfully argue that it’s unable to do so, at the very least right now. However in a purely formal way, yes it can output something that looks like a joke (but still lacking something like taste), which can even make you laugh (although I cannot find any randomized control human-judge trial of human vs LLM generated jokes).

The next question that we then have to ask if “What makes something funny?” There are a lot of competing theories, but ultimately (to me) what is funny is whatever makes you laugh¹. So really, by evaluating it (subconsciously perhaps) and finding something funny makes it funny. This means that humor may be fundamentally goal-oriented, to make you laugh. But what makes you laugh? Can we measure that? This is where (formal) evals come in.

What are evals and why are they important?

“Evals are, at their core, systematic methods for measuring whether your AI system is doing what you want it to do. They answer questions like: Is the model’s output accurate? Is it consistent? Does it stay within the guardrails you set? Does it degrade when the input changes slightly? Is it getting better or worse as you iterate on your system?”

The Death of Prompt Engineering, And How Evals Are Rising in Its Place by Yahav M

You may already be able to see how they could be helpful, but they become incredibly important in autonomous agentic systems. Evals are necessary for loops, for a system to get to a goal, it must be able to experiment and measure the outcomes of the various things that it tries in order to move closer to achieving its set goal. The creator of OpenClaw, a popular open-source autonomous agent, talking about loops which started discourse™:

The lack of humor evals is not for lack of trying. There is a branch of (computational) linguistics called computational humor that seeks to understand as well as generate humorous content. Maybe its timing or semantic distance. Of course in reality, it’s probably a mixture of a ton of different things, and while I am fascinated by this line of questioning in this sub-field, I do think that it quickly starts to miss the point. To what end is it to generate jokes? To say that we could? Humor is a puzzle, and jokes are little conceptual puzzles unto themselves, so I guess there will always be people who are fascinated by trying to crack them.

Let’s say for the sake of the argument that a formal eval for humor is found and is able to be employed in an agentic loop. What would happen? There would be individual differences in jokes because of temperature causing non-determinism in the outputs, but otherwise I feel like you would start to see certain themes arise. Maybe through frequency of appearance in training data, or by RLHF, without prompting it at the beginning with the skeleton or general premise of a joke, we would land on some kind of statically average joke (maybe a knock-knock joke). Now wouldn’t that be funny?

At the center of this is all is the question “What makes humor fun to consume and participate in?” In Ted Cohen’s book, Jokes: Philosophical Thoughts on Joking Matters, he talks about how jokes by nature require a shared understanding of things in order for them to work, so there is a certain kind of intimacy shared by the joke teller and the joke listener. In David Shoemaker’s book, Wisecracks: Humor and Morality in Everyday Life, he talks about the incredibly important role of interpersonal humor and how we relate to and play with each other. Think about riffing with your buddies or comedically sparring with them.

To participate in humor is perhaps to play with a quirk of human evolution; using language to exploit a quirk of predictive processing. But we do play with non-human agents, like playing fetch with your dog, so who’s to say that it always has to be human-to-human play. What if beyond working with LLMs, we might play with them in the future?² While LLMs right now are too RLHF’d to be an assistant that its humor abilities are constrained by its own professionalism, there is no guarantee that’s what it will be like that forever. What might it look like to you riffed or run bits with/on Claude? Is this good for humanity? Bad? More advanced AI and chatbots opens up novel and unexpected ways of interacting with LLMs, and it is our responsibility as people to think through the potential consequences.

Footnotes

My true opinion is more nuanced than this because laughter can “misfire” at things that aren’t funny/meant to be funny and are actually quite serious. However for the sake of this argument I think it is fine enough.

I need to think more about this and if it is good for humanity or not. I think that if we open up another entire paradigm of interacting with LLMs beyond work (I think that romantic chatbots are still in this work mode of emotional labor and sycophancy) that AI will only become more entrenched in (certain people’s) lives. Whether this is a good thing or a bad thing is up to other thinkers, but also only time will be able to tell. We are not soothsayers.

Click here to see a list of all the external links cited in this post

You might also like...

Could a Machine Like a Burrito?

On the separation of consciousness and human-like qualities.

Biomimicry and AI

My notes on Dwarkesh Patel's interview of Andrej Karpathy.

AI Usage Acceptability Spectrum

A write-up of my own intuitions about what (potentially) makes certain usages of AI more acceptable than others.

AI Art and the Intention of Differentiation

Introspecting on the difference between AI and human art, specifically in taking input and producing output.

Incongruity Theory in the Philosophy of Humor

In this paper, I will first explain Platonism and its relation to mathematics and reconstruct arguments against it to show how mathematical objects ultimately cannot exist. I will then explore logicism and formalism in order to critically evaluate how they create truth for mathematical propositions and the problems that they have that could or should prevent mainstream philosophical adoption. Finally, I will explain Benacerraf’s structuralism and why I think it is the best Anti-Platonist explanation for the philosophical foundation of mathematics.

LLMs, Humor & Play

Footnotes

You might also like...

Other Posts About “Humor”

Other Posts About “Artificial Intelligence”

Comments