Assessing Humor on WhatsApp

My friend group uses WhatsApp to stay in touch. As our lives progress it’s a great way to stay close. But the thing that binds us together the most is probably humor. It’s how we bond and it’s fun!

A couple weeks ago I asked my friends a simple question: Who in our group is the funniest? I received 5 different answers.

Hmm.

I got to thinking: could I machine a process to read the raw text of a WhatsApp chat and use MATH to determine who is the funniest? The answer, it turns out, is yes.*

I’ve written python code which ingests a WhatsApp chat export, interprets laughers & jokers, calculates humor metrics for each member of the chat, and visualizes the results. I invite you to take it for a spin on your own chat – clone my github repo.

How it works

How can we measure how funny someone is? By how much laughter they cause! That’s where we start: identifying laughter. Messages are flagged as containing laughter by performing keyword searches for known laughter terms (LOL, haha, 😂, etc.). That’s the easy part. Next we need to determine who caused the laughter.

To link laughter to its originating “joke” in an automated way, we need to look at the preceding chats and consider some rules:

  1. Is the laugher responding to a quoted message? If so – you’re done! Easy one!
  2. Is the chatter == the laugher? If so, keep looking.
  3. Does the chat also contain laughter? If so, keep looking.

The first chat we come to that does not fail on the #2 or #3 is credited as the ‘joker’. Here’s an example:

What successful laughter attribution looks like.

*It’s not perfect

Any machined process will lack the nuance of a human. Mistakes will be made! The code will link laughter to the wrong joker at some frequency due to the challenges around interpreting human language.

What UN-successful laughter attribution looks like.

But after observing chat behavior for some time I estimate that this approach properly assigns laughter about 90% of the time. And assuming random assignment of the remaining 10% we shouldn’t be advantaging anyone by too much – though more active chatters will tend to have an advantage on that account.

It also goes without saying that this code may not correctly assess humor in the real world; its perspective is limited to the WhatsApp chat!

The metrics

Now that we have a dataframe containing jokes + laughs, we can perform some analysis to develop features which describe how funny each chatter is. I’ve come up with 4 metrics. Why 4 and not just 1? Because something as complex as humor can’t be captured in just one number!

Weighted Laugh Score

Let’s start with the obvious one: Who caused the most laughter? But not all laughter is created equal: ‘Haha’ < ‘Hahaha’ < ‘Hahahaha’. We assign a “weight” value to known laughter keywords which describe the strength of the laugh. This metric is the sum of all laughter weights caused by the joker in the full chat.

Joke Proportion Score

How frequently did each chatter tell a joke? We want to value low-activity chatters who are often funny, and penalize high-activity chatters that produce a lot of unfunny messages. This metric is all about the “humor value” you create as a proportion of the count of your chats.

Laughter Distribution Score

OK – maybe you are causing a lot of laughter and you have a high joke proportion in your chats. But do you have broad appeal among your audience, or are you getting all your laughter from a vocal minority? This metric reduces the score for jokers who get a disproportionate amount of their laughter from a subset of their audience; it boosts the score for jokers who have an even spread of laughter from all members of the chat.

Big Joke Score

Similar to Weighted Laugh Score, but now we’re only crediting “big” jokes – defined as any message that causes multiple people to laugh. I felt this was necessary to further disambiguate truly funny messages from messages that elicit only a casual ‘haha’.

These metrics are weighted evenly and rolled up into a composite humor score.

Putting it all together

With the data compiled and calculations complete, we can visualize the results! To illustrate I will share the summaries produced from one of my chats, “Beware of Trolls”, which presently includes 13 members and has accumulated over 17,000 messages thru March 2020.

Behold!

The humor champions revealed

Same chat, each metric reported individually:

Results for each humor metric

Monthly analysis

With the code written for overall humor summaries, we can produce the same summaries by month, creating opportunities to do a number of cool things. Show trends over time, and look at at someone’s funniest month!

Here is my own trend line showing my humor score over time:

My own humor score over time. Middle of the pack!

But what we really want to see is how we compare to other members of the chat. So we plot the top 9 jokers together!

It’s more fun to plot 9 lines side-by-side!

And with humor scores for each month calculated….we can determine every chatter’s funniest month! Wow.

Mine, it turns out, was November 2018. Because of this joke:

I turned “Wolf of Wallstreet” into “World of Warcraft” – clearly I am the authority on humor.

That’s it folks. If your interest is piqued, feel free to try it out on your own chat. I welcome collaborators and suggestions!

Stay funny everyone. We need humor now more than ever.