• Member Since 11th Apr, 2012
  • offline last seen 2 hours ago

Bad Horse

Beneath the microscope, you contain galaxies.

More Blog Posts725


The Great Shakespeare / My Little Pony Showdown: Part 2: RESULTS! · 6:34am Dec 12th, 2014

Grant Voth said, in his lecture series The History of World Literature,

Shakespeare created about 1000 characters in his 37 or 38 plays, and yet each character speaks with his own rhythyms, his own accents, his own vocabulary, his own tricks of speech.... Shakespeare's characters are so individualized in terms of speech patterns that it has been said that in any given play, you could take the name tags off of all the speeches, drop them into a hat, shake them up, bring them out one by one, and if you had a good enough ear, you would be able to put all the speeches of a single character together, simply because there are no two characters who sound alike.

So I know the question you're all asking. Who wrote more-distinctive characters: Shakespeare, the Immortal Bard, "the man who invented humanity" [1], "the most influential person who ever lived" [2]--or the writers of My Little Pony: Friendship is Magic?

I can't answer that. What makes a character "distinctive" is a matter of opinion. But I can definitively answer this more-precise question: Which set of characters is more distinguished from each other by the frequency with which they used each individual word?

(This is a continuation of part 1, which discusses whether that precise question is meaningful.)

I found an ASCII file of the complete plays of Shakespeare, and downloaded the transcripts of every My Little Pony: Friendship is Magic episode (seasons 1-4) from mlp.wikia.com. I compiled all the lines for each character into a separate file [3]. Then I used the R library stylo, version, to compare them [4].

stylo finds which texts are most similar to each other in terms of how often they use different words. I split the files for each character into pieces, with similar numbers of words in each piece. Then, once for Shakespeare and once for My Little Pony, I threw all the pieces at stylo and asked it which files most resembled each other.

If a character's vocabulary is distinctive, stylo will say that the different pieces of that character's file are similar. The more that the files for each character cluster together, the more consistent and distinctive that character's vocabulary is. (I did this once with three of Chaucer's narrators from The Canterbury Tales, and the word frequencies matched up all the parts correctly.)

But first, I had some decisions to make.

Which words to count

The more words whose occurrences you count in each file, the more data you have to work with. But the more words you count, the more stylo’s comparison will be comparing the setting or the topic of conversation rather than the way that the characters talk. The default is to count just the most-common 100 words. Including words that group by settings or topics rather than by character would favor Shakespeare, since we’re looking at 95 different MLP episodes, but only 11 Shakespeare plays.

One might argue that Shakespeare’s characters have a more varied vocabulary than ponies, and therefore they require going beyond the default 100 words in order to express their distinct characters. To test this, I counted the number of distinct words needed to account for half of all the words in each set. For Shakespeare, this was 78 words; for MLP; 79. Despite the many claims that Shakespeare had an unusually large vocabulary, the two sets of texts had similar word frequency distributions [8]. Therefore I kept the word count at the default 100.

I inspected the list of words stylo produced (one list for Shakespeare, one for MLP), and disallowed all proper nouns, catch-phrases that I recognized (“totally”, "awesome", "darling", "ain't", “ooh”, “wikey”, etc.), words that identify the relative ranks of Shakespeare’s characters ("master", "lord", “liege”, "sir", “grace”, “thee”, “thou”, “thine”, “you”, “ye”, “yours”, “we”, “our”), and anything else that seemed to give an unfair advantage in distinguishing characters by their surroundings or who they talked to rather than by who they were.

Deciding what counts as a catchphrase and what counts as character is a little arbitrary. I eliminated “apple” and “apples” but decided to allow "farm", "dragons", "book", "Wonderbolts", and "party" because part of what makes MLP characters distinct is that they all have different professions, hobbies, and interests. I could not identify any words denoting interests that distinguished Shakespeare's characters, who are mainly occupied with sex, drinking, and killing each other, so disallowing such words would unfairly handicap MLP. I eliminated “silly”, “ain’t”, and “shucks”, but allowed “ya” and “wanna” because more than one character uses them frequently.

Very few of any of these terms were in the first 100 words. I ended up eliminating 8 words from Shakespeare (you, thou, your, thy, our, we, thee, france) and 9 from MLP (twilight, rainbow, spike, apple, dash, pinkie, princess, rarity, applejack) before taking the most-common 100 remaining words.

Which characters to use

In my experience, stylo needs at least 1000 and preferably 2000 words per file to have much accuracy. Only 14 characters in Shakespeare and 9 in MLP have at least 5000 words each, allowing at least 2 files of 2500 words. (Twilight has the most words, at 31,000).

When I ran stylo, it clustered the Shakespeare characters together that were from the same plays: Iago with Othello, Antony with Brutus, etc. That gave Shakespeare an unfair advantage: stylo would only have a 50% chance of making a mistake with any of those characters.

So I chose only characters from Shakespeare's historical plays about English kings [5] with over 4000 words. Since having fewer characters in one tree would make it more likely for their parts to match up by chance, I used 10 Shakespeare characters and 10 MLP characters. I broke the Shakespeare characters up into files of about 3100 words each, and the MLP characters into files of about 5500 words each, deleting the last Twilight Sparkle file, so that each set had 25 files in all.

Sample size

For statistical reasons, stylo can’t compare small files of different lengths. For a fair comparison, you must use the same number of words from each file. The smallest file in the sample was one half of the Duke of York, with 2054 words. I set the sample size to 2000 words, chosen at random without replacement, from each file.

I decided that I’d rather check for each character's consistency overall, rather than consistency between multiple plays. If Henry V came across very differently in Henry IV and Henry V, that could be due to character development. So for each sample, I shuffled the lines randomly. Shakespeare’s secondary characters (dukes and earls) match themselves very poorly across different plays [6], but match much better when the lines are shuffled.

Because I’m using random samples, the tree generated by clustering will be different every time. Therefore, I’ll take the average score over the first 3 trees produced for each set.

Whether to adjust for MLP having multiple authors

stylo was designed to identify the true authors of disputed texts. But MLP scripts have many different authors. So stylo should detect differences between Rarity as written by M.A. Larson and as written by Meghan McCarthy. How much of a disadvantage will this give to MLP?

To check this, I first did cluster analysis on the lines written by different writers, using from 1000 to 7000 words per file [7]:

The resulting tree shows the scripts by the same writer cluster perfectly with each other. So which is a stronger influence on vocabulary: writer or character? To test this, I created one separate file for each combination of writer and character, and did a cluster analysis (no sampling) on all resulting parts with over 1000 words:

There is some clustering by writer, but much more clustering by character. I chose not to adjust for MLP characters having different writers. It seemed more useful anyway to test the consistency of the characters overall rather than the separate skill of each writer.

How to score

The measure of success will be the sum, for each file of lines, of its adjacency score. Its adjacency score will be 1 divided by the log base 2 of the number of leaves in the smallest subtree that contains both that file, and another file from the same character. If a file is paired with another file from the same character, which might be written in list form as (A1 A2), its score is 1 / lg(2) = 1. If the tree is given by ((A1 B1) (A2 C2)), the score for A1 is 1 / lg(4) = ½. The score is designed to be inversely-proportional to the amount of information required to find a matching file in the cluster tree.


The pictures below show the first tree produced for each set:

The maximum possible score is 25. The average scores over three trials were:

Shakespeare: 13.9 (14.8, 14.2, 12.8)
MLP FiM: 19.5 (18.8, 20.5, 19.1)

The Winner:

My Little Pony:

Friendship is Magic

I should've guessed Twilight was a Shakespeare fan.

Better luck next time, Bill.

The MLP score wasn't just higher than Shakespeare's score; it was closer to perfect than it was to Shakespeare's score. That's what I'd call a sound thrashing. To visualize the difference, here's a plot of just the Mane 6, using multi-dimensional scaling to try to project them onto a 2-dimensional plane but keep the high-dimensional distances similar. This uses 2600 words per datapoint [9].

And here's the same plot of the 6 Shakespeare characters with the most lines in these plays:

[1] Harold Bloom, Shakespeare: The Invention of the Human)

[2] Stephen Marche, How Shakespeare Changed Everything

[3] This wasn't easy for Shakespeare. He loved to use the same names, and similar names, repeatedly. He has five different Antonios, three Rosalines and a Rosalind, two Ferdinands, two Portias, a Hortensio and a Hortensius, a Cassio and a Cassius, two Juliets and a Julia, two Gloucesters, two Franciscos, two Franciscans, and a Francisca. He has two different Francises and two different Bardolphs in Henry IV alone. There are five parts in The Merchant of Venice with names so similar that scholars usually collapse them into two or three, supposing the ones used least often are typos. I had to separate all such parts out by hand.

[4] I explained stylo here.

[5] Plus The Merry Wives of Windsor, which features the same characters.

[6] Yes, I was careful to change the character name when a new character inherited the same earldom.

[7] This time I didn’t take random samples, but used all the words from each file. This makes the clustering stronger. But otherwise it wouldn’t be possible to include the authors who wrote only about 2000 words.

[8] Shakespeare's vocabulary was larger past the first 50% of all words used, using a total of 9040 unique words in 72,550 words, while MLP used 8440 unique words in 120,311 words. The MLP count was elevated some by bad HTML character translations and by words merged together. The Shakespeare count was elevated some by Shakespeare's many arbitrary contractions in the middle of words.

[9] I can't add more characters to the plot without horking up the projection; the number of constraints to satisfy is roughly 7^*6/2 = 21 for 7 characters, but 10*9/2 = 45 for 10 characters.

Report Bad Horse · 1,550 views · #writing #Shakespeare
Comments ( 35 )

Hm. I wonder if the MLP writers have a vocabulary bible like the animators have a character bible for the show.

Wow, Larson's Fluttershy is highly anomalous. I need to look up which episodes he wrote.
Also, everyone writes Pinkie Pie differently. I wonder if that holds true among fanfic writers.

Respect! I mean it! For the sheer amount of work you put into taking talking magic horses seriously. I love it, and I love what it says about this fandom that it gets this kind of angle on it.

I missed the part where you removed catchphrases on the first read over. Only caught it with grep, actually.

It looks like there are a few somewhat arbitrary decisions that have a big influence on the end results. You at least tried to give Shakespeare a fighting chance, though, I guess.
There's just no way to make it a fair fight. He had several centuries less of accumulated culture to work with, focused less on character and more on concept, and on top of it all he had no way of knowing he'd eventually be compared to pied horse animations in a word-clustering competition.

I have the strong suspicion that part of the cause of Pinkie Pie being anomalous is because one of her mannerisms is repeating lines - both by herself and by other people.

The other part is that Pinkie Pie actually has a really broad vocabulary, and a lot of her character is in delivery rather than what she actually says. For instance, "hoof biting action overload" is a Pinkie Pie line. Overload sounds like a word that Twilight would use, while action is more of a Rainbow Dash word. And yet it is totally a Pinkie Pie line. She says and is involved with and does a wide variety of very random-seeming things, which probably means that her dialogue is very eclectic, and unlike Twilight, she doesn't have the sophisticated bits to her.

It is interesting that half of Spike falls in with Twilight while the other half falls in with Rainbow Dash; I wonder if this is a genuine phenomenon or not. On the one hand, Spike sometimes uses fairly sophisticated language and is Twilight's little brother; on the other hand, sometimes he... well, is a little bit Rainbow Dashesque in wanting attention and recognition.

As far as Larson's Fluttershy goes, hard to say. He did Swarm of the Century (where all the characters gushed over how cute the things were... except for Pinkie Pie), Sonic Rainboom (yay), Cutie Mark Chronicles (You'd never guess, but when I was little I was very shy. And a very weak flyer.), The Return of Harmony (Oh, but I am weak and helpless!), Luna Eclipsed (Oh, and Nightmare Moon), Secret of My Excess (We'll be ever so grateful if you'd be so kind as to possibly consider...), Super Speedy Cider Squeezy 6000 (Oh, where are we? What's the rush?), It's About Time (Who's the cute widdle three-headed dog?), Ponyville Confidential (she has no lines at all, but we do get to hear the salacious rumors about her tail extensions), Magic Duel (Don't be scared, little friends. Twilight is wonderful with magic. [to Twilight] Anything happens to them, Twilight, so help me...), and Magical Mystery Cure (But... I don't really know anything about animals...).

So... yeah. Actually a lot of pretty classic Fluttershy moments, really. The real question is what those other clowns are doing. :trixieshiftright:

I cannot even begin to imagine the amount of time and effort you had to put into this, but bravo!

This was highly fascinating and informative. :)

The two different MLP dendograms you present do not seem consistent. For example, when split by authors, Rarity and Fluttershy seem to cluster apart from the others while the CMC are interspersed throughout the tree, whereas when authorship is not taken into account, the CMC strongly cluster together, but Rarity and Fluttershy no longer do. In your three trials, how consistent was the clustering of related characters? (For example, the second figure shows Applejack and Twilight clustering together. Are they nearest neighbors in all trials?)

That second tree is quite interesting (if the overall branching is reproducible). It splits the mane six into two branches each containing one pegasus, one unicorn and one earth pony. Whereas one might have guessed the branching might influenced by the characters femininity, that doesn't seem to be the case because Rarity and Fluttershy are in separate branches as well as RD and AJ. What characteristic causes Fluttershy, Twilight and AJ to fall in one group, and Rarity, Pinkie and Dash in the other? My best guess is that stylo is partitioning the three more extroverted characters from the three more introverted characters.

Man, this is cool stuff, and I hope Bradel and anyone else with relevant expertise stop by and comment on your results.

When you say Twilight has 31,000 words, is that her usage of words in the top 100 or the total number of words she delivers in the show's run?

2648577 The cluster analysis, at each step, looks at a dataset in a high-dimensional space and probably divides that data in two with a plane. Slight changes to where that plane is can cause two clusters that are close together in the space to be separated early on in the tree. It's impossible to tell from the tree produced when this has happened. Applejack & Twilight didn't cluster together in the other 2 trees.

Alternately, stylo can project points onto a plane, showing distances, but the distances are misleading projections from n-space. Like this:


That's a picture made using all the words instead of sampling. That's more accurate with the characters with lots of lines, but Scoots gets spread out because she has the fewest lines, which spreads her points out. Probably because some of the common words are absent.

Does anyone else do this sort of thing, Bad Horse? Because I have a distinct impression you just invented the field of Computational Literary Criticism. And that is the coolest thing there is.

I happen to like Shakespeare, but I'm fairly certain he's been the victim of altogether too much hype and analysis. I mean, if you think about it, he's in a similar position as MLP in a way. Old Bill just wanted to write some decent plays and earn a bit of money, and then some of the cleverest people in the world over a period of four hundred years invested each line, each character, each subtlety with portentous significance. Faust, blessed be her holy hooves, just wanted to write a decent girls' show and shift some little horse figurines, and then a whole bunch of incredibly clever people (and me, I guess) descended onto what she made and spun a thousand tales from the most minor of features and turned the relatively simple characters into tortured and complex beings, or damn-near-religious symbols of hope and self-sacrifice, or a hundred other things.

Clearly, Hamlet should have gone around calling everyone "sugarcube."

Well, if any of the show's writers have to compose a ransom note, now they're screwed.

/ x x

That's a dactyl, I think. Possibly stretchable to a amphimacer? Sounds odd. Anyway, poor fit for iambic pentameter, surely? Though you could get away with it if he opens the verse with it. Trochaic inversion and all that.


"The prosecution calls Bad Horse and... must you wear that mask?"


"Oooh boy. Here we go again. And for this I left private practice?"


"Look, we have to rehearse this or the defense will have us for lunch. They'll already be on your case for being a bromide—"


"—same difference."

Wow. Thank you for taking the time and effort to do this, especially the exhaustive normalization process.

(Also, I have to love how the common words taken out of Shakespeare's work are seven pronouns... and France.)

Interestingly, Rainbow Dash is nearly as widespread.

2648595 Total number of words in the show, not counting the movies and shorts, which I didn't use.

2648772 Now, now. I eliminated sugarcube, spikey, wikey, and all that. They weren't in the top 100 words anyway.


Does anyone else do this sort of thing, Bad Horse? Because I have a distinct impression you just invented the field of Computational Literary Criticism. And that is the coolest thing there is.

Other people use this procedure, but as far as I know, only to ascertain authorship.

Faust, blessed be her holy hooves, just wanted to write a decent girls' show and shift some little horse figurines, and then a whole bunch of incredibly clever people (and me, I guess) descended onto what she made and spun a thousand tales from the most minor of features and turned the relatively simple characters into tortured and complex beings, or damn-near-religious symbols of hope and self-sacrifice, or a hundred other things.

What was that? Sorry, I was busy dripping candle wax on Fluttershy.

Look, what you two get up to in private as consenting adults is none of my affair.

Sounds like you now have the perfect pool of data to re-write a Shakespeare play with MLP characters. Hop to it! I want to see a perfect blend of The Taming of the Shrew and pony right away. Chop chop.

Yeah, I'm a big believer in this viewpoint: http://xkcd.com/915/ Shakespeare has just had more time for people to invest in it.


Sounds like you now have the perfect pool of data to re-write a Shakespeare play with MLP characters. Hop to it! I want to see a perfect blend of The Taming of the Shrew and pony right away. Chop chop.

I don't know how the data will help, but I really do want to do this. But not a rewriting of any existing Shakespeare play. I first thought of The Glass Blower, but it hasn't got enough action for a Shakespearian drama; there's just 2 characters at odds with each other, and they don't even stab each other.

Problem is that fimfiction doesn't allow scripts.


Does anyone else do this sort of thing, Bad Horse? Because I have a distinct impression you just invented the field of Computational Literary Criticism. And that is the coolest thing there is.

I now recall there are a few people in Hollywood who analyze scripts computationally, to try to predict which ones will be hits.

One of these days, one of these days I'll tell someone to write something I want and they'll respond with a "Right away!" instead of the reasonable and logical responses I actually get. Grumble grumble.

I ended up eliminating 8 words from Shakespeare (you, thou, your, thy, our, we, thee, france)

One of these... is not like the others...

Most intriguing! Do you happen to still have the "top 100" word lists available? I wonder, in pointing to a fanfic and saying "Rarity sounds off, here"... certainly the phrasing used is important, but potentialy this sort of stylo data could be used to point out things like "Stop making her say 'spiffy'. She never says that."

2649466 Imagine a website where you could enter your story, and it would guess at which character was speaking each quoted line. Would it be a good thing, or a bad thing?

And which would make me want to build it more? :pinkiehappy:

The website could openly mock poorly written character dialogue, like a less impotent version of Gordon Ramsay, if such a thing tickles your evil bone(s). :moustache:

Or you could capture the uploaded text and publish it to Fimfic a day before the original author. Precognitive theft, most foul!

This is all great, but now the thing that will stick in my brain is that Discord speaks most like Rarity. :raritydespair:

I've heard tell that Shakespeare, as was the standard of the day, wrote each character with the actor in mind. That is, instead of writing a generalized script for any performance, he wrote a script with specific characters for a specific group of actors. (This is also one of the reasons the Shakespeare as literary puppet for another writer is unlikely at best; whoever wrote the plays had to have worked closely with the actors for a long time).

So any analysis of Shakespeare's characters are, in part, an analysis of the actors he worked with.

2652158 That's relevant if true, but I think this must mean someone said Shakespeare did that, or theorized that he did, not that someone analyzed his scripts and concluded that. We don't have any of Shakespeare's original scripts, so I doubt we know who played what part for his plays.

I was going to say something supporting "of course Spike being part Dash-esque and part Twilight-esque is a thing". But I think I have a better idea.

I object strongly to this remark:

incredibly clever people (and me, I guess)

Are you seriously daring to try to imply that you are not incredibly clever? Sir I know better.

EDIT: added this to the archive.

Not implying. Stating. Affirming. Declaring. Exclaiming, even.

There's a war over here, between my I Know Better Than To Waste Time Arguing Over This With You and my No You Are Not Allowed To Have The Last Word. So you know.

Dangnabit. I should have realized that could have backfired.

Login or register to comment