• Member Since 11th Apr, 2012
  • offline last seen Wednesday

Bad Horse


Beneath the microscope, you contain galaxies.

More Blog Posts758

Dec
4th
2014

The Great Shakespeare / My Little Pony Showdown: Part 1 · 7:49pm Dec 4th, 2014

In the comments on my last post, I said,

Shakespeare is all about the language, and the language of every character, excepting fools and yeomen, is that of William Shakespeare. It's beautiful language, and distinctive (in being William Shakespeare), but it makes all his non-comic-relief characters sound the same. He cannot distinguish characters by their speech.

Is that true?

I've shown how to use stylometrics (literally "the quantification of style") to compare the distances in "style" between different texts. The R package stylo does this by counting the frequency of different common words. It can't say whether something is bad or good, only how similar it is to something else. This turns out to be terrible at identifying style, but very good at identifying a text's author.

A long time ago, I took 3 stories from Chaucer's Canterbury Tales, and took the first 1000 words and the last 1000 words as separate texts. Each story is narrated by a different character. If stylometrics could take the resulting 6 samples as a group, and match up the beginnings and ends of the stories, that would mean Chaucer varied his word usage significantly between different characters. I ran it on an Apple ][+ in BASIC. It worked; the computer matched up all three beginnings and ends.

I can do something similar with stylo. I've got the complete plays of Shakespeare in a text file, and I can download the complete transcripts from all four seasons of My Little Pony from the MLP wikia. For both Shakespeare and MLP, I can parse out the lines of N different characters, split each character's lines in half, and run stylo on the resulting 2N files. If the characters' voices are distinct, a dendrogram should match up each file with its other half.

Would that be a meaningful test to run? Here are some of the problems:

- Characters can be identified by which characters they talk to. Hamlet talks to Horatio; Twilight talks to Spike.

- Characters in Shakespeare can be identified by the honorifics they use. Certain characters say "you", "sir", and "master", while others say "thee", "thou", and "dog".

- Many characters have words or catch-phrases they use over and over: "Okey-dokey-lokey", "eeyup", "I reckon", "ain't", "darling", "awesome".

- Shakespeare's plays take place in different countries, at different times. Certain words occur only in certain plays. One set says "signior" where another says "sir"; some say "emperor" where others say "king"; some say "Venice" where others say "England" or "Scotland".

- In Shakespeare's time, it was unacceptable for men to talk like women, or for women to talk like men. In MLP, it might be frowned on for them to talk differently.

- My tests on short-story write-offs showed that stylo works pretty well on stories over 4000 words, but not at all on stories of 1000 words. So we can look only at characters who have thousands of words of dialogue.

- Choosing how many words to use the frequency of may bias the test to favor Shakespeare or MLP. The more words you use, the more setting-specific, class-specific, gender-specific words creep in that distinguish characters in uninteresting, mechanical ways that almost anybody would get right. The stylo default is 100, but that fails to capture the uncommon words and constructions that Shakespeare uses repeatedly across many different characters, like "peculiar", "wondrous", "prick", "in faith", "by my troth", "gentle friend", and "as I am a".

- The word-frequency approach completely fails to notice Shakespeare's grammatical mannerisms, like using verb root forms when speaking in the interrogative ("If it smell so strongly), word-order inversions, omission of the words "that" and "you/thou", and weird tenses ("did withhold" instead of "withheld"), some of which may be Elizabethan speech, but some of which weren't.

- I'd rather measure sentence length and meter, positive and negative emotions, frequency of part-of-speech (adverbs, adjectives, nouns), and other things that I'd call stylistic, but stylo doesn't do that.

I can address many of these problems by removing proper names, honorifics, and catch-phrases from the list of words to use to do the test. (Though I don't know what catch-phrases Shakespeare's character use.) But the larger question is whether the test would be meaningful. I want to ask whether Shakespeare wrote good characters, by asking whether they are distinguishable by their speech.

It isn't obvious that good characters must be distinguishable by their speech. Samurai Jack and Big Mac hardly speak at all. And it isn't obvious what are "good" and "bad" ways of being distinctive. Rainbow Dash is very distinctive, but not what I'd call complex. You could have one character speak BBC English and another like he came from the Baltimore projects, but that wouldn't mean they were well-developed characters. It might even tend to make them stereotypes.

I think we must least should dispense with the words "good" and "bad", and settle for "distinct" and "indistinct". Audiences seem to like bold, distinctive, stereotyped characters a lot in action/adventure movies. Nuance can be bad for some stories or some characters. Think War and Peace versus Blade Runner versus Star Wars. They're different things. War and Peace has such complex characters that I can't really root for any of them in a contest between two of them; you know all of their virtues and faults too well. Blade Runner is still pretty nuanced, but simplified enough, and the government impersonal enough, that I can care how things turn out. Star Wars has good and evil, and I can whoop with joy when evil gets blowed up.

Pretend I ran stylo twice, once on a set of Shakespeare character parts, each split in two; and once on a similar set of MLP parts.

-- Suppose the MLP parts don't match up, and Shakespeare's do. Would that mean Shakespeare wrote great characters, or that MLP characters change and develop over the course of 4 seasons?

- Suppose the MLP parts matched up together, and Shakespeare's did not. Would that surprise you? Would it mean that Shakespeare's characters all sound the same, or that MLP characters are all one-dimensional stereotypes?

- How should I split the character parts in two? Doing it in order, so that one half would have mostly Antony's lines from Julius Caesar, while the second would have mostly his lines from Antony and Cleopatra. That would test whether Shakespeare was consistent with Antony across the plays. There are many characters re-used across plays that way.

One could argue that a character's speech patterns might change with character development. I find that idea dubious, though I guess it might in rare cases like Discord, who used a lot of angry, threatening words in his first episodes but few in later episodes. I'm more inclined to split it without randomizing, because I want to cluster all Shakespeare's soldiers together, and all his fools, and see if they speak with one voice, or many. That would be a good control to say whether Shakespeare's other characters are distinctive. Some of them should match up, just because each character talks about particular things more than other characters do; but to call it good characterization, Cassius part 1 should be more similar to Cassius part 2 than Soldier 1 in MacBeth is to Second Guard in Hamlet.

I'm going to claim that, if Shakespeare's soldiers, fools, citizens, religious characters, etc., cluster together as tightly as do the separate parts of particular characters, or if the number of characters whose parts match up is less than random, that that shows Shakespeare does not write characters that are distinct at all, beyond being generic members of some social class.

What background ponies / walk-ons / guest stars should I cluster together as a control for MLP?

With respect to a comparison between Shakespeare and MLP, the fairest test would be to draw all the distinct Shakespeare characters (not soldiers, etc.) from a small number of settings (say, the English historicals only), to prevent Shakespeare characters from clustering by play; and to split MLP characters into separate files by episode credited writers, and split each of those files in two or more parts randomly. That will factor out the different writers, and character development over time.

I will claim that MLP has achieved character consistency if, e.g., Twilight_Larson clusters closer to Twilight_McCarthy than to Rarity_Larson. I will claim some kind of overall victory over Shakespeare if e.g., Twilights as a whole, across all writers, cluster together better than Shakespeare's characters do. But comparisons of samples less than 2000 words are highly questionable. I will declare a severe victory over Shakespeare if the episodes cluster by author, and yet the MLP characters still cluster better than Shakespeare's characters.

However, I'll do the non-random split first, because I want to use character classes as a control, and because I don't think character development will affect word choice very much in either Shakespeare or MLP.

To be fair, I'll have to, for each set (MLP and Shakespeare), take pairs of characters that tested as distinctive, contrast them using stylo, and see what contrasting words it uses to distinguish them. That will check whether cleanly-distinguished characters are distinguished for uninteresting reasons.

The criteria for measuring consistency of characters will be the average information gain per branch of the dendrogram, in order to be fair if comparing dendrograms with different numbers of characters. I'm not entirely sure that is fair, so alternately I may use an equal number of characters from MLP and Shakespeare, and just count number of characters whose first and last parts ended up as matched leaves on the tree.

If you don't want to accept any resulting verdict, state now why you disagree or how you would interpret the results.

Report Bad Horse · 957 views ·
Comments ( 24 )

Additional potential concern: MLP has many writers. And yes sure, they have lead writers who seek to keep everyone consistent, episode to episode. But some writers aren't as strong with certain characters, like M.A. Larsen has a reputation for being not-so-great with Fluttershy. Or some of the M6 get slightly flanderized when the episode is not about them, like Pinkie's screeching at Fluttershy in Filli Vanilli.

I like the spirit of the idea, but yeah like you're saying, it would require a fair bit of calibration and control.

I mentioned this in a blog post on dialogue, but one thing I found looking at MLP transcripts is that the characters written dialogue isn't as distinctive as you would think. I suggested that the show can get away with this more than fanfic because on the show we can actually hear their voices and speech patterns, where as fanfic requires cues to help us imagine the voices so the dialogue needs to be more distinct. Something similar could be at work in Shakespeare as well.

If you're worried about word frequency analysis being too simplistic to distinguish between authors, I've seen some people do writing analysis by modeling text as a Markov chain. Such a model captures the 1st order pairings of words and might catch some of the unusual grammatical constructions you mentioned. I've seen some sites that train Markov models on various texts and use them to generate silly phrases (e.g. http://kingjamesprogramming.tumblr.com/ ), but it should be possible to train Markov models on each characters' dialogue, then feed the models some test dialogue to figure out which Markov model gives the highest likelihood. Not sure if there is any published code to do this, however.

You'd be measuring different things by each different choice of how to split things, and other things yet by comparing different splits to each other. Suppose you shuffled pony lines so that each half had equal numbers of lines from each season. Then suppose you compared that stylo run with one where they are split evenly in the middle of the timeline. Those two runs together might say a lot.

To answer the question about Shakespeare--whether his characters all sound the same--I think you're right, for the right reasons, that stylo is going to be a crude measure. That being said, I think the data would still be interesting, even if it isn't conclusive. With stylo as your only tool, the more different splits you can think of to run, the more you'll be able to say anything meaningful.

2633743

My instinct when writing Applejack is pretty unintelligible. I start trying to write down all of the accent. That's a lot of apostrophes.

2633766
I'm suddenly inspired to write an adventure fic, where AJ joins forces with a dwarf, a Scotsman, an urchin with a Cockney accent, and a Brian Jacques mole. :trollestia:

2633766
Yeah, don't do that. :applejackconfused:

2633778
...Okay, you can do that.

2633749 I'd love to do that, but I'm probably not gonna write the code. stylo does everything, including the graphs with labels, out of the box.

I want to ask whether Shakespeare wrote good characters, by asking whether they are distinguishable by their speech.

Well, that's a bit tricky to do with a script. I'm sure that the MLP writers don't write "ah" for "I" when doing Applejack's lines, but the accent the actor uses makes that distinction. Good scripts (as opposed to good books) typically leave a lot of "room" for actors to bring their own interpretation and style to the roles.

In Shakespeare's case, it's true his noble characters would all use "poetic" speech patterns, particularly in soliloquies, but he was not only writing with an eye toward giving space for the actors to work their own style into the characters, he was also writing for distinct individuals like Richard Burbage, who he knew personally and very well. So, he might have made characters distinct by writing them to the quirks of the actors he knew were going to play the parts. A word-frequency analysis might be illuminating after all.

You might want to do a search before you go to the trouble of doing it yourself, though. Shakespeare is probably the most studied and analyzed writer on the planet, so the data might already exist somewhere.

2633778

I'm suddenly inspired to write an adventure fic, where AJ joins forces with a dwarf, a Scotsman, an urchin with a Cockney accent, and a Brian Jacques mole. 

Sounds like the Belgian trippel version of this:

lh6.googleusercontent.com/-mPfIExh3_0k/VHVA7J56amI/AAAAAAAADFs/U0kX1-0olOE/w298-h530-no/20141125_205430.jpg lh3.googleusercontent.com/-4-2TmGrUTQs/VHVBCg6_kGI/AAAAAAAADF4/KGTXu6HFhAY/w298-h530-no/20141125_205441.jpg

That's amazing. You still have an Apple II+? (checks again) Oh, a long time ago. (snerk) Translation for those of you under 30: Back in the 80s there was a company named Apple that bears little resemblence to the company of today. It made computers that did not take a gig of storage in order to just add two numbers, but I'm digressing.
(actually finishes reading)
I'd say the problem with quantifying dialogue is harder than it looks even after a second look. Each character has different moods that they are expressing, from serious to humorous. Take AJ for example. Her 'serious' lines are all active voice, and humourous ones tend to passive voice as questions "Will somepony tell me what the hay is goin' on around here?" She's used as a 'Straight pony' for many of the jokes, so you also have to look at the followup lines and situation.

2633778
I'd like to note that my upthumb here is merely an endorsement of the concept, and should not be taken as approval to actually do such a thing.

I wish I looked at the world like you did.

Actually, after recalling your body of work, maybe I don't.

2633743 2633882 2633933 If you point out problems, I'd like you to also say how you're going to interpret the results in light of those problems. If you're going to use something as an excuse if the X (X = MLP or Shakespeare) parts don't match up, will you also use it as an excuse if the Y parts don't match up? Precommit to using or not using it as an excuse, whatever the outcome.

I think none of the above considerations should be a factor at all in a comparison, because they should affect MLP and Shakespeare equally.

2633705 So, if I split Twilight into Twilight_Larson and Twilight_McCarthy (I added that to the post, BTW), then you won't complain about that if MLP loses?

2633967
I'm saying that my prediction is that neither Shakespeare nor MLP will show a significant difference between most characters, which doesn't actually have anything much to do with the characters in a script.

2633967 But it's so much more fun to throw rocks than to put glass back in the windows... (probably one reason why society has so many problems)

Fun Fact! Word-frequency analysis was actually a Big Thing in Shakespeare a few years ago. If you haven't already, look up Donald Foster (Professor at Vassar) and his SHAXICON project. SHAXICON is a database he created specifically to analyze the rare words and their usage in Shakespeare (and other contemporary authors).

Some of the conclusions he tried to find with this software: What order the plays were written in (the idea being rare words would show patterns of usage over time) -- this came out pretty close to the "traditional" order, with a few surprises; What roles Shakespeare himself might have played (the idea being that rare words used by characters he played would be better remembered, and thus used more often in later plays) -- again, this comes out pretty close to the traditional accepted roles, including roles in two Ben Jonson plays his company put on; Authorship -- which parts of co-written plays were Shakespeare, whether or not contested works were his, whether or not he wrote certain anonymous works -- this one caused some controversy due to a false positive. SHAXICON analysis identified Shakespeare as the author of "A Funeral Elegy by W.S.", when all other stylometry pointed away from Shakespeare. The reason ended up being too small a sample text for John Ford, the most likely candidate, and Don Foster had to "recant" his statement on the matter a few years later.

Still and all, word-frequency is accepted (if somewhat controversial) for analyzing Shakespeare. Don't know if anyone has done anything to try to distinguish characters like you want to, though.

2633967 2633982 Actually, since the comparison will be between scripts, instead of between scripts and narrative fictions, I don't think there will be a significant mis-match problem, other than the age-level of the target audience. What effect that will have on the outcome, I have no idea, and I have no intent to pre-judge.

I think an automated, reductionist process to see which is "best" at delimiting characters based solely on dialog is kind of silly, but we are talking about petite pastel ponies here, after all. Still, I'm looking forward with interest to whatever data comes out of the experiment.

If you want a prediction, I would say the program will come up with the MLP characters as more differentiated in general, particularly if you chose Shakespeare's noble characters as the sample from his work. (But if you compare Celestia and Canterlot background ponies to Dogberry and Feste, the outcome would probably be the opposite.)

I think the reason for this is that characters for children's entertainment tend to be simplistic and archetypal, and thus are more sharply drawn and often have particular idiosyncratic "tics." E.g., pick any line of dialog for Pinky from Pinky and the Brain and it will be immediately recognizable because it will invariable end with "narf!"
...
Come to think on it, Llewellen from Henry V says "look you" an awful lot... hmn.

Also... (sorry for the stream of consciousness here) ...I'd bet you'd be able to pick out the mentor figure in any animated feature of the last half century if you did a simple search for the phrase "believe in yourself." It makes them identifiable, but not necessarily well-made characters.

2634136

I think the reason for this is that characters for children's entertainment tend to be simplistic and archetypal, and thus are more sharply drawn and often have particular idiosyncratic "tics."

That is the basic problem. But if Shakespeare's characters come up as more differentiated, will you agree to conclude that Shakespeare's characters are more simplistic and archetypal?

How about checking MLP characters of the same archetype, like Rainbow Dash vs. Scootaloo? What would you expect from MLP characters that aren't well-defined, like (I'd say) Sweetie Belle? If Twilight is archetypal, what's her archetype?

I can try to get rid of the tics, as explained above. Archetypal is harder to get around. What characters of Shakespeare would you say are archetypal? Puck, Cordelia, Lear, Goneril, Malvolio? One of the many Antonios? Pick some, & we can look at them. Puck and Cordelia don't have enough lines, though.

(What is it with Shakespeare and the name Antonio? He has 5 of them.)

A search of all MLP scripts says:

MLP/Hurricane_Fluttershy.htm: Fluttershy: ...yes, I <i>did</i> tell you to never give up... and to believe in yourself. You're right, my friends. I shouldn't give up. I <i>will</i> get my confidence up and show everypony that I <i>am</i> a good flyer! A <i>great</i> flyer!

MLP/Hurricane_Fluttershy.htm: Fluttershy: Sometimes you can feel like what you have to offer is too little to make a difference, but today, I learned that everypony's contribution is important, no matter how small. If you just keep your head high, do your best, and believe in yourself, anything can happen.

:yay:

2634228

But if Shakespeare's characters come up as more differentiated, will you agree to conclude that Shakespeare's characters are more simplistic and archetypal?

Actually, in the case of some of the low-comedy characters like Dogberry, yeah, now that I'm going over them in my mind, none of them are exactly nuanced! It might be really interesting to divide the Shakespeare characters in to low, middle, and high classes. I bet the buffoons are most like cartoon characters, and the nobles are less distinguishable by dialog. The "middle" characters (fools and wise-servant types) I'm not so sure of.

Wasn't that first scene with Fluttershy just before she failed terribly? :rainbowlaugh:

I require sleep, but I'll answer this before reading any more of your posts. Hopefully your results will be after the pagebreak of the corresponding blog post.

2633749

The language implemented by our evaluator will be a computational object with time-varying state, let us model the situation of withdrawing money from a bank account. We will do this unto thee, prepare to meet thy God

I actually started doing this but stopped to work on something else, and I can't remember if I finished it. The algorithm isn't that complicated, but testing and debugging is a pain in the butt.
Messy code be ahead. http://pastebin.com/GhuW3mwD

2633749 stylo has an option to use n-grams, but it's meant to do this with characters. I set it to use word 2-grams, which is probably the same? as using a first-order Markov model. (I think 2-grams are called first-order Markov models in the lit; that always confuses me.)

But, it maintains a max of 5000 items in its count. 5000 is just barely enough word 2-grams to be useful with small samples (around 2000 words each), due to sparse data (most 2-grams don't appear in most texts); it ends up being weaker than 1-grams at 300 words used.

2635535
Yeah, analyzing 2-gram frequencies should give similar results to using a Markov model. It would be interesting to compare the dendograms from 1- and 2-gram frequency analyses to see if they give similar results (my guess would be that 2-grams would catch grammatical quirks and may be less sensitive to character names, but it may be more sensitive to catch phrases).

I think pony will have more less consistent characters (across all four seasons) and will have a greater separation between characters. I honestly haven't read much Shakespeare, but my memory and intuition tell me that he didn't think or care to use distinctive vocabulary across characters. I think MLP writers do take vocabulary into account when writing character speech. Shakespeare's characters do have distinctive personalities, but I think those personalities are too heavily dependent on role and setting, and you're subtracting those out.

I think that'll show for any test you run (so long as you don't intentionally sabatoge it). I really do think pony writers took more time and did a better job with pony characters than Shakespeare did with his, but I don't think that implies that pony writers are better in any way than Shakespeare. As you possible hinted here, Shakespeare writes with a goal in mind, and his characters reflect that goal. He's not trying to build a world, he's just trying to make people think about certain things, and unique, realistic, and diverse characters are just unnecessary baggage.

MLP is written around the characters. Sure there's a moral at the end of some episodes, but no single moral is the focal point of the series. I don't think the writers thought about "which personalities would be sufficient to explain friendship", and yet we're given six (plus or minux one) personalities that are fairly consistent across the series. It stands to reason that personalities are given higher priorities than the morals themselves, and that the personalities are partly the point of the series. This logic is partially circular.

And now we get to find out that Shakespeare's characters separate better than MLP's, and that I'm completely wrong.

Login or register to comment