Statistics #5.3: Revised Lloyd’s Register of Shipping · 11:00am Mar 25th, 2019

Continuing from Lloyd’s Register of Shipping, now with fewer caveats!

The last time, I mentioned that I was suspecting that due to Derpibooru conflating human and pony versions of characters, certain images were getting counted twice.

This suspicion was correct. Unfortunately, it is impossible to fix this problem by separating pony and human versions of a character without a lot of manual work: An image tagged human, pony, twilight sparkle might end up depicting an Anon riding pony Twilight just as easily as it can depict the Pedestrian Twilight or a Sci-Twi riding a pony or Sci-Twi riding Princess Twilight.

It is, however, possible to deduplicate individual images going into each pairing, which is what I did this time. I also made sure to save the raw data received from Derpibooru – not included, because there’s something like 250 megabytes worth of metadata in that dump – so should any new ideas turn up, I should be able to implement at least some of them without wasting hours to re-fetch the entire bundle. Code in the Github repository has been updated accordingly.

I won’t spam you with the entire list this time, you can still examine the full table in the repository. Download it and play with it! I’ll just point out the changes in the top ten:

387.9874  - Applejack / Rarity
375.5901  - Rarity / Spike
305.0550  - Sunset Shimmer / Twilight Sparkle
299.5550  - Rainbow Dash / Twilight Sparkle
275.9676  - Starlight Glimmer / Trixie
259.5838  - Applejack / Rainbow Dash
246.9860  - Fluttershy / Rainbow Dash
237.9829  - Princess Celestia / Princess Luna
226.2347  - Shining Armor / Twilight Sparkle
225.3981  - Princess Celestia / Twilight Sparkle

Compare to the previous top ten, which had duplicated images in cases of human/pony conflation:

454.3589  - Sunset Shimmer / Twilight Sparkle
388.1065  - Applejack / Rarity
375.6674  - Rarity / Spike
301.1101  - Rainbow Dash / Twilight Sparkle
276.0932  - Starlight Glimmer / Trixie
259.6075  - Applejack / Rainbow Dash
247.0561  - Fluttershy / Rainbow Dash
237.9406  - Princess Celestia / Princess Luna
226.1238  - Shining Armor / Twilight Sparkle
225.6740  - Octavia Melody / Vinyl Scratch

As you can see, Sunset Shimmer / Twilight Sparkle was inflated by nearly a third, and so, mysteriously, was Octavia Melody / Vinyl Scratch. The general distribution, however, did not change all that much: Characters which are not typically thought of as their Pedestrian versions were not affected by this issue.

The Friendliest Fleet makeup, obviously, changed, but this time it’s barely noticeable:

Highest scoring fleet:

375.5901  - Rarity / Spike
305.0550  - Sunset Shimmer / Twilight Sparkle
275.9676  - Starlight Glimmer / Trixie
259.5838  - Applejack / Rainbow Dash
237.9829  - Princess Celestia / Princess Luna
224.7584  - Octavia Melody / Vinyl Scratch
197.6489  - Princess Cadance / Shining Armor
138.1671  - Gallus / Sandbar
126.1228  - Bright Mac / Pear Butter
108.8116  - Big Macintosh / Sugar Belle
108.1050  - Bow Hothoof / Windy Whistles
93.0148   - Discord / Fluttershy
85.2484   - Stellar Flare / Sunburst
56.0887   - Aria Blaze / Sonata Dusk
45.6220   - Maud Pie / Mudbriar
42.8643   - Cheese Sandwich / Pinkie Pie
38.3921   - Scootaloo / Sweetie Belle
37.9029   - Sky Stinger / Vapor Trail
34.2547   - Princess Ember / Thorax
32.5613   - Diamond Tiara / Silver Spoon
30.0474   - Aloe / Lotus Blossom
27.9535   - Soarin' / Spitfire
25.2469   - Night Light / Twilight Velvet
22.2437   - Apple Bloom / Tender Taps
21.9648   - Cloudchaser / Flitter
21.9571   - Nightmare Moon / Queen Chrysalis
21.5141   - Carrot Cake / Cup Cake
20.8235   - Glitter Drops / Tempest Shadow
14.4490   - Double Diamond / Party Favor
13.8021   - Carrot Top / Derpy Hooves
13.2852   - Ahuizotl / Daring Do
12.9575   - Cloudy Quartz / Igneous Rock
8.5284    - Button Mash / Rumble
8.0849    - Gilda / Greta
7.4591    - Pound Cake / Princess Flurry Heart
5.8062    - King Sombra / Radiant Hope
4.8795    - Indigo Zap / Lemon Zest
4.2114    - Fizzle / Garble
3.6199    - Babs Seed / Twist
3.2488    - Braeburn / Little Strongheart
2.8002    - Blossomforth / Thunderlane
2.7969    - Flash Sentry / Timber Spruce
2.4324    - Dinky Hooves / Pipsqueak
2.1503    - Beauty Brass / Fiddlesticks
1.8868    - Jet Set / Upper Crust

Total fleet score: 3127.8919601638518

This is almost completely identical to the previous list, barring changes in the actual individual scores, but Gilda suddenly breaks into the list with Greta. While duplicated images were significant to counting which ship is top, the optimizing logic of assembling the friendliest fleet filtered the difference out.

Along the way, Scottbert suggested counting individual artists, rather than upvotes, as the primary ranking value. Here’s what happens to the top ten ships if I rank ships in artists per day:

0.8362    - Applejack / Rainbow Dash
0.7954    - Rarity / Spike
0.7274    - Starlight Glimmer / Trixie
0.7234    - Applejack / Rarity
0.7116    - Fluttershy / Rainbow Dash
0.6749    - Rainbow Dash / Soarin'
0.6387    - Tempest Shadow / Twilight Sparkle
0.6122    - Trixie / Twilight Sparkle
0.6093    - Rainbow Dash / Twilight Sparkle
0.5475    - Sunset Shimmer / Twilight Sparkle

While the ranking is still roughly similar, some notable differences turn up – Sunset / Twilight is suddenly a lot less popular, (why?!) while Rainbow / Soarin and Tempest / Twilight appear in the list out of the blue. It seems like ranking of ships among content creators and their audience could be significantly different, and I can’t, for the moment, imagine why that would be. Could it simply be a statistical artifact resulting from the fact that there are so much fewer artists than upvoters, so each single artist’s personal taste makes a bigger difference? Maybe.

Some analysis

It remains the most fascinating that the friendliest fleet optimizer, no matter how flawed the initial data is, consistently selects in favor of canon-inspired and canon-declared pairings, when just about every character involved in those pairings has higher scoring pure fandom ships. I certainly didn’t code it to do so, this is a trend that emerges from the data itself, even through numerous imperfections, and is only occasionally thrown off, only by the most popular fandom ships.

The underlying reason seems to be that the more popular a given character is, the more ships they get, and the advantage of the canon-inspired one, while minuscule, gives it an edge when the optimizer needs to balance it against other characters with similar properties. I suspect that in effect, it’s the same paradox that you see occasionally with people who are popularly beautiful: Nobody thinks they’re in their league, so nobody tries to pursue them romantically, and the most popularly beautiful people end up alone.

It seems that the optimizer effectively models everyone trying to seek out their highest possible ranking pair simultaneously, and settling for lesser-scoring pairs, with enough iterations that everyone gets the highest possible relationship score that they can get – and that optimum is the one thing that most of the fans can agree on, because canon is the one thing they actually have in common.

Further avenues of research

It may still be that somehow solidly distinguishing between human-only and pony-only pairings is feasible, which would be insightful. The proliferation of anthro art, humanized-but-not-Equestria-Girls art, and other noise muddles the issue a lot. I could try splitting each pairing into separate sets of Pedestrian/Equestrian images based on the presence of the Equestria Girls tag, but how would those which involve both humans and Equestrians work, when I can’t programmatically tell which of the two partners the tag “human” refers to? The search continues, I’m open to suggestions.
As Scottbert pointed out, upvotes-per-day might not be a particularly fair metric of ships themselves: Pictures are routinely liked based on the art style, regardless of what particular ship they represent. Unfortunately, ranking how popular ships are among the artists is also a problem, because there’s so much fewer artists than consumers of their artwork. There’s no doubt these are interdependent variables, but they’re still distinct. Can a more effective metric be derived from the data we have?

Report Oliver · 479 views · #mathematics #shipping #graph theory #statomancy #logistics

Comments ( 2 )

Viewing 1 - 50 of 2
- Newest First
- Oldest First

Catalysts Cradle

Catalysts Cradle #1 · Mar 25th, 2019 · 1 · ·

It seems like upvotes per day might favor relatively new ships over older ships rather than normalizing the popularity of ships over time. In your dataset, what does the relationship between posting date and upvotes look like? I'd suspect it's a peaked distribution reflecting the rise and fall of the popularity of the Fandom.

Oliver

Oliver #2 · Mar 25th, 2019 · 1 · ·

5032571

It seems like upvotes per day might favor relatively new ships over older ships rather than normalizing the popularity of ships over time.

This does happen, and would be a concern if ships were spaced evenly across time, but they aren’t. Most ships are far from new. The newest detected ship had its first picture uploaded in April 2018, almost a year ago. The only ship that does ride up into the fleet on that is Gallus / Sandbar, which exists since 2018-03-25, and has 268 images and 50431 upvotes to it. Considering that Gallus’ next major ship, Gallus / Silverstream, is only three months older, and yet has fewer images, I’d say this reflects reality.

In your dataset, what does the relationship between posting date and upvotes look like?

Which particular relationship do you mean? Considering that dates for individual upvotes are not available, I can’t find peak upvotes for every ship, and charting totals against earliest-image date shows the obvious “it’s older so it has more.”

There are individual images, which do have their individual upload dates and individual upvote/downvote numbers, but that’s not quite the same thing.

Viewing 1 - 50 of 2
- Newest First
- Oldest First

Oliver

More Blog Posts349

113 weeks
Against Stupidity

114 weeks
Good morning, Vietnam

159 weeks
Lame Pun Collection

160 weeks
Rational Magic

167 weeks
A series of unexpected observations

Statistics #5.3: Revised Lloyd’s Register of Shipping · 11:00am Mar 25th, 2019

Some analysis

Further avenues of research

Stats

FIMFiction

Follow & Support Us

Oliver

More Blog Posts349

113 weeks Against Stupidity

114 weeks Good morning, Vietnam

159 weeks Lame Pun Collection

160 weeks Rational Magic

167 weeks A series of unexpected observations

Statistics #5.3: Revised Lloyd’s Register of Shipping · 11:00am Mar 25th, 2019

Some analysis

Further avenues of research

Stats

FIMFiction

Follow & Support Us

113 weeks
Against Stupidity

114 weeks
Good morning, Vietnam

159 weeks
Lame Pun Collection

160 weeks
Rational Magic

167 weeks
A series of unexpected observations