• Member Since 19th Oct, 2015
  • offline last seen Nov 11th, 2023

Glitter Hamburger


More Blog Posts2

  • 321 weeks
    What makes stories popular on fimfiction?

    Or more accurately: What factors except writing quality make stories popular?

    Read More

    0 comments · 450 views
  • 415 weeks
    Some statistics on fimfiction story titles

    Ever since I read Bad Horse's blog posts on some fimfiction.net statistics I wanted to do some analysis myself. So when I noticed that I would reach five million read words here very soon, I thought it was the perfect occasion to finally do it and write a first blog post.

    Let's look at some statistics about story titles today.

    Read More

    0 comments · 955 views
Apr
26th
2016

Some statistics on fimfiction story titles · 8:28pm Apr 26th, 2016

Ever since I read Bad Horse's blog posts on some fimfiction.net statistics I wanted to do some analysis myself. So when I noticed that I would reach five million read words here very soon, I thought it was the perfect occasion to finally do it and write a first blog post.

Let's look at some statistics about story titles today.

The data used here is from the 12 January 2016. The highest story ID at the time was 310752. 168,840 of those stories existed and where accessible without a password. 78,788 of those were published. Those are the only ones I'm looking at here.


Word frequency in titles
Without further ado, the 30 most frequent words in stories published on fimfiction.net:

When comparing with the list given by wikipedia, we can see that "that" is completely missing, despite being the 8th most common word in English. "Be" and "have" are less frequent here because I did not merge different word forms (that is, I counted "am", "is", "are" etc as different words, whereas the list in wikipedia counts all of them together). "my" on the other hand is very common over here, but according to wikipedia only ranked 34th, and according to other lists even less frequent. At first I was quite surprised, until I remembered the 'M' of MLP. Heh.
Removing the expected frequent words from the list, we are left with "equestria", "little", "twilight", "pony", "love", "new", "night", "life", "rainbow", "day", "friendship", "magic", "story". Not that astonishing at first sight, but then I didn't really expect that many stories to contain "my little pony" or "new", "life", "night" and "day" in their titles.




I also ranked words by the cummulated likes titles containing them received, but the differences to ranking by frequency are small and it's hard to tell how significant they are. Looking at the disliked words, "human" on place 30 is interesting, because it's less than half as frequent as "story" and still got more dislikes. But it also got more likes (by an even greater ratio), so it's just a result of the high correlation of likes and dislikes with views, because more readers means more possibilities for likes and dislikes.
I don't think the lists are all that informative. They show that Twilight Sparkle is best favorite pony, followed by Rainbow Dash. Who knew.




The lists for word pairs are actually a bit more interesting. They show that there are lots of Fallout Equestria and Conversion Bureau fanfics. And they are ranked much higher in the frequency list than in the likes list. Does that mean readers don't like reading them as much as authors like writing them? Not necessarily. They have a similar position in the views list as in the frequency list, and many of them are quite old. Many of the old fics seem to have a much higher views to likes ratio than newer fics. Maybe this is because other places such as EQD bringing in readers without an account was much more important in those days.


It's interesting how many variations of "the story of" are in the list of the most frequent groups of three words.
The numbers are getting low enough for single fics to appear in the list, e.g. Sunny Skies All Day Long brings 8,306 of the 8,598 likes "all day long" has. Therefore I won't comment the lists for groups of 4 and 5 words here, but if you're interested, I uploaded all charts here: http://imgbox.com/g/XNxrJoZgHN


Most frequent titles
The 20 most frequent story titles on fimfiction.net ("viewsp~r" is views per chapter, "total_~s" is total views, i.e. the sum of the views of each individual chapter):
[code]
+----------------------------------------------------------------------+
| count title views viewsp~r total_~s likes |
|----------------------------------------------------------------------|
1. | 25 Memories 26488 22910.49 62358 948 |
2. | 24 Alone 27194 24840.64 69437 2024 |
3. | 20 A Second Chance 30130 18024.33 434996 1754 |
4. | 18 Broken 28431 25099.48 72203 2523 |
5. | 17 Redemption 30568 17029.39 159691 1717 |
|----------------------------------------------------------------------|
6. | 15 A New Life 21917 13803.09 90129 829 |
7. | 14 A Night to Remember 28587 27786.71 35972 864 |
8. | 14 Rain 16985 15208.5 34779 1303 |
9. | 13 Wings 21015 16005.87 100536 1059 |
10. | 12 Secrets 10790 8634.995 34708 522 |
|----------------------------------------------------------------------|
11. | 12 Remembrance 10121 9244.267 16835 669 |
12. | 12 Forgotten 14381 12150.93 33562 1276 |
13. | 12 Coming Home 11741 11115.25 16341 1124 |
14. | 12 Silence 8082 7518.133 11216 491 |
15. | 12 Loyalty 36249 35519.75 40577 905 |
|----------------------------------------------------------------------|
16. | 12 Monster 36592 32304.47 259465 2475 |
17. | 12 The Trotting Dead 9233 6247.296 47410 417 |
18. | 12 Destiny 11806 9937.067 16868 389 |
19. | 11 A Whole New World 21439 10530.55 333505 1354 |
20. | 11 Changes 35597 26598.77 145858 2735 |
+----------------------------------------------------------------------+



General title statistics


Story titles consist of 3.8 words or 22.2 characters on average. 3 words or 17 characters are the most frequent values.


Conclusion
Looking at these word lists and histograms is cool, but very limited and inexact in what it can tell us. Regression analysis would enable us to draw more exact and more extensive conclusions. Next time.



Appendix: Data acquisition
I downloaded the data with a python script through the fimfiction API, which gave me 168,840 JSON story metadata records. I then converted everything to one big csv file and wrote a python script to count the word frequencies and generate the charts. I also did some stuff with Stata.
Python script: (I can upload it somewhere on request)

Comments ( 0 )
Login or register to comment