• Member Since 1st Aug, 2014
  • offline last seen February 8th

Taialin


I'm Canadian!

More Blog Posts41

  • 215 weeks
    COVID-19 Pandemic

    Seriously, where did all the toilet paper go?

    ((My graduate training is in epidemiology and public health, and I'd like to think I know whereof I speak. This will be off-topic—possibly a more inane blog post than I've ever made here. You know what it's about.))

    Read More

    1 comments · 390 views
  • 271 weeks
    I'm not dead.

    And to those of you who know what's going on, I am not at all being facetious.

    Read More

    2 comments · 382 views
  • 285 weeks
    Cancer

    ((This is an explanation of I peered into oblivion yesterday., but it also elaborates upon many personal struggles, chief among them the title of this post. I'd advise you to read the story if you haven't already. I warn you once again: if you do not want to hear about sensitive personal matters or are

    Read More

    5 comments · 847 views
  • 295 weeks
    September 3

    Listen > Language > Lust

    Obsolete > Oneirology > O——

    Read More

    8 comments · 481 views
  • 329 weeks
    On Failure

    If there was ever any doubt that I'm still a terrible author . . .

    I thought I understood how to write characters, Rarity most of all . . .

    Why didn't I catch something so obvious? . . .

    Do I know what a good story is anymore? . . .

    So much of future stories depends on what happens in this one; what does it mean when I got this one so wrong? . . .

    Read More

    9 comments · 656 views
Feb
27th
2016

A Statistical Summary of Fimfiction (Or The Scattered Remains of an Abandoned Study) · 5:25am Feb 27th, 2016

About one year ago, I had the exceptionally bright idea of taking a random sample of all stories published to Fimfiction at the time and attempt to find something significant to say about them. Whether by suggesting what genres get the most views, or what sells the most, or what stories this fandom was lacking. Come to think of it, I'm really not sure why I chose to begin such an endeavor in the first place. Probably because I thought it was fun (for a given definition of "fun").

Needless to say, such a study took way longer than I anticipated, and given that I was really struggling to conclude something profound from this huge mass of data I collected (probably because there wasn't anything significant in the first place), I dropped the effort. That was nine months ago. All the data is still sitting on my computer gathering dust.

I found a few trends and conducted some analyses, but they're the results of a study that never was. Now they're just curious graphs and figures that are interesting to look at but don't mean a whole lot out of context. They'll actually mean even less soon. Fimfiction is constantly changing, but the data set I gathered was a moment frozen in time (about April 2015). The longer I go without posting this stuff, the more obsolete it gets. Even if they're nothing more than senseless curiosities now, I feel the need to post them before they become they become pointless, too.

I have no plans to continue working with the data I gathered, if only because I don't see a point anymore. But if anyone out there does see a point or thinks they can do something new with it, by all means, PM me, and I'll send you the raw data.


How often the three story ratings (Everyone, Teen, Mature) are used on Fimfiction. This is pretty much the most basic analyses that could be done on this data, and it's value was been pretty much completely deprecated by the statistics section of the Fimfiction engine itself. There's not much to say about this figure, except the fact that the majority of stories on Fimfiction are, in fact, not clop.


A slightly more interesting analysis that shows the frequency of the genres of stories posted to Fimfiction. I have to say, I wasn't expecting Adventure to be such a popular genre. There also happens to be a "Big Five" of sorts before the frequency drops off significantly. The very last entry, Anthro is low because I took this survey at a time when Anthro had only been a genre for a few months. If I were to take this sample again right now, the distribution would probably look largely the same, with the addition of the even newer tags on the right side.


Not much to say about this one. Darker tags tend to be rated more poorly, as I think anyone with a passing presence on Fimfiction would be able to tell you. But I told you I was going to drop everything that I did, so here it is. Make what you will of it.


Welp. Once again, darker tags also get fewer views. With the added bonus of objectively showing that sex, indeed, sells. By a depressingly large margin. Followed by Romance. Then Human. This is probably my least favorite of the figures I generated through this study. It summarizes pretty much everything I don't like about Fimfiction.


Naturally, stories can be tagged with multiple genres, so I wanted to look at that, too. As you might expect, with respect to the "Sex" tag, Romance shows up alongside it a whole lot more often than you'd expect by chance. And apparently Fimfiction's authors have a thing against writing Adventure sex. Somebody make it happen.


This figure took way too long to make. Essentially, I ran the above analysis on all genres (figures not shown), and crossed them against each other to form a table of genre concomitances. Green cells are placed at the intersection of genres likely to be seen with each other, and amber cells, the opposite. The number inside the cells is the associated z-score (a statistical measure). Cells that have italic text and are greyed out represent no significant association. So looking at the figure, it seems Fimfiction's authors do love their Dark/Gore mixes, but they hate Slice of Life/Gore or Dark. Also, Romance goes with nothing except Sex, and Slice of Life goes with nothing, period.


The number one graph you should show people if you want to tell them that writing for fame on Fimfiction is a stupendously bad idea. The vast, vast majority of stories here are forgotten and never seen. Fame begets fame, you see, so stories that get more views also get more attention and in turn generate more views for themselves. The heat system also helps contribute to this. Those stories that don't have attention don't get over the "early adopter" hump, and don't get very viewed.

Transform the graph logarithmically, and it almost looks Gaussian. This is pretty much the case for a lot of things in society (income disparity the prime example). Hey, at least it's not hyperexponent-based.


This graph is actually sort of difficult to analyze. Correlation is not causation, see. So you can't conclude that writing a better story with a better rating will garner it more views. Equally, you can't say that a story with more views will cause it to be more favorably rated. They are, however, positively correlated with each other (weakly). It doesn't help that these two variables aren't entirely independent, either. More on that in a second.


This figure is also one that should be interpreted with a grain of salt, particularly because of the confusing way that Fimfiction defines "rating." See, Fimfiction's rating system is based on calculating the lower bound of a Wilson interval, a statistical calculation. It's rather confusing to explain in full—see this link for a detailed explanation—but suffice it to say that your rating is partially dependent on how many times it's been rated (and by extension, viewed). For instance, an upvote:downvote ratio of 9:1 is not even close to 90%. Try 52:1 instead. Here's a super long Excel formula I developed in case you want to calculate it yourself:

=(1/(1+((1/(A1+A2)*(NORMSINV(1-(0.05/2))^2))))*(((A1/(A1+A2))+((1/(2*(A1+A2)))*((NORMSINV(1-(0.05/2))^2)))-(SQRT(((1/(A1+A2))*(A1/(A1+A2))*(1-(A1/(A1+A2))))+((1/(4*((A1+A2)^2)))*((NORMSINV(1-(0.05/2)))^2)))*(NORMSINV(1-(0.05/2)))))))

Where A1 and A2 are the upvote and downvote counts, respectively.


Data. Look at all the data. There's like over a thousand entries here, each with some thirty points of data. All generated manually. I found it relaxing for some reason. Again, if you want to do something new with this, PM me.

Whew. I have no use for any of these analyses, but they're kind of fun to look at. There's a lot of data I collected that didn't make it into any of these graphs, and I don't really care to find a creative way to integrate them. If nothing else, hopefully you enjoyed looking at all the pretty colors and the sort-of-statistics behind all of them.

Comments ( 7 )

Ocalhoun might be interested. He did some vaguely pointless stats a year back or so, regarding the sorta best time to post a story for max views, which was also dependent on mods being awake. Not so much now.
Also, holy crap that rating equation is complicated. Whhyyy.

The views to rating correlation might be more interesting if you plot views against the raw upvote:downvote ratio. The fancy "rating" computation would lead stories with higher numbers of votes to have higher rating even if each individual vote was completely random with the same distribution for every story.

Glad to see that at least something ended up happening with all the data you collected. Even if it just became a few interesting little graphs, I'm glad I could assist with the data collection. It was actually kinda fun for me too :twilightsmile:

Interesting stuff! Singularity Dream may at last have competition in the Ponyfic Spreadsheet stakes. ;) Not that I'm a gore fan, but did you do an analysis of "Gore" against genres in the same way you did for "Sex"? I'd guess that Adventure would be up in the top few of a chart like that, though behind the likes of Dark.

One very minor point, though it may not apply to the period when you collected your data: there are actually quite a lot of Adventure+SoL and Comedy+Sad fics, since stories that had both tags before the combination was disallowed have been permitted to keep them.

3779739
Because statistics is complicated. Yay. Abstracting away rating calculations is nice so that laypeople don't have to worry about it, but it's kind of frustrating for nerds like me. Oh, and do you have a link to those statistics that ocalhoun did? I do love me some pointless statistics.

3780109
i63.tinypic.com/fnsaba.png
Nope, not really. Actually looks a little bit messier. Unless you have something notable to comment about with this graph?

3780422
Yaay, nerds! I wish we could have done something more notable with all this data, but I guess it wasn't to be. I just couldn't think of a creative way to tie up all these stats into a nice paper with a notable and significant conclusion. Instead I have, "look at all these cool graphs and stats!"

3780456
Sure do! You could extract it from that table of genre concomitances, but here's a nice graph for your delectation.
i66.tinypic.com/69er1h.png
Strangely enough, nope. Even Sex and Human are more positively correlated than Adventure. Don't really know what to make of this.

I surveyed all stories with number ranges 1 to 258000, and I did find some stories with those forbidden pairs. I only didn't include them the table because Fimfiction's rule imposes an artificial barrier against this pair, which confounds it against what would actually be the case if there wasn't such a rule. For the record, they are, as expected, very negatively correlated, with Z scores under -8, an effective probability of less than 0.0000000000001%.

3781070
Hmmm. I get an impression the the majority of stories are very close to the left side of the graph, but at least the correlation we see now is not skewed by taking the lower end of a confidence interval. Perhaps something clearer would show with a logarithmic x-axis (which would correspond to the logit of up/(up+down)). But perhaps not. Just thinking aloud here while procrastinating on my writing. :twilightblush:

Ocalhoun's statistics post is here: https://www.fimfiction.net/blog/494106/when-is-the-best-time-to-post-a-story-answered-with-science

Login or register to comment