• Member Since 25th Feb, 2013
  • offline last seen Yesterday

Titanium Dragon


TD writes and reviews pony fanfiction, and has a serious RariJack addiction. Send help and/or ponies.

More Blog Posts593

Mar
10th
2015

Does Story Rating Correlate with Story Quality? · 11:55pm Mar 10th, 2015

For those of you who don't know, I keep bookshelves of my highly recommended, recommended, and upvoted (aka Worth Reading) stories (note that not all of these have been reviewed). I also have hidden bookshelves for stories I did not vote on, and on stories which I downvoted. Note that the upvoted list is inclusive of recommended and highly recommended stories, and the recommended stories includes my highly recommended stories.

One of the reasons I did this was for future statistical analysis. I asked myself the question:

Does rating correlate with story quality?

The answer, as it turns out, is yes:

I got these numbers by sorting my stories in each category by rating, and then looking at the Nth story (1-10, then the Nth percentile, so 10th percentile, 25th percentile, 50th percentile, 75th percentile, 90th percentile, 95th percentile, and 100th percentile (i.e. the last placed story)) and recorded their overall rating rank on the site. Low numbers are, thus, better than high numbers - #11 is the 11th highest rated story on the site, for instance.

As you can see, there's a pretty clear drop-off - while the top 10 stories are fairly randomly scattered, once you start getting down to lower numbers (particularly the 25th through 95th percentile), story rating does indeed appear to correlate with quality, albeit somewhat weakly - the median highly recommended story has a rating of 1727, the median recommended story has a rating of 1901, the median upvoted story has a rating of 3313, the median no vote story has a rating of 5104, and the median downvoted story has a rating of 9018.

This suggests that rating, while not a great proxy for quality, does indeed correlate positively with quality - the better a story is, the more highly rated it is likely to be. Unfortunately, the stories are not well-sorted by rating - and unfortunately, the sample sizes aren't enormous - but I think this is evidence that, at least according to my personal view of quality, there is some amount of real correlation. Interestingly, it also seems to indicate that there may not be a real difference between my recommended stories and highly recommended stories - I may be differentiating between the two more or less at random, as far as quality is concerned, but I do appear to pick out better stories to recommend than merely upvote on average.

So, when someone says rating has nothing to do with the quality of a story, they're wrong - but they're not wrong to suggest that it is insufficient evidence of quality. A story with a higher rating really is more likely to be good than a story with a lower rating, but it is not definitive.

Report Titanium Dragon · 641 views ·
Comments ( 22 )

Do you think that if I divided stories the same way you do, there would be the same correlation?

You're really organized. :heart:

2866226
That would be my guess!

It would be interesting to see more numbers from more people, but I don't know who else is OCD enough to do stuff like this. I think a fair number of folks track which stories they upvoted (I see a number of upvoted shelves) but I don't know if a lot of people track downvoted stories.

Despite your explanation of how you got these results, I have no idea what's happening there. I've never been good at mathematical formulae. :fluttershysad:

2866269
It wasn't really even math, just going through and gathering data.

Basically, what I did was take each set of stories and sort them by rating.

I then looked at the Nth story's rating on the site (i.e. what rank number it had) and put it into the table.

The 10% indicates 10th percentile - that is to say, of the, say, 500 stories I upvoted, it is the 50th story. Of the 344 stories I downvoted, i is the 34th story. I then recorded their ratings in the table.

So basically, if you scan across a single row of the table, you can compare where each fraction of the stories in that column fell - so for instance, the top 10% of my highly recommended stories had ratings above 92, the top 10% of the stories I upvoted had ratings above 265, and the top 10% of the stories I downvoted had ratings above 1348. Thus, the top 10% of the stories I highly recommended had a much higher average rating rank than the top 10% of the stories I upvoted, and a vastly higher rating rank than the top 10% of the stories I downvoted.

2866240
What about looking at PresentPerfect's rankings? H doesn't go by up/down vote, but he does rank them similarly, and has a huge sample size to play with.

By itself, this is a "well of course" sort of thing. It would be bizarre if this were not the case.

To make it useful you need to find a way to measure effect size: how much more likely is a story to be dragongood if it has twice as many likes (or X ratio)? How much more likely is a story to have at least X more likes (or X ratio) than another, when the former is, in fact, dragongooder?

I've done smaller sample sizes on occasion. Mostly just checking out randomly picked stories from mostly green, equal green/red, and mostly red. My conclusion was that a high rating is more or less protection from badness rather than indication of goodness. If that makes any sense.

Now I'm tempted to go through and cross-reference all of PresentPerfect's reviews to see how they add up. Of course then I remember I've got enough of my own work to mess around with.

2866771

My conclusion was that a high rating is more or less protection from badness rather than indication of goodness. If that makes any sense.

Well, if you look at rating as "odds that you won't dislike this story", it is by its very nature a good predictor of such, as that is more or less what it is actually measuring.

2866871
For me it's more of a "odds this story will not make you want to use eyebleach". Maybe I just got real unlucky picking red bar stories.

2866878
To get a red bar, you need to be either awful, offensive, or controversial; I've got a couple of red-barred stories on my lists which are actually quite good, but it is hard to find them unless you're already aware of the author for other things or someone points them out to you.

Tracking upvoted-but-not-favourited stories is a good idea. I think I'll start doing that from now on.

I'd be curious to see how the ratings for your downvoted stories change if you removed your downvote. The difference between you upvoting a story and leaving it untouched it small, but downvotes have an outsized effect on story rating, especially on lesser-read stories; I've seen a single downvote push stories down hundreds of places in the ratings.

2867926
They do change a little, but not enough to account for the differences - especially not further down the list.

Unfortunately, it requires me to change the vote temporarily to an upvote to check it, as you cannot simply unvote on stories (this also makes checking the impact of my upvotes on stories impossible to check).

For there to be no correlation would be surprising. I'd rather see the probability distribution for you recommending a story (highly or not) as a function of its (upvotes + 1) ( upvotes + downvotes + 2). That is, is the increase in P(recommended) with rating monotonic?

2866306

To show how much:

I don't understand either the original post or the explanation of the original post, I'll put up my conclusion from looking at your data. Then maybe someone can tell me why my conclusion is wrong, and that'll make me try thinking about this all in a different way.

'Cause what I get from all this is: stories you like are largely stories that FimFiction in general has liked, and stories you dislike are largely stories that FimFiction in general didn't like. Therefore when folks complain about your review columns being excessively negative, they're way off base because you are in essence the average FimFiction reader. Which means that the average FimFiction reader can try 15 or 20 stories before finding one that he or she likes.

Or am I way off base here? And do people really add stories to their "favourites" without upthumbing them and vice versa?

Mike, Confused as Usual

2869220
The point is that rating correlates positively with quality; i.e. a better story tends to be on average somewhat more highly rated. It isn't automatically higher rated than other stories, but a 50th percentile upvoted story is around the same point, ratings-wise, as a 25th percentile downvoted story - i.e. half of the stories I upvote are more highly rated than 75% of the stories that I downvote. This necessarily means that there must be higher rated stories which I am downvoting than half of the stories I upvoted.

Or, to put it another way: if I was to pick out stories at random, the more highly rated they were, the more likely they would be to be upvote worthy, whereas the lower a story was rated, the more likely it would be to be downvote worthy. That doesn't mean all highly rated stories are good, or all poorly rated stories are bad, and indeed, it does not necessarily mean that I am a good judge of quality (though I'd like to think so :coolphoto: ) but rather that story rating and my perception of story quality are linked to at least some degree.

I dislike a far higher percentage of FIMFiction stories than the average person, given that if I was the average FIMFiction reader, the overall upvote:downvote ratio on stories would be on average something like 10:7, when it seems to be closer to 10:1. I'm not well-representative of the population.

And do people really add stories to their "favourites" without upthumbing them and vice versa?

Yes, because a lot of people are lazy and just use their favorites shelf as a "read later" shelf. I dunno how frequent it is, but I see it often enough that it is noticeable.

Am I reading this right that some stories overlap in multiple bookshelves? Like I see the #1 on highly rated apparently on #3 or so on recommended

2869851
If I'm reading the list correctly, Highly Recommended is a subset of Recommended stories, which is a subset of Upvoted stories. So every HR story is also R, and every HR/R story is also Upvoted (but the reverse is not true).

2869244

OK:

I think I'm getting a better handle on this. So a more appropriate conclusion might be: if folks are looking for good Pony stories to read, this more long-term list of highly-rated stories is more likely to pay off than, say, the Featured Box?

Mike Again

Login or register to comment