• Member Since 11th Apr, 2012
  • offline last seen Yesterday

Bad Horse


Beneath the microscope, you contain galaxies.

More Blog Posts758

Jul
3rd
2015

Story tag results simplified · 7:13pm Jul 3rd, 2015

I did a terrible job of explaining how to interpret the results I presented in my last blog post. Let me try again, without so much math. There's an easier way to interpret the results for the binary independent variables. (The tags. 'Binary independent variables' = 'tags'.)

But first, snu-snu math! (You can skip to the end if you hate math and understanding things.)

The most important difference between using the equation

Eq. 1: views = b1 * Adventure + b2 * Comedy + b3 * Romance

and

Eq. 2: ln(views) = b1 * Adventure + b2 * Comedy + b3 * Romance [1]

is that equation 1 says that each different independent variable (things like tags) adds a constant to the number of expected views, while equation 2 says it multiplies the prediction by a constant. Equation 1 doesn't work for fimfiction because if MarySue3235 puts a 'Romance' tag on her story, and Pen Stroke puts a 'Romance' tag on his story, the number of additional viewers drawn in by adding that tag will be larger for Pen Stroke's story.


[1] I used b1, b2,... instead of a1, a2, ... this time because these coefficients are usually called beta coefficients, and represented by a beta, which is the Greek letter 'b'. 'ln' means the same thing as 'log', it's just more specifically saying "logarithm base 2.718".


That works like this:


ln(views) = b1 * Adventure + b2 * Comedy + b3 * Romance

e^ln(views) = e^[ b1 * Adventure + b2 * Comedy + b3 * Romance ]

"ln(views)" means "the number you have to raise e to the power of to get views", so e^ln(views) = views. e^[a + b + c] = (e^a)*(e^b)*(e^c). So

views = e^(b1 * Adventure) * e^(b2 * Comedy) * e^(b3 * Romance)

For tags, like Romance, the value is either 0 or 1. If it's 0, e^(b3 * 0) = e^0 = 1, so there's no effect on views. If it's 1, the predicted number of views gets multiplied by e^(b3 * 1) = e^b3.

So. Take that table of regression beta coefficients (the bi variables) from my last post, and change each of the binary ones to the multiplying factor e^bi :

TAG BETA e^BETA _______________________________ Ad -0.260 0.77 Co 0.192 1.21 Ro 0.306 1.36 Hu 0.465 1.59 Tr -0.115 0.89 celestia 0.100 1.10 chrysalis 0.803 2.23 (this was shortly after her introduction) cmc 0.125 1.13 daring_do 0.243 1.28 (not long after her introduction) dinky 0.374 1.45 discord -0.043 0.96 main_6 0.182 1.20 oc -0.477 0.62 twilight 0.211 1.24 completed 0.215 1.24 oneshot -0.085 0.92
[EDIT: Knighty fucked up the 'code' tag, removing whitespace within it. There is now no way to show data tables on fimfiction. Or code.]

Now we have a simple interpretation for each of these tags:

- The Comedy tag multiplies expected views by 1.21
- The OC tag multiplies expected views by 0.62

and so on.

Comments ( 22 )

So my HiE romantic comedy featuring Chrysalis and Dinky is good to go, in other words.

Interesting that Adventure reduces expected views to such a degree. I wonder if that's because they're so often multi-part, and thus also incomplete?

All of my fanfics will now be completed humanized romantic comedies starring Chrysalis, Twilight, and Dinky.

Just kidding! I know stats well enough to detect a faulty independence assumption.

3203117
(See my tinytext below for a counterargument to exactly what we were both thinking.)

So logically, I should add as many positive tags as possible to my story, to attain the greatest multiplier.

3203118
That's the basic flaw here. There's an unspoken independence assumption, so it can't be boiled down to a single set of numbers.

So the moral of the story is, always ask Bad Horse to crunch the numbers for you and tell you which tags to pick. I'm certain he will tell you the truth and not lead you astray for reasons of hilarity.

3203139 The tags that are missing didn't give reliable predictions.

Y...you perform statistical analyses on a stupid pastime? For fun?

HA HA HA HA HA HA HA HA HA...oh. :twilightoops:

Hold on, are you telling me that the "sex" tag honestly doesn't cause a strong effect in either direction? I'm finding that hard to believe. But then again, I don't know which direction I'd expect it to lean (I'd hope down).

3203738 I didn't load the 'gore' or 'sex' tag. They're stored in a different place on the webpage, and I never got around to reading them in.

3203213 You must have some bad C code, 'coz I can't follow that reference.

3204854 Hah. English speakers don't think of 'shed' and 'shade' as related, & I didn't know they were until now. I assumed you were quoting some band because you said "couldn't shed no light" instead of "couldn't shed any light", and because I didn't notice the pun in "aftermath" and so "the dark aftermath" sounded like something Pink Floyd would say.

3205295 The etymology is unclear. Google

shade shed etymology

says

late 15th century: apparently a variant of the noun shade.

but online etymology has a wide variety of choices.

Everyone likes Pink Floyd. Well, everyone worth mentioning. :trixieshiftright:

This just might explain the explosive popularity of the two Twilight+human story I wrote way back when… Cool

Login or register to comment