• Member Since 11th Apr, 2012
  • offline last seen Tuesday

Bad Horse


Beneath the microscope, you contain galaxies.

More Blog Posts758

Dec
19th
2016

The survey-giver is not your pal, or, How to piss people off with a survey · 3:40am Dec 19th, 2016

Right now, 1,638 people and/or ponies have answered the fimfiction survey. Thanks very much! I didn't expect to hit 1000 for a long time.

You should’ve gotten a link to the results after taking the survey, but some people were dazed by the time they finished it. The results are here.

The proximate cause of the survey was my application to Princeton, for which I needed a 25-page writing sample by midnight Dec. 15. I had something written about fan-fiction, but I asserted things about ficdom which I didn’t know or couldn’t prove. So I made a survey.

(I plan to break that writing sample up into articles & send it off to journals, so I'm not supposed to post it here first. I'll post some of the highlights.)

Fortunately, literary journals don't have ethics committees, so I shouldn't have to worry about the exact nature of your consent to use your data, which would not hold up if I wanted to use these results in a psychology paper.

Unfortunately, it turns out how people answer depends a lot on exactly how I phrase the question, what answers are possible, and the background of the person reading the question. So much so that I don’t trust survey results anymore. More on that in a later post.

I got a lot of comments about the survey. Some were constructive criticism, including valid points about ambiguous questions. Some were about things I had no control over, like ranging the linear scale from 1 to 5 instead of 1 to 10, or not giving more elaborate definitions of the scale.

But most of the angry complaints were about things that I did deliberately. (The things I screwed up the worst, you mostly couldn't see. I’ll explain those in another post.)

The problem is that people taking a survey want to express themselves as fully and accurately as possible, while the people giving a survey want to extract usable information from their answers that is clear, objective, and has simple statistical properties. These goals often conflict. I only need one to three bits of information from each question, so it's much more important for the meaning and statistical properties of answers to be clear than for their values to be precise or their information content to be large. As a survey giver, I would much rather get one-tenth of a bit of objective information about a binary choice from your answer than two bits of information about your position in a high-dimensional space with subjectively-defined co-ordinates.


Things I Did Deliberately

The most-common kind of complaint was about questions that restricted answers. They gave two choices and not ‘both’; they used single-choice instead of checkboxes; they didn’t include an ‘Other’ option. That was deliberate.

Consider some typical cases:

Are you in the fandom more for…
(a) the show?
(b) the fandom?

A lot of people wanted option (c), ‘both?’ But the point of the question is to provide information. Information is literally a distinction between alternatives. That’s why it says ‘more for,’ not ‘for.’ If you're in it 49% for the show and 51% for the fandom, I want you to enter "the fandom," and I will be able to tell roughly what the average importance of each is, and what the variance of that importance is, from the answers. A ‘both’ answer provides zero information, and prevents me from figuring out the average and variance, because I no longer have data on the entire distribution and don't know how much of it I'm missing.

Information is not measured by the number of answers. It is measured in bits, and bits are measured by the probability of giving one choice or the other on a two-choice question. If I’d provided a 'both', most people would have chosen it, even people in it 60% for the fandom and 40% for the show. The people who didn’t choose it, but gave a useful answer, would be questionably representative of fandom. Providing ‘both’ as an answer just bleeds the data away. By restricting it to 2 options, I got a useful and interesting answer: 73% more for the fandom, 27% more for the show.

"But that doesn't really mean that 73% of people are in it for the fandom, because--" No. It doesn't. It means what it says, not what it doesn't say. We wouldn't know what the results meant if there were a "both" option, because that would mean "the importance of these things is within some fraction N of each other," where N is different for each person.

“But you’re not capturing how important it is that these things are really close for me, because--” no; I am capturing the fraction of a bit of information in your slight preference about the one distinction I am gathering information on. When I add all these fractions of bits up, I get lots of information. If I include the “both” answer, I don’t get that information. I instead get additional information about a distinction between one of the original choices and not making a distinction. That's too meta, folks. That's not the information I'm looking for.

If you “make a mistake,” a proportional number of people on the other side of the issue made the same mistake, and it will cancel out. If there is no real preference in the population, the answer will come out 50/50.

And what if I want to do a binomial test on the results? Oh, shit, that "both" answer just threw out an unknown chunk from the middle of the distribution.

Putting a 1 to 5 scale there instead of two choices would still leave the interpretation of how wide the middle bin should be up to the survey-taker. A 1 to 4 scale, maybe.

This one complaint makes up most of the complaints. Look, it's a complicated subject, and I could be wrong. But I know a lot about extracting information from weak data. It’s math. If I ask, "Which has higher entropy, a normal distribution or a flat distribution," and you hesitate before answering, you aren't in a position to tell me I'm doing it wrong.

The intensity of feeling and conviction some people have for their mathematically wrong beliefs about this answer are, I think, indicative of a split between the sciences and the humanities, which I hope to discuss at length in a later post. Basically, many people in the humanities think in whole numbers, especially those in art and philosophy. This is because they're rationalists. The term "ratio" is derived from the term "rationality", because the Greeks were also rationalists and thought in whole numbers. Rationalism and Platonism are closely intertwined; both have as central beliefs that the integers "exist" in some transcendent reality. Ratios also exist; they are relations between integers. Fractions do not "exist." No one even invented a way of writing down fractions--I'm using "fraction" to mean a finite decimal expansion, but actually it's a synonym for ratio, English still doesn't have a word for a finite approximation of a ratio--until about 1600. That's why people before then never developed science, but only logic. Most people in the humanities today, and all art theorists, are still under the illusion that logic and rationality are similar to science and reason, because they haven't checked with the scientists in the past 300 years, and that is why we have modernism today. Because artists and philosophers mistake the fanatical dogmatism and excluded middle of rationalists for science.

But I digress.

Americans only: In 2016, you voted for
o Trump
o Clinton
o Johnson
o Stein
o Castle
o did not vote
o Other

I got a lot of complaints for having this question at all. I didn't want to know people's political views. I wanted to do 2 things:

- I wanted to look for differences between people who read different authors, and between writers and readers. Turned out there were huge differences in voting patterns between people who read different things.

- I wanted to see who would be willing to share their answers on touchy subjects with others. I could've used "Do you masturbate while reading fanfic?" instead, but I might not have gotten enough "yes" answers. Again, there were differences, and not what I expected.

I made some big mistakes here: I didn't include "American but not eligible to vote," I should've made "did not vote" specify "American," and I included "Other."

Some people complained their candidates weren’t listed there, so I added ‘Other’, thinking people would fill in the names of other candidates. Instead, people gave explanations of why they voted for Trump or Clinton, or why they didn’t vote, or that they weren't American, and I got 19% of answers as “Other.” This means I have to copy the data into another spreadsheet to remove the "Other" category, and that I don't know how many answers are under "Other" that should be in some other entry.

I have some really interesting results, like that 76% of people who liked the “literary” authors voted, while only 39% of people who liked fluff romance voted. Or do I? It could be that people who like “literary” authors are more likely to write long explanations of why they didn't vote. I won’t know unless I go through all the “Other” answers one-by-one and file them where they should go.

I have maybe 10,000 ‘Other’ answers in the spreadsheet. Except for the ones where "Other" is exploratory, like "How do you usually choose stories?", I’m not going to read them. Unless I'm trying to discover what people do, all that the ‘Other’ entry does is stop people from complaining, at the cost of throwing their answers away, making the pie charts that Google Forms produces useless, and screwing up the other results because I don't know what's misfiled under "Other."

Writers: What inspires you most to write fan-fiction? Pick up to 4.

This question was a disaster. I made some big mistakes here, including letting people pick more than one answer. Some people picked one; some picked 4; some (surprise!) picked all 11. The top answer was “adding details, background, or continuations to canon stories,” at a whopping 49%. But wait--if I just count people who chose that as one of 1 or 2 answers, it’s only 43 out of 836. If I count just people who chose “other things outside MLP” as one of 1 or 2 answers, which got 40% of the multiple-choice checkbox answers, I get 60--increasing its relative share of answers by a factor of 1.7. It turns out, looking at that and another question, that the people who were most-interested in the canon also checked lots of boxes.

What if I want to include this question in a multiple regression? I can’t, because the answers aren’t numerically comparable to each other.

The most-disturbing thing about this question is how much it contradicted the results on a similar question on Afalstein’s 2014 survey. For example, “filling in plot holes” got a mere 0.3% there, versus “filling in gaps in the canon”, which got 35% here--but only 16 people chose it as one of up to 2 choices, and only 2 chose it as their only answer.



The moral of all these cases--and this is repeated throughout the results--is to avoid at all costs answers of ‘both’, fill-in-the-blank, or checkboxes that allow multiple choices, because they wreck the data and make analysis much more time-consuming.

Animal rights? (from 1 to 5)
1. It's okay to kill animals for fun.
5. Medical experiments on animals should be outlawed.

From reddit:

When it comes to animal rights, "it's OK to kill animals for fun" is not a neutral way to refer to game hunting. Not is "medical experiments on animals should be outlawed" a fair measurement, (or representation) of people who care about animal rights.

I’m not speaking for the NRA or for PETA. I’m representing extremist positions. I deliberately chose the words "it's OK to kill animals for fun" to be more extremist than “it’s okay to hunt for sport.”

I took a lot of flack on reddit for being “biased”, by which they meant they didn’t like the descriptions I used on the 1 to 5 questions. One guy wanted me to write out long, detailed descriptions of each end of the scale, which wouldn’t have fit and would have been more subjective and confusing. Another wanted me not to have a scale at all, but to have a wide variety of choices to represent the complexity of real life. Some, I suppose, wanted me to use the usual supposedly objective descriptions like “extremely conservative … extremely liberal”.

When I put choices on a scale of 1-5, that reduces the data to one dimension, but it gives you points along that one dimension. The principal component of any complex phenomenon usually captures at least 50% of the variance. So you’ve still got at least 50% of the data, and you're not throwing away the numeric component of it. You've probably saved more than you've thrown away, and it’s in a useful form.

(This is part of why some questions were America-centric. If I tried to incorporate political views from, say, Europe, their principal component would be different, and it might wreck what data I had.)

Supposing that instead of using a scale along one dimension, you came up with a list of 8 choices on economic policy that didn't ignore the many different ways people’s opinions vary in real life--what would you do with the results? How would you include them in your multiple regression? You’d have to include each answer as a separate binary variable, and run a logistic regression instead of a linear regression. The weakness of the data and the crappiness of using logistic regression on 8 different variables instead of linear regression on 1 would throw out more information than you saved by having all those answers. And you couldn't get linear regression answers, which are usually simpler to understand.

The generic “extremely conservative … extremely liberal” is worst of all. Who’s going to say on a survey that they’re extremists? I doubt even people in ISIS call themselves extremists.

I wanted to contrast answers between people in different social groups. But if there is a difference, that difference erases itself in that highly subjective scale. I live in a town that votes extremely conservative, but they don’t know that. They think they vote normal. Most of them would say someone who voted for Clinton was “extremely liberal,” while somebody in Portland would call that same person “moderate.” People here would consider “it’s okay to kill animals for sport” the moderate position and “it’s okay to kill animals for the meat” a liberal position.

Saying that you don't like how I described the endpoints is just partisan bickering. The important point is that they're clear and will be interpreted pretty much the same way in different social groups.



Next post: Things I Screwed Up

Report Bad Horse · 2,207 views · #fimfiction #survey #math #umbrage
Comments ( 138 )

One of the problems with the survey that I noticed was that on the voting question at least, you couldn't uncheck a bubble after you'd checked it and then changed your mind and decided that you didn't want to leave an answer.

I was mildly annoyed you grouped writers together as if they represented the same type of content, despite them having wildly different writing styles. Dawnfade is a good writer, as is Tchernobog. You will not get the same experience reading their work.

4345533 I emphatically agree with this assertion. Of the names I recognized, I could not think of even the slightest correlation between them.

4345526 That was a problem a few people commented on. I have no good solution for it. I could add "no answer" to every question, I guess... it would make the form longer and the results messier.

4345539 4345533 I explained how the computer chose the author names at length in Author clusters question. One purpose of the survey was to validate the algorithm for forming the clusters, so I could propose other people use it to study fan-fiction. This is to address the problem of academics thinking fan-fiction is itself a genre.

That results link just takes me back to the survey, not to the results.

curses, now I regret not complaining about questions :derpyderp1:

4345544 Can't help but notice your phrasing leaves it ambiguous whether the survey actually managed to validate the algorithm.

Or maybe your meaning is exceptionally clear, and I'm just an uncultured swine.

Well that's what you get for expecting a bunch of strangers to do things correctly with little communication. instead of doing one big quiz, you should of done sepret quizzes? (I don't know if that would help or not, if anyone can answer that I'd be happy)

No complaints about the religion question? Huh, looks like people are more triggered by politics than what their morals and beliefs are based on :rainbowlaugh:

4345551 Whoops! Copy-paste error. Fixed. Thanks.

4345562 Oh, there were complaints. Not as many, though.

One thing confused me: I got the same number of "Jewish" answers for religion and for ethnic group, even though maybe 1 out of 10 Jews I know are religious. How does that make sense?

It's a shame to hear the survey went poorly. There were certainly some questions that stood out to me as odd, but between this and reading your write up on the author groupings, there's only so much you could really do. I hope there was at least some usable information in there.

4345593

Idiots who don't know the difference between race and religion (I'm not saying Jews are stupid, I'm saying people are stupid)

4345594 I wouldn't say it went poorly. The complaints were... um, I'd have to read all the comments, which I haven't yet because the Google Forms UI for reading them is awkward, but I'd guess maybe 10% of people complained. And I think I got good answers to a lot of questions. The reddit thread Chinchillax created had mostly hostile replies, but a bunch of them answered the questions, so it's still good.

4345608
Hrm. I must have had the wrong impression based on the tone of the blog. I'm happy to hear that! :twilightsmile:

Well...I suppose in hindsight it was obvious that getting inside the heads of fanfiction writers was going to be a bit, ah, maddening. :twilightblush:

I understand your frustrations, and I appreciate the explanations. It probably isn't easy, but try to cut yourself some slack (and the surveyors too). We're all, like, virgins here. First time's gonna awkward for everyone. ^.^ (...including that metaphor)

It may have turned into a giant pain in the ass, but it'll be worth it, even if it's in an unexpected way.

4345560
There wouldn't really be a good way of validating the algorithm with the survey data. The survey data might help interpret the results of the algorithm, but it won't really tell you whether it got the groupings correct. One way to validate the algorithm would be to collect a completely independent set of data and see whether it produces similar clusters.

Another important point to make is that the survey seemed unnecessarily large. Such a large survey gives you tons of data and many different ways to analyze the data. However, those reading the results of the survey should be cognizant of the green jelly bean principle—that is, the more comparisons you can test, the more false positives you will find. For example, the website FiveThirtyEight demonstrated this principle by doing a large survey of eating habits and other characteristics, and were able to find statistically significant correlations between things such as eating raw tomatoes and Judaism or eating egg rolls and owning a dog. With over 50 different questions in Bad Horse's survey, one could run in excess of 2500 pairwise comparisons on the data, which would generate over 125 false positive results (unless you properly correct for multiple comparisons). Therefore, given the exploratory nature of the data collection, the data are really only useful for generating hypotheses which would need to be verified by independent means before any solid conclusions could be drawn. If the survey were meant to test specific hypotheses about Fimfiction readers, the set of questions would be much narrower, and ideally, a plan for analyzing the data defining the metrics for accepting or rejecting the hypotheses would have been prepared before data collection began (note: most natural science and social science studies don't actually do this).

I am sure Bad Horse is aware of such limitations and issues, but others who read about the results should take these thoughts into account.

Very interesting analysis.

Ah well, I liked it. Perhaps slightly flawed in execution, but still very fascinating nonetheless. I'd call it a success!

4345627
4345560

There wouldn't really be a good way of validating the algorithm with the survey data.

I can't validate that the writers write similar stories, but that's not what I was trying to do. I wanted to see whether the readers of the different sets of writers differed significantly in other ways as well, indicating the clusters make real distinctions. I looked at 4 questions, and the groups gave greatly different answers on 3 of them, statistical significance so high I haven't bothered running the numbers yet. The 4th I screwed up by biasing some of the survey takers. IMHO it's validated better than I'd hoped.

4345627

Once important point to make is that the survey seemed unnecessarily large. Such large surveys gives you tons of data and many different ways to analyze the data. However, those reading the results of the survey should be cognizant of the green jelly bean principle—that is, the more comparisons you can test, the more false positives you will find.

Yes. That is a very valid complaint, which no-one made.

If the survey were meant to test specific hypotheses about Fimfiction readers, the set of questions would be much narrower, and ideally, a plan for analyzing the data defining the metrics for accepting or rejecting the hypotheses would have been prepared before data collection began (note: most natural science and social science studies don't actually do this).

Yes, that would have been great. Unfortunately I couldn't do that and use the results on Dec. 15. In retrospect, I probably shouldn't have tried to.

I've worked in surveys before, I sympathize with the difficulties you faced. The only time I had a problem with the survey was there was a question where multiple answers could be checked, and the question didn't say either "pick only one" or "pick multiple."

4345641

I can't validate that the writers write similar stories, but that's not what I was trying to do. I wanted to see whether the readers of the different sets of writers differed significantly in other ways as well, indicating the clusters make real distinctions. I looked at 4 questions, and the groups gave greatly different answers on 3 of them, statistical significance so high I haven't bothered running the numbers yet. The 4th I screwed up by biasing some of the survey takers. IMHO it's validated better than I'd hoped.

It's not too surprising that you can find differences between different groups of individuals. A much more interesting result would be to find that those differences are predictive of taste. Gather a group of individuals, get their answers to those three questions and have them rate writing samples representative of your clusters. If that experiment shows that the answers to those three questions predict how they'll rate the stories, that would be a result worth publishing in an academic journal.

4345618

Hrm. I must have had the wrong impression based on the tone of the blog.

That's probably because I'm kind of a jerk.

4345593 Because it's an ancient holdover from a time when there was a lot less fluidity or conceptual separation between different aspects of a person's identity. For many people, being Jewish as a race and culture is still inextricably tied to being Jewish as a religion and answering surveys accordingly... even if you're Jewish in only the most trivially nominal sense, don't bother trying to be Kosher, and would never in a million years even consider wasting your Friday / Saturday showing up a service.

Man, it's really interesting to read about this kind of hardcore statistics work. I'd love to be able to learn some of it, but I suspect my education will lead me elsewhere. In any case, keep fighting the good fight for the rest of us, BH. Looking forward to more discussion of this survey.

This probably doesn't lend itself well to web scripting, but one interesting idea from the literature on Bayesian prior elicitation seems like it might be relevant to psychometric data collection.

Say, for select "choose-one-option" questions, you gave people a stack of balls to assign to different bins representing each answer. Basically, you'd be trying to represent your data with a Dirichlet distribution rather than a categorical (i.e. multinomial n=1) distribution. There's obviously a lot more information in a Dirichlet than in a categorical. Not saying it's necessarily the best answer for survey design, but it's an interesting thing to think about in my opinion.

Well at least it sounds like you collected some very interesting if unscientific data on how upset people can get over an online poll about pony fanfiction.

Very interesting, but if you did add the con/lib question would be to use the word fundamentalist. People are oddly enough more honest with that one
Ps: never use yourself in a survey

i did my best on the survey to post my honest answers with as few possible choices. the only one i can think of off the bat that i chose several was the question of "how do you choose the stories you read."

Did you add:

Questions to the survey after the first day? Looking at the results, I find a lot of things there that I don't remember answering when I took the survey a couple hours after you posted the link. Or maybe I just took it while half- asleep?

Mike

Bad Horse, your question on Progressivism was basically impossible to answer.

You ask us to define our progressivism on not just subjective values, but the subjective values of other people. You ask us to use the definition as it's presented in the media. Which media? History text books? Wikipedia? Progressive forums? SJW forums? MSNBC? Fox News? Info Wars? Depending on what source I'm picking, all of which is valid under the vague definition you provided, I am rating myself on a 1 to 5 scale of being pro-nice guy or pro-fascist fifth column planning on destroying the values that made America great. Woodrow Wilson was a progressive and massive fucking racist and Planned Parenthood used to be a front for a eugenics conspiracy. On the other hand there are people like Bernie or Jimmy Carter who would probably call themselves progressives and whom others would use as examples of progressives. Who do I pick as an example, because I can find media basically justify each interpretation of the term and every possible shade inbetween. Furthermore your attempts to clarify with the scale ends only mystifies things further. Now we are asked to have not just other people judge our position on the scale, but hypothetical people, and god knows if our model ideal Republican and Moderate are anything like your ideal Republican or Moderate.

We can't actually know how you would want us to answer that question if you knew our positions on the issue and you can't know if we answered them using the same scale as the one you are assuming to interpret the data. The question is quite literally useless from top to bottom and I only answered it as a private joke about people who have called me names when they really should have known better.

Bottom line, Bad Horse is if everyone is telling you that the data you have does not represent them, if the data is about them and you are trying to find out true things about them, then the data is basically worthless by definition. I know it sucks, and I know it's hard but you will have to put more than a couple days effort into designing surveys if you want true, useful data about the people surveyed. There is a reason there are doctoral level classes about just this subject in every field it's relevant to. Otherwise you just condemn yourself to writing papers about spherical cows while ivory tower fogeys nod along because you appealed to their biases and they know less about your subjects than you do and nothing of certain value gets added to human knowledge.

The most-common kind of complaint was about questions that restricted answers. They gave two choices and not ‘both’; they used single-choice instead of checkboxes; they didn’t include an ‘Other’ option. That was deliberate.

Sure. This works when the respondents typically have a defined position they are ready to explain. If they don’t, which was invariably the case in your survey, they have to define it in response to your question. Which is what you want, only, sometimes you want too much, because defining your position on complicated issues is a significant cognitive load.

You used a single large controversial dichotomy everywhere you should have used two or three smaller, easier to digest ones instead. A longer survey one can answer without thinking is preferable to a short survey you have to meditate for ten minutes on each question to answer.

I have maybe 10,000 ‘Other’ answers in the spreadsheet. Except for the ones where “Other” is exploratory, like “How do you usually choose stories?”, I’m not going to read them. Unless I’m trying to discover what people do, all that the ‘Other’ entry does is stop people from complaining, at the cost of throwing their answers away, making the pie charts that Google Forms produces useless, and screwing up the other results because I don’t know what’s misfiled under “Other.”

That’s why there’s such a thing as “pilot survey” you run on a smaller set.

The most-disturbing thing about this question is how much it contradicted the results on a similar question on Afalstein’s 2014 survey. For example, “filling in plot holes” got a mere 0.3% there, versus “filling in gaps in the canon”, which got 35% here–but only 16 people chose it as one of up to 2 choices, and only 2 chose it as their only answer.

Pony universe has far more gaps in canon than plot holes. Gaps in canon, i.e. “things canon did not say about the world but should have” are not identical to plot holes, which are “things story did not say about the events shown, but should have.”

You’re trying to compare apples to apple pies.

4345842 Instead of all that, tell me what question you would have asked instead.

I feel like I shoud apologise for put "I'm not American" in the Other category for voting. I didn't read the question right.

4345844

You used a single large controversial dichotomy everywhere you should have used two or three smaller, easier to digest ones instead.

Can you give an example of what you mean?

That’s why there’s such a thing as “pilot survey” you run on a smaller set.

Yes. I wish I had done that better. :unsuresweetie: But I only put those "Other _____" answers on questions because people in my very small pilot study said they wanted them.

Pony universe has far more gaps in canon than plot holes.

Good point.

4345742 I think that would also be a nice way to implement voting on political issues if we were to have a direct democracy.

4345803 I added and removed questions after the first day. The ones I removed are under "Retired questions" in the results.

4345800

Ps: never use yourself in a survey

Yeah... good advice, but I had no other equally-stable cluster member to use.

The problem is that people taking a survey want to express themselves as fully and accurately as possible, while the people giving a survey want to extract usable information from their answers that is clear, objective, and has simple statistical properties. These goals often conflict.

It seems like there are a whole lot of unexamined assumptions in this paragraph, and also a lack of judgment with regard to which goal is correct, or at least more productive.

Why do people taking a survey want to express themselves as fully and accurately as possible? What goal do they think they're achieving by doing this?

I don't know that anyone has ever collected objective information on that (by making a survey about surveys!), but speaking for myself, it seems likely that they want to do that because they know nobody does a survey for no good reason. They figure somebody is collecting that information in order to analyze it or prove something. Many people have at least a dim understanding of "garbage in, garbage out"; if your data is shit, then your analysis is also going to be shit no matter how good your analytical tools are. They'd like to avoid wasting everyone's time, or contributing to a false statistic being bandied around as an authoritative measure. (Unless they're trying to deliberately ratfuck your survey, which happens a lot.)

This seems a worthy desire on peoples parts.

the people giving a survey want to extract usable information from their answers that is clear, objective, and has simple statistical properties.

Why?

No, seriously. Why?

The most obvious answer to "Why?" that I can think of is "because information of that sort is easiest to work with; it produces easy-to-analyze datasets and easy-to-understand answers."

And that seems like a weak answer, because presumably, any survey-givers goal isn't just to gather information they can work with easily and then check out for the day; it is, or probably ought to be, to gather accurate information to use in their analyses, because, as said before, if your data is garbage, your analyses will also be garbage.

While I am no expert on the subject, it seems that often, getting accurate results from certain kinds of analytical questions is going to require data that is messy and very hard to collect and sort, because to render it down into a more simplistic state also eliminates a lot of signal.

Example: the voting question. If the question you're trying to answer is this:

I wanted to see who would be willing to share their answers on touchy subjects with others.

Then that is a great question! Because literally what answer they chose doesn't matter, merely that they answered it.

If the data you're trying to gather is this:

I wanted to look for differences between people who read different authors, and between writers and readers.

It is also a great question... except that now that you have that data, what are you going to do with it? Because that data is going to be valid and appropriate for some analyses, and highly inappropriate for others.

However.

While this was not your goal, if the question you were trying to answer was what I imagine what most of the people who encountered that question thought it was ("What are the political attitudes of the people taking this survey?") then that question is dogshit. People are going to answer it anyway; but a lot of them will think "shit, if I check Clinton they're just going to dump me in the liberal bucket, aren't they? That's bad information. I'm not actually liberal; I'm just voting for Clinton. I need to set the record straight or I'll provide a bad data point."

That's laudable. Useless, because they haven't discerned your actual goal (and indeed if they had that would have poisoned your results, because that question depends on the survey-takers ignorance as to its real goal) but laudable.

If you wanted to actually gather up information of peoples political beliefs, you'd need much more than that one question, and they'd need to be structured a certain way. This is doable.

Unfortunately, it turns out how people answer depends a lot on exactly how I phrase the question, what answers are possible, and the background of the person reading the question. So much so that I don’t trust survey results anymore.

You shouldn't, but that's not because of a failure of methodology.

There are people there who are really good at gathering information and then analyzing it to produce an accurate answer. It turns out this is something that's been very important to many people from many walks of life for a very long time! They've been working on it really hard and they've gotten really good at compensating for "changing a single word dramatically changes how people respond to this, what the fuck." Not perfect. But pretty good.

They've gotten good enough that it is a real problem.

We've become very cynical as a society, with good cause, but for the most part people still trust science and reason. If someone with expertise says something that seems objectively verifiable ("Water molecules contain one hydrogen and two water atoms; come here and I will show you") they're more or less willing to accept that, even if it is something they cannot understand. I've never yet read an explanation for how a causal universe and faster-than-light communication are mutually incompatible that I understood at all, that made sense to me, but enough people who have demonstrated deep knowledge of both math and physics have assured me that this is so that I accept it.

This presents a difficulty, tho. Most people think of themselves as at least somewhat empirical. You don't meet a ton of folks who just straight-up go "All the evidence suggests X just ain't so, but I'm going to believe X is so anyway!" What's far more common is that they'll deny that evidence or declare it invalid in some way, and then present their own evidence for why X is, in fact, so.

We like to think of ourselves as people who respond to facts and evidence. Nobody consciously admits "Well, this fact is clearly untrue, but I believe it anyway." At best we can achieve doublethink. Doublethink is hard. The mind rebels against it.

Anyway. My point is that since people will at least nominally respond to facts and figures, you can get them to do what you want by presenting them with the appropriate evidence. And if that evidence doesn't exist, you can get the next best thing by fabricating it!

So there are plenty of great surveys out there that are producing really accurate information. And there are a lot more that are producing precisely the information the survey-giver would like them to provide in order to support the conclusion they've already decided upon.

I'm sure I'm not telling anyone here anything they didn't already know. But... hrm.

I do tech support for a major polling and data-gathering outfit. No, not the one you're thinking of. Or that one. Don't think politics. There's more to the world than politics. Commercial polling outstrips political polling by an ungodly factor, you don't even know.

And this is a major cultural problem inside the company and in their relations with clients. The math nerds, the people with masters or doctorates in mathematics and statistics are the most committed to getting it right. The psychologists and social scientists (the guys who craft the actual questions in our polling) likewise. They get personally offended on a deep level by bad data and bad analyses. Not all of them, but most of them. If you just want a paycheck there are easier things to do than spend ten years learning everything there is to know about regression analyses.

The breakdown comes in among the corporate stooges. I've overheard multiple deeply contentious meetings that basically went "we're not providing the client with useful information and they're threatening to walk." "You keep saying that. THEY keep saying that. They won't say how, tho! We're very sure of our methodology! We're answering the questions they want asked!" "But you're not getting the answers they want." "That isn't our problem. It isn't even a problem at all. The answers are answers. They may indicate a problem but they are not, themselves, problematic." "It is a problem if we want to keep the account." "That is a problem. Your problem." "Do I have to remind you that I'm your boss?"

It gets even worse when you account for internal politics among the clients. Many of them really want accurate information and are willing to pay a shit-ton of money for it. Others just want their preconceived notions validated; the ones who are purely mercenary are easy to deal with, they either move on if we won't play ball or cut us a check if they can find a division of the company that does. The ones who think we're doing something wrong if the answers are not to their liking are harder, because they don't actually know they're asking us to cook the books.

The TLDR here is that you should be leery of surveys, but they're still valuable and accurate tools done right. There's just a lot of noise out there.

4345857

The most obvious answer to "Why?" that I can think of is "because information of that sort is easiest to work with; it produces easy-to-analyze datasets and easy-to-understand answers."

I think it has less to do with 'easy' and more to do with making it difficult to introduce subjectivity and bias in to the analysis. The clearer and simpler the data points are, the less you can be accused of fudging what they mean.

Mercushio mentioned it but it bears repeating: Part of the issue of 'other' is also that between the political bits and author groupings you were asking questions on which your audience has nuanced positions that they feel are not accurately answered by a single check box.

4345850

Can you give an example of what you mean?

…you did add lots of questions, didn’t you? Because I’m sure I didn’t see at least 10% of those before. Also, more of these have a third option than I remember seeing, though I might be misremembering that…

Using the one you gave as the example, “Are you in the fandom more for…” – you could have replaced this entirely with two simpler questions like “how many show writer names you remember” and “how many fanfiction writer names you remember.” That’s just off the top of my head, there are probably better options of simpler questions that do not require the respondent to consciously define their position but reveal it indirectly instead. They would require some computing to analyze, you wouldn’t see the results at a glance in the pie chart. But it would be trivial computing, you wouldn’t have to parse or categorize the “Other” column.

In general, every time you have to ask people “are you X or Y” in an environment where the entire alphabet exists, defining them as X or Y based on things they actually report doing and only asking the question outright to see if they misrepresent themselves is typically a much better idea, and makes for a survey both easier to analyze and easier to answer. In the immortal words of doctor House, “people lie.” That’s perfectly normal and should be expected. The trick to getting good data is to give them as little opportunity to do so as possible, and while you might be skilled at extracting data from entropy, (more so than me, at least) you will still get much better results if you can collect your data in such a way that it has less entropy to start with.

P.S. “The last time you read something on fimfiction, you…” question is so vague both in the question and the dichotomy presented that I can’t tell if I have a defined position on it at all.

4345544 I would have preferred to check off the names I personally liked, as the groupings made that difficult. Some groups had a name I liked and then some I'd never read anything of. You could have checked your data against that easily enough.

The problem is that people taking a survey want to express themselves as fully and accurately as possible

This is an oversimplification and to some extent misses the point. What people taking a survey want is to avoid being misrepresented by the results.

Whenever I answer a question on one of these surveys, what I naturally need to consider for each possible answer to a question is: If everybody chose this answer, could someone later use the results of the survey to tell me my actual opinion is invalid? For a binary choice of the kind you're speaking about here, the answer is yes and yes: Depending on the matter under discussion, I can imagine being told either of

Troposphere, you are in a fringe minority because I have a survey here that says 98% of the bronies think the show is more important.

or

Troposphere, you are in a fringe minority because I have a survey here that says 98% of the bronies think the fandom is more important.

Thus there is no answer to the question that is safe for me. Of course there are always people who are wrong on the internet, but the specific indignity of having my own answers count against me is an especially infuriating prospect. And that is, I think, what pisses people off -- particularly when, as here, you were specifically not disclosing which underlying agenda you were going to use the survey results to push.

You can keep professing that you're actually only interested in the slight difference between people who weight the factors 49-51 and people who weight them 51-49 -- but we all know that is not how survey results are being used as rhetorical weapons. In a sense the criticism is that it is wrong for you to be interested in that "slight difference" without also collecting any data at all that will allow you (and, importantly, readers of your conclusions) to distinguish it from extreme opinions.

4345842

Bottom line, Bad Horse is if everyone is telling you that the data you have does not represent them, if the data is about them and you are trying to find out true things about them, then the data is basically worthless by definition.

Amen.

4345848

You ask us to define our progressivism on not just subjective values, but the subjective values of other people.

It's a much more comfortable question if it's only asking based on your values how you rate yourself, instead of how you think other people would judge you.

4345932

I think a lot of people are missing that at the beginning of the survey it was clearly and obviously stated that you could forego answering any question you felt uncomfortable with. That kind of includes not answering questions you think do not offer the answer you'd feel correctly represents you, right?

That is not a solution to the core problem, though. The fact that I cannot represent myself with the options given is only a symptom of the underlying problem that here is a survey that purports to represent a population I belong to but yet is structured such that its results cannot possibly reveal that people with my approach to the questions exist in that population. Then the results are going to misrepresent me whether or not I personally answer the question.

It would be easier to shrug this off if it was a question where I know from experience that my opinion is indeed an outlier. But on the contrary, my experience is that most people have nuanced approaches that will be misrepresented by either of the possible answers.

I'm sure this was considered and I'm not clever enough to do more than speculate, but: the survey was very long. I'd guess that at least some people started it and gave up part-way through, eg when the political questions appeared. Even though you explicitly said questions were optional, it's still less work simply to say "Screw it" and close the page there and then, rather than flick through the remaining pages without answering so that at least a partial set of responses gets submitted.

This also ties into another thought, which again I'm sure you considered: my guess is that people who mostly like to read literary fiction, which generally requires considerable effort from the reader, would be more likely to do a "long and difficult" survey like this than people who mostly like to read fluff. Maybe also less likely to give up halfway through, as mentioned above. Again, I'm not clever enough to know how to compensate for this; just something that came to mind.

Anyway, thanks very much for this post. Very interesting reading, and it did explain several things I wasn't clear about before.

Statistical analysis and poll composition are two very different skill sets. Knowing how to do one does not necessarily make you able to do the other.

It's like, knowing how to cook trout amandine does not make me a fisherman.

But you went ahead anyway and and made mistakes and learned from them, and that puts you way ahead of all the guys who never got off the couch.

4345593 You did? I could have sworn I answered the religion question as overtly nonreligious, but the ethnicity question as Jewish.

Login or register to comment