For the benefit of future classes, and of alumni who need reminders:
1. Random does not mean haphazard or accidental. Asked what ‘random’ means, students reply ‘no pattern,’ or something of that nature. Your sample ain’t random just because you say it’s random! Show how you’ve taken pains to ensure the sample meets the true definition of randomness, namely that each member of the studied population has the same probability of being selected for the sample.
o If the sample departs from randomness, show why the departure is small, and not material to answering the research question.
o Never speak of a ‘random sample of experts’: You choose experts for their special expertise, not for their typicality. Expert panels must be chosen expertly, not randomly. Their responses may be subject to descriptive statistics, but never to inferential statistics.
2. Specify the population. Oh, how students forget that statistical inference is reasoning from a sample to a specified population. As a late replacement for his dissertation examiner, I listened to a candidate’s oral defense. He showed a sophisticated analysis and a big n, and claimed support for his hypothesis at a high p level. I asked, What is the population? He replied, Huh?
o A sample must be drawn from a population. If you can’t say what the population is, your sample is crap. Then the dissertation is crap.
3. And specify the sampling frame and the sampling plan. If the population is ‘shoppers at Horton Plaza Mall in San Diego,’ the frame might be ‘people entering Horton Plaza through the south doors between 11a.m. and 6p.m. next Tuesday and next Saturday.’ The plan is ‘Interview every 25th person to enter, with at least 300 interviews to be completed.’
o You give your undergrad assistant your clipboarded questionnaire, along with the sampling plan. Face it: If the 25th person is a smelly guy with a beer gut, your assistant will give him a pass and instead interview #26, who is an attractive member of the opposite sex. Thus violating randomness.
o Solution: Send pairs of assistants. One will do the interviewing while the other enforces the sampling plan.
4. Don’t confuse a census with a sample. If you measure every member of the population, you’ve done a census, not a sample. No statistical inference is needed, or appropriate. You’re not making inference from sample to population, because you’ve ‘sampled’ the entire population!
o One student measured every country in his well-specified population of interest. He then performed statistical tests. ‘Why?’ I asked. ‘Well,’ he said, ‘it wouldn’t look like much of a dissertation if I didn’t have statistical tests.’ BRRRP! Buzzer! Wrong answer!
5. Don’t forget nonresponse bias. You invited 1000 people (randomly selected from a specified population, natch) to your surveymonkey.com site. 700 of them completed the questionnaire. Congratulations, this is actually a very good response rate for management research. However… Why do you believe the responses of the 300 non-responders – if they had responded – would have been similar to the responses of the 700? A passable dissertation must include an answer to this. There are techniques for estimating nonresponse bias.
o One student got a response rate like this, questioning businesspeople in his country about their interest in interacting with foreign businesspeople. Doesn’t it stand to reason, his examiners asked, that people not interested in foreigners would not be interested in a questionnaire about their interest in foreigners? Is this not a red flag for nonresponse bias?
o The student got a ‘conditional pass’ and was required to come back later with written estimates of nonresponse bias and its impact on his results.
6. Too many hypotheses? Testing many hypotheses on one too-small sample will almost guarantee at least one false positive or false negative. As a rule of thumb, you need a perfectly random sample of n=30 – with perfect controls on non-treatment effects – to get a good estimate of just one parameter.
o If you must test multiple hypotheses on multiple quantities, scale up your n accordingly.
o There’s no hard and fast rule. But you will need a bigger sample than you think you need.
7. Assuming normality? Take care. There are two issues here: Normality of the quantity you’re measuring, and normality of measurement errors. There are tests for normality.
o The world financial crash occurred (among other reasons) because experienced financial managers believed returns on assets would follow a Gaussian distribution. In fact, a longer-tailed distribution (e.g., a power law) was needed to catch the true probability of an extreme event of the kind that did, in fact, happen. For more on this, see The Black Swan.
o Your research model is y = ex + ε . You’re more comfortable with linear regression, so you transform it to loge y = x + ε’ . Regression depends on Gaussian-distributed error terms. Is it ε or ε’ that’s normally distributed? Or neither? (It almost surely won't be both.) Find out before you regress!
8. Justify your use of a statistical test by understanding its mathematical foundation and its match to your research question. Students, and some published scholars, tend to wave away this step with a magic gesture, saying ‘So-and-so (1998) used this test in a similar situation.’ Citing a prior use of the test is not a categorical no-no. But it’s better to show you know why it was used.
o I guarantee you, Professor So-and-so’s research question was quite different from yours. Just because his referees saw the logic of using the test doesn’t mean yours will.
9. The matter of replicability. Is your 90% significance level ‘in principle’ only, or possibly relevant in practice? Compare your management research to that of a medical investigator in a mouse lab: That investigator knows mouse physiology will not suddenly change tomorrow. Tomorrow’s repetition of the mouse experiment will show a result much the same as today’s. The management researcher knows the business environment will change tomorrow. The prospect of replicating your management study on many independent samples under the same conditions is nil.
o If an answer to your research question is likely to be very ephemeral, don’t do the study.
§ A possible exception: If you’re studying an important one-of-a-kind event, for example a nation’s adoption of the Euro, and its effect on consumer prices or attitudes.
o If you think you’ve revealed a management principle of lasting value, say why you think so. Many extra points if you can convince the reader of the study’s practical replicability.
10.The dangers of SEM, multi-level models, etc. The worst reason in the world to do something is just because you can. These highly complex statistical procedures are possible only because of the power of today’s computers. That alone is no reason to use them; you’ll need a much better reason. SEM, factor analysis and the like require judgment on the researcher’s part. (They are not just plug-in formulas.) Can a novice researcher exercise that judgment?
o You write a thesis to show you can do a supervised research project – not to display your virtuosity.
o Using a technique without showing complete understanding of it (see #8 above) won’t persuade your examiners to award the degree you seek.
11.Non-sampling errors will almost always be much bigger than sampling errors. Professors teach statistical math because it’s easier than teaching research logic. Sampling error, or reliability, is the significance level or p-value of your test. Errors that you make in formulating an unambiguous research question and measurable hypotheses, framing your study, interpreting results, and so on, as well as errors described in #1 through #4 above, are non-sampling errors.
o Avoiding them requires as much or more attention than doing the tests properly.
12.The p-value is not the probability of Ho being true. Or false. The hypothesis statement is a matter of fact, not of probability. Either it is true or it isn’t, out there in the real world. Makes no sense to speak of the probability of it being true. Though they shouldn’t have, the statistical forefathers used ‘p’ sometimes to denote a probability, and sometimes to denote a quantile of an error distribution. They thought you could keep the two straight. Don’t prove them wrong!
o Ditto for the significance level α. α is not a probability – of the truth or of anything else.
Modern
statistical inference is one of the top intellectual achievements of the 20th
century, and one of history’s greatest advances in applied epistemology.
However, deciding a hypothesis at the 95% level (which is almost impossible in
management studies anyway) only means that if you repeated the experiment 100
times on 100 independent samples, you would make the same decision
approximately 95 times.
In other words, you
still don’t know whether the hypothesis is true or not. All you have done
is quantify your confidence in its truth or falsity. Interpret your data with
the appropriate modesty and do not use the word ‘proved.’
Comments