Tuesday, February 13, 2007

The day I was Bjorn

It's my birthday today, and as someone with an unusual name (at least in the states), my birthday often makes me think about how my name has had an impact on my life. I've always thought having an unusual name has made me easier for people to remember, and has encouraged me to strive to be different in other ways. For example, I doubt I would have gone to the trouble of etching my name in a cell with a laser had my name been John.

Today, Overcompensating did a little bit about names and had some interesting links which taught me a few things. First of all, I found out I don't exist:

(There's some stuff on that site to make a statistician cringe, but I think it's mostly terminology. From a cursory look at their FAQ and so on, they do seem to know what they are doing. The fact that quite a few names, such as mine, don't show up is because their list of names is based on a random sample, and the sample probably didn't include anyone with the name Bjorn.)

The next thing I found out was that names don't have any effect on your success in life. Okay, it's not fair for me to say that, because the linked article doesn't exactly make that claim. What is does say, is that in some cases, "name is an indicator—but not a cause—of... life path." Specifically, they use the example of "Black-sounding names". (The article is an except from Freakonomics: A Rogue Economist Explores the Hidden Side of Everything By Steven D. Levitt and Stephen J. Dubner.)

I've often heard it said that if you send 2 identical resumes, one with a "black-" or "immigrant-sounding" name, and one with a "white-sounding" name, the "white" applicant is more likely to get a call. The authors don't contest this, but they put the results aside by saying "Such studies are tantalizing but severely limited, since they offer no real-world follow-up or analysis beyond the résumé stunt." To them, that's not research, it's a stunt. They may be right -- such research may simulate situations that are uncommon in the real world, or don't necessarily affect people's in as significant a way as the study might suggest -- but it's disappointing that they dismiss it altogether.

Instead, Levitt and Dubner look to a research paper called The Causes and Consequences of Distinctively Black Names for the answers. This study, an exhaustive research paper looking at a huge amount of data and taking into account a large number of variables, found "no negative relationship between having a distinctively Black name and later life outcomes after controlling for a child's circumstances at birth." In other words, distinctively Black names don't cause negative outcomes. While this may be true, it seems hard to believe that something that makes your resume more likely to land in the garbage is not more likely to result in worse life-situation, but that's the claim of the article. I don't want to dispute the article, but there are always problems with this kind of research which relies on large, complex linear regression models, and I thought I'd take the opportunity to mention a few of the ways they might have found these results even if "Black-sounding" names do have an impact on "life outcome":

  • The results were present but not statistically significant. It often happens in research that results are found, but the effect is too small or there is too much noise in the data for the results to be considered "statistically significant". Unfortunately, it almost always happens that people conclude that there is no correlation, simply because there is no significant correlation, but it is not possible to draw such conclusions. In the case of this study, which was conducted on a very large sample, it is probably not a terribly dangerous conclusion to draw: We can imagine that in real-life situations, people are so much more likely to encounter racism resulting from their appearance rather than racism resulting from their names, that perhaps the naming issue is too insignificant (above and beyond skin-color racism) to make a difference.

  • The assumptions necessary for linear regression are not met. Linear regression is used in virtually every social-science study, as well as a variety of medical and other research. Though a lot can be learned from regression, virtually no studies take the time to assess the validity of a linear model, and it's especially difficult to do so on large data-sets, with a large number of variables. In practice, such techniques often work, but any good statistician should hesitate to design a study using linear regression that tries to predict the extent to which various factors cause or predict certain outcomes, so it is disappointing that such techniques are so widespread.

  • There is an interaction between the variables which is not corrected for. It may be, for example, that a uniquely black name is an advantage for relatively rich blacks and a disadvantage for relatively poor blacks. However unlikely this may be, if it is the case, it is hard to use regression to uncover all such interactions, especially with such a large number of variables.

In the end, linear regression is often the only practical way to answer such questions, and the conclusion reached by the researchers is probably correct. But, even so, that doesn't mean that the conclusion generalizes: names may still have a large impact on the path your life takes.


