Editor’s note: The conversation below concerns the relative merits of ability-testing and self-report testing of emotional intelligence and related traits. The conversation was constructed from e-mails between Joseph Ciarrochi, a psychology professor at the University of Woolongong, Australia, and myself while we were editing the 2nd edition of the book, “Emotional Intelligence in Everyday Life.” (Joe Forgas is another editor of the book; to see the first edition click here).
The conversation helps clarify, I believe, a number of important points about the use of different kinds of data in a more lively way than an academic article otherwise might. Joseph and I have reviewed the conversation and it is posted with both of our agreement.
Is Self-Report Data Worth Using in Psychological Science?
Joseph Ciarrochi: I think a number of the arguments you make about the superiority of ability measurement in emotional intelligence seems to taint well-validated and useful self-report measures. They are depicted as inaccurate or measures of the extent that people have "positive illusions" or whatever. I think this is too critical.
Anyway, I sometimes think that you disparage all the self-report work over the last 4 decades (this is probably not an accurate impression, but it is how it feels). I don't think this is good. We've come too far. The last four decades of research utilizing self-reports has helped us to understand why people suffer. We are getting to the point where maybe we can reduce human suffering. The evidence is coming in...
John Mayer: In regard to measuring emotional intelligence -- I am a great believer that criterion-report (that is, ability testing) is the only adequate method to employ. Intelligence is an ability, and is directly measured only by having people answer questions and evaluating the correctness of those answers.
I am, however, a great defender of self-judgment data for other purposes. For example, you mentioned that self-judgment has allowed us to get a handle on why and how people suffer, and I agree entirely with that position. I think self-judgment data represents the single best approach to measuring personal emotional states and their outcomes.
Should the term Self-Report Data be Replaced?
Mayer (Cont.): Like you, I feel strongly about the potential contributions of many kinds of data to our field. I do think, however, that the term, self-report data, has now been so misused that we are better off dropping it. For example, the term self-report data has been used as a contrast to ability or criterion-report data (e.g., self-report versus ability measures), and at other times, the term self-report is used to include ability data.
I have helped promote the use of different kinds of data -- including what is often referred to as self-report data -- by developing a new language and classification system for them in a 2004 article. Here is a part of the opening of that article, in which I discuss use of the term, self-report.
...the term, “self-report data” is widely used in two substantially different ways but without clear acknowledgement of the different definitions. Self-report data is sometimes defined as any report by the self, including answers to questions such as “Is nuclear power safe?”, “Did you visit the hospital last year?”, and responses to Rorschach inkblots (e.g., Bordens & Abbott, 2002, p. 135; Heiman, 2002, p. 284; Shaugnessy, Zechmeister, & Zechmeister, 2003, p. 150). Alternatively, self-report is defined more specifically as a report by the self on the self -- limited to answers to questions such as “Do you like parties” and “Are you a nervous person?” (e.g., Kaplan & Saccuzzo, 2001, p. 406). (Quote from Mayer, 2004, p. 208)
I think the self-report term is overly vague. Such a vague term probably does more harm than good in a scientific context. When people criticize self-report, it often takes advantage of this vagueness. For example:
...some psychologists have criticized self-report data as involving "deliberate faking, lack of insight, and unconscious defensive reactions" (Mischel, 1968. p. 236). Surely, however, self-reports such as “I am 20 years old,” or “I am female” are trustworthy in many contexts. (Quote from Mayer, 2004, p. 208)
To help improve this situation, I have introduced a new model for organizing personality data -- which promotes many different kinds of self-report data (divided into conceptually clearer categories). I go on to write:
One of the major points of this classification system is that the different categories of data are different because they are produced differently -- and that has implications for what the data mean. For example, whether the data’s source is external or internal to personality makes a theoretical difference -- as does whether the person is simply endorsing an item, constructing a convergent response to meet a criterion, or constructing divergent and thematic responses. A person's endorsements of an extroversion item reflects that person's self-concept¸ and draws on relatively stable memories of the self. Such self-concept data can be expected to be meaningfully different from, say, a person’s freely-generated self-descriptions in which a person must create a narrative description of him or herself... Similarly, self-reports correlate only weakly with observer reports in a variety of areas (e.g., Funder, 1995; Paulhus, Lysy, & Yik, 1998). ...
Why Are Different Sources of Data on the same Trait (e.g., Emotional Intelligence) Weakly Related?
Ciarrochi : Ok , Jack here is the crucial point where we differ. I don’t know how you define “weakly” but again these arguments are suggesting that self-reports are just measuring “self-concepts” and are not connected to the “real” world. Let me disagree in the strongest terms.
Mayer: Well, wait a minute now, to say that self-judgment data correlates only moderately with, say, thematic (projective) data or criterion-report (ability) data doesn’t mean that it doesn’t predict important things. My whole point is that any or all of the kinds of data predict important things -- but that they are measuring different things, and as a consequence, are often predicting different things. Consider the last part of the passage from the article:
Each data source, from this perspective, is a strong indicator of a better-specified variable [i.e., outcome criterion -- ed]. This is a more powerful way to view data in personality, achieved through advances in understanding cognitive and emotional processes, and the behavior of the data itself. The challenge is that researchers must better keep in mind the data and what they specifically mean, and choose the right source(s) of data for a given research study. (Mayer, 2004, p. 216-217)
Ciarrochi: Okay, but let’s use a specific as an example. In research I am working on, we find that self-reported social competence is a very good predictor of how socially competent people behave when interacting with a stranger (using -- as a criterion measure -- a behavioral measure and consensus scoring system). So the self-report measure is predicting between 16 to 25 % of the variance in a 5 minute sample of behavior (the behavior is rated by independent raters using Funder’s Riverside Behavioral Q-sort)... This is a substantial effect. ...so this is an example of very strong evidence that self-reports can be accurate, not just “impressions.”
Mayer: Okay, you’ll notice my phrasing above concerning self-report and observer data -- that they “correlate only weakly in a variety of areas.” There, I cite David Funder’s really excellent work, and I acknowledge (here, as I might have done there) that his work also shows that in some areas -- particularly in areas of observable social behavior -- the relationship is substantially higher, reaching to levels that are fairly characterized as moderate (e.g., in the r = .40 to .50 level). This still indicates, to me, that the different sorts of data are measuring different things. But it is true that, at least at the r = .50 level, one can start to entertain the idea that there is some meaningful overlap.
Ciarrochi: So, when you say self-report is weakly linked to observer ratings, I think our finding challenges the claim. David Funder also has other findings that consistently challenge the claim.
Mayer: My point was to say that the meaning of what is measured by observations and by self-judgments is different. Observable behavior expressed by personality is something different from an individual’s internal mental representation. Often, they are weakly related, and, as you and, more generally, Funder’s research, rightly point out, sometimes, the two are moderately related.
Ciarrochi: Now, I can agree with you that, yes, IQ tests and self-judgment IQ are, at best, only weakly related, but it does not follow that this is true across all domains at all times.
Mayer: In many respects, Funder’s approach to data and my own are similar. We both believe that different measures provide us with different kinds of information.
Ciarrochi: Well, okay, so we agree that part of the reason for the smallish links between self-judgment and observer reports is not because of problems with the self judgment.
Actually, we have also found that the observer reports are somewhat unreliable and biased for various reasons (e.g., people don’t always see their friends in an honest light). When we use trained observes who do not know the participant, we obtain much stronger links between self-reports and behavior than when we use friends. In other words, our neutral observers are more accurate assessors of the participant than are the people who actually know the participant well.
The Need for a Theory of Data
Mayer: What is really exciting to me is the improvement in research quality possible from better understanding our sources of data. When I talk about different kinds of data drawing on different sorts of mental processes, I do mean, yes, that psychologists have been developing a theory of data over time and we should use what we know. As you point out, if we are talking about readily observable social behavior, then self-judgment and observer reports may converge at moderate levels.
If, on the other hand, we are talking about intelligence, different issues apply. For reasons I have written about extensively (the problem of unperceptive observers, the problem of the plasticity of intelligence, the issue that intelligence is a very internal, hidden process, and the issue of over- and under- self confidence), it appears that neither self-judgment nor (untrained) observer judgments correlate well with real performance, when we are talking about intelligence. Well then, this is a hard-won empirical fact about intelligence data that needs to be acknowledged. So, my theory of data -- and, I also think Funder’s work -- both advocate drawing on what we have understood about our data. In the case of direct measures of EI, that includes that ability-based IQ is a far better predictor of being able to solve a set of mental problems than is self-judgment.
Consider the following two hypothetical test instruments -- one self-judgment and one criterion-report (ability oriented).
For each test, a person answers a series of problems -- “How much does a specific face express happiness?” “What does ‘anger’ mean?” and the like. At the conclusion of the self judgment test, the person is asked, “How well did you do?” and his or her answer is taken as the final measure of emotional intelligence ability. For the ability/criterion-report version of the test, however, the person’s responses to the same items are compared to a criterion of correctness, and scored as correct or incorrect. Which test would you trust more? The ability test adds tremendous value -- by checking responses for their veracity.
Ciarrochi: Again, you give an example where self-judgments are not likely to be useful. Above I have given an example of where they are very useful. So I think you need to evaluate this claim on an area-by-area basis. Sometimes self-judgments will be better predictors of the outcomes of interest. Sometimes they will link to behavior in theoretically expected ways. Sometimes they won’t.
Yes, I agree the ability measures add tremendous value for emotional intelligence. But this is not a zero sum game. When one starts looking at overt social behaviors, then self-judgments and observer-report data can be valuable in another way for measuring some of the core social behaviors that are in an individual’s repertory.
They also can both be accurate. One is not the truth (i.e., ability), wheras the other is just impressions (self-judgment). Sometimes, in some cases, that might be true. But you can not generalize to every instrument, in every context.
Mayer: I agree they are (if well measured) both truths -- but not necessarily truths about emotional intelligence. Overt social behavior is overt social behavior. Is it important? Yes. Is it important for key social outcomes such as leadership or a person’s well-being? Absolutely. Such social behavior is, first and foremost, however, social behavior. To the degree that it relates to internal mental processes, it is (so far as I see it) most correlated with key socioemotional and features of the social actor such as warmth, tact, extroversion, and many other qualities. Indeed, emotional intelligence no doubt contributes as well.
The empirical research that indicates that those measures are largely uncorrelated with the MSCEIT tells us, however, that the general social intelligence, informed though it may be by emotional intelligence, contains much, much more than emotional intelligence. Much of that social behavior contains socio-emotional styles, such as extroversion, warmth, and non-verbal styles, but styles that are predominantly independent of emotional intelligence -- as I define it -- that is, as a mental ability -- in the aspects that can be judged by, say, peer raters.
Ciarrochi: Jack, I think your arguments are compelling, and I agree with them. I think it is crucial to find ways of talking about emotional intelligence that make things clearer, rather than confusing the issue. I therefore propose we distinguish between two related phenomena, "emotional intelligence" and "emotionally intelligent behavior." Emotional intelligence refers to people’s ability to process emotions and deal effectively with them. EI refers to people’s potential. In contrast, “emotionally intelligent behavior” refers to how effectively people actually behave in the presence of emotions and emotionally charged thoughts. Simply put, emotionally unintelligent behavior occurs when emotions impede effective (value congruent) action, and emotionally intelligent behavior occurs when emotions do not impede effective action, or when emotions facilitate effective action. Emotional intelligence (as an ability) is one set of processes hypothesized to promote emotionally intelligent behavior. There are other potential processes, many of which are discussed in my chapter (EI in everyday life, 2nd edition).
Mayer & Ciarrochi: So, here are some take-home messages we think worth considering:
- New classification systems for data argue that self-report, as a term, ought to be dropped (Funder, 2004 [this link now goes to a later edition of the book --ed]; Mayer, 2004).
- There are a number of discretely different kinds of data. Each kind of data reflects different mental (or, in the case of observers, observational) processes. Moreover, each kind of data can contribute different to understanding a phenomenon.
- If you want to look at mental potential around emotional intelligence, then criterion-report (that is, ability measures) are best; self-judgment measures are weak criteria at best.
- Self-judgments measure just that -- a person’s self-efficacy in regard to emotional intelligence, as distinct from their actual emotional intelligence.
- If you want to predict and increase emotionally intelligent behavior, then you may need to focus on processes beyond EI (defined as an ability). For example, you may need to focus on situational factors (see Zeidner and Forgas chapters, in EI in Everyday Life (2nd edition only), or other individual difference processes (See other chapters in EI in Everyday Life).
- If you want to understand how a person is perceived in a social context, that is, how warm, emotionally expressive or socially skilled they are, then observer-report is good, and self-judgment can provide a reasonable proxy.
- If you want to understand how a person is feeling inside, then, self-judgment measures are best.