1115-Chapter11PersonalityAssessment
docx
keyboard_arrow_up
School
South America University *
*We aren’t endorsed by this school
Course
100
Subject
Psychology
Date
Nov 24, 2024
Type
docx
Pages
71
Uploaded by CasperNyoveri
Personality Assessment: An Overview
In a 1950s rock ‘n’ roll classic song entitled “Personality,” singer Lloyd Price described the subject of that
song with the words walk, talk, smile, and charm. In so doing, Price used the term personality the way
most people tend to use it. For laypeople, personality refers to components of an individual’s makeup
that can elicit positive or negative reactions from others. Someone who consistently tends to elicit
positive reactions from others is thought to have a “good personality.” Someone who consistently tends
to elicit not-so-good reactions from others is thought to have a “bad personality” or, perhaps worse yet,
“no personality.” We also hear of people described in other ways, with adjectives such as aggressive,
warm, or cold. For professionals in the field of behavioral science, the terms tend to be better-defined, if
not more descriptive.
Personality and Personality Assessment
Personality
Dozens of different definitions of personality exist in the psychology literature. Some definitions appear
to be all-inclusive. For example, McClelland (1951, p. 69) defined personality as “the most adequate
conceptualization of a person’s behavior in all its detail.” Menninger (1953, p. 23) defined it as “the
individual as a whole, his height and weight and love and hates and blood pressure and reflexes; his
smiles and hopes and bowed legs and enlarged tonsils. It means all that anyone is and that he is trying to
become.” Some definitions focus narrowly on a particular aspect of the individual (Goldstein, 1963),
whereas others view the individual in the context of society (Sullivan, 1953). Some theorists avoid any
definition at all. For example, Byrne (1974, p. 26) characterized the entire area of personality psychology
as “psychology’s garbage bin in that any research which doesn’t fit other existing categories can be
labeled ‘personality.’ ”
In their widely read and authoritative textbook Theories of Personality, Hall and Lindzey (1970, p. 9)
wrote: “It is our conviction that no substantive definition of personality can be applied with any
generality” and “Personality is defined by the particular empirical concepts which are a part of the
theory of personality employed by the observer” [emphasis in the original]. Noting that there were
important theoretical differences in many theories of personality, Hall and Lindzey encouraged their
readers to select a definition of personality from the many presented and adopt it as their own.
391
JUST THINK . . .
Despite great effort, a definition of personality itself—much like a definition of intelligence—has been
somewhat elusive. Why do you think this is so?
For our purposes, we will define personality as an individual’s unique constellation of psychological traits
that is relatively stable over time. We view this definition as one that has the advantage of parsimony yet
still is flexible enough to incorporate a wide variety of variables. Included in this definition, then, are
variables on which individuals may differ, such as values, interests, attitudes, worldview, acculturation,
sense of humor, cognitive and behavioral styles, and personality states.
Personality Assessment
Personality assessment may be defined as the measurement and evaluation of psychological traits,
states, values, interests, attitudes, worldview, acculturation, sense of humor, cognitive and behavioral
styles, and/or related individual characteristics. In this chapter we overview the process of personality
assessment, including different approaches to the construction of personality tests. In Chapter 12, we
will focus on various methods of personality assessment, including objective, projective, and behavioral
methods. Before all that, however, some background is needed regarding the use of the terms trait,
type, and state.
Traits, Types, and States
Personality traits
Just as no consensus exists regarding the definition of personality, there is none regarding the definition
of trait. Theorists such as Gordon Allport (1937) have tended to view personality traits as real physical
entities that are “bona fide mental structures in each personality” (p. 289). For Allport, a trait is a
“generalized and focalized neuropsychic system (peculiar to the individual) with the capacity to render
many stimuli functionally equivalent, and to initiate and guide consistent (equivalent) forms of adaptive
and expressive behavior” (p. 295). Robert Holt (1971) wrote that there “are real structures inside people
that determine their behavior in lawful ways” (p. 6), and he went on to conceptualize these structures as
changes in brain chemistry that might occur as a result of learning: “Learning causes submicroscopic
structural changes in the brain, probably in the organization of its biochemical substance” (p. 7).
Raymond Cattell (1950) also conceptualized traits as mental structures, but for him structure did not
necessarily imply actual physical status.
Our own preference is to shy away from definitions that elevate trait to the status of physical existence.
We view psychological traits as attributions made in an effort to identify threads of consistency in
behavioral patterns. In this context, a definition of personality trait offered by Guilford (1959, p. 6) has
great appeal: “Any distinguishable, relatively enduring way in which one individual varies from another.”
JUST THINK . . .
What is another example of how the trait term selected by an observer is dependent both on the
behavior emitted as well as the context of that behavior?
This relatively simple definition has some aspects in common with the writings of other personality
theorists such as Allport (1937), Cattell (1950, 1965), and Eysenck (1961). The word distinguishable
indicates that behaviors labeled with different trait terms are actually different from one another. For
example, a behavior labeled “friendly” should be distinguishable from a behavior labeled “rude.” The
context, or the situation in which the behavior is displayed, is important in applying trait terms to
behaviors. A behavior present in one context may be labeled with one trait term, but the same behavior
exhibited in another context may be better described using another trait term. For example, if we
observe someone involved in a lengthy, apparently interesting conversation, we would observe the
context before drawing any conclusions about the person’s traits. A person talking with a friend over
lunch may be demonstrating friendliness, whereas that same person talking to that same friend during a
wedding ceremony may be considered rude. Thus, the trait term selected by an observer is dependent
both on the behavior itself and on the context in which it appears.
392
A measure of behavior in a particular context may be obtained using varied tools of psychological
assessment. For example, using naturalistic observation, an observer could watch the assessee interact
with co-workers during break time. Alternatively, the assessee could be administered a self-report
questionnaire that probes various aspects of the assessee’s interaction with co-workers during break
time.
In his definition of trait, Guilford did not assert that traits represent enduring ways in which individuals
vary from one another. Rather, he said relatively enduring. Relatively emphasizes that exactly how a
particular trait manifests itself is, at least to some extent, dependent on the situation. For example, a
“violent” parolee generally may be prone to behave in a rather subdued way with his parole officer and
much more violently in the presence of his family and friends. Allport (1937) addressed the issue of
cross-situational consistency of traits—or lack of it—as follows:
Perfect consistency will never be found and must not be expected. . . . People may be ascendant and
submissive, perhaps submissive only towards those individuals bearing traditional symbols of authority
and prestige; and towards everyone else aggressive and domineering. . . . The ever-changing
environment raises now one trait and now another to a state of active tension. (p. 330)
For years personality theorists and assessors have assumed that personality traits are relatively enduring
over the course of one’s life. Roberts and DelVecchio (2000) explored the endurance of traits by means
of a meta-analysis of 152 longitudinal studies. These researchers concluded that trait consistency
increases in a steplike pattern until one is 50 to 59 years old, at which time such consistency peaks. Their
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
findings may be interpreted as compelling testimony to the relatively enduring nature of personality
traits over the course of one’s life. Do you think the physically aggressive hockey player pictured in Figure
11–1 will still be as physically aggressive during his retirement years?
Figure 11–1
Trait aggressiveness and flare-ups on the ice.
Bushman and Wells (1998) administered a self-report measure of trait aggressiveness (the Physical
Aggression subscale of the Aggression Questionnaire) to 91 high-school team hockey players before the
start of the season. The players responded to items such as “Once in a while I cannot control my urge to
strike another person” presented in Likert scale format ranging from 1 to 5 (where 1 corresponded to
“extremely uncharacteristic of me” and 5 corresponded to “extremely characteristic of me”). At the end
of the season, trait aggressiveness scores were examined with respect to minutes served in the penalty
box for aggressive penalties such as fighting, slashing, and tripping. The preseason measure of trait
aggressiveness predicted aggressive penalty minutes served. The study is particularly noteworthy
because the test data were used to predict real-life aggression, not a laboratory analogue of aggression
such as the administration of electric shock. The authors recommended that possible applications of the
Aggression Questionnaire be explored in other settings where aggression is a problematic behavior.
Sven Nackstrand/AFP/Getty Images
Returning to our elaboration of Guilford’s definition, note that trait is described as a way in which one
individual varies from another. Let’s emphasize here that the attribution of a trait term is always a
relative phenomenon. For instance, some behavior described as “patriotic” may differ greatly from other
behavior also described as “patriotic.” There are no absolute standards. In describing an individual as
patriotic, we are, in essence, making an unstated comparison with the degree of patriotic behavior that
could reasonably be expected to be exhibited under the same or similar circumstances.
Classic research on the subject of cross-situational consistency in traits has pointed to a lack of
consistency with regard to traits such as honesty (Hartshorne & May, 1928), punctuality (Dudycha,
1936), conformity (Hollander & Willis, 1967), attitude toward authority (Burwen & Campbell, 1957), and
introversion/extraversion (Newcomb, 1929). These are the types of studies cited by Mischel (1968, 1973,
1977, 1979) and others who have been critical of the predominance of the concept of traits in
personality theory. Such critics may also allude to the fact that some undetermined portion of behavior
exhibited in public may be governed more by societal expectations and cultural role restrictions than by
an individual’s personality traits (Barker, 1963; Goffman, 1963). Research designed to shed light on the
primacy of individual differences, as opposed to situational factors in behavior, is methodologically
complex (Golding, 1975), and a definitive verdict as to the primacy of the trait or the situation is simply
not in; however, the past several decades have seen growing consensus around the five-factor approach
to personality.
Personality types
Having defined personality as a unique constellation of traits, we might define a personality type as a
constellation of traits that is similar in pattern to one identified category of personality within a
taxonomy of personalities. Whereas traits are frequently discussed as if they were characteristics
possessed by an individual, types are more clearly descriptions of people. So, for example, describing an
individual as “depressed” is different from describing that individual as a “depressed type.” The latter
term has more far-reaching implications regarding characteristic aspects of the individual, such as the
person’s worldview, activity level, capacity to enjoy life, and level of social interest.
393
At least since Hippocrates’ classification of people into four types (melancholic, phlegmatic, choleric, and
sanguine), there has been no shortage of personality typologies through the ages. A typology devised by
Carl Jung (1923) became the basis for the Myers-Briggs Type Indicator (MBTI; Myers & Briggs,
1943/1962). An assumption guiding the development of this test was that people exhibit definite
preferences in the way that they perceive or become aware of—and judge or arrive at conclusions about
—people, events, situations, and ideas. According to Myers (1962, p. 1), these differences in perception
and judging result in “corresponding differences in their reactions, in their interests, values, needs, and
motivations, in what they do best, and in what they like to do.” The MBTI enjoys great popularity, but it is
not without its critics who have identified concerns about this measure’s validity and reliability (Boyle,
1995; Pittenger, 1993; Stein & Swan, 2019).
JUST THINK . . .
What are the possible benefits of classifying people into types? What possible problems may arise from
doing so?
John Holland (Figure 11–2) argued that most people can be categorized as one of the following six
personality types: Artistic, Enterprising, Investigative, Social, Realistic, or Conventional (Holland, 1973,
1985, 1997, 1999). His Self-Directed Search test (SDS; Holland et al., 1994) is a self-administered, self-
scored, and self-interpreted aid used to type people according to this system and to offer vocational
guidance. Another personality typology, this one having only two categories, was devised by
cardiologists Meyer Friedman and Ray Rosenman (1974; Rosenman et al., 1975). They conceived of a
Type A personality, characterized by competitiveness, haste, restlessness, impatience, feelings of being
time-pressured, and strong needs for achievement and dominance. A Type B personality has the
opposite of the Type A’s traits: mellow or laid-back. A 52-item self-report inventory called the Jenkins
Activity Survey (JAS; Jenkins et al., 1979) has been used to type respondents as Type A or Type B
personalities.
394
Figure 11–2
John L. Holland (1919–2008).
John Holland was well known for the employment-related personality typology he developed, as well as
the Self-Directed Search (SDS), a measure of one’s interests and perceived abilities. The test is based on
Holland’s theory of vocational personality. At the heart of this theory is the view that occupational choice
has a great deal to do with one’s personality and self-perception of abilities. Holland’s work was the
subject of controversy in the 1970s. Critics asserted that measured differences between the interests of
men and women were an artifact of sex bias. Holland argued that such differences reflected valid
variance. As the author of Holland’s obituary in American Psychologist recalled, “He did not bend willy-
nilly in the winds of political correctness” (Gottfredson, 2009, p. 561).
John Hopkins University
The personality typology that has attracted the most attention from researchers and practitioners alike is
associated with scores on a test called the Minnesota Multiphasic Personality Inventory (MMPI) (as well
as all of its successors—discussed later in this chapter). Data from the administration of these tests, as
with others, are frequently discussed in terms of the patterns of scores that emerge on the subtests. This
pattern is referred to as a profile. In general, a profile is a narrative description, graph, table, or other
representation of the extent to which a person has demonstrated certain targeted characteristics as a
result of the administration or application of tools of assessment.1 In the term personality profile, the
targeted characteristics are typically traits, states, or types. With specific reference to the MMPI,
different profiles of scores are associated with different patterns of behavior. So, for example, a
particular MMPI profile designated as “2–4–7” is associated with a type of individual who has a history
of alcohol abuse alternating with sobriety and self-recrimination (Dahlstrom, 1995).
395
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Personality states
The word state has been used in at least two distinctly different ways in the personality assessment
literature. In one usage, a personality state is an inferred psychodynamic disposition designed to convey
the dynamic quality of id, ego, and superego in perpetual conflict. Assessment of these psychodynamic
dispositions may be made through the use of various psychoanalytic techniques such as free association,
word association, symbolic analysis of interview material, dream analysis, and analysis of slips of the
tongue, accidents, jokes, and forgetting.
Presently, a more popular usage of the term state—and the one we use in the discussion that follows—
refers to the transitory exhibition of some personality trait. Put another way, the use of the word trait
presupposes a relatively enduring behavioral predisposition, whereas the term state is indicative of a
relatively temporary predisposition (Chaplin et al., 1988). Thus, for example, your friend may be
accurately described as being “in an anxious state” before her midterms, though no one who knows her
well would describe her as “an anxious person.”
JUST THINK . . .
You experience “butterflies” in your stomach just before asking someone to whom you are attracted to
accompany you to the movies. Would this feeling better be characterized as a state or a trait?
Measuring personality states amounts, in essence, to a search for and an assessment of the strength of
traits that are relatively transitory or fairly situation specific. Relatively few personality tests seek to
distinguish traits from states. Charles D. Spielberger and his associates (Spielberger et al., 1980) led
pathbreaking work in this area. These researchers developed a number of personality inventories
designed to distinguish various states from traits. In the manual for the State-Trait Anxiety Inventory
(STAI), for example, we find that state anxiety refers to a transitory experience of tension because of a
particular situation. By contrast, trait anxiety or anxiety proneness refers to a relatively stable or
enduring personality characteristic. The STAI test items consist of short descriptive statements, and
subjects are instructed to indicate either (1) how they feel right now or at this moment (and to indicate
the intensity of the feeling), or (2) how they generally feel (and to record the frequency of the feeling).
The test-retest reliability coefficients reported in the manual are consistent with the theoretical premise
that trait anxiety is the more enduring characteristic, whereas state anxiety is transitory.
Personality Assessment: Some Basic Questions
For what type of employment is a person with this type of personality best suited?
Is this individual sufficiently well adjusted for military or police officer service?
What emotional and other adjustment-related factors may be responsible for this student’s level of
academic achievement?
What pattern of traits and states does this psychotherapy client evince, and to what extent may this
pattern be deemed pathological?
How has this patient’s personality been affected by neurological trauma?
These questions are a sampling of the kind that might lead to a referral for personality assessment.
Collectively, these types of referral questions provide insight into a more general question in a clinical
context: Why assess personality?
We might raise the same question in the context of basic research and find another wide world of
potential applications for personality assessment. For example, aspects of personality could be explored
in identifying determinants of knowledge about health (Beier & Ackerman, 2003), in categorizing
different types of commitment in intimate relationships (Frank & Brandstaetter, 2002), in determining
peer response to a team’s weakest link (Jackson & LePine, 2003), or even in the service of national
defense to identify those prone to terrorism. Personality assessment is a staple in developmental
research, be it tracking trait development over time (McCrae et al., 2002) or studying some uniquely
human characteristic such as moral judgment (Eisenberg et al., 2002). From a health psychology
perspective, there are a number of personality variables (such as perfectionism, self-criticism,
dependency, and neuroticism) that have been linked to physical and psychological disorders (Flett &
Hewitt, 2002; Klein et al., 2011; Kotov et al., 2010; Sturman, 2011; Zuroff et al., 2004). In the corporate
world, personality assessment is a key tool of the human resources department, relied on to aid in
hiring, firing, promoting, transferring, and related decisions. Perhaps as long as there have been tests to
measure people’s interests, there have been questions regarding how those interests relate to
personality (Larson et al., 2002). In military organizations around the world, leadership is a sought-after
trait, and personality tests help identify who has it (see, e.g., Bradley et al., 2002; Handler, 2001). In the
most general sense, basic research involving personality assessment helps to validate or invalidate
theories of behavior and to generate new hypotheses.
396
JUST THINK . . .
What differences in terms of accuracy and reliability of report would you expect when one is reporting
on one’s own personality as opposed to when another person is reporting about someone’s personality?
Tangentially, let’s note that a whole other perspective on the why of personality assessment emerges
with a consideration of cross-species research. For example, Gosling, Kwan, and John (2003) viewed their
research on the personality of dogs as paving the way for future research in previously uncharted areas
such as the exploration of environmental effects on personality. Weiss et al. (2002) viewed cross-species
research as presenting an opportunity to explore the heritability of personality. The fascinating research
program of Winnie Eckardt and her colleagues at the Dian Fossey Gorilla Fund International is the
subject of this chapter’s Close-Up.
397
CLOSE-UP
The Personality of Gorillas*
When he turned 17-years-old, a mountain gorilla named Cantsbee (see Figure 1) took over the
leadership of what was to become the largest, ever-observed gorilla group (which included up to 65
members). At this writing, he has held this position for over 20 years, despite challenges from rivals
within his group, and from outside attackers. Cantsbee also earned the respect and admiration of the
field researchers and assistants who work with him. He leads his group in a sensible way and seems to
know when it’s time to be supportive, administer discipline, take a strong leadership role, or adopt a
laissez-faire approach.
Figure 1
Cantsbee
Cantsbee is the oldest silverback gorilla at the Dian Fossey Gorilla Fund International’s Karisoke Research
Center in Rwanda. Prior to his birth in 1978, the researchers at Karisoke all thought that his mother was
a male, not a female. Dian Fossey’s shocked reaction to the birth was encapsulated in her exclamation,
“This can’t be!” Taking their cue from Fosse, the Rwandan field assistants promptly christened the
newborn gorilla, “Cantsbee.”
The Dian Fossey Gorilla Fund International
So, what does it take for a gorilla to win such enviable status from gorilla peers and human observers?
Apart from morphological traits that quite likely play a role, such as body size, there are personality traits
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
to be considered as well. This and other questions motivated Eckardt et al. (2015) to initiate the first
study of mountain gorilla personality.
Perhaps the ideal species for studying personality in wild ape populations is the Virunga mountain
gorilla. This is so because over 70% of the remaining 480 gorillas of this species (Gray et al., 2013) are
habituated to human presence and known by rangers and researchers individually, most since birth. The
Karisoke Research Center in Rwanda is one of the longest-existing primate research field sites with
almost 50 years of mountain gorilla monitoring in the Virungas. Well-trained trackers, data technicians,
and researchers familiar with gorilla behavior follow about 40% of the population daily. Many years of
experience and in-depth knowledge of each gorilla in various contexts make trackers as suitable for
assessing the personalities of gorillas as parents are for assessing the personalities of their children.
Between 2007 and 2008, eight of the most experienced Karisoke field staff assessed the personalities of
gorillas that they knew well using a version of the Hominoid Personality Questionnaire (HPQ, Weiss et
al., 2009). This questionnaire was derived by sampling traits from the human “Big 5,” and adapting them
so that they are suitable for assessing the personalities of nonhuman primates. Specifically, each of its 54
items is accompanied by a brief description to set the item in the context of gorilla behavior. For
example, dominant is defined as “Subject is able to displace, threaten, or take food from other gorillas”
or “subject may express high status by decisively intervening in social interactions.” Another example:
affectionate is defined as “subject seems to have a warm attachment or closeness with other gorillas.
This may entail frequently grooming, touching, embracing, or lying next to others.”
The HPQ was prepared in both English and French since both are official languages of Rwanda. The
Rwandan raters were instructed to score gorillas on each trait using a Likert scale ranging from (1) “either
total absence or negligible amounts” to (7) “extremely large amounts.” A prerating training session with
a professional Rwandan translator (who held a Bachelor’s degree in French and English) was conducted
to ensure that language barriers had a minimal influence on the understanding of the rating procedure
and the meaning of each traits. Inter-rater reliability was checked and found to be satisfactory.
Virunga mountain gorillas are folivorous, meaning that they eat mostly leaves, and that they live in what
could be described as a “huge salad bowl” (Fossey & Harcourt, 1977; Vedder, 1984; Watts, 1985). The
fact that food is plentiful and available all year round is believed to play a role in the lower level of
aggression in and between groups of gorillas (Robbins et al., 2005). Other great apes, such as
chimpanzees, depend on seasonally available, scattered fruit. As a result, competition for food and levels
of aggression can be high in these species (Harcourt & Stewart, 2007).
398
Gorilla society is hierarchically structured. They live in relatively stable, cohesive social groups with
male–female relationships forming the core of their society (Harcourt & Stewart, 2007). Emigration from
the natal group is common for both males and females (Robbins et al., 2007; Watts, 1990). Females
transfer between groups during intergroup encounters to avoid inbreeding, whereas males become
solitary after leaving their natal group to increase breeding opportunities by recruiting females from
existing groups.
Because gorillas live in stable and predictable environments with limited food competition, and less
vulnerability to the stressors present in the lives of other great apes, the researchers hypothesized that
the subjects would be rated as emotionally stable, with generally low levels on traits related to
neuroticism. Further, the researchers hypothesized that the subjects would be rated as low in aggression
and high in sociability.
As described in greater detail elsewhere (Eckardt et al., 2015), the researchers’ hypotheses were
confirmed through evaluation of correlations between HPQ scores on personality trait dimensions and
corresponding historical behavior of the subjects as noted in archival records. So, for example, in gorilla
society, the role of dominant males includes group protection duties as well as the mediation of within-
group social conflicts (Schaller, 1963; Watts, 1996). Thus, to ascend the gorilla social hierarchy in
dominance, traits such as being protective, helpful, and sensitive would seem to be a must. In fact,
Eckardt et al. (2015) reported that gorillas with a high social rank scored high on Dominance.
Additionally, rate of intervening to mediate social conflicts in the group was also associated with gorilla
Dominance. Another interesting finding was that gorillas high on Dominance stare less at other gorillas.
Also, with regard to grooming behavior, gorillas tend to approach and groom group members with higher
Dominance scores rather than vice versa.
So, how does Cantsbee’s personality compares to other gorillas? Not surprisingly, Cantsbee scored
second highest in Dominance. He also scored very high on the Sociability dimension, and his score on the
Openness dimension was below average. What is the significance of findings such as these?
Since Darwin (1872), personality research has included the study of personality in species other than our
own (Gosling & John, 1999; McGarrity et al., 2015). Darwin believed that behavioral and affective traits
evolve just like morphological traits. If that is the case, then we should be able to trace the origins of
human personality—and more specifically, personality dimensions such as Openness, Conscientiousness,
Extraversion, Agreeableness, and Neuroticism (otherwise known as the “Big 5” or five-factor model;
Digman, 1990; Goldberg, 1990). But how do we do that? While fossils can tell us a lot about the
evolution of physical features, they tell us nothing about the evolution of personality. Perhaps
evolutionary insights can be gleaned by comparing the personality of humans with those of our closest,
non-human relatives: the great apes, At the very least, the study of great apes holds the promise of
learning how assorted variables (such as differences in ecology, social systems, and life history) may act
to shape personality.
Used with permission of Winnie Eckardt.
* This Close-Up was guest-authored by Winnie Eckardt who has worked with wild mountain gorillas for
over 10 years at the Dian Fossey Gorilla Fund International Karisoke Research Center in Rwanda, and
Alexander Weiss of the University of Edinburgh and the Scottish Primate Research Group.
Beyond the why of personality assessment are several other questions that must be addressed in any
overview of the enterprise. Approaches to personality assessment differ in terms of who is being
assessed, what is being assessed, where the assessment is conducted, and how the assessment is
conducted. Let’s take a closer look at each of these related issues.
Who?
Who is being assessed, and who is doing the assessing? Some methods of personality assessment rely on
the assessee’s own self-report. Assessees may respond to interview questions, answer questionnaires in
writing; click responses on computers, tablets, or cell phones; blacken squares on computer answer
forms; or sort cards with various terms on them—all with the ultimate objective of providing the
assessor with a personality-related self-description. By contrast, other methods of personality
assessment rely on informants other than the person being assessed to provide personality-related
information. So, for example, parents or teachers may be asked to participate in the personality
assessment of a child by providing ratings, judgments, opinions, and impressions relevant to the child’s
personality.
The self as the primary referent
People typically undergo personality assessment so that they, as well as the assessor, can learn
something about who they are. In many instances, the assessment or some aspect of it requires self-
report, or a process wherein information about assessees is supplied by the assessees themselves. Self-
reported information may be obtained in the form of diaries kept by assessees or in the form of
responses to oral or written questions or test items. In some cases, the information sought by the
assessor is so private that only the individual assessees themselves are capable of providing it. For
example, when researchers investigated the psychometric soundness of the Sexual Sensation Seeking
Scale with a sample of college students, only the students themselves could provide the highly personal
information needed. The researchers viewed their reliance on self-report as a possible limitation of the
study, but noted that this methodology “has been the standard practice in this area of research because
no gold standard exists for verifying participants’ reports of sexual behaviors” (Gaither & Sellbom, 2003,
p. 165).
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Self-report methods are commonly used to explore an assessee’s self-concept. Self-concept may be
defined as one’s attitudes, beliefs, opinions, and related thoughts about oneself. Inferences about an
assessee’s self-concept may be derived from many tools of assessment. However, the tool of choice is
typically a dedicated self-concept measure; that is, an instrument designed to yield information relevant
to how an individual sees him- or herself with regard to selected psychological variables. Data from such
an instrument are usually interpreted in the context of how others may see themselves on the same or
similar variables. In the Beck Self-Concept Test (BST; Beck & Stein, 1961), named after senior author,
psychiatrist Aaron T. Beck, respondents are asked to compare themselves to other people on variables
such as looks, knowledge, and the ability to tell jokes.
A number of self-concept measures for children have been developed. Some representative tests include
the Tennessee Self-Concept Scale and the Piers-Harris Self-Concept Scale. The latter test contains 80 self-
statements (such as “I don’t have any friends”) to which respondents from grades 3 to 12 respond either
yes or no as the statement applies to them. Factor analysis has suggested that the items cover six
general areas of self-concept: behavior, intellectual and school status, physical appearance and
attributes, anxiety, popularity, and happiness and satisfaction. The Beck Self-Concept Test was extended
down as one component of a series called the Beck Youth Inventories–Second Edition (BYI-II) developed
by senior author, psychologist Judith Beck (Aaron T. Beck’s daughter). In addition to a self-concept
measure, the BYI-II includes inventories to measures depression, anxiety, anger, and disruptive behavior
in children and adolescents aged 7 to 18 years.
399
JUST THINK . . .
Highly differentiated or not very differentiated in self-concept—which do you think is preferable? Why?
Some measures of self-concept are based on the notion that states and traits related to self-concept are
to a large degree context-dependent—that is, ever-changing as a result of the particular situation
(Callero, 1992). The term self-concept differentiation refers to the degree to which a person has different
self-concepts in different roles (Donahue et al., 1993). People characterized as highly differentiated are
likely to perceive themselves quite differently in various roles. For example, a highly differentiated
businessman in his 40s may perceive himself as motivated and hard-driving in his role at work,
conforming and people-pleasing in his role as son, and emotional and passionate in his role as husband.
By contrast, people whose concept of self is not very differentiated tend to perceive themselves similarly
across their social roles. According to Donahue et al. (1993), people with low levels of self-concept
differentiation tend to be healthier psychologically, perhaps because of their more unified and coherent
sense of self.
JUST THINK . . .
Has anyone you know engaged in “faking good” or “faking bad” behavior (in or out of the context of
assessment)? Why?
Assuming that assessees have reasonably accurate insight into their own thinking and behavior, and
assuming that they are motivated to respond to test items honestly, self-report measures can be
extremely valuable. An assessee’s candid and accurate self-report can illustrate what that individual is
thinking, feeling, and doing. Unfortunately, some assessees may intentionally or unintentionally paint
distorted pictures of themselves in self-report measures.
Consider what would happen if employers were to rely on job applicants’ representations concerning
their personality and their suitability for a particular job. Employers might be led to believe they have
found a slew of perfect applicants. Many job applicants—as well as people in contexts as diverse as high-
school reunions, singles bars, and child custody hearings—attempt to “fake good” in their presentation
of themselves to other people.
The other side of the “faking good” coin is “faking bad.” Litigants in civil actions who claim injury may
seek high awards as compensation for their alleged pain, suffering, and emotional distress—all of which
may be exaggerated and dramatized for the benefit of a judge and jury. The accused in a criminal action
may view time in a mental institution as preferable to time in prison (or capital punishment) and
strategically choose an insanity defense—with accompanying behavior and claims to make such a
defense as believable as possible. A homeless person who prefers the environs of a mental hospital to
that of the street may attempt to fake bad on tests and in interviews if failure to do so will result in
discharge. In the days of the military draft, it was not uncommon for draft resisters to fake bad on
psychiatric examinations in their efforts to be deferred.
Some testtakers truly may be impaired with regard to their ability to respond accurately to self-report
questions. They may lack insight, for example, because of certain medical or psychological conditions at
the time of assessment. By contrast, other testtakers seem blessed with an abundance of self-insight
that they can convey with ease and expertise on self-report measures. It is for this latter group of
individuals that self-report measures, according to Burisch (1984), will not reveal anything the testtaker
does not already know. Of course, Burisch may have overstated the case. Even people with an
abundance of self-insight can profit from taking the time to reflect about their own thoughts and
behaviors, especially if they are unaccustomed to doing so.
400
Another person as the referent
JUST THINK . . .
Do you believe meaningful insights are better derived through self-assessment or through assessment by
someone else? Why?
Another person as the referent
In some situations, the best available method for the assessment of personality, behavior, or both
involves reporting by a third party such as a parent, teacher, peer, supervisor, spouse, or trained
observer. Consider, for example, the assessment of a child for emotional difficulties. The child may be
unable or unwilling to complete any measure (self-report, performance, or otherwise) that will be of
value in making a valid determination concerning that child’s emotional status. Even case history data
may be of minimal value because the problems may be so subtle as to become evident only after careful
and sustained observation. In such cases, the use of a test in which the testtaker or respondent is an
informant—but not the subject of study—may be valuable. In basic personality research, this third-party
approach to assessment has been found useful, especially when the third-party reporter knows the
subject of the evaluation well. Proceeding under the assumption that spouses should be familiar enough
with each other to serve as good informants, one study examined self-versus spouse ratings on
personality-related variables (South et al., 2011). Self and spousal ratings were found to be significantly
correlated, and this relationship was stronger than that typically found between self- and peer ratings in
personality research.
The Personality Inventory for Children (PIC) and its revision, the PIC-2 (pronounced “pick two”), are
examples of a kind of standardized interview of a child’s parent. Although the child is the subject of the
test, the respondent is the parent (usually the mother), guardian, or other adult qualified to respond
with reference to the child’s characteristic behavior. The test consists of a series of true–false items
designed to be free of racial and gender bias. The items may be administered by computer or paper and
pencil. Test results yield scores that provide clinical information and shed light on the validity of the
testtaker’s response patterns. A number of studies attest to the validity of the PIC for its intended
purpose (Kline et al., 1992, 1993; Lachar & Wirt, 1981; Lachar et al., 1985; Wirt et al., 1984). However, as
with any test that relies on the observations and judgment of a rater, some concerns about this
instrument have also been expressed (Achenbach, 1981; Cornell, 1985).
In general, there are many cautions to consider when one person undertakes to evaluate another. These
cautions are by no means limited to the area of personality assessment. Rather, in any situation when
one individual undertakes to rate another individual, it is important to understand the dynamics of the
situation. Although a rater’s report can provide a wealth of information about an assessee, it may also be
instructive to look at the source of that information.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Raters may vary in the extent to which they are, or strive to be, scrupulously neutral, favorably generous,
or harshly severe in their ratings. Generalized biases to rate in a particular direction are referred to in
terms such as leniency error or generosity error and severity error. A general tendency to rate everyone
near the midpoint of a rating scale is termed an error of central tendency. In some situations, a particular
set of circumstances may create a certain bias. For example, a teacher might be disposed to judging one
pupil favorably because that pupil’s older sister was teacher’s pet in a prior class. This variety of
favorable response bias is sometimes referred to as a halo effect.
Raters may make biased judgments, consciously or unconsciously, simply because it is in their own self-
interest to do so (see Figure 11–3). Therapists who passionately believe in the efficacy of a particular
therapeutic approach may be more disposed than others to see the benefits of that approach.
Proponents of alternative approaches may be more disposed to see the negative aspects of that same
treatment.
Figure 11–3
Ratings in one’s own self-interest.
“Monsters and screamers have always worked for me; I give it two thumbs up!”
©Ronald Jay Cohen. All rights reserved.
Numerous other factors may contribute to bias in a rater’s ratings. The rater may feel competitive with,
physically attracted to, or physically repelled by the subject of the ratings. The rater may not have the
proper background, experience, and trained eye needed for the particular task. Judgments may be
limited by the rater’s general level of conscientiousness and willingness to devote the time and effort
required to do the job properly. The rater may harbor biases concerning various stereotypes. Subjectivity
based on the rater’s own personal preferences and taste may also enter into judgments. Features that
rate a “perfect 10” in one person’s opinion may represent more like a “mediocre 5” in the eyes of
another person. If such marked diversity of opinion occurs frequently with regard to a particular
instrument, we would expect it to be reflected in low inter-rater reliability coefficients. It would probably
be desirable to take another look at the criteria used to make ratings and how specific they are.
401
When another person is the referent, an important factor to consider with regard to ratings is the
context of the evaluation. Different raters may have different perspectives on the individual they are
rating because of the context in which they typically view that person. A parent may indicate on a rating
scale that a child is hyperactive, whereas the same child’s teacher may indicate on the same rating scale
that the child’s activity level is within normal limits. Can they both be right?
The answer is yes, according to one meta-analysis of 119 articles in the scholarly literature (Achenbach
et al., 1987). Different informants may have different perspectives on the subjects being evaluated.
These different perspectives derive from observing and interacting with the subjects in different
contexts. The study also noted that raters tended to agree more about the difficulties of young children
(ages 6 to 11) than about those of older children and adolescents. Raters also tended to show more
agreement about children exhibiting self-control problems (such as hyperactivity and mistreating other
children) in contrast to “overcontrol” problems (such as anxiety or depression). The researchers urged
professionals to view the differences in evaluation that arise from different perspectives as something
more than error in the evaluation process. They urged professionals to employ context-specific
differences in treatment plans. Many of their ideas regarding context-dependent evaluation and
treatment were incorporated into Achenbach’s (1993) Multiaxial Empirically Based Assessment system,
the predecessor of the current Achenbach System of Empirically Based Assessment (Achenbach, 2009).
The system is an approach to the assessment of children and adolescents that incorporates cognitive and
physical assessments of the subject, self-report of the subject, and ratings by parents and teachers.
Additionally, performance measures of the child alone, with the family, or in the classroom may be
included.
JUST THINK . . .
Imagining that it was you who was being rated, how might you be rated differently on the same variable
in different contexts?
Regardless whether the self or another person is the subject of study, one element of any evaluation that
must be kept in mind by the assessor is the cultural context.
402
The cultural background of assessees
Test developers and users have shown increased sensitivity to issues of cultural diversity. A number of
concerns have been raised regarding the use of personality tests and other tools of assessment with
members of culturally and linguistically diverse populations (Anderson, 1995; Campos, 1989; Greene,
1987; Hill et al., 2010; Irvine & Berry, 1983; López & Hernandez, 1987; Nye et al., 2008; Sundberg &
Gonzales, 1981; Widiger & Samuel, 2009). How fair or generalizable is a particular instrument or
measurement technique with a member of a particular cultural group? How a test was developed, how
it is administered, and how scores on it are interpreted are all questions to be raised when considering
the appropriateness of administering a particular personality test to members of culturally and
linguistically diverse populations. We continue to explore these and related questions later in this
chapter and throughout this book. In Chapter 13, for example, we consider in detail the meaning of the
term culturally informed psychological assessment.
What?
What is assessed when a personality assessment is conducted? For many personality tests, it is
meaningful to answer this question with reference to the primary content area sampled by the test and
to that portion of the test devoted to measuring aspects of the testtaker’s general response style.
Primary content area sampled
Personality measures are tools used to gain insight into a wide array of thoughts, feelings, and behaviors
associated with all aspects of the human experience. Some tests are designed to measure particular
traits (such as introversion) or states (such as test anxiety), whereas others focus on descriptions of
behavior, usually in particular contexts. For example, an observational checklist may concentrate on
classroom behaviors associated with movement in order to assess a child’s hyperactivity. Extended
discussion of behavioral measures is presented in Chapter 12.
Many contemporary personality tests, especially tests that can be scored and interpreted by computer,
are designed to measure not only some targeted trait or other personality variable but also some aspect
of the testtaker’s response style. For example, in addition to scales labeled Introversion and Extraversion,
a test of introversion/extraversion might contain other scales. Such additional scales could be designed
to shed light on how honestly testtakers responded to the test, how consistently they answered the
questions, and other matters related to the validity of the test findings. These measures of response
pattern are also known as measures of response set or response style. Let’s take a look at some different
testtaker response styles as well as the scales used to identify them.
Testtaker response styles
Response style refers to a tendency to respond to a test item or interview question in some
characteristic manner regardless of the content of the item or question. For example, an individual may
be more apt to respond yes or true than no or false on a short-answer test. This particular pattern of
responding is characterized as acquiescent. Table 11–1 shows a listing of other identified response styles.
403
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Table 11–1
A Sampling of Test Response Styles
Response Style Name
Explanation: A Tendency to . . .
Socially desirable responding
present oneself in a favorable (socially acceptable or desirable) light
Acquiescence
agree with whatever is presented
Nonacquiescence
disagree with whatever is presented
Deviance
make unusual or uncommon responses
Extreme
make extreme, as opposed to middle, ratings on a rating scale
Gambling/cautiousness guess—or not guess—when in doubt
Overly positive claim extreme virtue through self-presentation in a superlative manner (Butcher & Han,
1995)
Impression management is a term used to describe the attempt to manipulate others’ impressions
through “the selective exposure of some information (it may be false information) . . . coupled with
suppression of [other] information” (Braginsky et al., 1969, p. 51). In the process of personality
assessment, assessees might employ any number of impression management strategies for any number
of reasons. Delroy Paulhus (1984, 1986, 1990) and his colleagues (Kurt & Paulhus, 2008; Paulhus &
Holden, 2010; Paulhus & Levitt, 1987) have explored impression management in test-taking as well as
the related phenomena of enhancement (the claiming of positive attributes), denial (the repudiation of
negative attributes), and self-deception—“the tendency to give favorably biased but honestly held self-
descriptions” (Paulhus & Reid, 1991, p. 307). Testtakers who engage in impression management are
exhibiting, in the broadest sense, a response style (Jackson & Messick, 1962).
JUST THINK . . .
On what occasion did you attempt to manage a particular impression for a friend, a family member, or
an acquaintance? Why did you feel the need to do so? Would you consider your effort successful?
Some personality tests contain items designed to detect different types of response styles. So, for
example, a true response to an item like “I summer in Baghdad” would raise a number of questions, such
as: Did the testtaker understand the instructions? Take the test seriously? Respond true to all items?
Respond randomly? Endorse other infrequently endorsed items? Analysis of the entire protocol will help
answer such questions.
Responding to a personality test in an inconsistent, contrary, or random way, or attempting to fake good
or bad, may affect the validity of the interpretations of the test data. Because a response style can affect
the validity of the outcome, one particular type of response style measure is referred to as a validity
scale. We may define a validity scale as a subscale of a test designed to assist in judgments regarding
how honestly the testtaker responded and whether observed responses were products of response
style, carelessness, deliberate efforts to deceive, or unintentional misunderstanding. Validity scales can
provide a kind of shorthand indication of how honestly, diligently, and carefully a testtaker responded to
test items. Some tests, such as the MMPI and its revision (to be discussed shortly), contain multiple
validity scales. Although there are those who question the utility of formally assessing response styles
(Costa & McCrae, 1997; Rorer, 1965), perhaps the more common view is that response styles are
themselves important for what they reveal about testtakers. As Nunnally (1978, p. 660) observed: “To
the extent that such stylistic variables can be measured independently of content relating to nonstylistic
variables or to the extent that they can somehow be separated from the variance of other traits, they
might prove useful as measures of personality traits.”
404
Where?
Where are personality assessments conducted? Traditional sites for personality assessment, as well as
other varieties of assessment, are schools, clinics, hospitals, academic research laboratories,
employment counseling and vocational selection centers, and the offices of psychologists and
counselors. In addition to such traditional venues, contemporary assessors may be found observing
behavior and making assessments in natural settings, ranging from the assessee’s own home (Marx,
1998; McElwain, 1998; Polizzi, 1998) to the incarcerated assessee’s prison cell (Glassbrenner, 1998).
How?
How are personality assessments structured and conducted? Let’s look at various facets of this
multidimensional question, beginning with issues of scope and theory. We then discuss procedures and
item formats that may be employed, the frame of reference of the assessment, and scoring and
interpretation.
Scope and theory
One dimension of the how of personality assessment concerns its scope. The scope of an evaluation may
be wide, seeking to take a kind of general inventory of an individual’s personality. The California
Psychological Inventory (CPI 434) is an example of an instrument with a relatively wide scope. This test
contains 434 true–false items—but then you knew that from its title—and is designed to yield
information on many personality-related variables such as responsibility, self-acceptance, and
dominance. It was originally conceived to measure enduring personality traits across cultural groups, and
predict the behavior of generally well-functioning people (Boer et al., 2008).
JUST THINK . . .
Suppose you would like to learn as much as you can about the personality of an assessee from one
personality test that is narrow in scope. On what single aspect of personality do you believe it would be
most important to focus?
In contrast to instruments and procedures designed to inventory personality as a whole are instruments
that are much narrower in terms of what they purport to measure. An instrument may be designed to
focus on as little as one particular aspect of personality. For example, consider tests designed to measure
a personality variable called locus of control (Rotter, 1966; Wallston et al., 1978). Locus (meaning “place”
or “site”) of control is a person’s perception about the source of things that happen to him or her. In
general, people who see themselves as largely responsible for what happens to them are said to have an
internal locus of control. People who are prone to attribute what happens to them to external factors
(such as fate or the actions of others) are said to have an external locus of control. A person who
believes in the value of seatbelts, for example, would be expected to score closer to the internal than to
the external end of the continuum of locus of control as opposed to a nonbuckling counterpart.
To what extent is a personality test theory-based or relatively atheoretical? Instruments used in
personality testing and assessment vary in the extent to which they are based on a theory of personality.
Some are based entirely on a theory, and some are relatively atheoretical. An example of a theory-based
instrument is the Blacky Pictures Test (Blum, 1950). This test consists of cartoonlike pictures of a dog
named Blacky in various situations, and each image is designed to elicit fantasies associated with various
psychoanalytic themes. For example, one card depicts Blacky with a knife hovering over his tail, a scene
(according to the test’s author) designed to elicit material related to the psychoanalytic concept of
castration anxiety. The respondent’s task is to make up stories in response to such cards, and the stories
are then analyzed according to the guidelines set forth by Blum (1950). The test is seldom used today;
we cite it here as a particularly dramatic and graphic illustration of how a personality theory (in this case,
psychoanalytic theory) can saturate a test.
405
The other side of the theory saturation coin is the personality test that is relatively atheoretical. The
single most popular personality test in use today is atheoretical: the Minnesota Multiphasic Personality
Inventory (MMPI), in both its original and revised forms. Streiner (2003) referred to this test as “the
epitome of an atheoretical, ‘dust bowl empiricism’ approach to the development of a tool to measure
personality traits” (p. 218). You will better appreciate this comment when we discuss the MMPI and its
subsequent revisions later in this chapter. For now, let’s simply point out one advantage of an
atheoretical tool of personality assessment: It allows test users, should they so desire, to impose their
own theoretical preferences on the interpretation of the findings.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Pursuing another aspect of the how of personality assessment, let’s turn to a nuts-and-bolts look at the
methods used.
Procedures and item formats
Personality may be assessed by many different methods, such as face-to-face interviews, computer-
administered tests, behavioral observation, paper-and-pencil tests, evaluation of case history data,
evaluation of portfolio data, and recording of physiological responses. The equipment required for
assessment varies greatly, depending upon the method employed. In one technique, for example, all
that may be required is a blank sheet of paper and a pencil. The assessee is asked to draw a person, and
the assessor makes inferences about the assessee’s personality from the drawing. Other approaches to
assessment, whether in the interest of basic research or for more applied purposes, may be far more
elaborate in terms of the equipment they require (Figure 11–4).
Figure 11–4
Learning about personality in the field—lterally.
During World War II, the assessment staff of the Office of Strategic Services (OSS) selected American
secret agents using a variety of measures. One measure used to assess leadership ability and emotional
stability in the field was a simulation that involved rebuilding a blown bridge. Candidates were
deliberately supplied with insufficient materials for rebuilding the bridge. In some instances, “assistants”
who were actually confederates of the experimenter further frustrated the candidates’ efforts. In what
was called the “Wall Situation,” candidates were thrust into a scenario wherein the structure pictured
above was a wall obstructing their escape from enemy forces. The group’s task was to get everyone over
it. Typically, the first person to survey the situation and devise a plan for completing the task emerged as
the group leader.
Courtesy of the National Archives
406
Measures of personality vary in terms of the degree of structure built into them. For example,
personality may be assessed by means of an interview, but it may also be assessed by a structured
interview. In the latter method, the interviewer must typically follow an interview guide and has little
leeway in terms of posing questions not in that guide. The variable of structure is also applicable to the
tasks assessees are instructed to perform. In some approaches to personality assessment, the tasks are
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
straightforward, highly structured, and unambiguous. Here is one example of the instructions used for
such a task: Copy this sentence in your own handwriting. Such instructions might be used if the assessor
was attempting to learn something about the assessee by handwriting analysis, also referred to as
graphology (see Figure 11–5). Intuitively appealing as a method of deriving insights into personality,
graphology seems not to have lived up to its promise (Dazzi & Pedrabissi, 2009; Fox, 2011; Gawda, 2008;
Thiry, 2009).
Figure 11–5
Three faces (and three handwritings) of Eve.
Three Faces of Eve was a fact-based, 1957 film classic about three of the personalities—there were more
over the course of the woman’s lifetime—manifested by a patient known as “Eve White,” “Eve Black,”
and “Jane.” Prior to making that film, the 20th Century–Fox legal department insisted that the patient on
whom the screenplay was based sign three separate contracts, one for each of her personalities.
Accordingly, the patient was asked to elicit “Eve White,” “Eve Black,” and “Jane,” and then sign an
agreement while manifesting each of these respective personalities. According to Aubrey Solomon, co-
author of The Films of 20th Century–Fox (Thomas & Solomon, 1989) and commentator on the DVD
release of the film, the three signatures on the three separate contracts were all distinctly different—
presumably because they were a product of three distinctly different personalities.
John Springer Collection/Corbis Historical/Getty Images
In other approaches to personality, what is required of the assessee is not so straightforward, not very
structured, and intentionally ambiguous. One example of a highly unstructured task is as follows: Hand
the assessee one of a series of inkblots and ask, What might this be?
407
The same personality trait or construct may be measured with different instruments in different ways.
Consider the many possible ways of determining how aggressive a person is. Measurement of this trait
could be made in different ways: a paper-and-pencil test; a computerized test; an interview with the
assessee; an interview with family, friends, and associates of the assessee; analysis of official records and
other case history data; behavioral observation; and laboratory experimentation. Of course, criteria for
what constitutes the trait measured—in this case, aggression—would have to be rigorously defined in
advance. After all, psychological traits and constructs can and have been defined in many different ways,
and virtually all such definitions tend to be context-dependent. For example, aggressive may be defined
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
in ways ranging from hostile and assaultive (as in the “aggressive inmate”) to bold and enterprising (as in
the “aggressive salesperson”). This personality trait, like many others, may or may not be socially
desirable; it depends entirely on the context.
In personality assessment, as well as in assessment of other areas, information may be gathered and
questions answered in a variety of ways. For example, a researcher or practitioner interested in learning
about the degree to which respondents are field-dependent may construct an elaborate tilting
chair/tilting room device—the same one you may recall from Chapter 1 (Figure 1–5). In the interests of
time and expense, an equivalent process administered by paper and pencil or computer may be more
practical for everyday use. In this chapter’s Everyday Psychometrics, we illustrate some of the more
common item formats employed in the study of personality and related psychological variables. Keep in
mind that, although we are using these formats to illustrate different ways that personality has been
studied, some are employed in other areas of assessment as well.
408
EVERYDAY PSYCHOMETRICS
Some Common Item Formats
How may personality be assessed? Here are some of the more typical types of item formats.
ITEM 1
I enjoy being out and among other people. TRUE FALSE
This item illustrates the true–false format. Was your reaction something like “been there, done that”
when you saw this item?
ITEM 2
Working with fellow community members on organizing and staging a blood drive. LIKE DISLIKE
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
This two-choice item is designed to elicit information about the respondent’s likes and dislikes. It is a
common format in interest inventories, particularly those used in vocational counseling.
ITEM 3
How I feel when I am out and among other people
Warm __:__:__:__:__:__:__ Cold
Tense __:__:__:__:__:__:__Relaxed
Weak __:__:__:__:__:__:__ Strong
Brooks Brothers suit __:__:__:__:__:__:__ Hawaiian shirt
This item format, called a semantic differential (Osgood et al., 1957), is characterized by bipolar
adjectives separated by a seven-point rating scale on which respondents select one point to indicate
their response. This type of item is useful for gauging the strength, degree, or magnitude of the direction
of a particular response and has applications ranging from self-concept descriptions to opinion surveys.
ITEM 4
I enjoy being out and among other people.
or
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
I have an interest in learning about art.
ITEM 5
I am depressed too much of the time.
or
I am anxious too much of the time.
These are two examples of items written in a forced-choice format, where ideally each of the two
choices (there may be more than two choices) is equal in social desirability. The Edwards Personal
Preference Schedule (Edwards, 1953) is a classic forced-choice test. Edwards (1957a, 1957b, 1966)
described in detail how he determined the items in this test to be equivalent in social desirability.
ITEM 6
naughty
needy
negativistic
New Age
nerdy
nimble
nonproductive
numb
This illustrates an item written in an adjective checklist format. Respondents check the traits that apply
to them.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
ITEM 7
Complete this sentence.
I feel as if I _____________________________.
Respondents are typically instructed to finish the sentence with their “real feelings” in what is called a
sentence completion item. The Rotter Incomplete Sentence Blank (Rotter & Rafferty, 1950) is a
standardized test that employs such items, and the manual features normative data (Rotter et al., 1992).
Can you distinguish the figure labeled (b) in the figure labeled (a)? This type of item is found in
embedded-figures tests. Identifying hidden figures is a skill thought to tap the same field
dependence/independence variable tapped by more elaborate apparatuses such as the tilting
chair/tilting room illustrated in Figure 1–5.
Part 4:The Assessment of Personality
This is an item reminiscent of one of the Rorschach inkblots. We will have much more to say about the
Rorschach in the following chapter.
Courtesy of Ronald Jay Cohen
Much like the Rorschach test, which uses inkblots as ambiguous stimuli, many other tests ask the
respondent to “project” onto an ambiguous stimulus. This item is reminiscent of one such projective
technique called the Hand Test. Respondents are asked to tell the examiner what they think the hands
might be doing.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Frame of reference
Another variable relevant to the how of personality measurement concerns the frame of reference of
the assessment. In the context of item format and assessment in general, frame of reference may be
defined as aspects of the focus of exploration such as the time frame (the past, the present, or the
future) as well as other contextual issues that involve people, places, and events. Perhaps for most
measures of personality, the frame of reference for the assessee may be described in phrases such as
what is or how I am right now. However, some techniques of measurement are easily adapted to tap
alternative frames of reference, such as what I could be ideally, how I am in the office, how others see
me, how I see others, and so forth. Obtaining self-reported information from different frames of
reference is, in itself, a way of developing information related to states and traits. For example, in
comparing self-perception in the present versus what is anticipated for the future, assessees who report
that they will become better people may be presumed to be more optimistic than assessees who report
a reverse trend.
Representative of methodologies that can be readily applied in the exploration of varied frames of
reference is the Q-sort technique. Originally developed by Stephenson (1953), the Q-sort is an
assessment technique in which the task is to sort a group of statements, usually in perceived rank order
ranging from most descriptive to least descriptive. The statements, traditionally presented on index
cards, may be sorted in ways designed to reflect various perceptions. They may, for example, reflect how
respondents see themselves or how they would like to see themselves. Illustrative statements are I am
confident, I try hard to please others, and I am uncomfortable in social situations.
One of the best-known applications of Q-sort methodology in clinical and counseling settings was
advocated by the personality theorist and psychotherapist Carl Rogers. Rogers (1959) used the Q-sort to
evaluate the discrepancy between the perceived actual self and the ideal self. At the beginning of
psychotherapy, clients might be asked to sort cards twice, once according to how they perceived
themselves to be and then according to how they would ultimately like to be. The larger the discrepancy
between the sortings, the more goals would have to be set in therapy. Presumably, retesting the client
who successfully completed a course of therapy would reveal much less discrepancy between the
present and idealized selves.
409
Beyond its application in initial assessment and reevaluation of a therapy client, the Q-sort technique has
also been used extensively in basic research in the area of personality and other areas. Some highly
specialized Q-sorts include the Leadership Q-Test (Cassel, 1958) and the Tyler Vocational Classification
System (Tyler, 1961). The former test was designed for use in military settings and contains cards with
statements that the assessee is instructed to sort in terms of their perceived importance to effective
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
leadership. The Tyler Q-sort contains cards on which occupations are listed; the cards are sorted in terms
of the perceived desirability of each occupation. One feature of Q-sort methodology is the ease with
which it can be adapted for use with a wide population range for varied clinical and research purposes.
Q-sort methodology has been used to measure a wide range of variables (e.g., Bradley & Miller, 2010;
Fowler & Westen, 2011; Huang & Shih, 2011). It has been used to measure attachment security with
children as young as preschoolers (DeMulder et al., 2000). An adaptation of Q-sort methodology has
even been used to measure attachment security in rhesus monkeys (Warfield et al., 2011).
Two other item presentation formats that are readily adaptable to different frames of reference are the
adjective checklist format and the sentence completion format. With the adjective checklist method,
respondents simply check off on a list of adjectives those that apply to themselves (or to people they are
rating). Using the same list of adjectives, the frame of reference can easily be changed by changing the
instructions. For example, to gauge various states, respondents can be asked to check off adjectives
indicating how they feel right now. Alternatively, to gauge various traits, they may be asked to check off
adjectives indicative of how they have felt for the last year or so. A test called, simply enough, the
Adjective Check List (Gough, 1960; Gough & Heilbrun, 1980) has been used in a wide range of research
studies to study assessees’ perceptions of themselves or others. For example, the instrument has been
used to study managers’ self-perceptions (Hills, 1985), parents’ perceptions of their children (Brown,
1972), and clients’ perceptions of their therapists (Reinehr, 1969). The sheer simplicity of the measure
makes it adaptable for use in a wide range of applications (e.g., Ledesma et al., 2011; Redshaw & Martin,
2009; Tsaousis & Georgiades, 2009).
410
JUST THINK . . .
Envision and describe an assessment scenario in which it would be important to obtain the assessee’s
perception of others.
As implied by the label ascribed to these types of tests, the testtaker’s task in responding to an item
written in a sentence completion format is to finish the rest of a sentence when provided with a
sentence stem. Items may tap how assessees feel about themselves, as in this sentence completion
item: I would describe my feeling about myself as _____. Items may tap how assessees feel about others,
as in My classmates are _____. More will be discussed on sentence completion methods in the following
chapter; right now, let’s briefly overview how personality tests are scored and interpreted.
Scoring and interpretation
Personality measures differ with respect to the way conclusions are drawn from the data they provide.
For some paper-and-pencil measures, a simple tally of responses to targeted items is presumed to
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
provide a measure of the strength of a particular trait. For other measures, a computer programmed to
apply highly technical manipulations of the data is required for purposes of scoring and interpretation.
Yet other measures may require a highly trained clinician reviewing a verbatim transcript of what the
assessee said in response to certain stimuli such as inkblots or pictures.
It is also meaningful to dichotomize measures with respect to the nomothetic versus idiographic
approach. The nomothetic approach to assessment is characterized by efforts to learn how a limited
number of personality traits can be applied to all people. According to a nomothetic view, certain
personality traits exist in all people to varying degrees. The assessor’s task is to determine what the
strength of each of these traits are in the assessee. An assessor who uses a test such as the 16 PF, Fifth
Edition (Cattell et al., 1993), probably subscribes to the nomothetic view. This is so because the 16PF was
designed to measure the strength of 16 personality factors (which is what “PF” stands for) in the
testtaker. Similarly, tests purporting to measure the “Big 5” personality traits are very much in the
nomothetic tradition.
In contrast to a nomothetic view is the idiographic one. An idiographic approach to assessment is
characterized by efforts to learn about each individual’s unique constellation of personality traits, with
no attempt to characterize each person according to any particular set of traits. The idea here is not to
see where one falls on the continuum of a few traits deemed to be universal, but rather to understand
the specific traits unique to the makeup of the individual. The idiographic orientation is evident in
assessment procedures that are more flexible not only in terms of listing the observed traits but also of
naming new trait terms.2 The idiographic approach to personality assessment was described in detail by
Allport (1937; Allport & Odbert, 1936). Methods of assessment used by proponents of this view tend to
be more like tools such as the case study and personal records rather than tests. Of these two different
approaches, most contemporary psychologists seem to favor the nomothetic approach.
Another dimension related to how meaning is attached to test scores has to do with whether inter-
individual or intra-individual comparisons are made with respect to test scores. Most common in
personality assessment is the normative approach, whereby a testtaker’s responses and the presumed
strength of a measured trait are interpreted relative to the strength of that trait in a sample of a larger
population. However, you may recall that an alternative to the normative approach in test interpretation
is the ipsative approach. In the ipsative approach, a testtaker’s responses, as well as the presumed
strength of measured traits, are interpreted relative to the strength of measured traits for that same
individual. On a test that employs ipsative scoring procedures, two people with the same score for a
particular trait or personality characteristic may differ markedly with regard to the magnitude of that
trait or characteristic relative to members of a larger population.
JUST THINK . . .
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Place yourself in the role of a human resources executive for a large airline. As part of the evaluation
process, all new pilots will be given a personality test. You are asked whether the test should be ipsative
or normative in nature. Your response?
Concluding our overview of the how of personality assessment, and to prepare for discussing the ways in
which personality tests are developed, let’s review some issues in personality test development and use.
411
Issues in personality test development and use
Many of the issues inherent in the test development process mirror the basic questions just discussed
about personality assessment in general. What testtakers will this test be designed to be used with? Will
the test entail self-report? Or will it require the use of raters or judges? If raters or judges are needed,
what special training or other qualifications must they have? How will a reasonable level of inter-rater
reliability be ensured? What content area will be sampled by the test? How will issues of testtaker
response style be dealt with? What item format should be employed, and what is the optimal frame of
reference? How will the test be scored and interpreted?
As previously noted, personality assessment that relies exclusively on self-report is a double-edged
sword. On the one hand, the information is from “the source.” Respondents are in most instances
presumed to know themselves better than anyone else does and therefore should be able to supply
accurate responses about themselves. On the other hand, the consumer of such information has no way
of knowing with certainty which self-reported information is entirely true, partly true, not really true, or
an outright lie. Consider a response to a single item on a personality inventory written in a true–false
format. The item reads: I tend to enjoy meeting new people. A respondent indicates true. In reality, we
do not know whether the respondent (1) enjoys meeting new people; (2) honestly believes that he or
she enjoys meeting new people but really does not (in which case, the response is more the product of a
lack of insight than a report of reality); (3) does not enjoy meeting new people but would like people to
think that he or she does; or (4) did not even bother to read the item, is not taking the test seriously, and
is responding true or false randomly to each item.
Building validity scales into self-report tests is one way that test developers have attempted to deal with
the potential problems. In recent years, there has been some debate about whether validity scales
should be included in personality tests. In arguing the case for the inclusion of validity scales, it has been
asserted that “detection of an attempt to provide misleading information is a vital and absolutely
necessary component of the clinical interpretation of test results” and that using any instrument without
validity scales “runs counter to the basic tenets of clinical assessment” (Ben-Porath & Waller, 1992, p.
24). By contrast, the authors of the widely used Revised NEO Personality Inventory (NEO PI-R), Paul T.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Costa Jr. and Robert R. McCrae, perceived no need to include any validity scales in their instrument and
have been unenthusiastic about the use of such scales in other tests (McCrae & Costa, 1983; McCrae et
al., 1989; Piedmont & McCrae, 1996; Piedmont et al., 2000). Referring to validity scales as SD (social
desirability) scales, Costa and McCrae (1997) opined:
SD scales typically consist of items that have a clearly desirable response. We know that people who are
trying falsely to appear to have good qualities will endorse many such items, and the creators of SD
scales wish to infer from this that people who endorse many SD items are trying to create a good
impression. That argument is formally identical to asserting that presidential candidates shake hands,
and therefore people who shake hands are probably running for president. In fact, there are many more
common reasons for shaking hands, and there is also a more common reason than impression
management for endorsing SD items—namely, because the items are reasonably accurate self-
descriptions. (p. 89)
412
JUST THINK . . .
Having read about some of the pros and cons of using validity scales in personality assessment, where do
you stand on the issue? Feel free to revise your opinion as you learn more.
According to Costa and McCrae, assessors can affirm that self-reported information is reasonably
accurate by consulting external sources such as peer raters. Of course, the use of raters necessitates
certain other precautions to guard against rater error and bias. Education regarding the nature of various
types of rater error and bias has been a key weapon in the fight against intentional or unintentional
inaccuracies in ratings. Training sessions may be designed to accomplish several objectives, such as
clarifying terminology to increase the reliability of ratings. A term like satisfactory, for example, may have
different meanings to different raters. During training, new raters can observe and work with more
experienced raters to become acquainted with aspects of the task that may not be described in the
rater’s manual, to compare ratings with more experienced raters, and to discuss the thinking that went
into the ratings.
To include or not include a validity scale in a personality test is definitely an issue that must be dealt
with. What about the language in which the assessment is conducted? At first blush, this would appear
to be a non-issue. Well, yes and no. If an assessee is from a culture different from the culture in which
the test was developed, or if the assessee is fluent in one or more languages, then language may well
become an issue. Words tend to lose—or gain—something in translation, and some words and
expressions are not readily translatable into other languages. Consider the following true–false item
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
from a popular personality test: I am known for my prudence and common sense. If you are a bilingual
student, translate that statement from English as an exercise in test-item translation before reading on.
A French translation of this item is quite close, adding only an extra first-person possessive pronoun
(“par ma prudence et mon bon sens”); however, the Filipino translation of this item would read I can be
relied on to decide carefully and well on matters (McCrae et al., 1998, p. 176).
In addition to sometimes significant differences in the meaning of individual items, the traits measured
by personality tests sometimes have different meanings as well. Acknowledging this fact, McCrae et al.
(1998, p. 183) cautioned that “personality-trait relations reported in Western studies should be
considered promising hypotheses to be tested in new cultures.”
The broader issue relevant to the development and use of personality tests with members of a culture
different from the culture in which the test was normed concerns the applicability of the norms. For
example, a number of MMPI studies conducted with members of groups from diverse backgrounds yield
findings in which minority group members tend to present with more psychopathology than majority
group members (see, e.g., Montgomery & Orozco, 1985; Whitworth & Unterbrink, 1994). Such
differences have elicited questions regarding the appropriateness of the use of the test with members of
different populations (Dana, 1995; Dana & Whatley, 1991; Malgady et al., 1987).
A test may well be appropriate for use with members of culturally different populations. As López (1988,
p. 1096) observed, “To argue that the MMPI is culturally biased, one needs to go beyond reporting that
ethnic groups differ in their group profiles.” López noted that many of the studies showing differences
between the groups did not control for psychopathology. Accordingly, there may well have been actual
differences across the groups in psychopathology. The size of the sample used in the research and the
appropriateness of the statistical analysis are other extracultural factors to consider when evaluating
cross-cultural research. Of course, if culture and “learned meanings” (Rohner, 1984, pp. 119–120), as
opposed to psychopathology, are found to account for differences in measured psychopathology with
members of a particular cultural group, then the continued use of the measures with members of that
cultural group must be questioned.
413
In the wake of heightened security concerns as a result of highly publicized terrorist threats, stalking
incidents, and the like, new issues related to privacy have come to the fore. The number of assessments
administered in the interest of threat assessment seem ever on the increase, while professional
guidelines and legislative mandates have lagged. The result is that the public’s need to know who is a
legitimate threat to public safety has been pitted against the individual’s right to privacy (among other
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
rights). The topic is delved into by no less than a threat assessment expert in this chapter’s Meet an
Assessment Professional.
Armed with some background information regarding the nature of personality and its assessment, as
well as some of the issues that attend the process, let’s look at the process of developing instruments
designed to assess personality.
Developing Instruments to Assess Personality
Tools such as logic, theory, and data reduction methods (such as factor analysis) are frequently used in
the process of developing personality tests. Another tool in the test development process may be a
criterion group. As we will see, most personality tests employ two or more of these tools in the course of
their development.
Logic and Reason
Notwithstanding the grumblings of skeptics, there is a place for logic and reason in psychology, at least
when it comes to writing items for a personality test. Logic and reason may dictate what content is
covered by the items. Indeed, the use of logic and reason in the development of test items is sometimes
referred to as the content or content-oriented approach to test development. So, for example, if you
were developing a true–false test of extraversion, logic and reason might dictate that one of the items
might be something like I consider myself an outgoing person.
Efforts to develop such content-oriented, face-valid items can be traced at least as far back as an
instrument used to screen World War I recruits for personality and adjustment problems. The Personal
Data Sheet (Woodworth, 1917), later known as the Woodworth Psychoneurotic Inventory, contained
items designed to elicit self-report of fears, sleep disorders, and other problems deemed symptomatic of
a pathological condition referred to then as psychoneuroticism. The greater the number of problems
reported, the more psychoneurotic the respondent was presumed to be.
A great deal of clinically actionable information can be collected in relatively little time using such self-
report instruments—provided, of course, that the testtaker has the requisite insight and responds with
candor. A highly trained professional is not required for administration of the test. A plus in the digital
age is that a computerized report of the findings can be available in minutes. Moreover, such
instruments are particularly well suited to clinical settings in managed care environments, where drastic
cost cutting has led to reductions in orders for assessment, and insurers are reluctant to authorize
assessments. In such environments, the preferred use of psychological tests has traditionally been to
identify conditions of “medical necessity” (Glazer et al., 1991). Quick, relatively inexpensive tests,
wherein assessees report specific problems have won favor with insurers.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
A typical companion to logic, reason, and intuition in item development is research. A review of the
literature on the aspect of personality that test items are designed to tap will frequently be very helpful
to test developers. In a similar vein, clinical experience can be helpful in item creation. So, for example,
clinicians with ample experience in treating people diagnosed with antisocial personality disorder could
be expected to have their own ideas about which items will work best on a test designed to identify
people with the disorder. A related aid in the test development process is correspondence with experts
on the subject matter of the test. Included here are experts who have researched and published on the
subject matter, as well as experts who have known to have amassed great clinical experience on the
subject matter. Yet another possible tool in test development—sometimes even the guiding force—is
psychological theory.
414
MEET AN ASSESSMENT PROFESSIONAL
Meet Dr. Rick Malone
Iam Colonel Rick Malone, MD, an active duty military forensic psychiatrist, currently serving as a
behavioral science officer with the U.S. Army Criminal Investigation Command (still known by its
historical abbreviation, CID). In this capacity I consult with CID Special Agents on a variety of
investigations. My work assignments include behavioral analysis of crime scene evidence, the conduct of
psychological autopsies, and what I will discuss in more detail here: threat assessment.
As its name implies, threat assessment may be defined as a process of identifying or evaluating entities,
actions, or occurrences, whether natural or man-made, that have or indicate the potential to harm life,
information, operations and/or property (Department of Homeland Security, 2008). The practice of
threat assessment can take many forms depending upon the setting and the organization’s mission. In
our setting, the mission of threat assessment entails, among other things, the gathering of intelligence
designed to protect senior Department of Defense officials (referred to as “principals”). The tool of
assessment we tend to rely on most is what is called a structured professional judgment (SPJ). The
structured professional judgment is an approach that attempts to bridge the gap between actuarial and
unstructured clinical approaches to risk assessment. Unstructured clinical approaches are based on the
exercise of professional discretion and usually are justified according to the qualifications and experience
of the professional who makes them. Of course, given the variance that exists in terms of the
qualifications and experience of professionals making such judgments, SPJ as a tool of assessment is
vulnerable to criticism on various psychometric grounds such as questionable or unknown reliability and
validity. Also, given the wide range of actions that may be launched as a result of such professional
discretion, another issue relevant to SPJ is accountability.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
In contrast to SPJ as the primary tool of assessment, an actuarial approach employs a fixed set of risk
factors that are combined to produce a score. In turn, this score is used to gauge an individual’s relative
risk compared to a normative group. One of the disadvantages of such strictly “objective” procedures is
that they typically prohibit the evaluator from considering unique, unusual, or context-specific variables
that might require intervention.
RICKY D. MALONE, MD, MPH, MSSI COL, MC, SFS Forensic Psychiatry/Behavioral Science Consultant, U.S.
Army Criminal Investigation Command
Ricky D. Malone
The SPJ relies on evidence-based guidelines that are directly informed, guided, and structured by the
scientific and professional literature, but allows the evaluator discretion in their interpretation. The word
“structured” in this term refers to a minimum set of risk factors that should be considered and how to
measure them. However, “structured” in this context stops short of requiring that the identified risk
factors be combined according to a specific algorithm (Hart & Logan, 2011).
In our setting, we are often asked to assess the threat posed by a person who has demonstrated an
“inappropriate direction of interest” toward one of our designated principals. Such an individual will
typically come to our attention through attempts to communicate directly with one of these principals by
telephone, mail, or e-mail. Occasionally—and even of greater concern—the individual has even directly
come in contact or approached a designated principal. In recent years, our attention has been focused
on such persons of interest as a result of some posting on social media. Communications of concern may
contain anything from an outright threat to a complaint symptomatic of inappropriate or exaggerated
anger or blame. Another variety of communication that will get our attention is one that makes an
inappropriate plea for help with some personal issue that the writer perceives to be within the public
official’s sphere of influence. As one might imagine, senior military officials in the public eye can and do
receive such inappropriate communications from all over the world. So what is done in response?
415
In some cases, not very much is done. Given relatively limited resources, we need to pick and choose
which communications warrant a response (or a formal investigation) and what the level of that
response should be. So what we do for starters is a brief, indirect assessment to estimate the level of
concern that the person of interest warrants. If our level of concern is high, a formal law enforcement
investigation will be launched. If our level of concern is below the threshold of triggering a formal
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
investigation, we will simply continue to monitor their attempts to communicate and related activities.
Useful in this context is Meloy’s (2000) biopsychosocial (BPS) model, which identifies
individual/psychological factors, social/situational factors, and biological factors that have been shown to
be associated with higher rates of interpersonal violence. It avoids the use of numerical scores and
assigning ranges for threat levels, but instead recommends that each factor be assessed and weighted
according to case-specific circumstances. While the BPS model was not designed specifically for targeted
violence towards public figures, it is useful in this context because it relies primarily on readily obtainable
information (as opposed to the level of information required for performing a formal investigation).
Perhaps the best source of data for making inferences as to how dangerous persons of interest may be
are the communications created by those person themselves. Notes, electronic postings, and other
communications frequently contain relevant personal details. These details can provide leads and clues
that yield informed insights into the individual’s mental state. Hypotheses about the person’s mental
state and the severity of disorder may be supported or rejected through the examination of other
sources such as the individual’s social media presence. Often, postings on social media can be quite
revealing in terms of things like an individual’s daily activities, interests, and political leanings. And
looking beyond the obvious, postings on social media may also be revealing in terms of personality and
the possible existence of delusional beliefs.
Complementing analysis of material readily found on social media websites is another potential gold
mine of relevant information: public records. A search of public records can yield valuable insights into
variables as diverse as financial status, residential stability, geographic mobility, and social support
systems. The information derived from such publicly available sources is then incorporated into the
biopsychosocial assessment and examined for evidence of the warning behaviors (Meloy et al., 2012).
Based on the amount and quality of information we have in hand, as well as the level of concern, the
threat management team decides whether to proceed with an investigation and/or take steps to
mitigate the threat. In both its investigative capacity, and its efforts to mitigate a threat, the team is
challenged to balance the protection of the principal’s safety with the need to preserve a citizen’s civil
rights (including one’s right to free speech and privacy, and the right not to be falsely imprisoned).
Investigative activities alone can have a significant negative impact on the individual’s life. During the
investigation, any questionable behavior on the part of a person of interest will be revealed to friends,
family, and business associates. One danger here is that the mere revelation of such behavior to third
parties will be damaging to the person of interest. From the perspective of the agency, conducting an
investigation has its own dangers as it may “tip off” the person of interest and give rise to an escalation
in that individual’s plans—all before an effective strategy for threat mitigation has been devised or put in
place. Alternatively, the “tip off” may serve to impact the person of interest with the reality that it is now
time to abandon the suspect activity.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Threat assessment is both an art and a science; it requires the ability to know how to use evidence-
based risk factors and to integrate them with relevant insights from the individual narrative. Effective
assessment and mitigation of threat further requires the ability to work as part of a multidisciplinary
team with a diverse group of professionals such as law enforcement officers, prosecutors, mental health
professionals, and corporate security experts. Students who are drawn to this type of work will find
indispensable a firm foundation in forensic psychology coursework, and more specifically, coursework in
forensic psychological assessment. Beyond formal coursework, read the published works of expert threat
assessors such as J. Reid Meloy (e.g., Meloy, 2001; 2011; 2015; Meloy et al., 2008, 2015; Mohandie &
Meloy, 2013). Also, consider doing volunteer work, or an internship in a setting where threat
assessments are routinely conducted. There, an experienced forensic professional can serve as a model
and a mentor in the art and science of unraveling the workings of a mind based on information gathered
from a variety of sources.
Used with permission of Ricky D. Malone.
416
Theory
As we noted earlier, personality measures differ in the extent to which they rely on a particular theory of
personality in their development as well as their interpretation. If psychoanalytic theory was the guiding
force behind the development of a new test designed to measure antisocial personality disorder, for
example, the items might look quite different than the items developed solely on the basis of logic and
reason. One might find, for example, items designed to tap ego and superego defects that might result in
a lack of mutuality in interpersonal relationships. Given that dreams are thought to reveal unconscious
motivation, there might even be items probing the respondent’s dreams; interpretations of such
responses would be made from a psychoanalytic perspective. As with the development of tests using
logic and reason, research, clinical experience, and the opinions of experts might be used in the
development of a personality test that is theory-based.
Data Reduction Methods
Data reduction methods represent another class of widely used tool in contemporary test development.
Data reduction methods include several types of statistical techniques collectively known as factor
analysis or cluster analysis. One use of data reduction methods in the design of personality measures is
to aid in the identification of the minimum number of variables or factors that account for the
intercorrelations in observed phenomena.
Let’s illustrate the process of data reduction with a simple example related to painting your
apartment. You may not have a strong sense of the exact color that best complements your “student-of-
psychology” decor. Your investment in a subscription to Architectural Digest proved to be no help at all.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
You go to the local paint store and obtain free card samples of every shade of paint known to humanity
—thousands of color samples. Next, you undertake an informal factor analysis of these thousands of
color samples. You attempt to identify the minimum number of variables or factors that account for the
intercorrelations among all of these colors. You discover that there are three factors (which might be
labeled “primary” factors) and four more factors (which might be labeled “secondary” or “second-order”
factors), the latter set of factors being combinations of the first set of factors. Because all colors can be
reduced to three primary colors and their combinations, the three primary factors would correspond to
the three primary colors, red, yellow, and blue (which you might christen factor R, factor Y, and factor B),
and the four secondary or second-order factors would correspond to all the possible combinations that
could be made from the primary factors (factors RY, RB, YB, and RYB).
The paint sample illustration might be helpful to keep in mind as we review how factor analysis is used
in test construction and personality assessment. In a way analogous to the factoring of all those shades
of paint into three primary colors, think of all personality traits being factored into what one psychologist
referred to as “the most important individual differences in human transactions” (Goldberg, 1993, p. 26).
After all the factoring is over and the dust has settled, how many personality-related terms do you think
would remain? Stated another way, just how many primary factors of personality are there?
417
As the result of a pioneering research program in the 1940s, Raymond Bernard Cattell’s answer to the
question posed above was “16.” Cattell (1946, 1947, 1948a, 1948b) reviewed previous research by
Allport and Odbert (1936), which suggested that there were more than 18,000 personality trait names
and terms in the English language. Of these, however, only about a quarter were “real traits of
personality” or words and terms that designated “generalized and personalized determining tendencies
—consistent and stable modes of an individual’s adjustment to his environment . . . not . . . merely
temporary and specific behavior” (Allport, 1937, p. 306).
Cattell added to the list some trait names and terms employed in the professional psychology and
psychiatric literature and then had judges rate “just distinguishable” differences between all the words
(Cattell, 1957). The result was a reduction in the size of the list to 171 trait names and terms. College
students were asked to rate their friends with respect to these trait names and terms, and the factor-
analyzed results of that rating further reduced the number of names and terms to 36, which Cattell
referred to as surface traits. Still more research indicated that 16 basic dimensions or source traits could
be distilled. In 1949, Cattell’s research culminated in the publication of a test called the Sixteen
Personality Factor (16 PF) Questionnaire. Revisions of the test were published in 1956, 1962, 1968, and
1993. In 2002, supplemental updated norms were published (Maraist & Russell, 2002).
Over the years, many questions have been raised regarding (1) whether the 16 factors identified by
Cattell do indeed merit the description as the “source traits” of personality, and (2) whether, in fact, the
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
16 PF measures 16 distinct factors. Although some research supports Cattell’s claims, give or take a
factor or two depending on the sample (Cattell & Krug, 1986; Lichtenstein et al., 1986), serious
reservations regarding these assertions have also been expressed (Eysenck, 1985, 1991; Goldberg, 1993).
Some have argued that the 16 PF may be measuring fewer than 16 factors, because several of the factors
are substantially intercorrelated.
With colors in the paint store, we can be certain that there are three that are primary. But with regard to
personality factors, certainty doesn’t seem to be in the cards. Some theorists have argued that the
primary factors of personality can be narrowed down to three (Eysenck, 1991), or maybe four, five, or six
(Church & Burke, 1994). At least four different five-factor models exist (Johnson & Ostendorf, 1993; Costa
& McCrae, 1992), and Waller and Zavala (1993) made a case for a seven-factor model. Costa and
McCrae’s five-factor model (with factors that have come to be known as “the Big Five,” sometimes also
expressed as “the Big 5”). has gained the greatest following. Interestingly, using factor analysis in the
1960s, Raymond Cattell had also derived five factors from his “primary 16” (H. Cattell, 1996). A side-by-
side comparison of “Cattell’s five” with the Big Five shows strong similarity between the two sets of
derived factors (Table 11–2). Still, Cattell believed in the primacy of the 16 factors he originally identified.
Table 11–2
The Big Five Compared to Cattell’s Five
The Big Five
Cattell’s Five (circa 1960)
Extraversion
Introversion/Extraversion
Neuroticism
Low Anxiety/High Anxiety
Openness
Tough-Mindedness/Receptivity
Agreeableness Independence/Accommodation
Conscientiousness
Low Self-Control/High Self-Control
Cattell expressed what he viewed as the source traits of personality in terms of bipolar dimensions. The
16 personality factors measured by the test today are: Warmth (Reserved vs. Warm), Reasoning
(Concrete vs. Abstract), Emotional Stability (Reactive vs. Emotionally Stable), Dominance (Deferential vs.
Dominant), Liveliness (Serious vs. Lively), Rule-Consciousness (Expedient vs. Rule-Conscious), Social
Boldness (Shy vs. Socially Bold), Sensitivity (Utilitarian vs. Sensitive), Vigilance (Trusting vs. Vigilant),
Abstractedness (Grounded vs. Abstracted), Privateness (Forthright vs. Private), Apprehension (Self-
Assured vs. Apprehensive), Openness to Change (Traditional vs. Open to Change), Self-Reliance (Group-
Oriented vs. Self-Reliant), Perfectionism (Tolerates Disorder vs. Perfectionistic), and Tension (Relaxed vs.
Tense).
The Big Five
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
The Revised NEO Personality Inventory (NEO PI-R; Costa & McCrae, 1992) is widely used in both clinical
applications and a wide range of research that involves personality assessment. Based on a five-
dimension (or factor) model of personality, the NEO PI-R is a measure of five major dimensions (or
“domains”) of personality and a total of 30 elements or facets that define each domain.
The original version of the test was called the NEO Personality Inventory (NEO-PI; Costa & McCrae,
1985), where NEO was an acronym for the first three domains measured: Neuroticism, Extraversion, and
Openness. The NEO PI-R provides for the measurement of two additional domains: Agreeableness and
Conscientiousness. Stated briefly, the Neuroticism domain (now referred to as the Emotional Stability
factor) taps aspects of adjustment and emotional stability, including how people cope in times of
emotional turmoil. The Extraversion domain taps aspects of sociability, how proactive people are in
seeking out others, as well as assertiveness. Openness (also referred to as the Intellect factor) refers to
openness to experience as well as active imagination, aesthetic sensitivity, attentiveness to inner
feelings, preference for variety, intellectual curiosity, and independence of judgment. Agreeableness is
primarily a dimension of interpersonal tendencies that include altruism, sympathy toward others,
friendliness, and the belief that others are similarly inclined. Conscientiousness is a dimension of
personality that has to do with the active processes of planning, organizing, and following through. Each
of these major dimensions or domains of personality may be subdivided into individual traits or facets
measured by the NEO PI-R. Psychologists have found value in using these dimensions to describe a wide
range of behavior attributable to personality (Chang et al., 2011).
418
The NEO PI-R is designed for use with persons 17 years of age and older and is essentially self-
administered. Computerized scoring and interpretation are available. Validity and reliability data are
presented in the manual.
Perhaps due to the enthusiasm with which psychologists have embraced “the Big 5,” a number of tests
other than the NEO PI-R have been developed to measure it. One such instrument is The Big Five
Inventory (BFI; John et al., 1991). This test is made publicly available for noncommercial purposes to
researchers and students. It consists of only 44 items, which makes it relatively quick to administer.
Another instrument, the Ten Item Personality Inventory (TIPI; Gosling, Rentfrow, & Swann, 2003),
contains only two items for each of the Big 5 dimensions. Educated on matters of test construction and
test validity, you may now be asking yourself how a test with so few items could possibly be valid. And if
that is the case, you may want to read an article by Jonason et al. (2011), which actually has some
favorable things to say about the construct validity of the TIPI. Another major force in the Big Five
literature, Lewis Goldberg, is author of an adjective marker measures of the Big Five (c, 1992). He also
oversees the International Personality Item Pool, an online repository of more than 3000 items and 250
scales of free personality and individual difference measures (https://ipip.ori.org/). A nonverbal measure
of the Big 5 has also been developed. And once again, educated on matters of test construction as you
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
are, you may be asking yourself something like, “How in blazes did they do that?!” The Five-Factor
Nonverbal Personality Questionnaire (FF-NPQ) is administered by showing respondents illustrations of
behaviors indicative of the Big 5 dimensions. Respondents are then asked to gauge the likelihood of
personally engaging in those behaviors (Paunonen et al., 2004). One study compared the performance of
monozygotic (identical) twins on verbal and nonverbal measures of the Big 5. The researchers concluded
that the performance of the twins was similar on the measures and that the similarities were
attributable to shared genes rather than shared environments (Moore et al., 2010). Such studies fueled
speculation regarding the heritability of psychological traits.
We began our discussion of personality test development methods with a note that many personality
tests have used two or more of these strategies in their process of development. At this point you may
begin to appreciate how, as well as why, two or more tools might be used. A pool of items for an
objective personality measure could be created, for example, on the basis of logic or theory, or both
logic and theory. The items might then be arranged into scales on the basis of factor analysis. The draft
version of the test could be administered to a criterion group and to a control group to see if responses
to the items differ as a function of group membership. But here we are getting just a bit ahead of
ourselves. We need to define, discuss, and illustrate what is meant by criterion group in the context of
developing personality tests.
419
Criterion Groups
A criterion may be defined as a standard on which a judgment or decision can be made. With regard to
scale development, a criterion group is a reference group of testtakers who share specific characteristics
and whose responses to test items serve as a standard according to which items will be included in or
discarded from the final version of a scale. The process of using criterion groups to develop test items is
referred to as empirical criterion keying because the scoring or keying of items has been demonstrated
empirically to differentiate among groups of testtakers. The shared characteristic of the criterion group
to be researched—a psychiatric diagnosis, a unique skill or ability, a genetic aberration, or whatever—
will vary as a function of the nature and scope of the test. Development of a test by means of empirical
criterion keying may be summed up as follows:
Create a large, preliminary pool of test items from which the test items for the final form of the test will
be selected.
Administer the preliminary pool of items to at least two groups of people:
Group 1: A criterion group composed of people known to possess the trait being measured.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Group 2: A randomly selected group of people (who may or may not possess the trait being measured)
Conduct an item analysis to select items indicative of membership in the criterion group. Items in the
preliminary pool that discriminate between membership in the two groups in a statistically significant
fashion will be retained and incorporated in the final form of the test.
Obtain data on test performance from a standardization sample of testtakers who are representative of
the population from which future testtakers will come. The test performance data for Group 2 members
on items incorporated into the final form of the test may be used for this purpose if deemed
appropriate. The performance of Group 2 members on the test would then become the standard against
which future testtakers will be evaluated. After the mean performance of Group 2 members on the
individual items (or scales) of the test has been identified, future testtakers will be evaluated in terms of
the extent to which their scores deviate in either direction from the Group 2 mean.
At this point you may ask, “But what about that initial pool of items? How is it created?” The answer is
that the test developer may have found inspiration for each of the items from reviews of journals and
books, interviews with patients, or consultations with colleagues or known experts. The test developer
may have relied on logic or reason alone to write the items, or on other tests. Alternatively, the test
developer may have relied on none of these and simply let imagination loose and committed to paper
whatever emerged. An interesting aspect of test development by means of empirical criterion keying is
that the content of the test items does not have to relate in a logical, rational, direct, or face-valid way to
the measurement objective. Burisch (1984, p. 218) captured the essence of empirical criterion keying
when he stated flatly, “If shoe size as a predictor improves your ability to predict performance as an
airplane pilot, use it.”3
420
Now imagine that it is the 1930s. A team of researchers is keenly interested in devising a paper-and-
pencil test that will improve reliability in psychiatric diagnosis. Their idea is to use empirical criterion
keying to create the instrument. A preliminary version of the test will be administered (1) to several
criterion groups of adult inpatients, each group homogeneous with respect to psychiatric diagnosis, and
(2) to a group of randomly selected non-clinical adults without any diagnoses. Using item analysis, items
useful in differentiating members of the various clinical groups from members of the non-clinical group
will be retained to make up the final form of the test. The researchers envision that future users of the
published test will be able to derive diagnostic insights by comparing a testtaker’s response pattern to
that of testtakers in the non-clinical group.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
And there you have the beginnings of a relatively simple idea that would, in time, win widespread
approval from clinicians around the world. It is an idea for a test that stimulated the publication of
thousands of research studies, and an idea that led to the development of a test that would serve as a
model for countless other instruments devised through the use of criterion group research. The test,
originally called the Medical and Psychiatric Inventory (Dahlstrom & Dahlstrom, 1980), is the MMPI.
Years after its tentative beginnings, the test’s senior author recalled that “it was difficult to persuade a
publisher to accept the MMPI” (Hathaway, cited in Dahlstrom & Welsh, 1960, p. vii). However, the
University of Minnesota Press was obviously persuaded, because in 1943 it published the test under a
new name, the Minnesota Multiphasic Personality Inventory (MMPI). The rest, as they say, is history.
In the next few pages, we describe the development of the original MMPI as well as its more
contemporary progeny, the MMPI-2, the MMPI-2 Restructured Form (the MMPI-2-RF), and the MMPI-A.
The MMPI
The MMPI was the product of a collaboration between psychologist Starke R. Hathaway and
psychiatrist/neurologist John Charnley McKinley (Hathaway & McKinley, 1940, 1942, 1943, 1951;
McKinley & Hathaway, 1940, 1944). It contained 566 true–false items and was designed as an aid to
psychiatric diagnosis with adolescents and adults 14 years of age and older. Research preceding the
selection of test items included review of textbooks, psychiatric reports, and previously published
personality test items. In this sense, the beginnings of the MMPI can be traced to an approach to test
development that was based on logic and reason.
A listing of the 10 clinical scales of the MMPI is presented in Table 11–3 along with a description of the
corresponding criterion group. Each of the diagnostic categories listed for the 10 clinical scales were
popular diagnostic categories in the 1930s. Members of the clinical criterion group for each scale were
presumed to have met the criteria for inclusion in the category named in the scale. MMPI clinical scale
items were derived empirically by administration to clinical criterion groups and normal control groups.
The items that successfully differentiated between the two groups were retained in the final version of
the test (Welsh & Dahlstrom, 1956). Well, it’s actually a bit more complicated than that, and you really
should know some of the details . . .
421
Table 11–3
The Clinical Criterion Groups for MMPI Scales
Scale
Clinical Criterion Group
1. Hypochondriasis (Hs) Patients who showed exaggerated concerns about their physical health
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
2. Depression (D)
Clinically depressed patients; unhappy and pessimistic about their future
3. Hysteria (Hy) Patients with conversion reactions
4. Psychopathic deviate (Pd)
Patients who had histories of delinquency and other antisocial behavior
5. Masculinity-femininity (Mf)
Minnesota draftees, airline stewardesses, and male homosexual college
students from the University of Minnesota campus community
6. Paranoia (Pa) Patients who exhibited paranoid symptomatology such as ideas of reference, suspicious
ness, delusions of persecution, and delusions of grandeur
7. Psychasthenia (Pt)
Anxious, obsessive-compulsive, guilt-ridden, and self-doubting patients
8. Schizophrenia (Sc)
Patients who were diagnosed as schizophrenic (various subtypes)
9. Hypomania (Ma)
Patients, most diagnosed as manic-depressive, who exhibited manic
symptomatology such as elevated mood, excessive activity, and easy distractibility
10. Social introversion (Si)
College students who had scored at the extremes on a test of
introversion/extraversion
Note that these same 10 clinical scales formed the core not only of the original MMPI, but of its 1989
revision, the MMPI-2. The clinical scales did undergo some modification for the MMPI-2, such as editing
and reordering, and nine items were eliminated. Still, the MMPI-2 retained the 10 original clinical scale
names, despite the fact that some of them (such as “Psychopathic Deviate”) are relics of a bygone era.
Perhaps that accounts for why convention has it that these scales be referred to by scale numbers only,
not their names.
To understand the meaning of normal control group in this context, think of an experiment. In
experimental research, an experimenter manipulates the situation so that the experimental group is
exposed to something (the independent variable) and the control group is not. In the development of
the MMPI, members of the criterion groups were drawn from a population of people presumed to be
members of a group with a shared diagnostic label. Analogizing an experiment to this test development
situation, it is as if the experimental treatment for the criterion group members was membership in the
category named. By contrast, members of the control group were normal (i.e., nondiagnosed) people
who ostensibly received no such experimental treatment.
JUST THINK . . .
Applying what you know about the standardization of tests, what are your thoughts regarding the
standardization of the original MMPI? What about the composition of the clinical criterion groups? The
normal control group?
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
The normal control group, also referred to as the standardization sample, consisted of approximately
1,500 subjects. Included were 724 people who happened to be visiting friends or relatives at University
of Minnesota hospitals, 265 high-school graduates seeking precollege guidance at the University of
Minnesota Testing Bureau, 265 skilled workers participating in a local Works Progress Administration
program, and 243 medical (nonpsychiatric) patients. The clinical criterion group for the MMPI was, for
the most part, made up of psychiatric inpatients at the University of Minnesota Hospital. We say “for the
most part” because Scale 5 (Masculinity-Femininity) and Scale 0 (Social Introversion) were not derived in
this way.
422
The number of people included in each diagnostic category was relatively low by contemporary
standards. For example, the criterion group for Scale 7 (Psychasthenia) contained only 20 people, all
diagnosed as psychasthenic.4 Two of the “clinical” scales (Scale 0 and Scale 5) did not even use members
of a clinical population in the criterion group. The members of the Scale 0 (Social Introversion) clinical
criterion group were college students who had earned extreme scores on a measure of introversion-
extraversion. Scale 5 (Masculinity-Femininity) was designed to measure neither masculinity nor
femininity; rather, it was originally developed to differentiate heterosexual from homosexual males. Due
to a dearth of items that effectively differentiated people on this variable, the test developers broadened
the definition of Scale 5 and added items that discriminated between normal males (soldiers) and
females (airline personnel) in the 1930s. Some of the items added to this scale were obtained from the
Attitude Interest Scale (Terman & Miles, 1936). Hathaway and McKinley had also attempted to develop a
scale to differentiate lesbians from female heterosexuals but were unable to do so.
JUST THINK . . .
Write one true–false item that you believe would successfully differentiate athlete from non-athlete
testtakers. Don’t forget to provide your suggested answer key.
By the 1930s, research on the Personal Data Sheet (Woodworth, 1917) as well as other face-valid, logic-
derived instruments had brought to light problems inherent in self-report methods. Hathaway and
McKinley (1943) evinced a keen awareness of such problems. They built into the MMPI three validity
scales: the L scale (the Lie scale), the F scale (the Frequency scale—or, perhaps more accurately, the
“Infrequency” scale), and the K (Correction) scale. Note that these scales were not designed to measure
validity in the technical, psychometric sense. There is, after all, something inherently self-serving, if not
suspect, about a test that purports to gauge its own validity! Rather, validity here was a reference to a
built-in indicator of the operation of testtaker response styles (such as carelessness, deliberate efforts to
deceive, or unintentional misunderstanding) that could affect the test results.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
The L scale contains 15 items that, if endorsed, could reflect somewhat negatively on the testtaker. Two
examples: “I do not always tell the truth” and “I gossip a little at times” (Dahlstrom et al., 1972, p. 109).
The willingness of the examinee to reveal anything negative of a personal nature will be called into
question if the score on the L scale does not fall within certain limits.
JUST THINK . . .
Try your hand at writing a good L-scale item.
The 64 items on the F scale (1) are infrequently endorsed by members of nonpsychiatric populations and
(2) do not fit into any known pattern of deviance. A response of true to an item such as the following
would be scored on the F scale: “It would be better if almost all laws were thrown away” (Dahlstrom et
al., 1972, p. 115). An elevated F score may mean that the respondent did not take the test seriously and
was just responding to items randomly. Alternatively, the individual with a high F score may be an
eccentric individual or someone who was attempting to fake bad. Malingerers in the armed services,
people intent on committing fraud with respect to health insurance, and criminals attempting to cop a
psychiatric plea are some of the groups of people who might be expected to have elevated F scores on
their profiles.
Like the L score and the F score, the K score is a reflection of the frankness of the testtaker’s self-report.
An elevated K score is associated with defensiveness and the desire to present a favorable impression. A
low K score is associated with excessive self-criticism, desire to detail deviance, or desire to fake bad. A
true response to the item “I certainly feel useless at times” and a false response to “At times I am all full
of energy” (Dahlstrom et al., 1972, p. 125) would be scored on the K scale. The K scale is sometimes used
to correct scores on five of the clinical scales. The scores are statistically corrected for an individual’s
overwillingness or unwillingness to admit deviance.
423
Another scale that bears on the validity of a test administration is the Cannot Say scale, also referred to
simply as the ? (question mark) scale. This scale is a simple frequency count of the number of items to
which the examinee responded cannot say or failed to mark any response. Items may be omitted or
marked cannot say for many reasons, including respondent indecisiveness, defensiveness, carelessness,
and lack of experience relevant to the item. Traditionally, the validity of an answer sheet with a cannot
say count of 30 or higher is called into question and deemed uninterpretable (Dahlstrom et al., 1972).
Even for test protocols with a cannot say count of 10, caution has been urged in test interpretation. High
cannot say scores may be avoided by a proctor’s emphasis in the initial instructions to answer all items.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
The MMPI contains 550 true–false items, 16 of which are repeated on some forms of the test (for a total
of 566 items administered). Scores on each MMPI scale are reported in the form of T scores which, you
may recall, have a mean set at 50 and a standard deviation set at 10. A score of 70 on any MMPI clinical
scale is 2 standard deviations above the average score of members of the standardization sample, and a
score of 30 is 2 standard deviations below their average score.
In addition to the clinical scales and the validity scales, there are MMPI content scales, supplementary
scales, and Harris-Lingoes subscales. As the name implies, the content scales, such as the Wiggins
Content Scales (after Wiggins, 1966), are composed of groups of test items of similar content. Examples
of content scales on the MMPI include the scales labeled Depression and Family Problems. In a sense,
content scales “bring order” and face validity to groups of items, derived from empirical criterion keying,
that ostensibly have no relation to one another.
JUST THINK . . .
If you were going to develop a supplementary MMPI scale, what would it be? Why would you want to
develop this scale?
Supplementary scales is a catch-all phrase for the hundreds of different MMPI scales that have been
developed since the test’s publication. These scales have been devised by different researchers using a
variety of methods and statistical procedures, most notably factor analysis. There are supplementary
scales that are fairly consistent with the original objectives of the MMPI, such as scales designed to shed
light on alcoholism and ego strength. And then there are dozens of other supplementary scales, ranging
from “Success in Baseball” to—well, you name it!5
The publisher of the MMPI makes available for computerized scoring only a limited selection of the many
hundreds of supplementary scales that have been developed and discussed in the professional
literature. One of them, the Harris-Lingoes subscales (often referred to simply as the Harris scales), are
groupings of items into subscales (with labels such as Brooding and Social Alienation) that were designed
to be more internally consistent than the umbrella scale from which the subscale was derived.
Historically administered by paper and pencil, the MMPI is today administered by many methods: online,
offline on disk, or by index cards. An audio-augmented computerized version is available for semiliterate
testtakers. Testtakers respond to items by answering true or false. Items left unanswered are construed
as cannot say. In the version of the test administered using individual items printed on cards, testtakers
are instructed to sort the cards into three piles labeled true, false, and cannot say. At least a sixth-grade
reading level is required to understand all the items. There are no time limits, and the time required to
administer 566 items is typically between 60 and 90 minutes.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
It is possible to score MMPI answer sheets by hand, but the process is labor intensive and rarely done.
Computer scoring of protocols is accomplished by software on personal computers, by computer
transmission to a scoring service via modem, online through the Q-global interface, or by physically
mailing the completed form to a computer scoring service. Computer output may range from a simple
numerical and graphic presentation of scores to a highly detailed narrative report complete with analysis
of scores on selected supplementary scales.
424
Soon after the MMPI was published, it became evident that the test could not be used to neatly
categorize testtakers into diagnostic categories. When testtakers had elevations in the pathological range
of two or more scales, diagnostic dilemmas arose. Hathaway and McKinley (1943) had urged users of
their test to opt for configural interpretation of scores—that is, interpretation based not on scores of
single scales but on the pattern, profile, or configuration of the scores. However, their proposed method
for profile interpretation was extremely complicated, as were many of the proposed adjunctive and
alternative procedures.
Paul Meehl (1951) proposed a 2-point code derived from the numbers of the clinical scales on which the
testtaker achieved the highest (most pathological) scores. If a testtaker achieved the highest score on
Scale 1 and the second-highest score on Scale 2, then that testtaker’s 2-point code type would be 12. The
2-point code type for a highest score on Scale 2 and a second-highest score on Scale 1 would be 21.
Because each digit in the code is interchangeable, a code of 12 would be interpreted in exactly the same
way as a code of 21. By the way, a code of 12 (or 21) is indicative of an individual in physical pain. An
assumption here is that each score in the 2-point code type exceeds an elevation of T = 70. If the scale
score does not exceed 70, this is indicated by the use of a prime ( ) after the scale number. Meehl’s
′
system had great appeal for many MMPI users. Before long, a wealth of research mounted on the
interpretive meanings of the 40 code types that could be derived using 10 scales and two
interchangeable digits.6
Another popular approach to scoring and interpretation came in the form of Welsh codes—referred to
as such because they were created by Welsh (1948, 1956), not because they were written in Welsh
(although to the uninitiated, they may be equally incomprehensible). Here is an example of a Welsh
code:
6* 78
1-53∕4:2# 90 F L-∕K
′′′
′
To the seasoned Welsh code user, this expression provides information about a testtaker’s scores on the
MMPI clinical and validity scales.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Students interested in learning more about the MMPI need not expend a great deal of effort in tracking
down sources. Chances are your university library is teeming with books and journal articles written on
or about this multiphasic (many-faceted) instrument. Of course, you may also want to go well beyond
this historical introduction by becoming better acquainted with this test’s more contemporary revisions,
the MMPI-2, the MMPI-2 Restructured Form, and the MMPI-A. A barebones overview of those
instruments follows.
The MMPI-2
Much of what has already been said about the MMPI in terms of its general structure, administration,
scoring, and interpretation is applicable to the MMPI-2. The most significant difference between the two
tests is the more representative standardization sample (normal control group) used in the norming of
the MMPI-2. Approximately 14% of the MMPI items were rewritten to correct grammatical errors and to
make the language more contemporary, nonsexist, and readable. Items thought to be objectionable to
some testtakers were eliminated. Added were items addressing topics such as drug abuse, suicide
potential, marital adjustment, attitudes toward work, and Type A behavior patterns.7 In all, the MMPI-2
contains a total of 567 true–false items, including 394 items that are identical to the original MMPI
items, 66 items that were modified or rewritten, and 107 new items. The suggested age range of
testtakers for the MMPI-2 is 18 years and older, as compared to 14 years and older for the MMPI. The
reading level required (sixth-grade) is the same as for the MMPI. The MMPI-2, like its predecessor, may
be administered online (with or without the audio augmentation) or offline by paper and pencil. It takes
about the same length of time to administer.
425
The 10 clinical scales of the MMPI are identical to those on the MMPI-2, as is the policy of referring to
them primarily by number. Content component scales were added to the MMPI-2 to provide more
focused indices of content. For example, Family Problems content was subdivided into Family Discord
and Familial Alienation content.
The three original validity scales of the MMPI were retained in the MMPI-2, and three new validity scales
were added: Back-Page Infrequency (Fb), True Response Inconsistency (TRIN), and Variable Response
Inconsistency (VRIN). The Back-Page Infrequency scale contains items seldom endorsed by testtakers
who are candid, deliberate, and diligent in their approach to the test. Of course, some testtakers’
diligence wanes as the test wears on and so, by the “back pages” of the test, a random or inconsistent
pattern of responses may become evident. The Fb scale is designed to detect such a pattern.
JUST THINK . . .
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
To maintain continuity with the original test, the MMPI-2 used the same names for the clinical scales.
Some of these scale names, such as Psychasthenia, are no longer used. If you were in charge of the
MMPI’s revision, what would your recommendation have been for dealing with this issue related to
MMPI-2 scale names?
The TRIN scale is designed to identify acquiescent and nonacquiescent response patterns. It contains 23
pairs of items worded in opposite forms. Consistency in responding dictates that, for example, a true
response to the first item in the pair is followed by a false response to the second item in the pair. The
VRIN scale is designed to identify indiscriminate response patterns. It, too, is made up of item pairs,
where each item in the pair is worded in either opposite or similar form.
The senior author of the MMPI-2, James Butcher (Figure 11–6),8 developed yet another validity scale
after the publication of that test. The S scale is a validity scale designed to detect self-presentation in a
superlative manner (Butcher & Han, 1995; Lanyon, 1993a, 1993b; Lim & Butcher, 1996).
Figure 11–6
James Butcher (1933– ) and friend.
That’s Jim, today better known as the senior author of the MMPI-2, to your right as an Army infantryman
at Outpost Yoke in South Korea in 1953. Returning to civilian life, Jim tried various occupations, including
salesman and private investigator. He later earned a Ph.D. at the University of North Carolina, where he
had occasion to work with W. Grant Dahlstrom and George Welsh (as in MMPI “Welsh code”). Butcher’s
first teaching job was at the University of Minnesota, where he looked forward to working with Starke
Hathaway and Paul Meehl. But he was disappointed to learn that “Hathaway had moved on to the
pursuit of psychotherapy research and typically disclaimed any expertise in the test. . . . Hathaway always
refused to become involved in teaching people about the test. Meehl had likewise moved on to other
venues” (Butcher, 2003, p. 233).
©James Butcher
JUST THINK . . .
Of all of the proposed validity scales for the MMPI-2, which do you think is the best indicator of whether
the test scores are truly indicative of the testtaker’s personality?
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Another proposed validity scale, this one designed to detect malingerers in personal injury claims, was
proposed by Paul R. Lees-Haley and his colleagues (1991). Referred to as the FBS or Faking Bad Scale, this
scale was originally developed as a means to detect malingerers who submitted bogus personal injury
claims. In the years since its development, the FBS Scale has found support from some, most notably
Ben-Porath et al. (2009). However, it also has its critics—among them, James Butcher and his colleagues.
Butcher et al. (2008) argued that factors other than malingering (such as genuine physical or
psychological problems) could contribute to endorsement of items that were keyed as indicative of
malingering. They cautioned that the “lack of empirical verification of the 43 items selected by Lees-
Haley, including examination of the items’ performance across broad categories of people, argues
against its widespread dissemination”
(pp. 194–195).
A nagging criticism of the original MMPI was the lack of representation of the standardization sample of
the U.S. population. This criticism was addressed in the standardization of the MMPI-2. The 2,600
individuals (1,462 females, 1,138 males) from seven states who made up the MMPI-2 standardization
sample had been matched to 1980 U.S. Census data on the variables of age, gender, minority status,
social class, and education (Butcher, 1990). Whereas the original MMPI did not contain any non-whites
in the standardization sample, the MMPI-2 sample was 81% white and 19% non-white. Age of subjects in
the sample ranged from 18 years to 85 years. Formal education ranged from 3 years to 20+ years, with
more highly educated people and people working in the professions overrepresented in the sample.
Median annual family income for females in the sample was $25,000 to $30,000. Median annual family
income for males in the sample was $30,000 to $35,000.
426
As with the original MMPI, the standardization sample data provided the basis for transforming the raw
scores obtained by respondents into T scores for the MMPI-2. However, a technical adjustment was
deemed to be in order. The T scores used for standardizing the MMPI clinical scales and content scales
were linear T scores. For the MMPI-2, linear T scores were also used for standardization of the validity
scales, the supplementary scales, and Scales 5 and 0 of the clinical scales. However, a different T score
was used to standardize the remaining eight clinical scales as well as all of the content scales; these
scales were standardized with uniform T scores (UT scores). The UT scores were used in an effort to
make the T scores corresponding to percentile scores more comparable across the MMPI-2 scales
(Graham, 1990; Tellegen & Ben-Porath, 1992).
Efforts to address concerns about the MMPI did not end with the publication of the
MMPI-2. Before long, research was under way to revise the MMPI-2. These efforts were evident in the
publication of restructured clinical scales (Tellegen et al., 2003) and culminated more recently in the
publication of the MMPI-2 Restructured Form (MMPI-2-RF).
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
The MMPI-2-RF
The need to rework the clinical scales of the MMPI-2 was perceived by Tellegen et al. (2003) as arising, at
least in part, from two basic problems with the structure of the scales. One basic problem was
overlapping items. The method of test development initially used to create the MMPI, empirical criterion
keying, practically ensured there would be some item overlap. But just how much item overlap was
there? Per pair of clinical scales, it has been observed that there is an average of more than six
overlapping items in the MMPI-2 (Greene, 2000; Helmes & Reddon, 1993). Item overlap between the
scales can decrease the distinctiveness and discriminant validity of individual scales and can also
contribute to difficulties in determining the meaning of elevated scales.
427
A second problem with the basic structure of the test could also be characterized in terms of overlap—
one that is more conceptual in nature. Here, reference is made to the pervasive influence of a factor that
seemed to permeate all of the clinical scales. The factor has been described in different ways with
different terms such as anxiety, malaise, despair, and maladjustment. It is a factor that is thought to be
common to most forms of psychopathology yet unique to none. Exploring the issue of why entirely
different approaches to psychotherapy had comparable results, Jerome Frank (1974) focused on what he
viewed as this common factor in psychopathology, which he termed demoralization:
Only a small proportion of persons with psychopathology come to therapy; apparently something else
must be added that interacts with their symptoms. This state of mind, which may be termed
“demoralization,” results from persistent failure to cope with internally or externally induced stresses. . . .
Its characteristic features, not all of which need to be present in any one person, are feelings of
impotence, isolation, and despair. (p. 271)
Dohrenwend et al. (1980) perpetuated the use of Frank’s concept of demoralization in their discussion of
a nonspecific distress factor in psychopathology. Tellegen (1985) also made reference to demoralization
when he wrote of a factor that seemed to inflate correlations between measures within clinical
inventories. Many of the items on all of the MMPI and MMPI-2 clinical scales, despite their
heterogeneous content, seemed to be saturated with the demoralization factor. Concern about the
consequences of this overlapping has a relatively long history (Adams & Horn, 1965; Rosen, 1962; Welsh,
1952). In fact, the history of efforts to remedy the problem of insufficient discriminant validity and
discriminative efficiency of the MMPI clinical scales is almost as long as the long history of the test itself.
One goal of the restructuring was to make the clinical scales of the MMPI-2 more distinctive and
meaningful. As described in detail in a monograph supplement to the MMPI-2 administration and
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
scoring manual, Tellegen et al. (2003) attempted to (1) identify the “core components” of each clinical
scale, (2) create revised scales to measure these core components (referred to as “seed scales”), and (3)
derive a final set of Revised Clinical (RC) scales using the MMPI-2 item pool. Another objective of the
restructuring was, in essence, to extract the demoralization factor from the existing MMPI-2 clinical
scales and create a new Demoralization scale. This new scale was described as one that “measures a
broad, emotionally colored variable that underlies much of the variance common to the MMPI-2 Clinical
Scales” (Tellegen et al., 2003, p. 11).
Employing the MMPI-2 normative sample as well as three additional clinical samples in their research,
Tellegen et al. (2003) made the case that their restructuring procedures were psychometrically sound
and had succeeded in improving both convergent and discriminant validity. According to their data, the
restructured clinical (RC) scales were less intercorrelated than the original clinical scales, and their
convergent and discriminant validity were greater than those original scales. Subsequent to the
development of the RC scales, additional scales were developed. For example, the test authors
developed scales to measure clinically significant factors that were not directly assessed by the RC scales,
such as suicidal ideation. They also saw a need to develop scales tapping higher-order dimensions to
provide a framework for organizing and interpreting findings. These higher-order scales were labeled
Emotional/Internalizing Dysfunction, Thought Dysfunction, and Behavioral/Externalizing Dysfunction.
The finished product was published in 2008 and called the MMPI-2 Restructured Form (MMPI-2-RF; Ben-
Porath & Tellegen, 2008). It contains a total of 338 items and 50 scales, some of which are summarized in
Table 11–4.
428
429
Table 11–4
Description of a Sampling of MMPI-2-RF Scales
Clinical Scales Group
There are a total of nine clinical scales. The RCd, RC1, RC2, and RC3 scales were introduced by Tellegen et
al. (2003). Gone from the original MMPI (and MMPI-2) clinical scales is the Masculinity-Femininity Scale.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Scale Name
Scale Description
Demoralization (RCd)
General malaise, unhappiness, and dissatisfaction
Somatic Complaints (RC1)
Diffuse complaints related to physical health
Low Positive Emotions (RC2)
A “core” feeling of vulnerability in depression
Cynicism (RC3)
Beliefs nonrelated to self that others are generally ill-intentioned and not to be trusted
Antisocial Behavior (RC4)
Acting in violation of societal or social rules
Ideas of Persecution (RC6)
Self-referential beliefs that one is in danger or threatened by others
Dysfunctional Negative Emotions (RC7)
Disruptive anxiety, anger, and irritability
Aberrant Experiences (RC8)
Psychotic or psychotic-like thoughts, perceptions, or experiences
Hypomanic Activation (RC9)
Over-activation, grandiosity, impulsivity, or aggression
Validity Scales Group
There are a total of eight validity scales, which is one more validity scale than in the previous edition of
the test. The added validity scale is Infrequent Somatic Response (Fs).
Scale Name
Scale Description
Variable Response Inconsistency-Revised (VRIN-r)
Random responding
True Response Inconsistency-Revised (TRIN-r)
Fixed responding
Infrequent Responses-Revised (F-r)
Infrequent responses compared to the general population
Infrequent Psychopathology Responses-Revised (Fp-r)
Infrequent responses characteristic of
psychiatric populations
Infrequent Somatic Responses (Fs)
Infrequent somatic complaints from patients with medical
problems
Symptom Validity (aka Fake Bad Scale-Revised; FBS-r)
Somatic or mental complaints with little or no
credibility
Uncommon Virtues (aka Lie Scale-Revised; L-r)
Willingness to reveal anything negative about oneself
Adjustment Validity (aka Defensiveness Scale-Revised; K-r)
Degree to which the respondent is self-
critical
Specific Problem (SP) Scales Group
There are a total of 20 scales that measure problems. These SP scales are grouped as relating to
Internalizing, Externalizing, or Interpersonal issues and are subgrouped according to the clinical scale on
which they shed light.
Scale Name
Scale Description
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Suicidal/Death Ideation (SUI)a
Respondent reports self-related suicidal thoughts or actions
Helplessness/Hopelessness (HLP)a
Pervasive belief that problems are unsolvable and/or goals
unattainable
Self-Doubt (SFD)a
Lack of self-confidence, feelings of uselessness
Inefficacy (NFC)a
Belief that one is indecisive or incapable of accomplishment
Cognitive Complaints (COG)a
Concentration and memory difficulties
Juvenile Conduct Problems (JCP)b
Difficulties at home or school, stealing
Substance Abuse (SUB)b
Current and past misuse of alcohol and drugs
Sensitivity/Vulnerability (SNV)c
Taking things too hard, being easily hurt by others
Stress/Worry (STW)c
Preoccupation with disappointments, difficulty with time pressure
Anxiety (AXY)c
Pervasive anxiety, frights, frequent nightmares
Anger Proneness (ANP)c
Being easily angered, impatient with others
Behavior-Restricting Fears (BRF)c
Fears that significantly inhibit normal behavior
Multiple Specific Fears (MSF)c
Various specific fears, such as a fear of blood or a fear of thunder
Juvenile Conduct Problems (JCP)c
Difficulties at home or school, stealing
Aggression (AGG)d
Physically aggressive, violent behavior
Activation (ACT)d
Heightened excitation and energy level
Interest Scales Group
There are two scales that measure interests: the AES scale and the MEC scale.
Scale Name
Scale Description
Aesthetic-Literary Interests (AES)
Interest in literature, music, and/or the theater
Mechanical-Physical Interests (MEC)
Fixing things, building things, outdoor pursuits, sports
PSY-5 Scales Group
These five scales are revised versions of MMPI-2 measures.
Scale Name
Scale Description
Aggressiveness-Revised (AGGR-r)
Goal-directed aggression
Psychoticism-Revised (PSYC-r)
Disconnection from reality
Disconstraint-Revised (DISC-r)
Undercontrolled behavior
Negative Emotionality/Neuroticism-Revised (NEGE-r)
Anxiety, insecurity, worry, and fear
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Introversion/Low Positive Emotionality-Revised (INTR-r)
Social disengagement and absence of joy or
happiness
Note: Overview based on Ben-Porath et al. (2007) and related materials; consult the MMPI-2-RF test
manual (and updates) for a complete list and description of all the test’s scales.
a Internalizing scale that measures facets of Demoralization (RCd).
b Internalizing scale that measures facets of Antisocial Behavior (RC4).
c Internalizing scale that measures facets of Dysfunctional Negative Emotions (RC7).
d Internalizing scale that measures facets of Hypomanic Activation (RC9).
JUST THINK . . .
What is a scale that you think should have been added to the latest version of the MMPI?
Since the publication of Tellegen et al.’s (2003) monograph, Tellegen, Ben-Porath, and their colleagues
have published a number of other articles that provide support for various aspects of the psychometric
adequacy of the RC scales and the MMPI-2-RF. Studies from independent researchers have also provided
support for some of the claims made regarding the RC scales’ reduced item intercorrelations and
increased convergent and discriminant validity (Simms et al., 2005; Wallace & Liljequist, 2005). Other
authors have obtained support for the Somatic Complaints RC scale, the Cynicism RC scale, and the
VRIN-r and TRIN-r validity scales (Handel et al., 2010; Ingram et al., 2011; Thomas & Locke, 2010). Osberg
et al. (2008) compared the MMPI-2 clinical scales with the RC scales in terms of psychometric properties
and diagnostic efficiency and reported mixed results.
The MMPI-2-RF technical manual provides empirical correlates of test scores based on various criteria in
various settings including clinical and nonclinical samples. The MMPI-2-RF can still be hand-scored and
hand-profiled, although computerized score reporting (with or without a computerized narrative report)
is available.
430
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
The MMPI-3
Newly released in Fall 2020, the third edition of the Minnesota Multiphasic Personality Inventory (MMPI-
3) is offered electronically either online through Pearson’s Q-global or locally through Q-local or in a
paper-and-pencil format for hand-scoring or through a mail-in scoring service. Authored by Ben-Porath
and Tellegen, this latest version is shortened to a 25– to 50-minute administration requiring a 4.5 grade
reading level. It is offered in three languages: English, Spanish, and Canadian French. Its normative
sample was matched to the U.S. Census Bureau demographic projections for 2020 with a total of 1,620
testtakers in the sample (810 men and 810 women) all aged 18 years or older. The Spanish sample
included 550 U.S. Spanish Speakers (275 men and 275 women). It includes 72 new items, 24 updated
items, and 4 new scales.
The MMPI-A-RF
Although its developers had recommended the original MMPI for use with adolescents, test users had
evinced skepticism of this recommendation through the years. Early on it was noticed that adolescents
as a group tended to score somewhat higher on the clinical scales than adults, a finding that left
adolescents as a group in the unenviable position of appearing to suffer from more psychopathology
than adults. In part for this reason, separate MMPI norms for adolescents were developed. In the 1980s,
while the MMPI was being revised to become the MMPI-2, the test developers had a choice of simply
renorming the MMPI-2 for adolescents or creating a new instrument. They opted to develop a new test
that was in many key respects a downward extension of the MMPI-2.
The Minnesota Multiphasic Personality Inventory–Adolescent (MMPI-A; Butcher et al., 1992) was a 478-
item, true–false test designed for use in clinical, counseling, and school settings for the purpose of
assessing psychopathology and identifying personal, social, and behavioral problems. The individual
items of the MMPI-A largely parallel the MMPI-2, although there are 88 fewer items. Some of the MMPI-
2 items were discarded, others were rewritten, and some completely new ones were added. Recently,
the MMPI-A was restructured to mirror the MMPI-2-RF. The MMPI-A-RF (Archer et al., 2016) uses the
same norms as the MMPI-A, but has reconfigured the scale items to reduce item overlap and sharpen
the theoretical meaning of the scales. The MMPI-A-RF contains 10 clinical scales (identical in name and
number to those of the MMPI-2-RF) and seven validity scales.
In addition to basic clinical and validity scales, the MMPI-A contains many supplementary scales for
evaluating aspects of internalizing, externalizing, and somatic symptoms of distress. It also provides a
succinct summary of psychopathology with the Personality Psychopathology Five scales: Aggressiveness,
Psychoticism, Disconstraint, Negative Emotionality, and Low Positive Emotionality.
The normative sample for the MMPI-A-RF consisted of 805 adolescent males and 815 adolescent
females drawn from schools in California, Minnesota, New York, North Carolina, Ohio, Pennsylvania,
Virginia, and Washington. The objective was to obtain a sample that was nationally representative in
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
terms of demographic variables such as ethnic background, geographic region of the United States, and
urban/rural residence. Concurrent with the norming of the MMPI-A-RF, a clinical sample of 713
adolescents was tested for the purpose of obtaining validity data. However, no effort was made to
ensure representativeness of the clinical sample. Subjects were all drawn from the Minneapolis area,
most from drug and alcohol treatment centers.
JUST THINK . . .
Your comments on the norming of the MMPI-A?
In general, the MMPI-A and MMPI-A-RF have earned high marks from test reviewers and may well have
quickly become the most widely used measure of psychopathology in adolescents. More information
about this test can be obtained from an authoritative book entitled Assessing Adolescent
Psychopathology: MMPI-A/MMPI-A-RF, Fourth Edition (Archer, 2017).
431
The MMPI and its revisions and progeny in perspective
The MMPI burst onto the psychology scene in the 1940s and was greeted as an innovative, well-
researched, and highly appealing instrument by both clinical practitioners and academic researchers.
Today, we can look back at its development and be even more impressed, as it was developed without
the benefit of high-speed computers. The number of research studies that have conducted on this test
number in the thousands, and few psychological tests are better known throughout the world. Through
the years, various weaknesses in the test have been discovered, and remedies have been proposed as a
consequence. The latest “restructuring” of the MMPI represents an effort not only to improve the test
and bring it into the twenty-first century but also to maintain continuity with the voluminous research
addressing its previous forms. There can be little doubt that the MMPI is very much a “work in progress”
that will be continually patched, restructured, and otherwise re-innovated to maintain that continuity.
JUST THINK . . .
What should the next version of the MMPI look like? In what ways should it be different than the MMPI-
2-RF?
Personality Assessment and Culture
Every day, assessment professionals across the United States are routinely called on to evaluate
personality and related variables of people from culturally and linguistically diverse populations. Yet
personality assessment is anything but routine with children, adolescents, and adults from Native
American, Latinx, Asian, Black/African American, and other cultures that may have been
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
underrepresented in the development, standardization, and interpretation protocols of the measures
used. Especially with members of culturally and linguistically diverse populations, a routine and business-
as-usual approach to psychological testing and assessment is inappropriate, if not irresponsible. What is
required is a professionally trained assessor capable of conducting a meaningful assessment, with
sensitivity to how culture relates to the behaviors and cognitions being measured (López, 2000).
Before any tool of personality assessment—an interview, a test, a protocol for behavioral observation, a
portfolio, or something else—can be employed, and before data derived from an attempt at
measurement can be imbued with meaning, the assessor will ideally consider some important issues
with regard to assessment of a particular assessee. Many of these issues relate to the level of
acculturation, values, identity, worldview, and language of the assessee. Professional exploration of
these areas is capable of yielding not only information necessary as a prerequisite for formal personality
assessment but a wealth of personality-related information in its own right.
Acculturation and Related Considerations
Acculturation is an ongoing process by which an individual’s thoughts, behaviors, values, worldview, and
identity develop in relation to the general thinking, behavior, customs, and values of a particular cultural
group. The process of acculturation begins at birth, a time at which the newborn infant’s family or
caretakers serve as agents of the culture.9 In the years to come, other family members, teachers, peers,
books, films, theater, newspapers, television and radio programs, and other media serve as agents of
acculturation. Through the process of acculturation, one develops culturally accepted ways of thinking,
feeling, and behaving.
A number of tests and questionnaires have been developed to yield insights regarding assessees’ level of
acculturation to their native culture or the dominant culture. A sampling of these measures is presented
in Table 11–5. As you survey this list, keep in mind that the amount of psychometric research conducted
on these instruments varies. Some of these instruments may be little more than content valid, if that. In
such cases, let the buyer beware. Should you wish to use any of these measures, you may wish to look
up more information about it in a resource such as the Mental Measurements Yearbook. Perhaps the
most appropriate use of many of these tests would be to derive hypotheses for future testing by means
of other tools of assessment. Unless compelling evidence exists to attest to the use of a particular
instrument with members of a specific population, data derived from any of these tests and
questionnaires should not be used alone to make selection, treatment, placement, or other momentous
decisions.
432
Table 11–5
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Some Published Measures of Acculturation
Target Population
Reference Sources
African-American
Baldwin (1984)
Baldwin & Bell (1985)
Klonoff & Landrine (2000)
Obasi & Leong (2010)
Snowden & Hines (1999)
Asian
Kim et al. (1999)
Suinn et al. (1987)
Asian-American Gim Chung et al. (2004)
Wolfe et al. (2001)
Asian (East & South)
Barry (2001)
Inman et al. (2001)
Asian Indian
Sodowsky & Carey (1988)
Central American
Wallen et al. (2002)
ChineseYao (1979)
Cuban
Garcia & Lega (1979)
Deaf culture
Maxwell-McCaw & Zea (2011)
Eskimo
Chance (1965)
Hawaiian
Bautista (2004)
Hishinuma et al. (2000)
Iranian
Shahim (2007)
Japanese-American
Masuda et al. (1970)
Padilla et al. (1985)
Khmer
Lim et al. (2002)
Latino/Latina
Murguia et al. (2000)
Zea et al. (2003)
Mexican-American
Cuéllar et al. (1995)
Franco (1983)
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Mendoza (1989)
Ramirez (1984)
Muslim American
Bagasra (2010)
Native American
Garrett & Pichette (2000)
Howe Chief (1940)
Roy (1962)
Puerto Rican
Tropp et al. (1999)
Cortes et al. (2003)
Vietnamese
Nguyen & von Eye (2002)
Population nonspecific measures
Sevig et al. (2000)
Smither & Rodriguez-Giegling (1982)
Stephenson (2000)
Unger et al. (2002)
Wong-Rieger & Quintana (1987)
A number of important questions regarding acculturation and related variables can be raised with regard
to assessees from culturally diverse populations. Many general types of interview questions may yield
rich insights regarding the overlapping areas of acculturation, values, worldview, and identity. A sampling
of such questions is presented in Table 11–6. As an exercise, you may wish to pose some or all of these
questions to someone you know who happens to be in the process of acculturation. Before doing so,
however, some caveats are in order. Keep in mind the critical importance of rapport when conducting an
interview. Be sensitive to cultural differences in readiness to engage in self-disclosure about family or
other matters that may be perceived as too personal to discuss (with a stranger or otherwise). Be ready
and able to change the wording of these questions should you need to facilitate the assessee’s
understanding of them or to change the order of these questions should an assessee answer more than
one question in the same response. Listen carefully and do not hesitate to probe for more information if
you perceive value in doing so. Finally, keep in mind that the relevance of each of these questions will
vary with the background and unique socialization experiences of each assessee.
433
Table 11–6
Some Sample Questions to Assess Acculturation
Describe yourself.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Describe your family. Who lives at home?
Describe roles in your family, such as the role of mother, the role of father, the role of grandmother, the
role of child, and so forth.
What traditions, rituals, or customs were passed down to you by family members?
What traditions, rituals, or customs do you think it is important to pass to the next generation?
With regard to your family situation, what obligations do you see yourself as having?
What obligations does your family have to you?
What role does your family play in everyday life?
How does the role of males and females differ from your own cultural perspective?
What kind of music do you like?
What kinds of foods do you eat most routinely?
What do you consider fun things to do? When do you do these things?
Describe yourself in the way that you think most other people would describe you. How would you say
your own self-description would differ from that description?
How might you respond to the question “Who are you?” with reference to your own sense of personal
identity?
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
With which cultural group or groups do you identify most? Why?
What aspect of the history of the group with which you most identify is most significant to you? Why?
Who are some of the people who have influenced you most?
What are some things that have happened to you in the past that have influenced you most?
What sources of satisfaction are associated with being you?
What sources of dissatisfaction or conflict are associated with being you?
What do you call yourself when asked about your ethnicity?
What are your feelings regarding your racial and ethnic identity?
Describe your most pleasant memory as a child.
Describe your least pleasant memory as a child.
Describe the ways in which you typically learn new things. In what ways might cultural factors have
influenced the ways you learn?
Describe the ways you typically resolve conflicts with other people. What influence might cultural factors
have on this way of resolving conflicts?
How would you describe your general view of the world?
How would you characterize human nature in general?
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
How much control do you believe you have over the things that happen to you? Why?
How much control do you believe you have over your health? Your mental health?
What are your thoughts regarding the role of work in daily life? Has your cultural identity influenced your
views about work in any way? If so, how?
How would you characterize the role of doctors in the world around you?
How would you characterize the role of lawyers in the world around you?
How would you characterize the role of politicians in the world around you?
How would you characterize the role of spirituality in your daily life?
What are your feelings about the use of illegal drugs?
What is the role of play in daily life?
How would you characterize the ideal relationship between human beings and nature?
What defines a person who has power?
What happens when one dies?
Do you tend to live your life more in the past, the present, or the future? What influences on you do you
think helped shape this way of living?
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
How would you characterize your attitudes and feelings about the older people in your family? About
older people in society in general?
Describe your thinking about the local police and the criminal justice system.
How do you see yourself 10 years from now?
434
Intimately entwined with acculturation is the learning of values. Values are that which an individual
prizes or the ideals an individual believes in. An early systematic treatment of the subject of values came
in a book entitled Types of Men (Spranger, 1928), which listed different types of people based on
whether they valued things like truth, practicality, and power. The book served as an inspiration for a yet
more systematic treatment of the subject (Allport et al., 1951). Before long, a number of different
systems for listing and categorizing values had been published.
Rokeach (1973) differentiated what he called instrumental from terminal values. Instrumental values are
guiding principles to help one attain some objective. Honesty, imagination, ambition, and cheerfulness
are examples of instrumental values. Terminal values are guiding principles and a mode of behavior that
is an endpoint objective. A comfortable life, an exciting life, a sense of accomplishment, and self-respect
are some examples of terminal values. Other value-categorization systems focus on values in specific
contexts, such as employment settings. Values such as financial reward, job security, or prestige may
figure prominently in decisions regarding occupational choice and employment or feelings of job
satisfaction.
Writing from an anthropological/cultural perspective, Kluckhohn (1954, 1960; Kluckhohn & Strodtbeck,
1961) conceived of values as answers to key questions with which civilizations must grapple. So, for
example, from questions about how the individual should relate to the group, values emerge about
individual versus group priorities. In one culture, the answers to such questions might take the form of
norms and sanctions that encourage strict conformity and little competition among group members. In
another culture, norms and sanctions may encourage individuality and competition among group
members. In this context, one can begin to appreciate how members of different cultural groups can
grow up with vastly different values, ranging from views on various “isms” (such as individualism versus
collectivism) to views on what is trivial and what is worth dying for. The different values people from
various cultures bring to the assessment situation may translate into widely varying motivational and
incentive systems. Understanding an individual’s values is an integral part of understanding personality.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Also intimately tied to the concept of acculturation is the concept of personal identity. Identity in this
context may be defined as a set of cognitive and behavioral characteristics by which individuals define
themselves as members of a particular group. Stated simply, identity refers to one’s sense of self. Levine
and Padilla (1980) defined identification as a process by which an individual assumes a pattern of
behavior characteristic of other people, and referred to it as one of the “central issues that ethnic
minority groups must deal with” (p. 13). Echoing this sentiment, Zuniga (1988) suggested that a question
such as “What do you call yourself when asked about your ethnicity?” might be used as an icebreaker
when assessing identification. She went on:
How a minority client handles their response offers evidence of their comfortableness with their identity.
A Mexican-American client who responds by saying, “I am an American, and I am just like everyone else,”
displays a defensiveness that demands gentle probing. One client sheepishly declared that she always
called herself Spanish. She used this self-designation since she felt the term “Mexican” was dirty. (p. 291)
Another key culture-related personality variable concerns how an assessee tends to view the world. As
its name implies, worldview is the unique way people interpret and make sense of their perceptions as a
consequence of their learning experiences, cultural background, and related variables.
435
Our overview of personality began with a consideration of some superficial, lay perspectives on this
multifaceted subject. We made reference to the now-classic rock oldie Personality and its “definition” of
personality in terms of observable variables such as walk, talk, smile, and charm. Here, at the end of the
chapter, we have come a long way in considering more personal, nonobservable elements of personality
in the form of constructs such as worldview, identification, values, and acculturation. In the chapter that
follows, we continue to broaden our perspective regarding tools that may be used to better understand
and effectively assess personality.
Self-Assessment
Test your understanding of elements of this chapter by seeing if you can explain each of the following
terms, expressions, and abbreviations:
acculturation
acquiescent response style
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Big Five
control group
criterion
criterion group
empirical criterion keying
error of central tendency
forced-choice format
frame of reference
generosity error
graphology
halo effect
identification
identity
idiographic approach
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
impression management
instrumental values
IPIP
leniency error
locus of control
MMPI
MMPI-2
MMPI-2-RF
MMPI-3
MMPI-A-RF
NEO PI-R
nomothetic approach
personality
personality assessment
personality profile
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
personality trait
personality type
profile
profile analysis
profiler
Q-sort technique
response style
self-concept
self-concept differentiation
self-concept measure
self-report
semantic differential
severity error
state
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
structured interview
terminal values
Type A personality
Type B personality
validity scale
values
Welsh code
worldview
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Related Documents
Recommended textbooks for you

Ciccarelli: Psychology_5 (5th Edition)
Psychology
ISBN:9780134477961
Author:Saundra K. Ciccarelli, J. Noland White
Publisher:PEARSON

Cognitive Psychology
Psychology
ISBN:9781337408271
Author:Goldstein, E. Bruce.
Publisher:Cengage Learning,

Introduction to Psychology: Gateways to Mind and ...
Psychology
ISBN:9781337565691
Author:Dennis Coon, John O. Mitterer, Tanya S. Martini
Publisher:Cengage Learning

Psychology in Your Life (Second Edition)
Psychology
ISBN:9780393265156
Author:Sarah Grison, Michael Gazzaniga
Publisher:W. W. Norton & Company

Cognitive Psychology: Connecting Mind, Research a...
Psychology
ISBN:9781285763880
Author:E. Bruce Goldstein
Publisher:Cengage Learning

Theories of Personality (MindTap Course List)
Psychology
ISBN:9781305652958
Author:Duane P. Schultz, Sydney Ellen Schultz
Publisher:Cengage Learning
Recommended textbooks for you
- Ciccarelli: Psychology_5 (5th Edition)PsychologyISBN:9780134477961Author:Saundra K. Ciccarelli, J. Noland WhitePublisher:PEARSONCognitive PsychologyPsychologyISBN:9781337408271Author:Goldstein, E. Bruce.Publisher:Cengage Learning,Introduction to Psychology: Gateways to Mind and ...PsychologyISBN:9781337565691Author:Dennis Coon, John O. Mitterer, Tanya S. MartiniPublisher:Cengage Learning
- Psychology in Your Life (Second Edition)PsychologyISBN:9780393265156Author:Sarah Grison, Michael GazzanigaPublisher:W. W. Norton & CompanyCognitive Psychology: Connecting Mind, Research a...PsychologyISBN:9781285763880Author:E. Bruce GoldsteinPublisher:Cengage LearningTheories of Personality (MindTap Course List)PsychologyISBN:9781305652958Author:Duane P. Schultz, Sydney Ellen SchultzPublisher:Cengage Learning

Ciccarelli: Psychology_5 (5th Edition)
Psychology
ISBN:9780134477961
Author:Saundra K. Ciccarelli, J. Noland White
Publisher:PEARSON

Cognitive Psychology
Psychology
ISBN:9781337408271
Author:Goldstein, E. Bruce.
Publisher:Cengage Learning,

Introduction to Psychology: Gateways to Mind and ...
Psychology
ISBN:9781337565691
Author:Dennis Coon, John O. Mitterer, Tanya S. Martini
Publisher:Cengage Learning

Psychology in Your Life (Second Edition)
Psychology
ISBN:9780393265156
Author:Sarah Grison, Michael Gazzaniga
Publisher:W. W. Norton & Company

Cognitive Psychology: Connecting Mind, Research a...
Psychology
ISBN:9781285763880
Author:E. Bruce Goldstein
Publisher:Cengage Learning

Theories of Personality (MindTap Course List)
Psychology
ISBN:9781305652958
Author:Duane P. Schultz, Sydney Ellen Schultz
Publisher:Cengage Learning