An Icelandic version of the
Kiddie-SADS-PL: Translation,
cross-cultural adaptation and
inter-rater reliability


Lauth B, Magnússon P, Ferrari P, Pétursson H. An Icelandic version of the Kiddie-SADS-PL:
Nord J Psychiatry 2008;62:
379–385.
379�385. Oslo. ISSN 0803-9488.

The development of structured diagnostic instruments has been an important step for research
in child and adolescent psychiatry, but the adequacy of a diagnostic instrument in a given culture
does not guarantee its reliability or validity in another population. The objective of the study
was to describe the process of cross-cultural adaptation into Icelandic of the Schedule for
Affective Disorders and Schizophrenia for School-Age Children*Present and Lifetime Version
(Kiddie-SADS-PL) and to test the inter-rater reliability of the adapted version. To attain cross-
cultural equivalency, five important dimensions were addressed: semantic, technical, content,
criterion and conceptual. The adapted Icelandic version was introduced into an inpatient clinical
setting, and inter-rater reliability was estimated both at the symptom and diagnoses level, for the
most frequent diagnostic categories in both international diagnostic classification systems
(DSM-IV and ICD-10). The cross-cultural adaptation has provided an Icelandic version
allowing similar understanding among different raters and has achieved acceptable cross-
cultural equivalence. This initial study confirmed the quality of the translation and adaptation of
Kiddie-SADS-PL and constitutes the first step of a larger validation study of the Icelandic
version of the instru


� Child psychiatric interview, Diagnostic instruments, Inter-rater reliability, K-SADS-PL,
Translation and cross-cultural adaptation.

Bertrand Lauth, University of Iceland, Landspitali University Hospital, Department of Child
and Adolescent Psychiatry, Dalbraut 12, 105 Reykjavı́k, Iceland, E-mail: bertrand@landspitali.
is; Accepted 20 September 2007.

he development of structured diagnostic instruments

has been an important step for research in child and

adolescent psychiatry. These standardized instruments

are based on the more descriptive taxonomy of recent

editions of international classification systems, DSM-IV

(1) and ICD-10 (2). They are essential for research and

are good aids for clinicians, since they provide a

structured and systematic assessment procedure that

has increased diagnostic reliability.

However, for researchers belonging to cultural back-

grounds and speaking languages other than English, the

adoption of such an instrument involves a process of

translation and adaptation of the instrument. The

adequacy of the diagnostic instrument in a given culture

does not guarantee its reliability or validity in another

population (3). The challenge for these researchers is then

to develop a translated and adapted version of the

assessment tool, which is equivalent to the original

version of the instrument. Only by achieving this equiva-

lence will it be possible to assure the degree of compar-

ability in studies carried out in different cultures (3).

The need for diagnostic instruments has led the

authors to start the development of a local version of

the Kiddie-SADS-PL for research in clinical populations.

Because of the lack of European consensus statement

about standard and validated practices on that topic, we

started with a classic translation/back-translation process

to try to achieve semantic equivalence and then started to

introduce the structured interview into an adolescent

inpatient clinical setting.

The impact of the introduction of the diagnostic

instrument is described in another publication, as well

as additional studies on the psychometric properties of

the Icelandic version in an adolescent clinical population.

This article describes the process of translation and

adaptation of the instrument as well as inter-rater
reliability data.

The importance of structured psychiatric interviews
in child and adolescent psychiatry
Research does not support unstructured interviews as

reliable means to standardized diagnoses (4). Even
experienced clinical interviewers are not reliable diag-

nosticians when compared which each other or when

compared with structured interviews (5�8). Agreement is

usually low in studies involving outpatients or hospita-

lized children and adolescents (5, 6, 9�12). Agreement is

usually higher for externalizing disorders than for inter-

nalizing and many authors suggest that diagnoses of

anxiety and depression are often missed using an
unstructured interview. Miller (7, 8) has suggested that

because the structured interview yields precise diagnostic

data, appropriate treatments may be delivered earlier,

leading to more rapid recovery and shorter hospital stays.

Research is still needed to establish the utility of

diagnostic interviews in clinical settings, but by encoura-

ging clinicians to follow standard diagnostic and inter-

viewing methods, they promote more consistent
diagnostic practices and help justify therapeutic inter-

ventions and outcomes. However, the instruments are

not a substitute for clinical judgement. As McClellan &

Werry (13) pointed out, psychiatric decision making

depends on the integration of information from diverse

sources and perspectives, including the patient and

family interviews, the mental status examination, collat-

eral informants (teachers) and other treatment providers.
The pre-eminent role of the clinician must be recognized

and preserved.

Choice of a ‘‘state of the art’’ psychiatric diagnostic
In this project, an interviewer-based diagnostic interview
that requires some clinical decision making on the part

of the interviewer was chosen. This kind of instrument is

often used in studies of clinical populations and is

preferred by clinicians to respondent-based interviews

(13), even though it usually shows poorer reliability.

The ‘‘Kiddie-Schedule for Affective Disorders and

Schizophrenia for School-Age Children’’ (‘‘K-SADS’’)

has changed since its original publication (14) and is
currently available in different DSM-IV format versions:

the Kiddie-SADS-P IVR (Present State), the Kiddie-

SADS-L (Lifetime), the Kiddie-SADS-PL (Present and

Lifetime Version) and the Kiddie-SADS-E (Epidemio-

logical) (15, 16).

The ‘‘Present and Lifetime’’ version (K-SADS-PL)

has several strengths (16). It has strong content validity

because it was designed to tap pre-specified diagnostic
criteria. It has been designed to lead the clinician or the

therapist to make DSM-IV diagnoses during interviews

of the young patient and his parent, including detailed
probes useful in eliciting clinically meaningful informa-

tion. One strength of the Kiddie-SADS-PL is its high

degree of precision and detail in assessing child symp-

toms and their onset, severity, duration and associated

impairments. It is the only instrument that provides

global and diagnosis-specific impairment ratings to

facilitate the determination of ‘‘caseness’’ (17). The

Kiddie-SADS-PL provides also a clinician-friendly front
and screening examination, which may result in a more

efficient shorter interview. Skip-decisions seem reliable

and have been validated by comparison with other

symptom measures (18).

In comparison to other child diagnostic instruments,

the K-SADS-PL compares favourably with test�retest

reliability estimates (17) and adequate concurrent valid-

ity of the K-SADS-PL diagnoses was established against
several standard self-report scales (18). Its strength

remains in diagnoses of affective and anxiety disorders;

it has also been demonstrated to have good reliability for

attention-deficit/hyperactivity disorder.

Use of the Kiddie-SADS-PL requires extensive clin-

ical experience and instrument specific training. Diag-

nostic algorithms have not been computerized, so

diagnoses are formulated by expert judgement.

Translation and adaptation of Kiddie-SADS-PL
We followed the cross-cultural adaptation model used to

translate into Spanish and culturally adapt the Diag-
nostic Interview Schedule for Children (DISC) (19).

Even if cultural differences between American and

Icelandic children and adolescents may be to some

extent less important than between American and

Spanish-speaking child and adolescent populations, we

aimed at attaining cross-cultural equivalency by addres-

sing five important dimensions: semantic, technical,


criterion and conceptual.

This model provided the frame of reference employed

in the development of an Icelandic version of the K-



It involves the choice of terms and sentence structures

that ensure that the meaning of the source language

items is preserved in the translation. Semantic equiva-
lence can be achieved with a combination of translation

and back-translation techniques (20).

In our case, two different translaters have indepen-

dently translated and back-translated the instrument.

Following Geisinger’s recommendations (21), they are

fluent in both languages, knowledgeable about both

cultures and have a good understanding of clinical

characteristics and content that the instrument is
supposed to evaluate. Translation and back-translation

drafts have been examined by a bilingual committee

composed of the two translaters and four bilingual and

experienced clinicians in child and adolescent psychiatry.

Feedback from interviewers after initial field testing was

essential in further improving the translation.

A few examples of differences between original and

back-translated instruments are following, illustrating

some of the semantic problems that had to be solved:

. Sadness/depressed mood;

. Separation from/absence of the mother;

. Racing/quick thoughts;

. Fits/episodes;

. Diurnal mood variations/daily mood fluctuations;

. Impulsivity/impetuousness.

One difficulty consisted in that the English vocabulary is

broader and more detailed than the Icelandic one, which

in particular includes fewer words to describe emotions

and affects.


The purpose here is to examine whether the content of

each item is meaningful to the population under study

and, if necessary, to remove some of them. In the

Icelandic version of Kiddie-SADS, we had for example

to remove ‘‘Walking on the train tracks’’, since transport

by train does not exist in Iceland. Additionally we had to

remove some references to famous US teenagers’ gangs,

as well as several examples of social games common with

American children and replace them by equivalent

Icelandic ones.

Another issue in the field of content equivalence is to

determine whether the operationalization of what con-

stitutes normal behaviour is relevant to the culture under

study, with special consideration to evaluation of im-

pairment and social adaptation (normative equivalence).

Since Kiddie-SADS items include systematic estimation

of impairment or social adaptation, those issues have

been carefully examined by our bilingual committee.


The issues addressed in this field are related to under- or

over-reporting of certain problems by some cultural

groups compared to others (substance abuse, sexual

behaviour and thoughts, certain feelings). We considered

here the openness with which particular topics are

discussed by Icelandic teenagers, the manner in which

ideas are expressed and the way in which strangers

asking questions are treated. The most significant

particularity we described (but not systematically as-

sessed) was related to difficulties, showed in our

experience by many Icelandic boys, to express both

positive and negative emotions, especially anxiety.


This refers to the fact that the same theoretical construct

should be evaluated by the two versions of the assessment

instrument in the different cultures involved. Does the

concept operationalized in the source instrument (in-

dependently of the words and phrases used to represent

it) exist in the same form in the thinking of members of

the target culture?

In our case, two important discrepancies revealed by

differences between original and back-translated ver-

sions illustrated conceptual problems:

. Conduct/behavioural disorder;

. Attachment figure/someone close

The concept of attachment as it exists in English

semiology is almost impossible to translate into Icelandic.

Behling & Law (22) pointed out that translation/back-

translation was not an adequate test of the equivalence

of the target and source-language documents, but deals

only with semantic equivalence. The two versions may

correspond with one another for the wrong reasons and

target-language version may still not convey the in-

tended meaning to potential respondents.

Solving conceptual problems needs careful considera-

tion of: the constitutive definition of the concept of

interest, the theory that explains it and the nature of any

difference between the source and the target culture.

Usually this kind of equivalence cannot be directly

assessed except for rating scales with a known factorial

structure, which could be empirically tested with a

confirmatory factor analysis, trying to find the same

factor structure in the translated version.


Solving problems in that field involves identifying

pertinent norms in each culture and assessing when the

trait or disorder evaluated is said to exist according to

these norms. In our case, the culture for which Kiddie-

SADS is to be adapted is not very different from the

culture for which the instrument was developed and we

were not able to identify significant symptoms of mental

distress that would appear different culturally from those

included in the original version of the instrument.

However, we were aware of cultural differences in how

parents and professionals view behavioural problems.

In this study, we were not able to validate the

instrument against best estimate clinical judgement

from various Icelandic child and adolescent psychia-

trists, so we designed a study of the concurrent validity

using 11 well-known checklists and rating scales,

which had already been translated and validated in

Iceland and that assess several important dimensions of

psychopathology. This validation study will be the

subject of a separate paper.

Clinical context
The adolescent unit of the Department of Child and
Adolescent Psychiatry of the Landspı́tali University

Hospital in Reykjavı́k, is the only psychiatric ward for

adolescents in Iceland, admitting each year between 70

and 80 patients from 11 to 18 years of age, from all parts

of the country. The main reasons for admission are

severe behavioural or emotional disturbances with severe

functional impairment and often suicidality (61% of

cases in the period 2001�2004), 53% being acute admis-
sions. Mean length of stay was 43 days (2001�2004).

Adolescents presenting with alcohol and drug abuse as a

predominant problem are referred to other service

providers, such as social services. The population

admitted is culturally homogeneous and its geographic

distribution throughout the country is representative of

the general population for the same age categories.

Since the unit is the only facility in the country
providing psychiatric inpatient treatment for adoles-

cents, we assume that our population is representative

of the most severe range of psychiatric morbidity of the

adolescent clinical population in Iceland.

Fifteen subjects were included in the inter-rater relia-

bility data pool; the mean age was 15.2 (standard

deviation, s 1.0, range 14�17), 55% being females
and all were Icelandic.

These subjects were chosen at random among parti-

cipants in the validation study (n 86, mean age

15.0 years; s 1.34; range 11�18), females accounted

for 55.7% and again, all were Icelandic. They were

severely affected in most of the cases and some of them

could not complete the diagnostic interview.

As the main official diagnostic classification system in

European countries is ICD-10 for both clinical and

research purposes, results of Kiddie-SADS interviews

algorithms have been translated into the ICD-10. A few

additional questions were included in the interviews for

ICD-10 criteria not covered by the K-SADS-PL.

Both DSM-IV and ICD-10 diagnoses were generated,

and numbers of diagnostic criteria met were calculated
for the most frequent diagnostic categories in both

classification systems.

Two coders checked to verify accurate utilization of

DSM-IV and ICD-10 algorithms in the calculation of

numbers of diagnostic criteria met and assignment of

final diagnosis.

Inter-rater reliability was estimated with 15 interviews
being re-rated independently by other experienced and

trained clinicians. Three clinical psychologists and one

child and adolescent psychiatrist took part in the
project. Interviews were either videotaped or attended

at the same time by another rater.

Statistical analysis
The Statistical Package for Social Sciences was used for

data analysis (23). For comparisons of categorical

variables, chi-square tests were applied. Cohen’s Kappa

(24) was used for reliability measures. Criteria proposed

by Landis & Koch (24) were used to interpret the Kappa

coefficients: excellent reliability (Kappa 0.75), good

reliability (Kappa 0.59�0.75), fair reliability (Kappa
0.40�0.58) and poor reliability (KappaB0.40).

Kappa values were calculated in both classification

systems and for most frequent diagnostic categories. In

each diagnostic area, inter-rater reliability was also

examined at the symptom level, with Kappa values

calculated for each item. Correlations between numbers

of diagnostic criteria met generated by different K-

SADS raters were calculated.
Inter-rater reliability was also examined for main

diagnostic areas surveyed in the screen interview, in order

to estimate agreement in utilization of skip-out criteria.

Since several severely affected patients could not

complete the diagnostic interview, the number of sub-

jects allowing calculation of inter-rater reliability varied

according to diagnostic categories.

Inter-rater reliability of skip-out criteria
The average agreement evaluated by calculation of

Kappa statistics across the nine diagnostic categories

studied here is fair to excellent (Table 1).

Inter-rater reliability of diagnosis assignment
Kappas for diagnoses ranged from 0.31 to 1.0 (Table 2).

In each diagnostic category, inter-rater reliability was

also examined at the symptom level with Kappas

calculated for each item. Table 3 summarizes the results

Table 1. Inter-rater reliability: Agreement in utilization of
skip-out criteria.

Diagnostic area in screen interview Kappa value n

Depressive disorders 1.00 15

Bipolar disorders 0.86 14

Attention-deficit/hyperactivity disorder 1.00 11

Oppositional defiant disorder 1.00 12

Conduct disorder 0.67 12

Post-traumatic stress disorder 1.00 8

Social phobia 1.00 12

Separation anxiety disorder 0.81 12

Generalized anxiety disorder 0.57 12

Table 2. Inter-rater reliability: measure of agreement on

Diagnostic category Kappa value n

Major depressive disorder (DSM-IV) 1.00 15

Melancholic depression (DSM-IV) 0.55 15

Dysthymia (ICD-10) 0.63 15

Moderate depressive episode (ICD-10) 0.32 15

Severe depressive episode (ICD-10) 0.44 15

Somatic syndrome (ICD-10) 0.47 15

Mania (DSM-IV) 0.31 14

Hypomania (DSM-IV) 1.00 14

Bipolar disorder not otherwise 0.44 14

specified (DSM-IV)

Mania (ICD-10) 0.44 14

Hypomania (ICD-10) 1.00 14

ADHD*predominantly inattentive 0.81 11

type (DSM-IV)

ADHD*predominantly 1.00 11

hyperactive-impulsive type (DSM-IV)

ADHD*combined type 1.00 11


disorder (ICD-10)

Oppositional defiant disorder (DSM-IV) 1.00 12

Oppositional defiant disorder (ICD-10) 1.00 12

Conduct disorder (DSM-IV and ICD-10) 1.00 12

Post-traumatic stress disorder (DSM-IV) 0.67 6

Post-traumatic stress 0.57 6

disorder*chronic type (DSM-IV)

Post-traumatic stress disorder (ICD-10) 0.67 6

Social phobia (DSM-IV) 0.82 12

Social anxiety disorder of 1.00 12

childhood (ICD-10)

Separation anxiety disorder (DSM-IV) 1.00 12

Separation anxiety disorder of 1.00 12

childhood (ICD-10)

Overanxious/generalized anxiety 1.00 12

disorder (DSM-IV)

Overanxious/generalized anxiety 0.82 12

disorder (ICD-10)

The inter-rater reliability data pool allowed calcula-

tion of correlations between numbers of diagnostic
criteria met generated by different K-SADS raters and

calculated in main areas of psychopathology, related to

both classification diagnostic systems. Statistically sig-

nificant correlations between different raters’ severity

scores were found at the 0.01 level in all diagnostic

categories (Table 4).

The translation and adaptation process of the Kiddie-

SADS-PL into the Icelandic language and culture aimed

at attaining cross-cultural equivalency by addressing five

important dimensions: semantic, technical, content,

criterion and conceptual.

The results of the study on inter-rater reliability must
Table 3. Inter-rater reliability: average values at the symptom
level within each diagnostic category.

Diagnostic category Kappa value n

Depressive disorders 0.83 15

Bipolar disorders 0.85 14

ADHD*inattention 0.90 11

ADHD*hyperactivity-impulsivity 0.94 11

Oppositional defiant disorder 0.98 12

Conduct disorder 0.96 12

Post-traumatic stress disorder*current 0.93 9

Post-traumatic stress disorder*msp 0.86 3

Social phobia 0.48 12

Separation anxiety disorder 0.78 12

Generalized anxiety disorder 0.82 12

ADHD, Attention-Deficit/Hyperactivity Disorder msp, most severe

procedures were used: in-session observation and video-

taped observation.

The data collected in this study were, however, in line

with those reported by other investigators (4, 12, 18, 25�
27). Agreement in utilization of skip-out criteria was

excellent for most diagnostic categories but moderate for

generalized anxiety disorder. Agreement in the assign-

ment of most frequent diagnostic categories was ex-

cellent, good or fair in most cases, but poor for moderate

depressive episode in ICD-10 diagnostic classification

system and for mania in both systems. Agreement was

excellent at the symptom level within each category

except Social Phobia. Correlations between different

raters’ severity scores were found significant at the 0.01

level in all diagnostic categories.

These results suggest that the translation and adapta-

tion work has provided an Icelandic version allowing

similar understanding among different raters and has

succeeded achieving acceptable equivalence.

The American version of the interview had shown

exceptional reliability at the symptom level (16). The

strongest reliability data were related to behavioural,

disruptive and affective disorders. The instrument had

shown poorer reliability for diagnosing anxiety disor-

ders. Our results are going in the same direction except

for anxiety disorders, with a tendency to show better

inter-rater reliability, and for bipolar spectrum disorders

(poorer reliability). Those difficulties concerning bipolar

spectrum disorders could be related to the fact that our

interviewers were less familiar to the assessment of this

kind of symptomatology.

Our findings also suggest a much poorer inter-rater

reliability for depressive disorders diagnosed in the ICD-

10 diagnostic classification system. This could be related

to the fact that diagnostic criteria are more numerous

and difficult to meet in the ICD-10 system, but this

could constitute a problem in Iceland, since results of

Table 4. Inter-rater reliability: correlation between numbers of
diagnostic criteria met.

Number of diagnostic criteria met Pearson’s value n

DSM-IV Depressive symptomatology/9 0.94** 15

ICD-10 depressive symptomatology/10 0.95** 15

DSM-IV Melancholic features/8 0.81** 15

ICD-10 somatic syndrome/8 0.85** 15

DSM-IV Manic symptomatology/8 0.93** 14

ICD-10 manic symptomatology/10 0.89** 14

ADHD Inattention 0.94** 11

(both classification systems)/9

ADHD Hyper-activity 0.98** 11

(both classification systems)/5

ADHD Impulsivity 0.76** 11

(both classification systems)/4

DSM-IV oppositional 1** 12

defiant disorder/8

ICD-10 oppositional 0.98** 12

defiant disorder/23

DSM-IV conduct disorder/15 0.77** 12

Social anxiety symptomatology 0.98** 12

(both systems)/3

Separation anxiety symptomatology 0.95** 12

(both classification systems)/8

DSM-IV generalized anxiety 0.98** 11


ICD-10 generalized anxiety 1** 12


DSM-IV post-traumatic stress 0.99** 6


ICD-10 post-traumatic stress 0.97** 6


Kiddie-SADS interview algorithms have necessarily to be

translated into ICD-10. We conclude that further studies

are needed of differences between concurrent validity

data according to both diagnostic classification systems
as well as differences in diagnostic assignment procedures.

This initial inter-rater reliability study confirmed the

quality of translation and adaptation of the Kiddie-

SADS-PL and constitutes the first step of a larger

validation study of the Icelandic version of the instru-


Bertrand Lauth, University of Iceland, Landspitali University
Journal of Personality Assessment, 93(1), 26–32, 2011
Reliability and Validity of the Spanish Version of the Minnesota
Multiphasic Personality Inventory–Adolescent (MMPI–A)


1Special Education Department, Sakhnin College for Teacher’s Education, Sakhnin, Israel
2Facultad de Psicologı́a, Universidad de Granada, Granada, Spain

The aim of this study was to determine the test–retest reliability and internal consistency of the scales of the Spanish version of the Minnesota
Multiphasic Personality Inventory–Adolescent (MMPI–A; Butcher et al., 1992). Two samples of 939 and 109 Spanish adolescents ages 14 to 18
years were assessed with the MMPI–A in their school environment. The frst sample responded to the inventory once, whereas the second sample
responded to it on 2 occasions with a 2-week interval between sessions. Results showed no signifcant differences in means or variances between the
frst and the second test administration for most MMPI–A scales. Test–retest reliability ranged between .62 (Amorality, Ma1) and .92 (Immaturity,
IMM); most correlations exceeded .70. Internal consistency values for the MMPI–A scales in the pretest and posttest were very similar overall.
External validity of the MMPI–A was demonstrated through several signifcant correlations between its scales and YSR/11–18 syndromes and social
interaction measures. The highest correlations were established between the Anxious/Depressed YSR/11–18 scale and other MMPI–A scales such
as Schizophrenia (Sc), Welsh’s Anxiety (A), Adolescent-Anxiety (A-anx) and Adolescent-Alienation (A-aln), and between the Social Avoidance
and Distress Scale and the MMPI–A Adolescent-Social Discomfort (A-sod) scale.

The Minnesota Multiphasic Personality Inventory–Adolescent
(MMPI–A; Butcher et al., 1992) is made up of 478 items that
assess a number of aspects of personality—up to 70 variables—
using different groups of scales: validity, clinical, content, and
supplementary scales, as well as subscales. The MMPI–A is
most frequently used in psychological, psychiatric, medical,
alcohol and drug treatment, and correctional clinical contexts.
It can be applied individually or in groups to adolescents ages
14 to 18 years.

Traditional validity scales, largely carried over from the orig-
inal MMPI (Hathaway & McKinley, 1943; Lie, L; Infrequency,
F; Infrequency 1 subscale, F1; Infrequency 2 subscale, F2; and
Defensiveness, K), help to detect deviant test-taking attitudes
and responses of adolescents. The Variable Response Inconsis-
tency (VRIN) and True Response Inconsistency (TRIN) scales
are additional validity scales that inform about the consistency
of responses to the items. With regards to the clinical scales
(Hypochondriasis, Hs; Depression, D; Hysteria, Hy; Psycho-
pathic Deviate, Pd; Masculinity-Femininity, Mf; Paranoia, Pa;
Psychasthenia, Pt; Schizophrenia; Sc; Hypomania, Ma; and
Social Introversion, Si), the revision from the MMPI to the
MMPI–A basically maintained the same items of the original
instrument, with the exception of Mf and Si. Six supple-
mentary scales were also included (MacAndrew Alcoholism
Scale–Revised, MAC–R; Alcohol/Drug Problem Acknowl-
edgment, ACK; Alcohol/Drug Problem Proneness, PRO;
Immaturity, IMM; Welsh’s Anxiety, A; and Repression, R),

and 15 content scales were introduced (Adolescent-Anxiety,
A-anx; Adolescent-Obsessiveness, A-obs; Adolescent-
Depression, A-dep; Adolescent-Health Concerns, A-hea;
Adolescent-Alienation, A-aln; Adolescent-Bizarre Mentation,
A-biz; Adolescent-Anger, A-ang; Adolescent-Cynicism, A-
cyn; Adolescent-Conduct Problems, A-con; Adolescent-Low
Self-Esteem, A-lse; Adolescent-Low Aspirations, A-las;
Adolescent-Social Discomfort, A-sod; Adolescent-Family
Problems, A-fam; Adolescent-School Problems, A-sch; and
Adolescent-Negative Treatment Indicators, A-trt).

The item changes of the MMPI–A were made by the steer-
ing committee responsible for the creation of the revised test
booklet to improve the content and the relevance of some items
in the experiences of adolescents’ lives. Archer and Gordon
(1994) evaluated the impact of these changes. They examined
the psychometric stability of the modifed items using test–retest
correlations in a sample of 265 adolescents ages 13 to 17 years,
and found that the modifed items did not lead to any relevant
changes in response patterns compared to those on the MMPI.

Correlations between the basic scales of the original MMPI
have been studied in samples of adolescent inpatients by Archer,
Ball, and Hunter (1985), Archer and Gordon (1988), Archer,
Gordon, Anderson, and Giannetti (1989), Ball, Archer, Struve,
Hunter, and Gordon (1987), and Williams and Butcher (1989).
Butcher et al. (1992) reported the reliability of MMPI–A
clinical scale scores as ranging from .65 to .84 in a normative
sample of English-speaking adolescents (45 boys and 109 girls).
These values were similar to the test–retest correlation values
for adults presented in the MMPI–2 (Minnesota Multiphasic
Personality Inventory–2; Butcher, Dahlstrom, Graham, Telle-
gen, & Kaemmer, 1989) manual. Internal consistency values
(Cronbach’s alpha) of the MMPI–A validity and clinical scales




in the normative sample of English-speaking boys and girls
were high for many scales (for example, Hs and Sc had an inter-
nal consistency of .78–.79 and .88–.89, respectively). Yet, the
coeffcients obtained for other scales such as Mf and Pa in the
same English-speaking sample were relatively low or moderate
(.40–.43 and .57–.59, respectively; Butcher et al., 1992).

The construction of scales focused on item content, such as
the MMPI–A content scales, has received increasing acceptance
over the last 30 years (Burisch, 1984; Jackson, 1971). Such
scales have proven to be as good at describing and predicting
personality variables as those created using other methods (Hase
& Goldberg, 1967). Moreover, their homogeneity makes them
easy to interpret (Burisch, 1984). The internal consistency of
the MMPI–A content scales was acceptable, both in the norma-
tive sample (α = .55–.83) and the clinical sample (α = .63–.83),
and test–retest correlations ranged between .62 and .82 (Butcher
et al., 1992). McGrath, Pogge, and Stokes (2002) studied the in-
cremental validity of the content scale scores when added to
the clinical scales of the MMPI–A as predictors of various be-
havioral disorders in a sample of adolescents of both sexes.
They found that the content scales offered incremental validity
over the clinical scales and supported the use of the content
scales as an adjunct to the traditional clinical scales. Forbey and
Ben-Porath (2003) reported that several of the MMPI–A content
scales show signifcant incremental validity in predicting behav-
ior and personality characteristics of adolescents. The clinical
scales also demonstrated incremental validity in reference to
the content scales, indicating that the two sets of scales provide
complementary information.

Because personality traits are expected to remain stable over
time, scores from instruments aimed at measuring such traits
should also remain very stable. Stability is usually assessed by
measuring the correlation between scores obtained in the same
test at two different points in time or scores obtained from paral-
lel forms of the same test. To estimate the reliability coeffcient
with the test–retest procedure, it is necessary to calculate the
Pearson product–moment correlation coeffcient between the
two sets of scores of the same individuals on two occasions.
However, this procedure has some drawbacks: (a) Repeating
the same test twice might cause the frst test to infuence the
results of the second one; and (b) a short time interval between
two test sessions might increase memory effects, whereas a
long time interval might lead to changes in the participants’
level of information. For all these reasons, estimates made with
the test–retest method are more appropriate for tests that assess
traits that are not likely to be affected by the effects of practice
and remain stable over the time interval in question. Informa-
tion about the degree of stability of scores is essential in many
applied situations.

In addition to test–retest stability, the stability of a given
measure might be low because there is a broad variation in
the amount of score change, or low internal consistency, across
scales. These two factors of instability can be reviewed by an-
alyzing the relation between the internal consistency reliability
coeffcient (Cronbach’s alpha) and the test–retest reliability co-
effcient through a qualitative procedure. So far, no statistic
available can be calculated and function as an indicator of the
type of relation between both coeffcients mentioned. If the same
test is applied twice with an interval of several weeks between
sessions, several scenarios could occur: Both the internal con-
sistency (alpha) and the test–retest reliability coeffcient might

be high or low and confrmed in the second test, or the former
might be high and the latter might be low; in the second case, the
test seems to be reliable, but the trait measured in participants
has changed in that time interval. Tests are meant to have a high
alpha coeffcient, showing adequate internal consistency. They
are also meant to show a high correlation between parallel forms
applied within an interval of weeks, showing adequacy in the
repeatability and stability of measures. So far, no studies have
analyzed the existing relation between the test–retest reliability
and internal consistency of the MMPI–A scales in the Spanish
adolescent population.

The Spanish adaptation of the MMPI–A was carried out
´ by Jim´ omez Avila-Espada of the University ofenez-G´ and

Salamanca, Spain, between 1994 and 2002 and published in
2003 (Jim´ omez & ´ enez-G´ Avila-Espada, 2003); however, the au-
thors did not provide any information about the reliability of the
scales. No reliability coeffcients have been provided by other
studies that have used Spanish versions of the MMPI–A (Scott,
Butcher, Young, & Gomez, 2002; Scott & Mamani-Pampa,
2008). Given the scarcity of studies on the use of the MMPI–A
with Spanish adolescents, the aim of this study was to study
the test–retest reliability and internal consistency of MMPI–A
scale scores among Spanish adolescents (see Carretero-Dios &
P´ on, 2007, for instrumental studyerez, 2007; Montero & Le´
guidelines). External validity of MMPI–A scale scores was as-
sessed through their correlations with behavioral syndrome and
social interaction measures.



All data were obtained from Spanish students ages 14 to 18
years in various secondary schools of the province of Granada,
Spain. No nonstudent samples were used in this study, and par-
ticipants were assessed in their school environment. Of the 26
randomly selected secondary education schools whose partici-
pation was requested, only 13 agreed to participate in the study.
Thus, the sample was more a convenience sample than a random
sample. Participants were briefed in general terms about the pur-
poses of the research and told that privacy of the data collected
was guaranteed. All participants expressed their consent to par-
ticipate once the conditions of the study had been explained. The
assessment was carried out by a single examiner using standard
instructions and guidelines to answer the questionnaires.

Two nonoverlapping samples were selected: The frst one in-
cluded 939 adolescents (539 girls, 400 boys) with a mean age
of 15.69 years (SD = 1.27, range = 14–18); the second sample
was made up of 109 adolescents (53 girls, 56 boys) with a mean
age of 15.26 years (SD = 1.12, range = 14–17). About two
thirds (n = 630) of participants in the frst sample were in their
second, third, or fourth year of compulsory secondary education
(Educacion´ Secundaria Obligatoria); the rest were students in
the frst or second year of noncompulsory secondary education
(Bachillerato; n = 174) and students following various voca-
tional training courses (Ciclos Formativos; n = 135) such as
cooking, hairdressing, and so on.


MMPI–A (Butcher et al., 1992). MMPI–A valid-
ity, clinical, content, and supplementary scales, and
subscales of the clinical scales, were examined in this study.


Youth Self-Report for Ages 11–18 (YSR; Achenbach &
Rescorla, 2000, 2001). The YSR assesses adolescents’ psy-
chosocial skills and problem behaviors. Verhulst, van der Ende,
and Koot (1997) provided evidence of reliability and validity
data. These authors reported Cronbach alpha values of .61 for
boys and .67 for girls in a sample of normal adolescents (15–
18 years). Higher values were found in samples of adolescent
patients: boys (.73) and girls (.70). In Spain, Lemos, Fidalgo,
Calvo, and Menéndez (1992) reported that girls scored higher
than boys on internalized behaviors, whereas boys scored higher
on externalized behaviors. Abad, Forns, Amador, and Martorell
(2000) revealed that internal consistency is higher for internal-
ized and externalized syndrome scales (range = .81–.84) than
narrowband ones (range = .56–.74).

Liebowitz Social Anxiety Scale (LSAS; Liebowitz, 1987).
This scale includes 24 items that assess performance in so-
cial situations by evaluating the degree of fear experienced and
the degree of avoidance reported by participants. Cox, Ross,
Swinson, and Direnfeld (1998) reported high internal consis-
tency coeffcients for the social fear and social avoidance sub-
scales (α = .90). In the Spanish validation, Bobes et al. (1999)
obtained internal consistency coeffcient values above .73 for all
the LSAS scales; the intraclass correlation coeffcients obtained
in the 2-week test–retest studies featured values above .82 for
all subscales.

Social Interaction Anxiety Scale (SIAS; Mattick & Clarke,
1998). The SIAS includes 20 items that are meant to be an-
swered using a 5-point Likert scale. It has a high internal consis-
tency (α = .93) and a 1-month test–retest correlation coeffcient
above .90. Ries et al. (1998) reported that the SIAS discriminates
between generalized and specifc subtypes of social phobia. In
Spain, Olivares, Garcı́a-López, and Hidalgo (2001) found an
internal consistency coeffcient of .89 and obtained two factors
that explain 40.11% of the variance. Nevertheless, the confr-
matory factor analysis supported the single-factor model and
clustered all the items into a single factor called interaction
social anxiety.

Social Avoidance and Distress Scale (SAD; Watson &
Friend, 1969). The SAD includes 28 items, half of which
refer to subjective discomfort in social situations, and the other
half of which refect active avoidance of such situations. This
scale has shown an internal consistency of .94 and a 1-month
test–retest reliability of .68. Hoffmann, DiBartolo, Holaway, and
Heimberg (2004) reported a Cronbach’s alpha of .93. In Spain,
the reliability of the avoidance subscale was .87, whereas that
of social anxiety was .85 (Comeche, Dı́az, & Vallejo, 1995).
Garcı́a-López, Olivares, Hidalgo, Beidel, and Turner (2001)
found a 10-day test–retest reliability of .85 in an adolescent

Fear of Negative Evaluation Scale (FNE; Watson &
Friend, 1969). The FNE assesses the degree of intensity with
which individuals experience fear of being negatively evaluated
by others. Watson and Friend (1969) reported an internal con-
sistency coeffcient of .94 and a 1-month test–retest reliability of
.78. In Spain, internal consistencies of .94 and .90 were obtained
for the original and the short versions of the scale, respectively.

Garcia-López et al. (2001) reported a 10-day test–retest relia-
bility of .84.


The 939 adolescents were assessed collectively in their class-
rooms in two 75-min sessions by a single examiner. The re-
maining 109 adolescents were assessed in two different ses-
sions separated by a 2-week interval. Data collection started
once consent had been obtained from the parents and teach-
ers of the adolescents and the adolescents themselves. They
were all assured of confdentiality. Each session lasted for about
60 min and the MMPI–A was administered in group testing by
the same examiner. The assessment of the sample of 939 ado-
lescents with all measures occurred over a 4-month period, and
that of the sample of 109 adolescents twice with the MMPI–A
took 2 weeks. All participants were offered the opportunity to
receive individual information about their results on the tests as
well as their psychological interpretation.


The assumptions of classical test theory were tested by veri-
fying whether the means and variances of the variables differed
signifcantly between the frst and the second administration of
the MMPI–A in the sample of 109 adolescents.

Hotelling’s T2 test for equality of means was not signif-
icant for the 67 variables, with the exception of VRIN and
TRIN, F (67, 32) = 1.36, p = .171. Equality of variances was
tested by the Pitman–Morgan test with the Bonferroni correc-
tion for the same 67 variables. No signifcant differences were
obtained between variances obtained in the two administra-

MMPI–A test–retest reliability coeffcients in the sample of
109 adolescents ranged from .62 (for the Ma1scale) to .92 (for
the IMM scale); most correlations exceeded .70. Alpha internal
consistency values were similar in both administrations of the
test. Alpha and test–retest correlation values were similar in
most cases; low internal consistency values and high test–retest
correlations were only obtained for 18 scales (see Table 1).

The internal consistency of both administrations of the
MMPI–A in the sample of 109 adolescents was also calculated.
As shown in Table 1, pretest and posttest alpha values in this
sample were very similar to those obtained in the initial sample
of 939 adolescents.

Analysis of external validity data for the MMPI–A involved
calculating the correlations between the basic and content scales
and the common seven syndromes of the YSR/11–18 found in
both boys and girls in the factorial study of Zubeidat, Fernández-
Parra, Salinas, and Sierra (in press) and other social anxiety mea-
sures (LSAS, SIAS, SAD, and FNE). Results are shown in Ta-
ble 2. Overall, correlations were moderate in size. The strongest
relationships were found between MMPI–A scales assessing
social introversion/discomfort and social anxiety measures,
and between MMPI–A scales measuring anxiety and depres-
sion and the Anxious/Depressed scale of the YSR/11–18 (see
Table 2).


As discussed by Archer (2005), the MMPI and the MMPI–
A have been used in the assessment of adolescents for over
60 years, leading to more than 200 studies dealing with


TABLE 1.—Test–retest reliability and internal consistency in both administrations of the Minnesota Multiphasic Personality Inventory–Adolescent.

Alpha of Initial
Scale No. of Items Test–Retestr Pretest Alpha Posttest Alpha Sample (N = 939)

Variable Response Inconsistency (VRIN) 50 .80 .43 .74 .60
True Response Inconsistency (TRIN) 24 .81 — — —
Infrequency Subscale (F1) 34 .75 .80 .84 .81
Infrequency Subscale (F2) 33 .89 .84 .85 .83
Infrequency (F) 67 .92 .89 .91 .89
Lie (L) 14 .81 .45 .46 .58
Defensiveness (K) 30 .77 .70 .73 .65
Hypochondriasis (Hs) 31 .81 .74 .75 .71
Depression (D) 57 .74 .56 .46 .57
Hysteria (Hy) 60 .71 .53 .57 .60
Psychopathic Deviate (Pd) 48 .85 .64 .58 .54
Masculinity/Femininity-Males (Mf) 44 .74 .36 .32 .27
Masculinity/Femininity-Females (Mf) 44 .80 .24 .09 .20
Masculinity-Femininity (Mf) 44 .86 — — —
Paranoia (Pa) 40 .77 .54 .55 .62
Psychasthenia (Pt) 48 .92 .85 .85 .84
Schizophrenia (Sc) 77 .90 .89 .89 .87
Hypomania (Ma) 46 .82 .69 .64 .61
Social Introversion (Si) 62 .88 .80 .77 .70
Subjective Depression (D1) 29 .76 .67 .60 .64
Psychomotor Retardation (D2) 14 .66 .21 .18 .18
Physical Malfunctioning (D3) 11 .68 .11 .34 .34
Mental Dullness (D4) 15 .79 .63 .55 .55
Brooding (D5) 10 .71 .57 .48 .62
Denial of Social Anxiety (Hy1) 6 .75 .57 .58 .56
Need for Affection (Hy2) 11 .78 .54 .63 .35
Lassitude-Malaise (Hy3) 15 .83 .61 .59 .60
Somatic Complaints (Hy4) 17 .78 .69 .67 .58
Inhibition of Aggression (Hy5) 7 .68 .15 .20 .36
Familial Discord (Pd1) 9 .80 .65 .60 .45
Authority Problems (Pd2) 8 .65 .20 .34 .22
Social Imperturbability (Pd3) 6 .83 .58 .56 .42
Social Alienation (Pd4) 12 .78 .45 .39 .50
Self Alienation (Pd5) 12 .80 .62 .61 .54
Persecutory Ideas (Pa1) 17 .78 .74 .75 .69
Poignancy (Pa2) 9 .77 .34 .36 .44
Naivete (Pa3) 9 .79 .40 .53 .49
Social Alienation (Sc1) 21 .80 .65 .72 .68
Emotional Alienation (Sc2) 11 .83 .52 .43 .56
Lack of Ego Mastery-Cognitive (Sc3) 10 .84 .68 .60 .60
Lack of Ego Mastery-Conative (Sc4) 14 .80 .54 .58 .56
Lack of Ego Mastery-Defective Inhibition (Sc5) 11 .81 .61 .63 .60
Bizarre Sensory Experiences (Sc6) 20 .82 .77 .75 .70
Amorality (Ma1) 6 .62 .14 .16 .18
Psychomotor Acceleration (Ma2) 11 .79 .49 .49 .40
Imperturbability (Ma3) 8 .76 .40 .52 .12
Ego Infation (Ma4) 9 .74 .49 .45 .43
Shyness/Self-Consciousness (Si1) 14 .86 .75 .74 .66
Social Avoidance (Si2) 8 .82 .61 .61 .55
Alienation-Self and Others (Si3) 17 .82 .74 .69 .70
MacAndrew Alcoholism Scale-Revised (MAC-R) .49 .78 .38 .54 .46
Alcohol/Drug Problem Acknowledgment (ACK) 13 .88 .65 .66 .62
Alcohol/Drug Problem Proneness (PRO) 36 .85 .50 .45 .48
Immaturity (IMM) 43 .92 .83 .81 .75
Welsh’s Anxiety (A) 35 .90 .84 .81 .84
Repression (R) 33 .81 .51 .53 .61
Adolescent-Anxiety (A-anx) 21 .90 .75 .74 .67
Adolescent-Obsessiveness (A-obs) 15 .82 .63 .61 .67
Adolescent-Depression (A-dep) 26 .86 .78 .76 .78
Adolescent-Health Concerns (A-hea) 37 .86 .84 .83 .77
Adolescent-Alienation (A-aln) 20 .88 .71 .75 .70
Adolescent-Bizarre Mentation (A-biz) 19 .87 .81 .82 .74
Adolescent-Anger (A-ang) 17 .85 .71 .71 .66
Adolescent-Cynicism (A-cyn) 22 .86 .74 .78 .69
Adolescent Conduct Problems (A-con) 23 .89 .68 .74 .70
Adolescent-Low Self-Esteem (A-lse) 18 .86 .66 .63 .65
Adolescent-Low Aspirations (A-las) 16 .85 .65 .59 .44
Adolescent-Social Discomfort (A-sod) 24 .91 .77 .76 .71
Adolescent-Family Problems (A-fam) 34 .90 .86 .84 .78
Adolescent-School Problems (A-sch) 20 .83 .66 .66 .63
Adolescent-Negative Treatment Indicators (A-trt) 26 .86 .75 .77 .70


Infrequency Subscale (F1) .14** .29** .08* −.02 .23** −.10** .16** .01 .09** .16** .02
Infrequency Subscale (F2) .18** .22** .10** .04 .25** −.02 .15** .13** .16** .23** .08*
Infrequency (F) .17** .28** .10** .01 .27** −.06 .17** .08* .14** .22** .06
Lie (L) −.28** −.27** −.32** −.18** −.27** −.21** −.27** −.20** −.12** −.06 −.10**
Defensiveness (K) −.43** −.32** −.41** −.33** −.38** −.29** −.33** −.32** −.28** −.26** −.25**
Hypochondriasis (Hs) .26** .21** .17** .28** .23** −.01 .12** .14** .20** .29** .20**
Depression (D) .29** .01 .01 .05 .06 .06 −.08* .17** .32** .34** .31**
Hysteria (Hy) .11** .06 .01 .10** .04 −.16** −.05 −.04 .01 .08* .06
Psychopathic Deviate (Pd) .38** .38** .28** .14** .33** −.00 .24** .11** .13** .20** .15**
Masculinity-Femininity (Mf) .20** −.05 .12** .13** .00 .03 −.09* .06 .11** .01 .21**
Paranoia (Pa) .28** .23** .15** .11** .24** −.05 .13** .08* .17** .23** .21**
Psychasthenia (Pt) .54** .36** .36** .35** .43** .28** .31** .38** .39** .41** .36**
Schizophrenia (Sc) .37** .37** .27** .21** .41** .09** .30** .21** .25** .33** .19**
Hypomania (Ma) .17** .37** .26** .16** .33** −.03 .31** .04 −.01 −.01 −.02
Social Introversion (Si) .38** .15** .14** .15** .17** .35** .13** .43** .54** .60** .41**
Welsh’s Anxiety (A) .55** .33** .36** .37** .41** .31** .31** .42** .42** .40** .39**
Repression (R) −.19** −.30** −.28** −.21** −.31** −.13** −.32** −.14** .01 .03 −.01
Adolescent-Anxiety (A-anx) .49** .35** .36** .37** .41** .21** .29** .32** .30** .34** .33**
Adolescent-Obsessiveness (A-obs) .41** .31** .36** .33** .36** .28** .32** .33** .32** .30** .28**
Adolescent-Depression (A-dep) .55** .31** .29** .24** .36** .19** .22** .28** .33** .34** .31**
Adolescent-Health Concerns (A-hea) .19** .18** .13** .23** .23** −.06 .12** .10** .15** .26** .18**
Adolescent-Alienation (A-aln) .39** .29** .19** .11** .31** .12** .19** .21** .32** .36** .19**
Adolescent-Bizarre Mentation (A-biz) .26** .34** .22** .19** .38** .02 .29** .08* .11** .19** .06
Adolescent-Anger (A-ang) .34** .44** .49** .27** .36** .18** .36** .22** .16** .20** .16**
Adolescent-Cynicism (A-cyn) .31** .28** .27** .21** .32** .15** .28** .22** .12** .14** .12**
Adolescent Conduct Problems (A-con) .11** .43** .22** .10** .27** .01 .32** .06 .03 .12** −.05
Adolescent-Low Self-Esteem (A-lse) .44** .24** .20** .19** .26** .22** .20** .35** .40** .40** .33**
Adolescent-Low Aspirations (A-las) .13** .17** .12** .03 .10** .03 .10** .07* .15** .19** .07*
Adolescent-Social Discomfort (A-sod) .19** .09** −.01 .00 .06 .23** .04 .30** .46** .53** .23**
Adolescent-Family Problems (A-fam) .28** .36** .27** .09** .34** .02 .26** .09** .13** .17** .08*
Adolescent-School Problems (A-sch) .20** .44** .23** .12** .27** .06 .22** .11** .09** .14** .04
Adolescent-Negative Treatment Indicators (A-trt) .38** .30** .24** .17** .32** .23** .25** .34** .34** .37** .21**

adolescent samples. Such studies have made important con-
tributions not only to the study of the psychometric charac-
teristics of these instruments, but also to our understanding of
the development and psychopathology of adolescents. How-
ever, the characteristics of these tools have not been suff-
ciently studied in certain adolescent populations. For example,
Perfect (2005) pointed out that there are still limited em-
pirical data to support the clinical use of the MMPI–A in
samples of abused adolescents. Similarly, no studies so far
have explored the test–retest reliability and the internal con-
sistency of the scales of the MMPI–A in the Spanish adolescent

The design of studies with the objectives just mentioned often
involves working with two samples: The frst one was a large
reference sample with participants assessed in one session, and
the second and smaller sample was assessed twice with the same
test, with a 2-week time interval between administrations. To
estimate the reliability coeffcient with the test–retest method,
the test should measure traits that are not likely to be affected by
the effects of practice and remain stable over the time interval
in question. Thus, it is necessary to verify the stability of the
score distribution beforehand. Results of this study showed no
overall differences between the means of the frst and second
administration of the MMPI–A scales; results of the multivariate
difference of means test were clearly not signifcant. Likewise,

no signifcant differences were found in the variances of both
administrations of the MMPI–A.

With regard to test–retest reliability, correlations between
both administrations of the MMPI–A scales were high and were
considered satisfactory. In fact, most of these correlations ex-
ceeded .70 and ranged between .62 and .92. These results are
similar to those found by P´ ıas, Dur´ omez-erez y Far´ an, and G´
Maqueo (2003) in a sample of 1,056 Mexican adolescents,
where test–retest correlations were statistically signifcant with
values ranging between .36 and .90. They also agree with those
obtained by Butcher et al. (1992), which ranged between .47
and .84 in a sample of American adolescents. P´ ıaserez y Far´
et al. (2003) concluded that the clinical, validity, content, and
supplementary scales of the MMPI–A were stable in their sam-
ple of adolescents, like other studies (Aharoni, 1999; Ampudia,
Duran, & Lucio, 1995; Gomez, Johnson, Davis, & Velazquez,
2000; Hammel, 2001; Mendoza-Newman, 2000; Sirigati, 2000),
including this one. Moreover, alpha values were very simi-
lar across the pretest and posttest. Along these lines, Carlson
and Hofstra (2001) carried out a study in which 80 adoles-
cents between the ages of 14 and 18 completed a computerized
and a written version of the MMPI–A (with a 1-week interval
between both counterbalanced sessions); results showed that
the clinical, content, and supplementary scales and the test–
retest correlation coeffcients were statistically signifcant and


compared favorably with the reliability data provided in the
MMPI–A manual. Results from Stein, McClinton, and Gra-
ham’s (1998) study examining the long-term stability of the
clinical, content, and supplementary scales and the MMPI–A
personality psychopathology scales in a sample of 61 adoles-
cents were also consistent with the fndings of MMPI–A score
reliability reported earlier. Moreover, improvements shown dur-
ing the development of the MMPI–A have led to moderate in-
creases in the stability of the clinical scales in adolescents.

Findings on internal consistency values and test–retest cor-
relation showed that both coeffcients were similar, and could
therefore be viewed as jointly providing evidence of suffcient
consistency as well as stability in responses to the items. In
fact, Vinet and Alarcon´ (2003) studied a sample of 705 Chilean
adolescents and reported that the MMPI–A is a stable and con-
sistent measure, with similar reliability levels (stability and in-
ternal consistency) to those obtained in studies carried out in
other countries. Low internal consistency and high test–retest
correlations were only obtained in a few scales of this study,
demonstrating the overall stability of responses to the items.
However, the inadequate internal consistency of these scales
shows that they need to be reviewed. Problems related to lack of
internal consistency cannot be attributed to sample size, because
values in the smaller sample of 109 adolescents were very sim-
ilar to those obtained in the sample of 939 adolescents. We also
studied the external validity of the clinical and content MMPI–A
scales based on their correlation with other variables related to
adolescents’ behavioral problems and social discomfort. Over-
all, most of these correlations were signifcant and moderate
in size, and in the expected direction. In general, MMPI–A
scales that assess internalized problems showed higher corre-
lations with YSR/11–18 scales that assess such problems than
with those assessing externalized problems; conversely, scales
that assess externalized problems show higher correlations with
YSR/11–18 scales related to externalized problems rather than
internalized ones. These results are similar to those found in
other MMPI–A studies with clinical samples. Indeed, Butcher
et al. (1992) reported similar correlations between the clinical
scales and the Child Behavior Checklist in the MMPI–A manual.
More recently, Veltri et al. (2009) obtained similar results with
psychiatric and forensic samples by correlating the MMPI–A
scales with criterion variables using a standardized Record Re-
view Form; the study by Stokes, Pogge, Sarnicola, and McGrath
(2009) in an adolescent inpatient psychiatric sample showed a
similar trend.

Nevertheless, the highest correlations were only found be-
tween MMPI–A social interaction scales and social anxi-
ety variables and between MMPI–A anxiety and depression
scales and the anxious/depressed syndrome. Likewise, Mennin,
Heimberg, and Jack (2000) reported that individuals with patho-
logical levels of social anxiety show high scores of social anxiety
and avoidance, general anxiety, cognitive symptoms of anxiety,
and depressive mood. Also, Heimberg et al. (1999) found that
total anxiety and avoidance scores highly correlated with total
fear and total avoidance (.90 for both); these authors reported
that the correlations found between total anxiety and avoidance
and other measures of anxiety tend to be higher than those found
between the former and measures of depression, as happened
in this study. Along these lines, Zubeidat, Salinas, and Sierra
(2008) reported that total anxiety and avoidance scores showed
higher correlations with measures of social anxiety (e.g., SIAS,

FNE, and SAD) than with variables related to MMPI–A depres-
sion scales (e.g., Depression-D and Depression-DEP). Overall,
the strongest support in this study is found for MMPI–A scales
related to depression, anxiety, and social discomfort.


This study does not include a sample of adolescents not at-
tending school, which limits the possibilities of generalizing the
results beyond the population of adolescent students. In Spain,
education is compulsory until the age of 16, and there are very
few adolescents not attending school or marginalized in the
age range between 16 and 18 years. Also, participants in this
study do not represent a strictly random sample because they
were selected by convenience sampling. The results therefore
cannot be considered representative of the Spanish adolescent
population, and further study in clinical adolescent samples is
certainly needed. A fnal drawback is the fact that the stability of
the measures of the MMPI–A was only studied with a test–retest
analysis of two points in time.


Abad, J., Forns, M., Amador, J. A., & Martorell, B. (2000). Fiabilidad y validez
del Youth-Self Report en una muestra de adolescentes [Youth Self-Report
fability and validity in an adolscent sample]. Psicothema, 12, 49–54.

Achenbach, T. M., & Rescorla, L.A. (2000). Mental health practioners’ guide for
the Achenbach System of Empirically Based Assessment (ASEBA). Burling-
ton: University of Vermont Department of Psychiatry.

Achenbach, T. M., & Rescorla, L.A. (2001). Manual for the ASEBA School:
Age forms & profles. Burlington: University of Vermont, Research Center
for Children, Youth & Families.

Aharoni, D. (1999). The effectiveness of the MMPI–A in the assessment of
adolescent substance abuse (Minnesota Multiphasic Personality Inventory).
Dissertation Abstracts International: Section B: The Sciences and Engineer-
ing, 60(6B), 2932.

Ampudia, A., Duran, C., & Lucio, E. (1995). Confabilidad de las Escalas Suple-
mentarias del MMPI–2 en poblacion´ mejicana [Reliability of the MMPI–2
Supplementary Scales in Mexican population]. Revista Iberoamericana de
Diagnostico y Evaluaci´ on´ Psicológica, 2, 25–49.

Archer, R. P. (2005). Implications of MMPI/MMPI–A fndings for understand-
ing adolescent development and psychopathology. Journal of Personality
Assessment, 85, 257–270.

Archer, R. P., Ball, J. D., & Hunter, J. A. (1985). MMPI characteristics of
borderline psychopathology in adolescent inpatients. Journal of Personality
Assessment, 49, 47–55.

Archer, R. P., & Gordon, R. A. (1988). MMPI and Rorschach indices of
schizophrenic and depressive diagnoses among adolescent inpatients. Journal
of Personality Assessment, 52, 276–287.

Archer, R. P., & Gordon, R. A. (1994). Psychometric stability of MMPI–A item
modifcations. Journal of Personality Assessment, 62, 416–426.

Archer, R. P., Gordon, R. A., Anderson, G. L., & Giannetti, R. (1989). MMPI
special scale clinical correlates for adolescent inpatients. Journal of Person-
ality Assessment, 53, 654–664.

Ball, J. D., Archer, R. P., Struve, F. A., Hunter, J. A., & Gordon, R. A. (1987).
MMPI correlates of a controversial EEG pattern among adolescent psychiatric
patients. Journal of Clinical Psychology, 43, 708–714.

Bobes, J., Badı́a, X., Luque, A., Garcı́a, M., González, M. P., & Dal-Ré, R.
(1999). Validaci´ nol de los cuestionarios Liebowitzon de las versiones en espa˜
Social Anxiety Scale, Social Anxiety and Distress Scale y Sheenan Disability
Inventory para la evaluación de la fobia social [Validation of the Spanish
versions of the questionnaires Liebowitz Social Anxiety Scale, Social Anxiety
and Distress Scale, and Sheenan Disability Inventory for assessment of social
phobia]. Medicina Clinica, 112, 530–538.

Burisch, M. (1984). Approaches of personality inventory construction. Ameri-
can Psychologist, 39, 214–227.


Butcher, J. N., Dahlstrom, W. G., Graham, J. R., Tellegen, A., & Kaemmer, B.
(1989). Minnesota Multiphasic Personality Inventory–2 (MMPI–2): Manual
for administration and scoring. Minneapolis: University of Minnesota Press.

Butcher, J. N., Williams, C. L., Graham, J. R., Archer, R. P., Tellegen, A.,
Ben-Porath, Y. S., & Kaemmer, B. (1992). MMPI–A, Minnesota Multipha-
sic Personality Inventory-Adolescent. Minneapolis: University of Minnesota

Carlson, D. A., & Hofstra, U. (2001). Computerized vs. written administration
of the MMPI–A in clinical and non-clinical settings. Dissertation Abstracts
International: Section B: The Sciences and Engineering, 62(2-B), 1130.

Carretero-Dios, H., & Pérez, C. (2007). Standards for the development and
review of instrumental studies: Considerations about test selection in psycho-
logical research. International Journal of Clinical and Health Psychology, 7,

Comeche, M. I., Dı́az, M. I., & Vallejo, M. A. (1995). Cuestionarios, inventarios
y escalas [Questionnaires, inventories and scales]. Madrid, Spain: Fundación

Cox, B. J., Ross, L., Swinson, R. P., & Direnfeld, D. M. (1998). A comparison
of social phobia outcome measures in cognitive-behavioral group therapy.
Behavior Modifcation, 22, 285–297.

Forbey, J. D., & Ben-Porath, Y. S. (2003). Incremental validity of the MMPI–A
content scales in a residential treatment facility. Assessment, 10, 191–202.

Garcı́a-López, L. J., Olivares, J., Hidalgo, M. D., Beidel, D. C., & Turner,
S. M. (2001). Psychometric properties of the Social Phobia and Anxiety
Inventory, the Social Anxiety Scale for Adolescents, the Fear of Negative
Evaluation Scale and the Social Avoidance Distress Scale in an adolescent
Spanish-speaking population. Journal of Psychopathology and Behavioral
Assessment, 23, 51–59.

Gomez, F., Johnson, R., Davis, Q., & Velazquez, R. (2000). MMPI–A perfor-
mance of African American adolescent frst-time offenders. Psychological
Reports, 87, 309–314.

Hammel, S. (2001). An investigation of the validity and clinical usefulness
of the MMPI–A with female juvenile delinquents. Dissertation Abstracts
International: Section B: The Sciences and Engineering, 61(11B).

Hase, H. D., & Goldberg, L. R. (1967). Comparative validity of different strate-
gies of constructing personality inventory scales. Psychological Bulletin, 67,

Hathaway, S. R., & McKinley, J. C. (1943). The Minnesota Multiphasic Per-
sonality Inventory (rev. ed.). Minneapolis: University of Minnesota Press.

Heimberg, R. G., Horner, K. J., Juster, H. R., Safren, S. A., Brown, E.
J., Schneier, F. R., & Liebowitz, M. R. (1999). Psychometric properties
of the Liebowitz Social Anxiety Scale. Psychological Medicine, 29, 199–

Hoffmann, S. G., DiBartolo, P. M., Holaway, R. M., & Heimberg, R. G. (2004).
Scoring error of Social Avoidance and Distress Scale and its psychometric
implications. Depression and Anxiety, 19, 197–198.

Jackson, D. N. (1971). The dynamics of structured personality tests: 1971.
Psychological Review, 78, 229–248.

Jim´ omez, F., & ´ enez-G´ Avila-Espada, A. (2003). MMPI–A, Inventario Mul-
tifásico de Personalidad de Minnesota para Adolescentes [MMPI–A, Min-
nesota Multiphasic Personality Inventory–Adolescent]. Madrid, Spain: Edi-
ciones TEA.

Lemos, S. G., Fidalgo, A. M., Calvo, P., & Menéndez, P. (1992). Estructura fac-
torial de la prueba YSR y su utilidad en psicopatologı́a infanto-juvenil [Factor
structure of the YSR and its use in child and adolescent psychopathology].
An´ on de Conducta, 18, 883–905.alisis y Modifcaci´

Liebowitz, M. R. (1987). Social phobia. Modern Problems in Pharmacopsychi-
atry, 22, 141–173.

Mattick, R. P., & Clarke, J. C. (1998). Development and validation of measures of
social phobia scrutiny fear and social interaction anxiety. Behavior Research
and Therapy, 36, 455–470.

McGrath, R. E., Pogge, D. L., & Stokes, J. M. (2002). Incremental validity
of selected MMPI–A content scales in an inpatient setting. Psychological
Assessment, 14, 401–409.

Mendoza-Newman, M. (2000). Level of acculturation, sociodemographic sta-
tus, and the MMPI–A performance of a non-clinical Hispanic adolescent
sample. Dissertation Abstracts International: Section B: The Sciences and
Engineering, 60 (9B), 4897.

Mennin, D. S., Heimberg, R. G., & Jack, M. S. (2000). Comorbid generalized
anxiety disorder in primary social phobia: Symptom severity, functional im-
pairment, and treatment response. Journal of Anxiety Disorders, 14, 325–343.

Montero, I., & León, O. G. (2007). A guide for naming research studies in
psychology. International Journal of Clinical and Health Psychology, 7,

Olivares, J., Garcı́a-López, L. G., & Hidalgo, M. D. (2001). The Social Phobia
Scale and the Social Interaction Anxiety Scale: Factor structure and reliability
in a Spanish-speaking population. Journal of Psychoeducational Assessment,
19, 69–80.

Pérez y Farı́as, J. M., Durán, C., & Gómez-Maqueo, E. L. (2003). Un estudio
sobre la estabilidad temporal del MMPI–A con un diseño test–retest en estu-
diantes Mejicanos [A study about the temporal stability of the MMPI–A by
a test-retest design on Mexican students]. Salud Mental, 26, 59–66.

Perfect, M. M. (2005). Incremental validity of the Minnesota Multiphasic Per-
sonality Inventory (MMPI–A) and Rorschach Inkblot Test in predicting the
number and severity of adolescents’ maltreatment histories. Dissertation Ab-
stracts International Section A: Humanities and Social Sciences, 65(8-A),

Ries, B. J., McNeil, D. W., Boone, M. L., Turk, C. L., Carter, L. E., &
Heimberg, R. G. (1998). Assessment of contemporary social phobia verbal
report instruments. Behaviour Research and Therapy, 36, 983–994.

Scott, R. L., Butcher, J. N., Young, T. L., & Gomez, N. (2002). The Hispanic
MMPI–A across fve countries. Journal of Clinical Psychology, 58, 407–417.

Scott, R. L., & Mamani-Pampa, W. (2008). MMPI–A for Peru: Adaptation and
normalization. International Journal of Clinical and Health Psychology, 8,

Sirigati, S. (2000). Verso un adattamento italiano del Minnesota Multiphasic
Personality Inventory–Adolescent [Toward an Italian adaptation of the Min-
nesota Multiphasic Personality Inventory–Adolescent]. Bolletino Psicologı́a
Applicata, 230, 67–72.

Stein, L. A. R., McClinton, B. K., & Graham, J. R. (1998). Long-term stability
of MMPI–A scales. Journal of Personality Assessment, 70, 103–108.

Stokes, J., Pogge, D., Sarnicola, J., & McGrath, R. (2009). Correlates of the
MMPI–A psychopathology fve (PSY-5) facet scales in an adolescent inpa-
tient sample. Journal of Personality Assessment, 91, 48–57.

Veltri, C. O. C., Graham, J. R., Sellbom, M., Ben-Porath, Y. S., Forbey, J.
D., O’Connell, C., . . . White, R. S. (2009). Correlates of MMPI–A scales in
acute psychiatric and forensic samples. Journal of Personality Assessment,
91, 288–300.

Verhulst, F. C., van der Ende, J. Y., & Koot, H. M. (1997). Handleiding voor
de Youth Self-Report (YSR) [Manual for the Youth Self-Report]. Rotterdam,
The Netherlands: Erasmus University/Sophia Children’s Hospital.

Vinet, E. V., & Alarcon,´ B. P. (2003). Evaluacion´ Psicométrica del Inventario
Multifásico de Personalidad de Minnesota para Adolescentes (MMPI–A) en
muestras chilenas [Psychometric assessment of the Minnesota Multiphasic
Personality Inventory–Adolescent (MMPI–A) in Chilean samples]. Terapia
Psicológica, 21, 87–103.

Watson, D., & Friend, R. (1969). Measurement of social evaluative anxiety.
Journal of Consulting and Clinical Psychology, 33, 448–457.

Williams, C. L., & Butcher, J. N. (1989). An MMPI study of adolescents:
II. Verifcation and limitations of codetype classifcations. Psychological
Assessment: A Journal of Consulting and Clinical Psychology, 1, 251–

Zubeidat, I., Fernández-Parra, A., Salinas, J. M., & Sierra, J. C. (in press).
Factorial analysis of the Youth Self-Report for Ages 11–18 (YSR) in Spanish
adolescent sample. Al-Nibras, Israel.

Zubeidat, I., Salinas, J. M., & Sierra, J. C. (2008). Exploration of the psycho-
metric characteristics of the Liebowitz Social Anxiety Scale in a Spanish
adolescent sample. Depression and Anxiety, 25, 977–987.

