Charles Ungerleider,
Professor Emeritus, The University of British Columbia
[permission to
reproduce granted if authorship is acknowledged]
There is little doubt that results
from the Programme for International Student Assessment (PISA) get attention. If
you enter the search term “PISA results” in your browser under the heading news
as I just did, it will return more than 67,000 references in about a quarter of
a second. Every three years the results are celebrated by politicians in
jurisdictions that are ‘winners,’ like Canada, and loathed by those presiding
over education in jurisdictions that are ‘losers.’
PISA is the name given to the assessments
administered to 15-year-olds in reading, mathematics, and science. The
assessments are administered in more than 35 countries (which PISA often calls
“economies”) and more than a dozen partners that include countries, such as Brazil,
and economic regions such as Shanghai, Hong Kong, and Macau. It is convenient
for PISA to use the term ‘economies’ to speak of all participating
entities.
Each administration of PISA assesses
all three domains (reading, mathematics, and science), but gives prominence to
one of the three domains in each cycle. In 2000, 2009 and 2018, the principal
assessment domain was reading. In 2003 and 2012, it was mathematics. In 2006
and 2015, science was the focal domain. In 2021, the focus will be on mathematics
again with an additional test in creative thinking. And, in 2024, PISA will
measure “Learning in the Digital World,” the ability of students to use
self-regulated learning while they employ digital tools.
PISA derives its support from the
countries and economies that participate in the assessments. To sustain itself
PISA must maintain the continuing support of previous participants, but it also
tries to encourage new participation. PISA combines the assessment of the three
domains with the assessment of abilities in other areas: financial literacy,
creativity, and digital learning, for example.
It is doubtful whether PISA would
earn repeat business if there were significant differences from round to round in
the major domains. “If you want to measure change, do not change the measures”
– at least not too much. Thus, while the
assessments do change from one round to the next, the folks who analyze the
results perform a variety of statistical operations to assure participating
jurisdictions of the equivalence of the assessments. They also adjust the
results so that they are related to the same scale. Thus, adjustments are made
so that the score is centered at 500 with a standard deviation of 100 score points.
Large-scale assessments are helpful
in determining if school systems are producing better student outcomes and
reducing educational inequalities among groups of students over time. The
jurisdictions that have such mechanisms are advantaged. Ontario, for example,
has an Education Quality and Accountability Office (EQAO). A body that has some
independence from the provincial Ministry of Education , EQAO conducts
province-wide, census-type, large-scale assessments in reading, writing and
mathematics at the primary and junior divisions; applied and academic
mathematics at Grade 9; and the Ontario Secondary School Literacy Test (OSSLT)
administered at grade 10. But education systems that do not have their own
census-type large-scale assessments are at a disadvantage and thus must rely on
external benchmarks against which they can measure their progress over time.
Because of my interest in improving
student outcomes and reducing educational inequalities, I have been an observer
of PISA since it began at the turn of this century. I am primarily interested
in how jurisdictions interpret the results and what use, if any, they make of
them to improve outcomes.
Earlier in this blog I used the
terms ‘winners’ and ‘losers.’ I did that because the leadership in most jurisdictions
treats PISA like a horse race. “Who won?” Who lost?” “How well or badly did we
do?” There are some significant challenges to making use of the results.
Those who are responsible for PISA
want PISA to attract attention and earn support. But they caution that PISA
results are not a measure of the impact of schooling per se, but a
cumulative measure of the prior experiences that the 15-year-olds have had and
the many factors that influence those experiences such as poverty, parental
education, etc. They also caution that PISA is not aligned with curriculum in
the many countries and economies that participate.
Notwithstanding these significant
limitations, I have spent quite a bit of time reading and consulting with
colleagues about the decline in PISA scores over time. The decline has occurred
in all three major domains (reading, mathematics, and science) on an
international level and within Canada. I have represented that decline in the
chart below devoted to mathematics. I chose mathematics because it is an area
about which there has been much hand wringing. It includes the data for all the
Canadian provinces, Canada, and the OECD average (excluding the partner
economies since because they are not countries).
I and my colleagues, many of whom
have international reputations in measurement, statistics, and education are
baffled. We are not certain to what the decline should be attributed or its significance. There is no shortage of
hypotheses.
With repeated measurements of the
same phenomenon, there is a tendency for scores at the extreme ends (high or
low) of the distribution to be followed by ones that are closer to the mean.
The trend in the PISA data may reflect such a tendency. Another hypothesis is
that over the nearly 20 years of PISA assessments, students spend more time on
computers and less time reading print material, and the kind of reading they do
has changed - if not in kind in degree. According to this hypothesis, students
devote less mental effort to reading and the loss of mental effort is reflected
in all subjects. Still another hypothesis is that the effort to retain students
who would have dropped out of school has paid off, but that the students
retained are less able and, thus, ‘diluting’ performance over time. Yet another
hypothesis is that successive generations of students have become desensitized
to large-scale assessments, attributing less importance to them, and expending
less effort on them than in the past. I could go on, but I won’t.
Do these tentative explanations
deserve examination? These hypotheses are worthy topics for a dissertation.
But, if a jurisdiction has a robust system of
large-scale assessments upon which it can rely for examining change over time, it
would be more productive to focus on the data produced by those systems than to
depend upon PISA. This is particularly true if the large-scale assessments are
closely linked with the jurisdiction’s educational goals and curricula; allow
for assessment at regular intervals throughout students’ educational careers; and
are amenable to close analysis of the relationship between factors over which
schools have control and the outcomes measured.
Jurisdictions that do not have robust
systems for large-scale assessment and do not have the resources to develop
them will be dependent upon assessments such as PISA. For them, understanding why
PISA scores are declining is a necessary prelude to understanding the results
their students obtain. Jurisdictions that
depend on PISA alone for an external measure of system performance would also
be wise to invest in some oversampling so they can track performance of
subpopulations.