Bob Shepherd, a frequent contributor to the blog, is an education polymath. He has authored textbooks, written assessments, developed curriculum, and was most recently a classroom teacher in Florida. He has a long history in the education industry.
He explains here why standardized testing today is neither valid nor reliable.
He begins:
The dirty secret of the standardized testing industry is the breathtakingly low quality of the tests themselves. I worked in the educational publishing industry at very high levels for more than twenty years. I have produced materials for all the major textbook publishers and most of the standardized test publishers, and I know from experience that quality control processes in the standardized testing industry have dropped to such low levels that the tests, these days, are typically extraordinarily sloppy and neither reliable nor valid. They typically have not been subjected to anything like the validation and standardization procedures used, in the past, with intelligence tests, the Iowa Test of Basic Skills, and so on. The mathematics tests are marginally better than are the tests in ELA, US History, and Science, but they are not great. The tests in English Language Arts are truly appalling…
The Common Core tests, he says, are especially useless.
They are almost entirely content free. They don’t assess what students ought to know. Instead, they test, supposedly, a lot of abstract “skills”–the stuff on the Gates/Coleman Common [sic] Core [sic] bullet list, but as we shall see below, they don’t even do that.
Open the link and read on. This is a very important exposé by an expert.
Thank you, Bob. Many of the AP US History test questions were, um, questionable, if memory serves. Did you know, for example, that one of the causes of the Vietnam War was NOT US economic interests?
I appreciate all your research.
“This process took over five years in order to produce a so-called valid test.”
So-called is the right descriptor as standardized tests have been shown, proven beyond a doubt, to be invalid.
Noel Wilson, and others before him, thoroughly destroyed the onto-epistemological foundations (conceptual foundational) upon which the standards and testing malpractice regime is based. To understand why that regime is invalid read and comprehend his seminal 1997 dissertation “Educational Standards and the Problem of Error” found at: https://epaa.asu.edu/ojs/article/viewFile/577/700
That’s a reply to retired teacher.
Note: After you download the Wilson PDF,
click Edit, select: Advanced Search,
enter: standardized, click: search.
Wilson reveals the main
“Dirty Secret”,UNCHANGED
by adding STANDARDIZED to
the testing lingo.
Tests have been shown,
proven beyond a doubt,
to be invalid.
yes: a lot of work done for this view. Thanks
I once worked with a professor that created a standardized ESL test. The process of validation is an expensive, grueling ordeal. It involves giving the test to multiple groups of differing populations, reviewing and analyzing the results, tossing out invalid items and going through the process again and again. Then, the final product is sent to the psychometricians and statisticians that create the technical manual. This process took over five years in order to produce a so-called valid test.
State tests do not go through this process. There have been numerous reports of errors on state administered tests. The most outrageous part of the state tests is the pass/fail criteria. The cut scores are subjective, and they can be manipulated in many ways based on politics, bias and any other subjective criteria. Proficiency is the term applied to those that make the cut, but proficiency has no meaning in the statistical world of fact and figures. Proficiency is whimsical term decided by those creating the test. A validated standardized test will produce percentiles in which a student’s performance is compared to the rest of the cohort. A student with a score of 88% scored better than 87% of the students taking the test and worse than 11% of students that took the same test. State tests provide no such information.
Reblogged this on Crazy Normal – the Classroom Exposé and commented:
The dirty secrets of the very profitable private=sector testing industry: SHOCKING low quality tests that do not test what OUR children in OUR public schools learn from their teachers. And Private Charter Schools taking money away from Public schools “don’t have to follow the same regulations from states, municipalities and school districts as traditional public schools.”
From Bob Shepherd’s writing -“All students taking these tests and all teachers administering them have to sign forms stating that they will not reveal anything about the test items, and the items are no longer released, later, for public scrutiny, and so there is no check whatsoever on the test makers.” It seems this would be a concern that parents would want school board members to explore. Can testing companies really require students to sign a form stating that they can’t discuss test items even after the test???
Parents should also be concerned about the collection and privacy of all this data especially after FERPA was weakened.
EPIC recently gave testimony on Big Data: Privacy Risks and Needed Reforms in the Public and Private Sectors.”In their testimony they discussed: (1) the problems we face today due to the failure of policymakers in the United States to create adequate data protection standards; (2) the current state of privacy law in the U.S.; and (3) what a comprehensive approach to data protection and privacy in the United States should look like. EPIC recommends that Congress enact a privacy law that: (1) limits the collection and use of personal data; (2) prohibits discriminatory uses of data; (3) requires algorithmic fairness and accountability; (4) bans manipulative design and unfair marketing practices; (5) limits government access to personal data; (6) provides for a private right of action; (6) preserves states’ rights to enact stronger provisions; and (7) establishes a federal data protection agency to enforce these new rules.[2]” https://epic.org/documents/hearing-on-big-data-privacy-risks-and-needed-reforms-in-the-public-and-private-sectors/
Parents can use their voices to advocate for student data and privacy to be better protected.
Now that many states are turning to embedded testing a part of online instruction, there are renewed concerns about student privacy. Students’ data become a commodity over which they have no control. This should be a major issue for parents and their children’s privacy.
AlwaysLearning….Don’t you remember when PARCC hit the scene and Pearson had personnel scouring the internet/social media to find students who were posting about the test or discussing the test questions? Pearson actually tracked down those students and reported them to their schools or districts. It was all over the news that year. FERPA has definitely been eroded and needs to be rewritten to allow for all the computer data being generated by online programs. No way that Facebook/Meta should have access to student data thru it’s funding of Panorama Education eg.
Thank you, Always Learning. I tried to hit the main reasons for getting out from under the standardized-testing-based Deformer occupation of our schools, but this one of student data privacy is also really important. NB that Gates tried, with InBloom, to create a single, national database of educational outcomes–grades, test scores, etc.–that would operate throughout K-12, college, and people’s careers. It would have been, in effect, the national gradebook, and any producer of educational or training materials would have had to be accepted as a partner by InBloom. This would have in effect made Gates the gatekeeper of all instruction in the U.S.
Talk about Orwellian!!!!
Here, a warning about that:
The New York State education Dept. does release a portion of the test items from both the math and ELA (Common Core aka NextGen – Grades 3 to 8) tests. Here is a link to the 2019 exams for your perusal:
https://www.engageny.org/resource/released-2019-3-8-ela-and-mathematics-state-test-questions
NYSED has full transparency with the federally mandated Science Exams; here is a link to ALL 19 years of test items from the cumulative (Grades 5 to 8), Intermediate Level Science exams:
https://nysedregents.org/Grade8/Science/home.html
Note that NYS will is scheduled to shift to the new NGSS standards (CC spawn from Achieve) in yet another push away from objective content knowledge in favor of soft, scientific thinking skills.
Here is the format of a Score Report (Grade 8) that a parent receives:
Click to access elascorereport-19eng.pdf
Here is the disclaimer found on the score report:
Dear Parent/Guardian of Jane,
This report summarizes Jane’s performance on the New York State Testing Program English Language Arts Assessment, administered in the spring of 2019. The test score provides one way to understand student performance; however, this score does not tell the whole story about what Jane knows and can do. The results from the Grade 3-8 ELA and Mathematics Tests are being provided for diagnostic purposes and will not be included in Jane’s official transcript or permanent record.
” . . . diagnostic purposes”? Ha!
Teachers can ‘diagnose’ without this ‘help’. That’s why small classes are important.
If you live in NYS, ask any math or ELA teacher (Grades 4 to 9) exactly when they receive these scores and exactly how they are used to diagnose specific academic deficiencies for the purposes of support and improvement. At best level 2 students are placed into a general AIS class in lieu of a special area elective.
Brief outline of Wilson’s “Educational Standards and the Problem of Error” and some comments of mine. (updated 6/24/13 per Wilson email)
1. A description of a quality can only be partially quantified. Quantity is almost always a very small aspect of quality. It is illogical to judge/assess a whole category only by a part of the whole. The assessment is, by definition, lacking in the sense that “assessments are always of multidimensional qualities. To quantify them as unidimensional quantities (numbers or grades) is to perpetuate a fundamental logical error” (per Wilson). The teaching and learning process falls in the logical realm of aesthetics/qualities of human interactions. In attempting to quantify educational standards and standardized testing the descriptive information about said interactions is inadequate, insufficient and inferior to the point of invalidity and unacceptability.
A major epistemological mistake is that we attach, with great importance, the “score” of the student, not only onto the student but also, by extension, the teacher, school and district. Any description of a testing event is only a description of an interaction, that of the student and the testing device at a given time and place. The only correct logical thing that we can attempt to do is to describe that interaction (how accurately or not is a whole other story). That description cannot, by logical thought, be “assigned/attached” to the student as it cannot be a description of the student but the interaction. And this error is probably one of the most egregious “errors” that occur with standardized testing (and even the “grading” of students by a teacher).
Wilson identifies four “frames of reference” each with distinct assumptions (epistemological basis) about the assessment process from which the “assessor” views the interactions of the teaching and learning process: the Judge (think college professor who “knows” the students capabilities and grades them accordingly), the General Frame-think standardized testing that claims to have a “scientific” basis, the Specific Frame-think of learning by objective like computer based learning, getting a correct answer before moving on to the next screen, and the Responsive Frame-think of an apprenticeship in a trade or a medical residency program where the learner interacts with the “teacher” with constant feedback. Each category has its own sources of error and more error in the process is caused when the assessor confuses and conflates the categories.
Wilson elucidates the notion of “error”: “Error is predicated on a notion of perfection; to allocate error is to imply what is without error; to know error it is necessary to determine what is true. And what is true is determined by what we define as true, theoretically by the assumptions of our epistemology, practically by the events and non-events, the discourses and silences, the world of surfaces and their interactions and interpretations; in short, the practices that permeate the field. . . Error is the uncertainty dimension of the statement; error is the band within which chaos reigns, in which anything can happen. Error comprises all of those eventful circumstances which make the assessment statement less than perfectly precise, the measure less than perfectly accurate, the rank order less than perfectly stable, the standard and its measurement less than absolute, and the communication of its truth less than impeccable.”
In other words all the logical errors involved in the process render any conclusions invalid.
The test makers/psychometricians, through all sorts of mathematical machinations attempt to “prove” that these tests (based on standards) are valid-errorless or supposedly at least with minimal error [they aren’t]. Wilson turns the concept of validity on its head and focuses on just how invalid the machinations and the test and results are. He is an advocate for the test taker not the test maker. In doing so he identifies thirteen sources of “error”, any one of which renders the test making/giving/disseminating of results invalid. And a basic logical premise is that once something is shown to be invalid it is just that, invalid, and no amount of “fudging” by the psychometricians/test makers can alleviate that invalidity.
Having shown the invalidity, and therefore the unreliability, of the whole process Wilson concludes, rightly so, that any result/information gleaned from the process is “vain and illusory”. In other words start with an invalidity, end with an invalidity (except by sheer chance every once in a while, like a blind and anosmic squirrel who finds the occasional acorn, a result may be “true”) or to put in more mundane terms crap in-crap out.
And so what does this all mean? I’ll let Wilson have the second to last word: “So what does a test measure in our world? It measures what the person with the power to pay for the test says it measures. And the person who sets the test will name the test what the person who pays for the test wants the test to be named.”
In other words it attempts to measure “’something’ and we can specify some of the ‘errors’ in that ‘something’ but still don’t know [precisely] what the ‘something’ is.” The whole process harms many students as the social rewards for some are not available to others who “don’t make the grade (sic)” Should American public education have the function of sorting and separating students so that some may receive greater benefits than others, especially considering that the sorting and separating devices, educational standards and standardized testing, are so flawed not only in concept but in execution?
My answer is NO!!!!!
One final note with Wilson channeling Foucault and his concept of subjectivization:
“So the mark [grade/test score] becomes part of the story about yourself and with sufficient repetitions becomes true: true because those who know, those in authority, say it is true; true because the society in which you live legitimates this authority; true because your cultural habitus makes it difficult for you to perceive, conceive and integrate those aspects of your experience that contradict the story; true because in acting out your story, which now includes the mark and its meaning, the social truth that created it is confirmed; true because if your mark is high you are consistently rewarded, so that your voice becomes a voice of authority in the power-knowledge discourses that reproduce the structure that helped to produce you; true because if your mark is low your voice becomes muted and confirms your lower position in the social hierarchy; true finally because that success or failure confirms that mark that implicitly predicted the now self-evident consequences. And so the circle is complete.”
In other words students “internalize” what those “marks” (grades/test scores) mean, and since the vast majority of the students have not developed the mental skills to counteract what the “authorities” say, they accept as “natural and normal” that “story/description” of them. Although paradoxical in a sense, the “I’m an “A” student” is almost as harmful as “I’m an ‘F’ student” in hindering students becoming independent, critical and free thinkers. And having independent, critical and free thinkers is a threat to the current socio-economic structure of society.
I know, I know, we have to do everything we can to oppose Republicans who want to do away with all government except law and border enforcement, but let’s face it. President Biden is a worthless puppet of unhinged billionaires. He said he would end standardized testing. He might as well have said he was going to learn how to weld. There was no intention to tell the truth. Just dishonesty. All BS, billionaire sh-t. What has he done for public schools? Nothing. President Fail.
The ESSA must be reauthorized with no testing mandate. The stupidity, the insanity, the BS must end.
(A certain commenter is likely to respond with an unreadably long reply defensively lashing out at the suggestion a Democrat is less than perfectly magnanimous, with a rehashing of some primary election that took place long ago. Don’t care. President Joe Fail has me trying to keep my school alive with invalid test scores over which I have no control. President Fail.)
The teachers’ unions could end the federal standardized testing mandate by taking this to the streets. Until they do that, they are complicit in child abuse. I do not mean that as metaphor. I mean it quite literally.
Yes, it is child abuse.
Why?
“To the extent that these categorisations are accurate or valid at an individual level, these decisions may be both ethically acceptable to the decision makers, and rationally and emotionally acceptable to the test takers and their advocates. They accept the judgments of their society regarding their mental or emotional capabilities. But to the extent that such categorisations are invalid, they must be deemed unacceptable to all concerned.
Further, to the extent that this invalidity is hidden or denied, they are all involved in a culture of symbolic violence. This is violence related to the meaning of the categorisation event where, firstly, the real source of violation, the state or educational institution that controls the meanings of the categorisations, are disguised, and the authority appears to come from another source, in this case from professional opinion backed by scientific research. If you do not believe this, then consider that no matter how high the status of an educator, his voice is unheard unless he belongs to the relevant institution.
And finally a symbolically violent event is one in which what is manifestly unjust is asserted to be fair and just. In the case of testing, where massive errors and thus miscategorisations are suppressed, scores and categorisations are given with no hint of their large invalidity components. It is significant that in the chapter on Rights and responsibilities of test users, considerable attention is given to the responsibility of the test taker not to cheat. Fair enough. But where is the balancing responsibility of the test user not to cheat, not to pretend that a test event has accuracy vastly exceeding technical or social reality? Indeed where is the indication to the test taker of any inaccuracy at all, except possibly arithmetic additions?”
A Little Less than Valid: An Essay Review
http://edrev.asu.edu/index.php/ER/article/view/1372/43
YES!
Also with Wilson channeling Foucault and his concept of subjectivization:
“So the mark [grade/test score] becomes part of the story about yourself and with sufficient repetitions becomes true: true because those who know, those in authority, say it is true; true because the society in which you live legitimates this authority; true because your cultural habitus makes it difficult for you to perceive, conceive and integrate those aspects of your experience that contradict the story; true because in acting out your story, which now includes the mark and its meaning, the social truth that created it is confirmed; true because if your mark is high you are consistently rewarded, so that your voice becomes a voice of authority in the power-knowledge discourses that reproduce the structure that helped to produce you; true because if your mark is low your voice becomes muted and confirms your lower position in the social hierarchy; true finally because that success or failure confirms that mark that implicitly predicted the now self-evident consequences. And so the circle is complete.”
In other words students “internalize” what those “marks” (grades/test scores) mean, and since the vast majority of the students have not developed the mental skills to counteract what the “authorities” say, they accept as “natural and normal” that “story/description” of them. Although paradoxical in a sense, the “I’m an “A” student” is almost as harmful as “I’m an ‘F’ student” in hindering students becoming independent, critical and free thinkers. And having independent, critical and free thinkers is a threat to the current socio-economic structure of society.
LeftCoastTeacher,
AOC won’t even come out against the SHSAT, a standardized admission test for NYC public high schools.
Is she also a worthless puppet of unhinged billionaires?
Many Democrats and progressives are less than perfect on education. Are they all being controlled by billionaires if they have a difference of opinion?
I don’t agree with many of the “progressive” democrats who support “public charter schools” and won’t come out strongly against charters. In fact, I often defended Bill de Blasio because while Bernie Sanders was still praising “good public charters”, de Blasio was telling the truth about them.
But even when I disagreed with the position that progressive democrats have on issues of standardized tests or charters or a myriad of other issues, I am not unhinged enough to ignore all of their other positions on other issues and make sweeping attacks about how those progressives are worthless puppets of billionaires.
I would hope that public school teachers could understand the difference.
Which progressives have signed on and are loudly amplifying the need to reauthorize ESSA with no testing mandate. And I don’t mean some phrase in some written platform no one reads. Which progressives are speaking out loudly about this, LCT? Because I would like to support them.
Yep.
When you are around tests, you will see tests and what ways questions look for student response. I have been there. Take my word. There is rarely a question written that is not subject to some interpretation the writer of the question never anticipated. Questions are rarely even a window into student understanding. The tests students take today are worse than intellectually bankrupt; they are criminal.
Always great to read something by Bob.
Puts a bit of wind in my sails on a cold, Sunday night
Very kind of you, John. I feel the same way about your work. Warm regards to you and yours.
Aaaand the GOOD news is that “The Too Young to Test” bill just passed in the Illinois Senate! (Now, on to the House!) The bad news is, of course, that some damn (no, not Some DAM, a fine gentleman & a scholar, as my dad {o.b.m.} would say) fools in Illinois actually WANT to test kids in K-2. (oh, WE KNOW who THEY are…getting some $$$$$ under the table from Pear$$$on much?). For all of you who live in Illinois, call your State Reps NOW & tell them to vote YES on HB 5285, & to sign on as a sponsor.
Thanks to everyone who Witness Slipped, worked on this bill & special thanks to IL Families for Public Schools (Cassie Cresswell, Exec. Director, is an Education Hero!).
excellent
Congratulations!
As the Huntsville City Schools began to ramp up the use of Standardized tests around 2012, student results in the state tests dropped precipitously over a period of 6 years. This is not the exception. Districts across the country have pursued this educational malpractice of colluding testing instruments that not only fail to match up with state tests but confuse curricular focus. Then this happened: When NCLB passed states fell over themselves adopting instruments and normative practices that introduced tests that gave little true indication of student academic progress. Once the media caught on, companies such as Pearson then developed tests that placed “on grade level” well above the grade level being tested. Voila, more “failing schools” creating gross for defunding public education. They called it “rigor”. When the media reports on the struggles of the public schools from district to district or state to state, they blindly, lazily, or both, refer to test results that do not relate to one another. In the long run, standardized testing has told us nothing about student academic progress and has amplified the voices who don’t want public schools.
Overall good comment Paul. One exception, though, at least from my experience here in the Show Me State.
“Then this happened: When NCLB passed states fell over themselves adopting instruments and normative practices that introduced tests that gave little true indication of student academic progress.”
Missouri and many others took the position of let’s wait and see and delayed as much as possible to implement “those federal mandates”, especially Common Core, mainly because “we ain’t gonna have the feds telling us what to do.” But time and a new Dept of Ed Secretary, one Arne Duncan, raised the monetary stakes with Race to the Top which then forced the states to comply if they wanted the federal monies.
And if you would, I think I know what you mean by “normative practices” but am not sure, please explain what you mean by that phrase, as I can think of a number of different definitions for it.