March 26, 2013 – RI Future

March 26, 2013March 27, 2013

Supreme Court Considers Marriage Equality Debate

Deprecated: Function get_magic_quotes_gpc() is deprecated in /hermes/bosnacweb08/bosnacweb08bf/b1577/ipg.rifuturecom/RIFutureNew/wp-includes/formatting.php on line 4387

What a day for the LGBT community!

The Supreme Court heard challenges to California’s Proposition 8 today, and tomorrow it will hear arguments against the Defense of Marriage Act (DOMA). (The audio and transcript , if you have time to check them out!)

Let’s hope that Chief Justice John Roberts kept his gay cousin in mind during the Prop 8 argument, as she was in attendance with her partner, in seats reserved for guests of the justices.

Four Democratic senators reversed their stance on DOMA in the past several days, but unfortunately, not everyone sees the significance of this civil rights issue and have a change of heart. There are many who refuse to even call it a “civil rights” issue. What happened at the RI State House last week played out once more near the Supreme Court today, at an anti-gay marriage rally, when African-American pastor Rev. Bill Owens said, “I marched in this same location years ago. They are trying to say they are suffering the same thing we suffered. They are not. … Not even close.”

Engaging in “Oppression Olympics” serves absolutely no one, and I am grateful to see public displays of solidarity in all communities across the country, and a million examples of Love for every hateful word spoken.

March 26, 2013March 27, 2013

2 Takes On Disability: Ken Block, ‘This American Life’

Deprecated: Function get_magic_quotes_gpc() is deprecated in /hermes/bosnacweb08/bosnacweb08bf/b1577/ipg.rifuturecom/RIFutureNew/wp-includes/formatting.php on line 4387

When the House Committee on Small Business takes up today it will hear from gubernatorial candidate-turned-foodstamp-fraud investigator Ken Block who will likely say something like, “Ultimately, our TDI program should resemble most other state’s plans in terms of cost and utilization.” as he tweeted last night.

But a must-listen “This American Life” episode this weekend makes a case for why Rhode Island’s disability insurance program should be unique. (Ed. note: this is the kind of journalism RI Future would like to do when we have more resources to play with!)

“It’s confusing, I have back pain,” says Planet Money reporter, Chana Joffe-Walt. “My editor has a herniated disk and works harder than anyone I know.”

Her piece is a very revealing look at how those who use their brains to earn a living cannot conceive of the difficulties of relying on your brawn to stay employed long term. I know this well. I worked as a farm hand in my 20’s and am tremendously grateful that my East Greenwich education allowed me to transition into the white collar economy.

But her story also says that disability is being used as a “quiet de facto welfare program.” (Something tells me Doreen Costa and Fox “News” are outraged!)

But economic realities and right wing outrage porn are so often mutually exclusive of one another.

“…if your alternative is a minimum wage job that will pay you $16,000 a year … that probably won’t be full time and very likely won’t include health insurance, disability may be a better option,” says Joffe-Walt.

Like the rural areas of the South she reports on, many formerly middle class Rhode Islanders are on disability because a life of manual labor has left their bodies battered, and a globalized economy has taken away their job security.

Just like with Block’s efforts to root out food stamp fraud, his effort to reform disability might be well-intentioned. But well-intentioned isn’t the same as economically prudent. If it was government would be easy!

Before we make our TDI program look like other states, as Block suggests, we should investigate whether are situation is unique. After all, our economy is unique to other places so adopting ideas simply because other states are doing it will likely be a bad idea for our real-life economy AND the made-up CNBC rankings.

Editor’s note: It should have been made clearer that Block is testifying about temporary disability insurance and the This American Life story is about long term disability. The headline has been corrected.

March 26, 2013March 26, 2013

Psychometrics R Us

Deprecated: Function get_magic_quotes_gpc() is deprecated in /hermes/bosnacweb08/bosnacweb08bf/b1577/ipg.rifuturecom/RIFutureNew/wp-includes/formatting.php on line 4387

A few days ago, I wrote about the NECAP test, and the statistical goals of its designers. Since then, I’ve been called “not a psychometrician” on the radio, among other things. I hear that Monday I was insulted on John DePetro’s show, too. So I thought I’d provide accounts of what a couple of psychometricians have had to say about what I wrote.

First we’ll hear from Charles DePascale. He works in New Hampshire, for the Center for Assessment (nciea.org), and is apparently the consultant to the Rhode Island Department of Education (RIDE) on all matters NECAP.

He wrote up a critique, and RIDE has been sharing it with reporters. They wouldn’t share it with me, though the department spokesguy, Elliot Krieger, told me they’d “consider” any open records request I made for public documents. But fortunately, reporters seem to be more interested in the free flow of information, and you can see the document here. (Elisabeth Harrison of RI Public Radio writes about it here.) It is unsigned in the document body, presumably since DePascale doesn’t speak for the department, according to Krieger, who does speak for them.

The document, whoever wrote it, makes three main points:

The NECAP is not a norm-referenced test, so the number of kids who flunk is a function of their abilities and instruction, not a function of the test design.
The statistical significance of the results means that you can be confident that a student will not be mistakenly flunked.
Performance on the 11th-grade reading test is what you’d expect for a graduation test, therefore the math test, designed the same way, is also fine.

Also, I said that only 9 out of 22 questions (40%) on the 11th grade math test were answered correctly by more than half the students, but in a direct blow to the central premise of my argument, DePascale says I have it all wrong, it was actually 19 out of 46 (41%). I dragged myself to the ropes, a beaten man, devastated by the force of his argument… well never mind all that.

To the first point, he is exactly right. And here, we will descend into some jargon, but please follow me, because it’s important. The NECAP is, indeed, what test designers call a “criterion-referenced” test. A student’s score on the test is referenced to a standard, not to the other test-takers. The SAT, for example, is a “norm-referenced” test, where a student is graded on their performance relative to other students. On a norm-referenced test, a fixed percentage of test-takers will flunk, almost by definition.

The NECAP is not that, and I never meant to imply that it was. I’m afraid I did use the word “certain” to describe the number of students who flunk the NECAP in one summary sentence, and that was a poor choice of words that I tried to clarify here. It is still perfectly sound advice that if you want to rank performance on a test, you do what you can to spread out the performers. This is not a point of advanced psychometrics, this is a point of basic statistical analysis, even common sense. The NECAP test designers put their test together to maximize the spread between students, for all the statistical reasons I wrote about. They do so in the questions they choose, not in the grading, as a norm-referenced test would do, and the care with which they analyze the per-question results demonstrates how careful they are.

Obviously, if you’re grading against an absolute standard, it is conceivable for all test takers to ace it, and DePascale makes that point. But the NECAP test designers have done what they can to make that highly unlikely, for perfectly valid statistical reasons, and that makes it a bad graduation test. That’s what I meant, and I stand by it, largely because I still haven’t seen anyone convincingly state otherwise.

With regard to the second point, DePascale includes a substantial discussion of whether the margin of error on the NECAP means that a student could be flunked accidentally, and claims that the chance is less than 1%, for a student barely above the threshold, after repeated testing. It’s not perfectly clear to me what point I made that this is supposed to contradict. On the contrary, this actually strengthens my contention that the test was designed to make sure that the scores were statistically sound, that a student who scores in the 40th percentile belongs there.

To make his third point, DePascale shows the distribution of test scores for the 11th grade reading and math tests, shown below.

His main goal in showing these graphs seems to be to claim that, since the 11th grade reading test looks reasonably close to the curve you’d expect for a good graduation test, the 11th grade math test is fair. He makes the same point in other parts of the document.

There are few things to say about this curve. It does show a lump of students above the passing grade, and the distribution does appear similar to the results of a test one might design to be a graduation test. However, the fat tail of the reading test distribution is not just a detail, when it comes to judging a test’s suitability as a graduation test. It might not be anything important, but you can’t just assume that. Leave that aside, though, let’s just note that it’s a funny kind of defense of one test to say that another one is just fine. I might accuse you of being a criminal. To have you reply that you have law-abiding friends isn’t much of a defense, is it?

So what is the distribution of scores for the math test? Here it is:

This is a highly skewed result. It’s certainly easy to rank the successful students in this test, since they are spread over the map. But this is a very peculiar distribution for test results that have weight in students’ lives. It’s not at all the distribution you’d expect to see of students, from the big bump at the left to the nearly linear descent as you go to the right.

What’s even more remarkable than the distribution itself is to think that some testing professional — some psychometrician — once looked at that distribution and thought, “Wow, kids really don’t know their math, do they,” and not “Wow, are we sure this test is doing what we think it’s doing?” But if there was ever any such self-doubt, there is no record of it.

And that brings me to the question of validity — how do you know a test is a good one? — and the other psychometrician I met over the weekend. More on that meeting in my next post.

p.s. While you’re waiting for that post, consider throwing a buck to the Providence Student Union. They are the ones who catapulted the issue of the NECAP graduation test onto the state’s front burner with their “take the test” event. Please help me support their great work, click here for details.