Given the interest among readers on my blog about how to evaluate the new required 9th grade science course, and what seems to be the general support for conducting such an evaluation, I'm posting a memo that Steve Rivkin (School Committee Candidate) and I submitted last fall to the Superintendent, copied to the Chairs of both the Amherst and Regional School Committee chairs. When this memo, which describes how to conduct such an evaluation, was submitted, we offered to meet with the superintendent, high school principal, and/or high school science teachers. I also personally offered to have an Amherst College thesis student conduct the evaluation, under my supervision. We have yet to receive a response from anyone regarding such an evaluation.
To: Alton Sprague
From: Steve Rivkin, Ph.D., and Catherine Sanderson, Ph.D.
Cc: Michael Hussin, Andy Churchill
Re: Review of 9th Grade Science
Date: October 15, 2008
This is a brief note on issues relevant to the review of 9th grade science, and we hope this is helpful. We both conduct program evaluations in our work, and our CVs are attached. We would also be both glad to answer questions or suggestions about conducting such an evaluation at any time, so please be in touch if and when that would be helpful.
Review of 9th grade Science Course Change
A comprehensive and informative evaluation of the change in 9th grade science requires the development of a valid empirical model, which includes consideration of both short-tern and long-term outcomes and the collection of the requisite data. In particular, it is important to produce separate estimates of the effect of the change on students who would have been in honors biology, on students who would have been in honors earth science, and on students who would have been in college prep earth science. In addition, it is important to obtain estimates for specific sub-groups, such as students of color, lower income students, and females. Such an evaluation will help the School Committee, Superintendents, and high school administrators and teachers understand the impact of the new required 9th grade science course on all students (and it is certainly possible that the new course will have a different impact on different students).
We briefly describe an ideal but not feasible evaluation framework as a kind of Holy Grail for evaluation in which all changes in student outcomes can be attributed directly to the change in 9th grade science. We then turn to a feasible approach given the available information and discuss some of the problems that must be addressed given that different kids experienced the two 9th grade science courses at different times. Finally, we describe the data needed for a successful evaluation.
Ideal evaluations: In the ideal evaluation, students would attend high school and go on to post-secondary activities first under the old 9th grade science curriculum and then under the new 9th grade science curriculum. Everything else would be identical, so that any differences in student outcomes potentially including satisfaction with 9th grade science, MCAS performance, number of science courses taken, number of AP science courses taken in various subjects, colleges attended, college majors, performance in science, future occupation and earnings could be directly attributed to the change in 9th grade science.
Of course people only go through high school once, so this “ideal” is not feasible. An alternative, and more feasible, approach that is commonly used in research to make such comparisons would be to randomly assign 9th graders to take either the old or the new science curriculum. Differences in outcomes between the two groups would provide a valid estimate of the effect of changing the 9th grade science curriculum if the randomization is done well. However, in this case such randomization is clearly not appropriate, as the high school would have had to offer two sets of science courses and parental and student efforts to get into the course of their choosing would have compromised the experiment. Most important, this was not done.
Feasible evaluation: The feasible alternative is to compare the cohort of students who attended 9th grade just prior to the change in science curriculum to a cohort of students who attended 9th grade just following the change. Although this approach is similar to the ideal framework, there are a number of potential impediments to a successful evaluation. These include cohort differences in student characteristics (such as interest in science, math background, etc.) as well as other changes (such as in teachers, curriculum, school policies, etc.). One should also acknowledge that the final year under the old curriculum could be less interesting and less well organized than the previous years as teachers understand that is the final year. Moreover, the first year under the new program might have some glitches, though the enthusiasm for the new program might be unusually strong in its first year.
In terms of the implementation of this evaluation, some of these potential impediments cannot be directly addressed and some can be with appropriate steps. On the one hand, changes in school policies, personnel, general interest in science, and other factors can be noted and considered but not directly incorporated in the analysis. For example, it would be important to know whether the different science courses influence the number of students who leave the school either as dropouts or school transfers (e.g., do some students who would have taken honors biology opt out of the high school for private school once it is no longer offered?). On the other hand, differences in student preparation and characteristics can be addressed with information on middle school performance, particularly in eighth grade mathematics, and student demographics.
There are a number of different methods that can be used to estimate the effects of the new curriculum on different groups of students.
Option #1: The state of the art is to use the method of propensity score matching to essentially “match” each student in the pre-change cohort to a student in the post-change cohort on the basis of middle score academic performance, family income, gender, race, ethnicity, and other relevant factors. This method provides a way to correct for cohort differences along a number of dimensions. (This is the approach that Catherine is now using to examine the effectiveness of the Pipeline Program.)
Option #2: An alternative and less technically demanding approach is to classify all students in the post-change cohort by which class they would have taken under the old system. Then students who actually took college prep earth science can be compared with those who would have taken the course, students who actually took honors earth science can be compared with those who would have been in the course, and students who actually took honors biology can be compared with students who would have been in the course. Although some students will be wrongly assigned to a particular group (because we don’t actually know what course they would have taken), one can make a pretty good guess about which course they would have taken based on 8th grade math preparation (given that only students with 8th grade algebra were eligible for the honors biology class).
Either of these methods clearly requires information on middle school academic performance and student demographics in order to mitigate the effects of cohort differences and to allocate students in the post-change group into the various courses they would have taken under the old system. It is our understanding that this type of information on individuals was not collected in the initial round of data collection (with the exception of gender and race). This complicates the evaluation and does rule out certain comparisons that ideally one could have made. However, it is certainly possible to collect information on middle school transcripts (including math class taken in 8th grade), high school transcripts (including 9th grade science course taken) and demographic characteristics for the pre-cohort students that can be used to both create the post-change groups and collect longer-term outcome data for the pre-change cohort.
Data needed: The initial short-term outcomes were measured based on a survey of 9th grade science students (which includes interest in science and future intentions to take science). We agree that these are important questions, and that students should be surveyed on such measures again this year to evaluate the qualitative outcomes of the different science courses. It is also important to collect data on longer-term, and quantitative, outcomes of such courses, and to build such plans into the overall evaluation model. These outcomes should include scores on 10th grade Science MCAS (biology, chemistry), number (and type) of science courses taken, and achievement on standardized tests (e.g., SAT IIs, APs). Of course ideally longer-term outcomes (such as college attended, proficiency in college science courses, and occupation) would be measured, but such outcomes are probably not feasible, nor can such data be collected in a timely enough way for decisions about the current science program to be made.
My Goal in Blogging
I started this blog in May of 2008, shortly after my election to the School Committee, because I believed it was very important to both provide the community with an opportunity to share their thoughts with me about our schools and to provide me with an opportunity for me to ask questions and share my thoughts and reasoning. I have found the conversation generated on my blog to be extremely helpful to me in learning community views on many issues. I appreciate the many people who have taken the time to share their views. I believe it is critical to the quality of our public schools to have a public discussion of our community priorities, concerns and aspirations.