By the Company It Keeps: Smarter Balanced

Our second installment of the testing-consortia series is a conversation with Smarter Balanced. Formed and federally funded in 2010, the consortium boasts an expert staff and set of advisory committees. Its members include the nation’s largest state, one of the first Race to the Top winners, and a number of states attempting to advance nation-leading reforms.

After my ominous prediction about the consortia’s fates, Smarter Balanced quickly responded in private. Their counter was both courteous and forceful. I was impressed by the initial case they made, and I’m very glad that they swiftly agreed to participate in this public Q&A.

Could you please briefly describe the process (including the challenges) of creating “next-generation” assessments aligned with new standards via a multi-state consortium?

The process Smarter Balanced is using is very similar to the processes that states have been using for over a decade to create assessments for NCLB accountability. Using a widely regarded conceptual approach called Evidence-Centered Design, and working in partnership with an array of private sector companies, work groups comprising assessment leadership from Smarter Balanced states have developed the various components necessary for a next-generation assessment system. Among the major elements are:

IT architecture and open-source software to deliver, score and report on assessments
Content and item specifications and a test blueprint to govern the content and format of the assessment
Accommodations and accessibility features and policies
Achievement-level descriptors, college content-readiness policy and plans for standard-setting
Reporting system design
Validity testing and psychometric research

This work has benefited immensely from the pooled expertise of state assessment professionals; K12 teachers, higher education faculty and other academic content experts; and staff from a diverse array of private sector firms. While the process that is being used to develop the Smarter Balanced assessment system would be familiar to anyone who has ever built a test, what is unique about Smarter Balanced is the bringing together of a large and diverse array of talent committed to making each element of the system “best in breed.”

What is Smarter Balanced most proud of?

We are proud that we have simultaneously been able to meet three ambitious goals:

1. We have hit all the major project milestones for delivering all three aspects of the assessment system—summative, interim and formative—on-time and on-budget for the 2014-15 academic year.

2. We have done this through a state-led process featuring consensus-based decision-making and the hard work of dedicated K-12 and higher education educators and administrators.

3. We have met our milestones without sacrificing our very high quality standards, and we continue to push the envelope of innovation in test design, including ground-breaking open-source software, innovative items and performance tasks, and new approaches to key processes such as developing achievement level descriptors and standard-setting.

What elements of this project proved more difficult than you expected?

Building an assessment system as large, multi-faceted, and sophisticated as the Smarter Balanced system is challenging, but the test-development process Smarter Balanced is using follows a sequence of steps that is familiar to all experienced assessment professionals. Given the extensive expertise of the Smarter Balanced state leadership and staff, the difficulties we have encountered in test development have not been unforeseen or unmanageable. The greater challenge is responding to the intense—and legitimate—interest of so many diverse parties in this work. State assessment directors expect scrutiny by policy makers, parents, interest groups, and others, but the number and diversity of the interested parties is much greater when working on the scale of a multi-state consortium. Keeping this diverse array of interested parties informed about the complex and often highly technical work of building an assessment system has been more challenging than we originally imagined.

Do states have the devices and bandwidth needed to deliver your online, adaptive assessments?

Some states are ready and are currently assessing students online. Other states are in the process of preparing for the assessments and will be ready when the assessments become operational in 2014-15. Recognizing the need for a transition period, for three years Smarter Balanced will support the use of a paper-and-pencil option for those schools not fully ready right away. Further, the minimum technology specifications that Smarter Balanced released last fall allow for very old operating systems and require only the minimum processors and memory required to run the operating system itself (for example, the summative assessment can be delivered using computers with 233 MHz processors and 128 MB RAM that run Windows XP). Likewise, the file size for individual assessment items will be very small to minimize the network bandwidth necessary to deliver the assessment online. Right now, Smarter Balanced provides a bandwidth checker to allow schools to determine the number of students that can simultaneously take tests and is hosting, in collaboration with PARCC, a “technology readiness tool” that allows schools and districts to assess and track their progress toward readiness.

Smarter Balanced has deployed small-scale trials and a pilot test to incrementally improve its technology system; these efforts have also helped districts better understand the technology and human resource requirements necessary to deploy the online assessments. The Smarter Balanced practice test to be released at the end of May and the Smarter Balanced field tests to be deployed in Spring 2014 will provide additional opportunities for schools to gain experience in deploying the assessments.

In recent months, concerns about cheating have skyrocketed. How can you guarantee test security given that numerous states will be giving the same exams during different test windows?

Actually, under the Smarter Balanced summative assessment design, states will be giving different tests during the same 12-week window at the end of each academic year. In a computer-adaptive assessment, each student’s test is customized based on his/her performance throughout the test. There will be no way for students to copy each other’s answers since each will be looking at a unique question. Further, since the results are captured electronically, it will not be possible for adults to tamper with results once the test administration is completed.

For schools using the paper-and-pencil option, the particular form students receive will depend on their responses to a short “locater test.” Since scoring standards will differ for the various forms, there will be no incentive for teachers to steer students to the less challenging forms.

Beyond elements of the test design that will militate against the risk of cheating, the Smarter Balanced test administration policies will call on states to conform to best practices with regard to independent monitoring and proctoring.

A defining moment on the horizon is when Smarter Balanced attempts to set cut scores and have all member states sign on. Can you help us understand the process you’ll use to reach consensus? How important is it that the cut scores are high and that members agree?

It is essential that the scores accurately reflect student mastery of the Common Core State Standards and that they have a common meaning across Smarter Balanced states. Our member chief state school officers, who will ultimately vote on the cut scores, view rigorous common performance standards as an essential element of realizing the promise of the Common Core State Standards. From its beginning, Smarter Balanced has relied upon a consensus-based process for all its policy decisions. Our experience has been that our states have little difficulty in reaching consensus when we are deliberative and remain open and transparent as policy decisions develop. We don’t expect the standard setting process to be any different.

That said, we acknowledge the challenge in setting performance standards at the scale of a multi-state consortium. Relying entirely on the traditional workshop format typically used for standard setting would make it difficult for each state to feel adequately represented in the process. To address this challenge, we are planning an innovative approach to standard-setting that will take advantage of our online testing platform to allow the participation of as many constituents as interested to review exemplar test items and weigh in on where they think the “cut scores” should be set. This crowd-sourced data, parsed by state and by respondent role (teachers, higher education faculty, parents, etc.) can then inform the comparatively small number of individuals participating in the standard-setting workshop.

How important are the two testing consortia to Common Core? Is the fate of the standards tied to the fate of the consortia?

The Common Core will succeed or fail in the classroom. Effective instruction is at the heart of meeting the high expectations set by the standards. Smarter Balanced is grounded in the notion that putting good information about student performance in the hands of teachers can have a profound impact on instruction and—as a result—on student learning. By accurately assessing the deeper learning required by the standards, and by helping teachers sharpen their own skills in formative, classroom-based assessment through the Digital Library of Formative Tools and Practices, Smarter Balanced can make a positive contribution to the ultimate success of the Common Core.

Are you confident that Smarter Balanced will be able to deliver online, adaptive assessments on time, on budget, and in all promised grades/subjects to all member states in 2014–15?

We are on track to deliver each aspect of our assessment system on time and on budget in 2014-15. To date, our work has been supported through contracts with every one of the country’s large testing companies. We have successfully sought bids and procured multiple contracts consistent with our overall project plan, and we continue to be on schedule. This spring we are pilot testing the first 5,000 items and tasks we have developed with about a million students, engaging more than 5,200 schools drawn from all 21 of our governing states. The pilot test also serves as a beta test for our test delivery software. In addition to testing out our items, performance tasks, and software, the pilot test also gives us an opportunity to evaluate a variety of accessibility features for students with disabilities and English language learners.

How concerned are you by Alabama’s and Utah’s decisions to abandon the testing consortia and Florida chief Tony Bennett’s public statement that he’s looking for a “Plan B?” Are we about to see a mass exodus from the consortia?

We regret that Alabama and Utah chose to leave the consortium; each state did so for particular reasons unrelated to the progress of the Consortium or to design decisions that the member states had reached. While those states have chosen to leave the Consortium, Alaska and the U.S. Virgin Islands have recently joined Smarter Balanced, and many states that initially joined as advisory states have transitioned to governing status, reflecting their commitment to the Consortium.

About how many states do you expect to administer Smart Balanced assessments in all covered grades and subjects in 2014–15? How many states will be participating in Smarter Balanced and PARCC combined?

We have no reason to expect changes among our 21 governing states. Recently, our governing states completed a survey asking them to identify the Smarter Balanced assessments they are most likely to ultimately use. All but one state indicated plans to use the full suite of formative, interim, and summative assessments. The one remaining state plans to implement only the summative assessment.

We also currently have four advisory states; some of those states may choose to select different assessments while others may transition to governing status.

If a state chief called you tomorrow and said, “A trusted vendor is guaranteeing me high-quality, secure assessments below Smarter Balanced costs and without all of the hassles that come along with a 20-state consortium,” what would you tell him/her?

A primary benefit of the Smarter Balanced assessment system is that it is built by states, for states. States in the Consortium have a level of direct decision-making control that they could never hope to achieve with an “off-the-shelf” product. They also are assured of a level of multi-states comparability, both within the Consortium and across PARCC and Smarter Balanced, which is unlikely to be reached with a commercial test. Finally, the transparency of the Smarter Balanced system is antithetical to the competitive nature of commercial test publishing. Examples of that transparency include:

Our open-source test delivery software
Our open bank of interim items that are built to the same specifications as the item bank for the secure summative assessment
The extent to which the documents that guide our work invite public review and are published online
The full and open disclosure of our plans for research and validity and the results of those studies.

Additionally, Smarter Balanced is developing a distributed, multi-actor system of test delivery, with the Consortium maintaining responsibility only for those aspects that are essential for ensuring continued comparability of results, quality improvements, and state-led governance. Under this system, many of the services that states need to administer the tests and deliver results will come from the vendor community. This system allows states to maintain control over content and quality while outsourcing to the private sector those elements of test delivery system that these companies have mastered.

Finally, the estimated total cost for the Smarter Balanced assessments ($22.50 per student for the summative assessment in both English language arts and math, or $27.30 per student for the full system of formative, interim and summative assessments) is less than the amount that about two-thirds of our member states currently pay for their state assessments. These costs encompass both the services provided by Smarter Balanced in common to all member states and the services that states will either provide directly or procure from vendors in the private sector.

One element dominates the cost: approximately 70 percent of the vendor cost for summative assessments is tied to hand-scoring. Measuring the deeper learning required by the Common Core requires that students write extensively and much of that writing cannot yet be scored by technology. Paying teachers, faculty, and other content experts to score student responses is costly, but it is currently the only effective way to measure important elements of the Common Core. Until automated scoring of writing improves, reducing the cost would require reducing the amount of writing—a step that cannot be taken without compromising fidelity to the standards. Smarter Balanced can include extensive writing and maintain a reasonable cost because our size allows us to take advantage of economies of scale. Off-the-shelf tests that cost substantially less than Smarter Balanced assessments almost certainly will not include as many items and tasks that require students to produce a response rather than simply find a correct answer. We believe this is a significant quality benefit of Smarter Balanced.

Could you please describe Smarter Balanced’s relationship with the U.S. Department of Education? For example, how often do you meet with them, what kinds of technical assistance do they provide, how much do they direct your work, etc.?

Smarter Balanced is funded under a cooperative agreement with the U.S. Department of Education that extends through September 2014. The fiscal agent for the federal grant is the state of Washington. The Department of Education has responsibility for fiduciary and programmatic oversight of the grant. In essence, they need to track that we are doing the things we promised to do and spending funds in accordance with our approved budget. Like any federal grantee, Smarter Balanced must operate within the requirements of its federal grant; however, there is a great deal of latitude built into the grant for state decision-making. For example, the grant stipulates that Smarter Balanced must build an assessment of the Common Core State Standards, but the test blueprint specifying the proportion of test material on various topics is something the states in the Consortium decide.

We meet with program officers at the Department monthly and provide them with quarterly financial and programmatic reports. In addition, once a year we undergo a thorough program review, not unlike the program review that states have always gone through for their Title I grants.

Since the inception of No Child Left Behind, the Department of Education has used a process called “Peer Review” to ensure that all state testing programs adhere to the AERA/NCME/APA Joint Standards for Educational and Psychological Testing. The consortia assessments will be no different with regard to this Peer Review, and we have already been preparing and submitting materials to the Department of Education for that purpose. This level of review is no greater nor less than the technical scrutiny the Department of Education requires of all state tests designed to meet the requirements of federal accountability.

At the conclusion of the federal grant, Smarter Balanced will transition to being an operational assessment system supported by its member states. The consortium does not plan to seek additional funds from the U.S. Department of Education.

Why do you think 70 percent of “education insiders” say Smarter Balanced is on the wrong track?

Smarter Balanced was created by assessment professionals in state education agencies who determined that by pooling their experience and expertise—and by taking advantage of the federal funds offered by the Department of Education and working in partnership with private sector firms—they could build more sophisticated and accurate assessments of student learning than any individual state could offer on its own. For the last two and a half years, this group of technical experts has been busily doing its work, and the result is that Smarter Balanced is on track and on budget.

Assessment experts around the country have expressed nothing but admiration for the work that Smarter Balanced has done (for example, see http://www.cse.ucla.edu/products/reports/R823.pdf). The education insiders who responded to the survey referenced in this question likely aren’t experts in assessment and—because Smarter Balanced is not a Washington, DC-based organization and its leaders are not well known “inside the beltway”—they are not as familiar with the work the Consortium has done. Smarter Balanced is committed to doing more outreach and communication work to better inform all our stakeholders about the progress we have made and the challenges ahead.

Would you please explain your plans for Smarter Balanced’s future governance, leadership, and funding?

In March 2013, the governing states of Smarter Balanced endorsed a sustainability plan that included instructions for the Smarter Balanced executive director to enter into negotiations with the National Center for Research on Evaluation, Standards, and Student Testing (CRESST) at UCLA to serve as a partner and host for the Smarter Balanced Consortium after the completion of the federal grant in September 2014. These negotiations are moving toward an agreement by UCLA to recognize the shared state ownership of the assessment system content and an independent governance structure much like the one that the consortium currently employs. Smarter Balanced will continue to be governed by its member states, with K-12 and higher education representatives and a small executive committee providing day-to-day oversight. Operations will be managed by a small staff under the leadership of an executive director and two deputies. Funding will come primarily from fees paid by states for packages of assessment services, with UCLA/CRESST providing office space and key administrative support in areas such as finance, human resources, legal advice, etc.

What else would you like people to know about Smarter Balanced?

As part of our commitment to transparency—and in order to help teachers, teacher educators, and other interested parties learn about and prepare for the assessments—Smarter Balanced will be releasing a complete set of practice tests for each subject and grade level at the end of May. These practice tests will be freely available on the Smarter Balanced web site (www.SmarterBalanced.org); they will utilize the same software system that is being used for the operational test and will feature many of the tools, accommodations and accessibility features that will be included in the final software package. Everyone interested in seeing first-hand what the assessments will look like is encouraged to visit our web site and challenge themselves by answering the questions in these practice tests.

-Andy Smarick

This blog entry first appeared on the Fordham Institute’s Common Core Watch blog.