Methodological Appendix for the Live Theater Experimental Study



By 10/15/2014

0 Comments | Print | NO PDF |

Learning from Live Theater
Education Next, Winter 2015

Empirical Strategy

Because the randomized controlled trial approach has the important feature of generating comparable treatment and control groups, we can use a straightforward set of analytic techniques, designed for use in social experiments, to estimate the impact of a school field trip to see live theater on student outcomes. In its simplest form, this technique can estimate mean differences using the following equation for outcome Y of student i in matched pair m:

(1)        Yim = α + β1Treati + β2Matchim + εim

The binary variable Treati is equal to 1 if the student is in the treatment group that was randomly assigned to receive free tickets for a field trip to see live performances of A Christmas Carol or Hamlet, and is equal to 0 otherwise. Because the groups were created using a stratified randomization procedure within matched applicant group pairs, Matchim is also included in the model as a vector of dummy variables that have the statistical effect of estimating within, as opposed to across, matched groupings. Finally, εim is a stochastic error term clustered at the applicant group level to take into account the spatial correlation from students nested within applicant groups.

Proper randomization generates experimental groups that are comparable but not necessarily identical. The basic regression model can, therefore, be improved by adding controls for observable characteristics to increase the reliability of the estimated impact by accounting for minor differences and improving the precision of the overall statistical model. This yields the following equation to be estimated:

(2)  Yim = α + β1Treati + β2Matchim + β3Genderi + β4Gradei + β5Minorityi + εim

where Genderi is a dummy variable equal to 1 if the student is a female and 0 otherwise, Gradei is a vector of dummy variables indicating the grade level of student i, and Minorityi is a dummy variable equal to 1 if the student does not identify as being white and is 0 otherwise. In this model, β1 is the parameter of interest and represents the effect of a school tour for students in the treatment group. Equation (2) is our preferred model for estimating overall impacts.

Comparability of Treatment and Control Groups

Even within randomized controlled trials, treatment and control groups may differ significantly from each other by chance. To explore whether that occurred in our experiment, we compared the observed characteristics of treatment and control group students. We found only one significant difference among the 19 observed characteristics we examined. (See Appendix Table 1.) The treatment group was 5.3% more likely to be from a minority racial or ethnic group, 27.7% for the treatment group versus 22.4% for the control group. With so many comparisons, it is possible that this difference could have been produced by chance, so we are unconcerned that the lottery failed to give us generally comparable treatment and control groups. In addition, we controlled for minority status in our model, which should further alleviate concerns that the groups were different at baseline.

Appendix Table 1: Treatment/Control Balance

Treatment

Control

Difference

Individual

Average Grade

9.3

9.3

0

Percentage Female

61

62.1

-1.1

Percentage Minority

27.7

22.4

5.3**

Percentage Agree “I am a good student.”

95.2

97.6

-2.4

Percentage Agree “School is boring.”

48.5

46.2

2.3

School

Average Enrollment

850.8

829.9

20.9

Percentage Homeless

3.1

1.6

1.5

Percentage FRL

40.4

44

-3.6

Average School Poverty Index

75.2

76.8

-1.6

Percentage White

71.7

71.5

0.2

Percentage Hispanic

12.8

15.1

-2.3

Percentage Black

4.9

4.2

0.7

Percentage Other Race

11.1

9.3

1.8

Percentage Minority

28.4

28.5

-0.1

Percentage GT

8.6

8.6

0

Percentage SPED

8.1

7.7

0.4

Percentage LEP

9.5

11.4

-1.9

Average Miles from Theater

24.4

30.5

-6.1

Average Minutes from Theater

29.9

36.2

-6.3

** p < .05, two-tailed.

Treatment

Control

Number of Students

All

330

340

Christmas Carol

101

246

Hamlet

229

94

Applicant Groups

All

22

27

Christmas Carol

6

18

Hamlet

16

9

 

Outcome Scales

We examined five outcomes for students: knowledge of play plots and vocabulary, tolerance, Reading the Mind in the Eyes Test (RMET), desire to participate in theater, and interest in viewing theater. Each of these outcomes consisted of a scale constructed from multiple items on the survey.

For the knowledge scale, students who had applied to see A Christmas Carol were asked,

1)    Who is Jacob Marley?

2)    What lesson does Ebenezer Scrooge learn in A Christmas Carol?

3)    What does the Ghost of Christmas Yet to Come show Scrooge?

4)    In A Christmas Carol who says “God bless us, every one!”

5)    Who is Belle?

6)    Why does Scrooge become a mean-spirited miser?

They were also asked to pick the word that best fit the following definitions (with the answers in parentheses):

1)    Wanting things owned by others (covetous)

2)    Very poor (destitute)

3)    A ghost or ghost like image or appearance (apparition)

4)    Nonsense, or a trick (humbug)

5)    Offensive, or strongly disliked (odious)

Students whose groups applied to see Hamlet were asked,

1)    Who are Rosencrantz and Guildenstern?

2)    When Hamlet wonders whether “to be, or not to be,” what is he considering?

3)    What does the Ghost ask Hamlet to do?

4)    Why is Hamlet troubled?

5)    What happens to Ophelia?

6)    Which of these does Hamlet say? (“What a piece of work is a man!”)

They were also asked to pick the word that best fit the following definitions (with the answers in parentheses):

1)    Happiness and laughter (mirth)

2)    The appearance of a person’s face (countenance)

3)    Wherefore (why)

4)    Not working, active, or being used (idle)

5)    A roguish or mischievous act (knavery)

The tolerance scale consisted of asking students the extent to which they agreed or disagreed with the following statements, with the negatively framed items reverse-coded:

1)    People who disagree with my point of view bother me.

2)    Plays critical of America should not be allowed to be performed in our community.

3)    I am not interested in learning about people different than me.

4)    I think people can have different opinions about the same thing.

5)    Women are equally able to do the same jobs that men can do.

6)    I like to hear views different from my own.

7)    There are multiple ways to interpret the same work of drama.

The version of RMET used in our study was the one developed for adolescent subjects as described and validated in Baron-Cohen, S. Wheelwright, S. Scahill, V. Lawson, J. and Spong, A. (2001). (Are intuitive physics and intuitive psychology independent? A test with children with Asperger Syndrome. Journal of Developmental and Learning Disorders 5:47-78). Because this test is widely used and has been validated by others we did not alter it in any way despite the fact that some British-isms, such as using the word “cross” to mean angry, may have confused our students.

The scale for student interest in participating in theater consisted of the following items:

1)    How interested are you in being in a theater performance?

2)    If your school were having auditions for a new play, how interested would you be in trying to get a role in that play?

3)    How interested are you in taking a drama class?

4)    I would be interested in joining a drama club if my school had one.

The scale for student interest in viewing theater consisted of the following items, with the negatively framed items reverse-coded:

1)    How interested are you in seeing live performances in a theater?

2)    If your friends or family wanted to go to a play, how interested would you be in going?

3)    Imagine that a friend of yours is going to go on a field trip. Do you think your friend would enjoy these field trips? A theater performance

4)    Would you like more live theater performances in your town?

5)    I would tell my friends that they should see a live theater performance.

6)    I plan to see live theater performances when I am an adult.

7)    Live theater is interesting to me.

8)    Trips to see live theater are fun.

9)    I feel uneasy in theaters.

10) I feel comfortable talking about theater performances.

All of our scales were built by standardizing and averaging the components of the scales. The effect sizes of results were all computed by using the standard deviation of the control group.

Cronbach’s Alpha tests show that the items reliably measure knowledge, tolerance, interest in participating in theater, and interest in viewing theater. (See Appendix Table 2.) The Cronbach’s Alpha for the RMET scale, however, falls short of conventional standards for reliably measuring the same underlying construct. Because this scale has been validated by other researchers, however, we feel comfortable using it in this analysis. We suspect that some Britishisms and the fact that we incorporated RMET into a larger survey may have produced a lower alpha than what other researchers have found. The fact that we still observe significant effects despite a noisy scale also increases our confidence in using it. None of our scales could be improved substantially by omitting any one item, so we build all scales with all available items that are theoretically connected with the underlying constructs.

Appendix Table 2: Cronbach’s Alpha for Outcome Scales

Scale

Number of Items

Cronbach’s Alpha

Knowledge – Both plays combined

11

0.59

Knowledge – Christmas Carol

11

0.62

Knowledge – Hamlet

11

0.55

Tolerance

7

0.59

Reading Others’ Emotions

28

0.42

Interest in Participating in Theater

4

0.94

Interest in Viewing Theater

10

0.92

Analysis without Assuming Weather Events Are Exogenous

Adverse weather prevented several school groups from seeing performances of A Christmas Carol, and we have treated those events as exogenous and assigned those groups to the control group. Doing so cannot bias any estimates of the treatment because it resulted in there being no treatment students within their matched groupings. Those observations do not contribute directly to the estimate of the treatment effect because there is no variance on the treatment variable within their matched grouping. Leaving them within the analysis, however, does improve the precision of estimates for other covariates, which results in a more precise estimate of the treatment effect.

In this section, we show that we generally get similar results even if we relax that assumption and use other approaches to handling the fact that some groups had to cancel their field trips. First, we present below in Appendix Table 3 the results of our preferred approach with the inclusion of standard errors.

Appendix Table 3 – Results Treating Snow Days as Exogenous

Knowledge

Tolerance

Reading Others’ Emotions

Treatment  0.63 (0.13)***  0.26 (0.12)**  0.23 (0.11)**
Treatment – Controlling for Reading and Movie-Watching  0.58 (0.13)***  0.31 (0.11)**  0.21 (0.11)*
Read Play or Book for School  0.01 (0.15) -0.13 (0.12) -0.04 (0.10)
Watched Movie for School  0.30 (0.12)** -0.22 (0.11)*  0.11 (0.11)
Treatment – Controlling for Interest in Theater  0.61 (0.13)***  0.22 (0.09)**  0.22 (0.11)*
Interest in Theater  0.24 (0.04)***  0.37 (0.05)***  0.09 (0.04)**
* p < .10, ** p < .05, *** p < .01, two-tailed. Standard error in parentheses.

 

We could instead use an intention-to-treat approach to estimate our results. That has the advantage of ensuring that there is no bias in our estimate of treatment effects because all groups retain the treatment status they were awarded by the lottery regardless of whether their performance was cancelled for snow. But an intention-to-treat approach has the significant disadvantage of understating the effect of actually being treated, particularly for the large number of school groups whose field trip was cancelled for weather.

The results for the intention-to-treat analyses with standard errors are reported below in Appendix Table 4. As one would expect, the point estimates are lower, but the substantive findings are generally the same. The only important difference is that the effect for the main Tolerance analysis falls short of being statistically significant.

Appendix Table 4 – Results Using Intention-to-Treat Approach

Knowledge

Tolerance

Reading Others’ Emotions

Treatment  0.44 (0.14)***  0.17 (0.12)  0.17 (0.09)*
Treatment – Controlling for Reading and Movie-Watching  0.39 (0.14)**  0.22 (0.11)*  0.16 (0.09)*
Read Play or Book for School -0.01 (0.16) -0.15 (0.13) -0.05 (0.10)
Watched Movie for School  0.33 (0.12)*** -0.20 (0.11)*  0.12 (0.11)
Treatment – Controlling for Interest in Theater  0.43 (0.14)***  0.15 (0.09)  0.16 (0.08)*
Interest in Theater  0.24 (0.04)***  0.38 (0.05)***  0.09 (0.04)**
* p < .10, ** p < .05, *** p < .01, two-tailed. Standard error in parentheses.

 

We could estimate the impact on treated using a two-stage model in which the intention to treat is used as an instrument for whether groups actually received the treatment. The advantage of this approach is that we get an estimate of the impact on treated. The disadvantage is that we inflate the standard errors by using a two-stage model, which is particularly important given that there wasn’t any non-compliance from the intention to treat assignment for the Hamlet groups. So to adjust for non-compliance for one play we inflate standard errors for both.

The results for the instrumental variable analyses are reported below in Appendix Table 5. The point estimates are almost identical to the main approach where we treat weather as exogenous, but the standard errors get larger so that the main Tolerance result falls short of statistical significance.

Appendix Table 5 – Results Using Instrumental Variable Approach

Knowledge

Tolerance

Reading Others’ Emotions

Treatment  0.62 (0.16)***  0.24 (0.16)  0.24 (0.11)**
Treatment – Controlling for Reading and Movie-Watching  0.55 (0.16)***  0.31 (0.15)**  0.22 (0.11)**
Read Play or Book for School  0.01 (0.14) -0.13 (0.12) -0.04 (0.09)
Watched Movie for School  0.31 (0.12)*** -0.22 (0.11)**  0.11 (0.11)
Treatment – Controlling for Interest in Theater  0.60 (0.16)***  0.22 (0.12)*  0.23 (0.11)**
Interest in Theater  0.24 (0.04)***  0.37 (0.05)***  0.09 (0.03)**
* p < .10, ** p < .05, *** p < .01, two-tailed. Standard error in parentheses.

 

And if we use intention to treat to determine baseline equivalence, the results come out basically the same as when we treated weather as exogenous and reassigned groups that had to cancel to the control group. The intention-to-treat baseline equivalence comparisons can be found in Appendix Table 6. Of the 19 baseline characteristics on which we compare the students, those assigned by the lottery to the intention to treat condition are not significantly different from the control group in all but one instance. The intention-to-treat students are still more likely to be from a minority racial or ethnic group by 8.5%. Again, this is a difference that could have been produced by chance and is controlled in the regression models.

Appendix Table 6: Intention to Treat/Control Balance

Intent to Treat

Control

Difference

Individual

Average Grade

9.3

9.5

-0.2

Percentage Female

59.5

61.7

-2.2

Percentage Minority

30.4

21.9

8.5***

Percentage Agree “I am a good student.”

96.0

98.3

-2.3*

Percentage Agree “School is boring.”

47.8

45.5

2.3

School

Average Enrollment

969.2

753.4

215.8

Percentage Homeless

3.1

2.0

1.1

Percentage FRL

46.4

49.7

-3.3

Average School Poverty Index

84.7

87.8

-3.1

Percentage White

69.9

69.7

0.2

Percentage Hispanic

13.9

16.3

-2.4

Percentage Black

30.9

30.3

0.6

Percentage Other Race

10.9

9.2

1.7

Percentage Minority

30.1

30.3

-0.2

Percentage GT

8.6

9.3

-0.7

Percentage SPED

8.9

8.6

0.3

Percentage LEP

10.7

13.5

-2.8

Average Miles from Theater

28.7

34.1

-5.4

Average Minutes from Theater

33.4

39.4

-6.0

* p < .10, *** p < .01, two-tailed.

Treatment

Control

Number of Students Per Group

All

428

242

Christmas Carol

199

148

Hamlet

229

94

Applicant Groups

All

30

19

Christmas Carol

14

10

Hamlet

16

9

 




Comment on this article

Comment on this Article

Name ()


*

     0 Comments
Sponsored Results
Sponsors

The Hoover Institution at Stanford University - Ideas Defining a Free Society

Harvard Kennedy School Program on Educational Policy and Governance

Thomas Fordham Institute - Advancing Educational Excellence and Education Reform

Sponsors