Maximum Simulated Likelihood (R)

Introduction

For an existing project, my coauthors and I use a number of statistical tools in conjunction with a structural model in order to recover preference parameters from experimental data. As described in extensive detail in the draft, we have designed a two-stage experiment in the classic endowment effect framework in an attempt to test the comparative statics of the KR model; our primary contribution is a theoretical and empirical demonstration that accounting for heterogeneity in individual gain-loss attitudes is crucial for generating/recovering predictions in this paradigm. In order to convincingly demonstrate this, we use our first stage experimental data to estimate gain-loss attitudes, from which we generate sharp, testable predictions that form our second stage hypothesis.

As the measurement of gain-loss attitudes is fundamental to our hypothesis, we experiment with a number of methods. Originally, we opted for a standard MLE procedure relying on random utility methods and our structural model. However, these estimates did not directly allow us to speak about the core heterogeneity in which we were interested. Because of this, we adapted our estimation procedure to a similar methodology more suited to measuring distributions: mixed logit. The key difference in this framework is that we assume our central parameter is normally distributed, with unobservable, individual-level noise. This problem has no analytical solution, so we adopt Monte Carlo simulation methods — sampling from our assumed noise distribution to generate a Maximum Simulated Likelihood function which we ultimately maximize.

Once we have estimated the distribution of gain-loss attitudes, we assign individual level parameters by computed the expected value of gain-loss attitude that would lead to the observed decision (given the choice context). With this in hand, we can run our regression of interest relying on the estimated value of gain-loss attitude rather than a coarser classification as in the paper.

Code

The following code implements the MSL estimation procedure, as well as the individual parameter assignment and interaction regression of interest. The code is implemented in R, although our most recent effort in this direction has a slightly different flavor and is implemented in Stata.

library(foreign)
library("haven")
library(dplyr)
#gather the data from the wd, currently formatted as a dta from Stata.
orig_data <- read_dta("original_dataset.dta")

#set the number of random draws
num_draws=1000

#Define the Random Parameter Mixed Simulated Likelihood Function.
mslf <- function(param){
  #Set up the major variables that will be used to created a likelihood
  choice<-Data$preference_liking
  endowment<-Data$InitialGood_Stage1
  
  #set of parameters we are hoping to find the MSL estimates of.
  lambda_m<-param[1]
  u1<-param[2]
  u2<-param[3]
  u3<-param[4]
  u4<-param[5]
  delt<-param[6]
  sd<-exp(param[7])
  

  sim_avg_f=0
  
  set.seed(10101)
  
  #create the for loop over which we generate the simulated likelihood function
  for(i in 1:num_draws){
    #first, generate a set of random normal variables for each individual
    #This will represent the underlying (unobserved) heterogeneity in our random parameter model.
    unobserved_noise<-rnorm(nrow(Data), 0, 1)
    
    #Draw lambda value for an individual, sampling from the mean value (lambda_temp) with noise e*sd.
    lambda<-lambda_m+unobserved_noise*sd
    
    #Given individual context, generate the KR structural utilities.
    #Good a represents the endowment, so we compute U(a|a).
    kr_utils_good_a=u1*(endowment==1)+u2*(endowment==2)+u3*(endowment==3)+u4*(endowment==4)
    #Good b represents the alternative good, so we compute U(b|a)
    kr_utils_good_b=(2*u2-lambda*u1)*(endowment==1)+(2*u1-lambda*u2)*(endowment==2)+
                    (2*u4-lambda*u3)*(endowment==3)+(2*u3-lambda*u4)*(endowment==4)
    
    #Construct the likelihood at the given draw
    sim_f=(exp(kr_utils_good_a)/(exp(kr_utils_good_a)+exp(kr_utils_good_b+delt)))*(choice==1)+
          (exp(kr_utils_good_b)/(exp(kr_utils_good_b)+exp(kr_utils_good_a+delt)))*(choice==-1)+
        (1- (exp(kr_utils_good_b)/(exp(kr_utils_good_b)+exp(kr_utils_good_a+delt)))-
           (exp(kr_utils_good_a)/(exp(kr_utils_good_a)+exp(kr_utils_good_b+delt))) )*(choice==0)
        
    sim_avg_f = sim_avg_f + sim_f/num_draws

  }
  log(sim_avg_f)
} 

#Select the relevant attributes to feed into the MSL function.
Data<-select(orig_data, c(InitialGood_Stage1, Treatment, preference_liking))
#MSL of Lambda for "Prefer Endowment"
msl_results <- maxBFGSR(mslf, start=c(1.5,1, 1, 1, 1, 0.75, 0.75), print.level=2, activePar=c(T,F,T,T, F, T,T), tol=1e-5)
summary(msl_results)


#Present the MSL coefficient estimates and their associated Standard Errors
coeffs <- msl_results$estimate
covmat<-solve(-(hessian(msl_results)[activePar(msl_results), activePar(msl_results)]))
stderr <- sqrt(diag(covmat)) 

for(i in 1:length(which(activePar(msl_results)==FALSE))){
  stderr<- append(stderr, NA, after=((which(activePar(msl_results)==FALSE)[i])-1))
}
zscore <- coeffs/stderr
pvalue <- 2*(1 - pnorm(abs(zscore)))
results_bundle1_ind <- cbind(coeffs,stderr,zscore,pvalue)
colnames(results_bundle1_ind) <- c("Coeff.", "Std. Err.", "z", "p value")
print(results_bundle1_ind)

#With the MSL estimates in hand, we can run the second stage regressions of interest.
#In particular, we have estimates of the distribution of lambda, as well as the relative 
#utilities and indifference thresholds. From here, we can relate the pattern of choices made
#to an expected value of lambda for that particular choice. For instance, someone endowed good
#1 and stating a preference for good 1 would have specifc expected lambda related to the 
#utility of good 1 vs good 2, which we compute in this section.

#These variables represent our estimated quantities
l_est <- coeffs[[1]]
u1 <- coeffs[[2]]
u2 <- coeffs[[3]]  
u3 <- coeffs[[4]]  
u4 <- coeffs[[5]]
d <- coeffs[[6]] 
sd_est <- exp(coeffs[[7]])

#we draw a large number of lambdas from the distribution we estimated with the MSL.
lambdas <- rnorm(num_draws, mean = l_est, sd= sd_est)


######################################
#Endowed 1: Expected Lambda Given Choice
######################################

#First, compute the logit probability of Preferring 1,2 or indifference from our lambda distribution.

# Probability of Preferring 1 given Endowed 1
p_11 <- exp(u1)/(exp(u1) + exp(2*u2 - lambdas*u1 + d))
##Probability of Preferring 2 given Endowed 1
p_21 <- exp(2*u2 - lambdas*u1)/(exp(u1+d) + exp(2*u2 - lambdas*u1)  )
##Probability of Preferring Neither 
p_no1 <- 1 -p_11 - p_21


# Following Train (2002) (Discrete Choice Models with Simulation, Chapter 6), we compute the
#expected value by integrating over the mixed logit probabilities (p_11, etc) for each lambda,
#weighted by the distribution estimated.

# Expected Lambda for Prefer 1 given Endowed 1
l_11 <- sum((p_11/sum(p_11))*lambdas)
# Expected Lambda for Preferring 2 given Endowed 1
l_21 <- sum((p_21/sum(p_21))*lambdas)
# Expected Lambda for Preferring Neither given Endowed 1
l_no1 <- sum((p_no1/sum(p_no1))*lambdas) #Although not used for the analysis, our distributional estimates allow us to quantify #the likelihood Probability that an individual is loss averse (lambda>1) given their options. 

#Probability Loss Averse for Preferring 1 given Endowed 1
pla_11 <- sum((p_11/sum(p_11))*ifelse(lambdas>1,1,0))
#Probability Loss Aversefor Preferring 2 given Endowed 1
pla_21 <- sum((p_21/sum(p_21))*ifelse(lambdas>1,1,0))
#Probability Loss Averse for Preferring Neither given Endowed 1
pla_no1 <- sum((p_no1/sum(p_no1))*ifelse(lambdas>1,1,0))

#We now repeat these computations for each of the endowments

######################################
#Endowed 2: Expected Lambda Given Choice
######################################

##Probability of Preferring 2 given Endowed 2
p_22 <- exp(u2)/(exp(u2) + exp(2*u1 - lambdas*u2 + d))
##Probability of Preferring 1 given Endowed 2
p_12 <- exp(2*u1 - lambdas*u2)/(exp(u2+d) + exp(2*u1 - lambdas*u2)  )
##Probability of Preferring Neither 
p_no2 <- 1 -p_22 - p_12


#Expected Lambda for Preferring 2 given Endowed 2
l_22 <- sum((p_22/sum(p_22))*lambdas)
#Expected Lambda for Preferring 1 given Endowed 2
l_12 <- sum((p_12/sum(p_12))*lambdas)
#Expected Lambda for Preferring Neither given Endowed 2
l_no2 <- sum((p_no2/sum(p_no2))*lambdas)


#Probability Loss Averse for Preferring 2 given Endowed 2
pla_22 <- sum((p_22/sum(p_22))*ifelse(lambdas>1,1,0))
#Probability Loss Aversefor Preferring 1 given Endowed 2
pla_12 <- sum((p_12/sum(p_12))*ifelse(lambdas>1,1,0))
#Probability Loss Averse for Preferring Neither given Endowed 2
pla_no2 <- sum((p_no2/sum(p_no2))*ifelse(lambdas>1,1,0))


######################################
#Endowed 3: Expected Lambda Given Choice
######################################

##Probability of Preferring 3 given Endowed 3
p_33 <- exp(u3)/(exp(u3) + exp(2*u4 - lambdas*u3 + d))
##Probability of Preferring 4 given Endowed 3
p_43 <- exp(2*u4 - lambdas*u3)/(exp(u3+d) + exp(2*u4 - lambdas*u3)  )
##Probability of Preferring Neither 
p_no3 <- 1 -p_33 - p_43

#Expected Lambda for Preferring 3 given Endowed 3
l_33 <- sum((p_33/sum(p_33))*lambdas)
#Expected Lambda for Preferring 4 given Endowed 3
l_43 <- sum((p_43/sum(p_43))*lambdas)
#Expected Lambda for Preferring Neither given Endowed 3
l_no3 <- sum((p_no3/sum(p_no3))*lambdas)

#Probability Loss Averse for Preferring 3 given Endowed 3
pla_33 <- sum((p_33/sum(p_33))*ifelse(lambdas>1,1,0))
#Probability Loss Aversefor Preferring 4 given Endowed 3
pla_43 <- sum((p_43/sum(p_43))*ifelse(lambdas>1,1,0))
#Probability Loss Averse for Preferring Neither given Endowed 3
pla_no3<-sum((p_no3/sum(p_no3))*ifelse(lambdas>1,1,0))


######################################
#Endowed 4
######################################
##Probability of Preferring 4 given Endowed 4
p_44 <- exp(u4)/(exp(u4) + exp(2*u3 - lambdas*u4 + d))
##Probability of Preferring 3 given Endowed 4
p_34 <- exp(2*u3 - lambdas*u4)/(exp(u4+d) + exp(2*u3 - lambdas*u4)  )
##Probability of Preferring Neither 
p_no4 <- 1 -p_44 - p_34

#Expected Lambda for Preferring 4 given Endowed 4
l_44 <- sum((p_44/sum(p_44))*lambdas)
#Expected Lambda for Preferring 3 given Endowed 4
l_34 <- sum((p_34/sum(p_34))*lambdas)
#Expected Lambda for Preferring Neither given Endowed 4
l_no4 <- sum((p_no4/sum(p_no4))*lambdas)

#Probability Loss Averse for Preferring 4 given Endowed 4
pla_44 <- sum((p_44/sum(p_44))*ifelse(lambdas>1,1,0))
#Probability Loss Aversefor Preferring 3 given Endowed 4
pla_34 <- sum((p_34/sum(p_34))*ifelse(lambdas>1,1,0))
#Probability Loss Averse for Preferring Neither given Endowed 4
pla_no4<-sum((p_no4/sum(p_no4))*ifelse(lambdas>1,1,0))


#Having computed the expected lambda given the possible combination of rating preference
#and endowment, we can now assign these values to the individuals in the lab, who actually
#made these preference statements. This will yield 12 values of lambda in the data set.
#With these lambdas assigned, as well as the treatment indicator, we can analyze the second 
#stage behavior using the interaction specification of interest. Specifically, we regress 
#Voluntary_Exchange on the estimated lambda, treatment, and the interaction of the two.

orig_data<-orig_data %>% mutate(Measured_Lambda=case_when(
  (InitialGood_Stage1==1 & preference_liking==1) ~ l_11,
  (InitialGood_Stage1==1 & preference_liking==-1) ~ l_21,
  (InitialGood_Stage1==1 & preference_liking==0) ~ l_no1,
  (InitialGood_Stage1==2 & preference_liking==1) ~ l_22,
  (InitialGood_Stage1==2 & preference_liking==-1) ~ l_12,
  (InitialGood_Stage1==2 & preference_liking==0) ~ l_no2,
  (InitialGood_Stage1==3 & preference_liking==1) ~ l_33,
  (InitialGood_Stage1==3 & preference_liking==-1) ~ l_43,
  (InitialGood_Stage1==3 & preference_liking==0) ~ l_no3,
  (InitialGood_Stage1==4 & preference_liking==1) ~ l_44,
  (InitialGood_Stage1==4 & preference_liking==-1) ~ l_34,
  (InitialGood_Stage1==4 & preference_liking==0) ~ l_no4,
  
))


#First, plot a kernel smoothed density of the Lambda.
library(ggplot2)
density_plot<-ggplot(orig_data, aes(Measured_Lambda)) + geom_density()+
  labs(x="Measured Lambda", y="Density", title = "Smoothed Density of Gain-Loss Attitude")


#Finally, run the regression of interest.
interaction_reg=lm(VoluntaryExchange~Treatment+Measured_Lambda+(Treatment*Measured_Lambda), data = orig_data)
summary(interaction_reg)
library("stargazer")
stargazer(interaction_reg, title="MSL Interaction Regression",
          align=TRUE, dep.var.labels=c("Exchange (=1)"),
          covariate.labels=c("Treatment","$\\hat{\\lambda}_i$",
                             "$\\hat{\\lambda}_i \\times$ Treatment"),
          omit.stat=c("LL","ser", "aic", "bic"), no.space=TRUE)

Lambda_Dist_Blog

 

MSL_Table

Sources:
Goette, Lorenz, Thomas Graeber, Alexandre Kellogg, and Charles Sprenger (2018). “Heterogeneity of Gain-Loss Attitudes and Expectations-Based Reference Points”.
Kőszegi, Botond and Matthew Rabin (2006). “A model of reference-dependent preferences”. In: The Quarterly Journal of Economics, pp. 1133–1165.
Kőszegi, Botond and Matthew Rabin (2007). “Reference-Dependent Risk Attitudes.” In: American Economic Review (4): 1047–73.

 

Power Analysis Code (Python)

Introduction

A fundamental part of experimental economic research is writing pre-analysis plans, which serve as a commitment device for the hypotheses we wish to test and the number of participants we aim to recruit. In outlining the specific questions we are seeking to answer with a specific sample size, we can reassure future readers that the findings we describe are not a coincidence which we stumbled upon as we analyzed our data, lending increased credibility to the experimental results. The following sections describe one particular (preliminary) exercise conducted in the pre-analysis plan for a working paper of mine, Campos et al (2019).

Our project is a natural follow-up to a submitted project of mine, Goette et al (2019), in which we derive and test comparative static predictions of the KR model in the endowment effect context with heterogeneous gain-loss types. Specifically, we consider two types of agents: loss averse and gain loving. Loss averse individuals, roughly speaking, are defined as those who would accept a 50-50 gamble of +$10+x, -$10 for x>0; intuitively, these agents dislike losses around a reference point (assumed 0 here) more than equal sized gains, and would thus need to be compensated with a bigger payoff to accept the possibility of a $10 loss. The larger the x before they accept, the more loss averse. Gain lovers, in the same rough terms, would actually accept gambles where x<0 because, intuitively, they enjoy the surprise of the lottery, and enjoy gains above the reference point more than commensurate losses.

In Goette et al (2019), we show that lab participants previously measured to be gain loving vs loss averse respond quite differently to a treatment that is commonly used to test the KR model. Importantly, prior analysis of these types has tended to ignore the heterogeneity and test the treatment assuming that people are loss averse on average. Although this is empirically true, the 15-30% of gain lovers have an outsized role in aggregate treatment effects, which we uncover in our paper. The pre-analysis exercise described below applies this same experimental framework to a different domain, to explore whether these gain-loss classifications predict behavior in the real effort setting.

Power Analysis Overview and Code

Before we run our experiments in this new domain, we want to be sure that the theoretical predictions yield interesting, testable implications that can be measured with reasonable sample sizes. To get a feel for this, we run simulations on bootstrapped data, allowing us to recover expected treatment effect sizes under heterogeneous populations. From the data in Goette et al (2019), we obtain a distribution of gain-loss attitudes measured in a lab population — which we assume to be representative despite the domain change. From Augenblick and Rabin (2018), we obtain MLE estimates of the cost of effort function under a particular functional form assumption, using the same task as we will in this experiment (see Table 1 for parameter estimates).  With these distributions of parameters in hand, we have all the requisite information to generate simulated behavior under the KR model, specifically the CPE assumption.

For a range of sample sizes, we bootstrap from these distributions and generate simulated behavior, which we subsequently feed into our regression of interest. For each of the sample sizes we consider, we store the estimated regression coefficient as well as the minimum detectable effect size (approximated by 2.8*SE(coef) as in Page 16 of these JPAL slides), which we ultimately plot against the bootstrapped sample. This plot helps inform us of what types of effect sizes we can reject at different sample sizes; by examining our simulated effect sizes, we are able to map the results and determine the number of participants we require.

Note that there are a number of simplifications in this code, and the final sample size will be determined using a slightly different procedure. Specifically, the analysis herein assumes we know the gain-loss value (lambda), whereas in our study, we will estimate it from experimental data. Because this introduces additional noise, we expect attenuation bias in our parameter estimates. This, and other details that were skipped over, will be discussed at length in the pre analysis plan; as soon as it is posted, I will link to it.

This Python code implements the aforementioned procedure, generating a preliminary MDE curve.

'''
Code Overview: Using data from Goette et al (2019) and Augenblick and Rabin (2019),
we bootstrap a distribution of gain-loss and cost of effort function parameters to
conduct a power analysis on our experimental hypothesis. Ultimately, we hope to
determine the requisite sample size to test our coefficient of interest with 80%
power at the 5% two-sided level.

Author: Alex Kellogg
'''

#Import required modules for the ensuing analysis
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import statsmodels.formula.api as sm

'''
We assume the distribution of gain-loss attitudes follows that in Goette et al (2019)
for  this analysis. This data was estimated via MLE in a prior project, and is
generally representative of the experimentally measured distributions throughout the
literature.
'''

#Read in the relevant columns from full Goette et al (2019) lab data
ggks_dataset=pd.read_csv("dir/pooled_data.csv", usecols=['stage1_lambda','structla'])

'''
Because our task is adopted from Augenblick and Rabin (2019), who have prior
MLE estimates of the cost of effort function given their assumed functional
form, we opt to model effort costs in the form below. This allows us to introduce
heterogeneity in a rigorous manner to both cost of effort and gain-loss attitudes.
'''

def cost_of_effort(e, c_slope, c_curv):
    '''
    :param e: Effort, or the number of tasks (b/w 0 and 100).
    :param c_slope: Slope parameter, normalizing tasks to dollar costs.
    :param c_curv: Curvature parameter determining convexity of cost function.
    :return: (Negative of) Utility from completing e tasks.
    '''
    return (1/(c_slope*c_curv))*((e+10)**c_curv)

'''
We assume individual utility follows KR06 CPE, so that there is no gains/losses
in the effort domain as the number of tasks are preselected, but uncertainty
in wages yields gain-loss comparisons against reference points. Specifically,
each feasible ex-ante outcome is compared to the others, weighted by their a-priori
likelihood.

'''
def cpe_utils(e, c_slope, c_curv, wage, fixed, lam):
    '''

    :param e: The number of tasks considered.
    :param c_slope: Slope parameter, normalizing tasks to dollar costs.
    :param c_curv: Curvature parameter determining convexity of cost function.
    :param wage: Piece-rate (per task) rate of payment.
    :param fixed: Outside option, which is earned regardless of effort with 50%.
    :param lam: Gain-loss Attitude parameter, lam>1 implies loss aversion.
    :return: KR06 CPE utility of working e tasks given the preference parameters and wage rates.
    '''

    return 0.5*e*wage+0.5*fixed-0.5*(lam-1)*abs(fixed-wage*e)-cost_of_effort(e,c_slope,c_curv)

'''
Unfortunately, there is no closed form solution to the problem of optimal effort with
fixed vs piece-rate wages. Thus, to determine the optimal utility given the parameters,
we conduct a grid search over the possible values of effort, which can range between 0
and 100 tasks. The alternative to a grid search is a series of elseif conditions to
determine how the Marginal Benefit and Marginal Cost curves relate to each other, but
the number of checks is extensive and thus the computational cost of the grid search
outweighs the speed but increased error rate of the checklist approach.
'''

#Loop over the possible task choices for the agent given the parameters, and store the optimal
def optimal_effort_solver(c_slope, c_curv, wage, fixed, lam):
    '''
    :param c_slope: Slope parameter, normalizing tasks to dollar costs.
    :param c_curv: Curvature parameter determining convexity of cost function.
    :param wage: Piece-rate (per task) rate of payment.
    :param fixed: Outside option, which is earned regardless of effort with 50%.
    :param lam: Gain-loss Attitude parameter, lam>1 implies loss aversion.
    :return: The number of tasks that yields maximum CPE Utils.
    '''
    effort_space = np.arange(0, 100.1, 0.1)
    utils_vec=[]
    for a in effort_space:
        tempU= cpe_utils(a, c_slope, c_curv, wage, fixed, lam)
        utils_vec.append(tempU)
    utils_vec = np.asarray(utils_vec)
    max_ind = np.argmax(utils_vec)
    return effort_space[max_ind]

#Define a function the computes the between subjects interaction regression.
def treatment_effect_bootstrapped_between_lambda(data):
    '''
    :param data: Dataframe containing the relevant variables for the regression specification.
    :return: Regression Result.
    '''

    #generate a new effort variable that captures only the relevant effort given treatment
    data['Effort']=data['Treatment']*data['Effort_Choice_Hi_Fixed']+(1-data['Treatment'])*data['Effort_Choice_Low_Fixed']

    data['Interaction']=data['Treatment']*data['Lambda']

    result = sm.ols(formula="Effort ~Treatment+Lambda+Interaction", data=data).fit()
    return result

#Create a function to plot the Minimum Detectable Effect size
def mde_plotter(n_list, mde_list):
    '''

    :param n_list: np.array of the sample sizes we considered in the MDE analysis.
    :param mde_list: List of the mde's generated in the analysis.
    :return: Plot the MDE for each sample size.
    '''
    mde_list=np.asarray(mde_list)
    plt.plot(n_list,mde_list)
    plt.xlabel('Bootstrapped N')
    plt.ylabel('MDE')
    plt.show()

#Set up the parameters for our MDE analysis loop.
bootstrap_N_list=np.arange(50,1000,50)
bootstrap_MDE=[]
bootstrap_te=[]

'''
Iterate over each of the sample sizes and compute the MDE at that sample size.
To do so, we will sample bootstrap_N gain-loss, cost slope (constant here),
and cost curvature parameters from their assumed distributions, based on prior work.
We take this sample to be our experimental subject pool, and simulate the decisions
we would observe from these subjects if they decided according to KR CPE. We then run
our interaction specification on our full sample and record the parameter estimate for
coefficient of interest, as well as the mde, which is approximated by 2.8*SE(beta).
'''
for bootstrap_N in bootstrap_N_list:
    #sample lambdas with replacement from the empirical distribution to create our own distribution
    id = np.random.choice(np.arange(len(ggks_dataset.stage1_lambda)), bootstrap_N, replace=True)
    sampled_lambda = np.asarray(ggks_dataset.stage1_lambda[id])
    sampled_structla = np.asarray(ggks_dataset.structla[id])

    #for the cost function, we borrow numbers from Table 1 (pg 29) in AR19 assuming the large sample properties of MLE.
    #that is, we take the estimated MLE and associated sd to be distributed normally, and draw from them.
    # cost slope represents phi in the AR cost function.
    cost_slope = [724] * bootstrap_N
    # cost curvature represents gamma.
    cost_curvature = np.random.normal(2.138, 0.692, size=bootstrap_N)

    #To cut the computation in half, we solve for the optimal effort in the condition for which this simulant will ultimately wind up.
#This is a between subjects regression, so it is unaffected.
    treatment_assignment=np.random.randint(2, size=bootstrap_N)

    effort_choices_l=[]
    effort_choices_h=[]
    for i in np.arange(0,bootstrap_N):
        if treatment_assignment[i]==0:
            effort_choices_l.append(optimal_effort_solver(cost_slope[i], cost_curvature[i], 0.25, 5, sampled_lambda[i]))
            effort_choices_h.append(-1)
        else:
            effort_choices_l.append(-1)
            effort_choices_h.append(optimal_effort_solver(cost_slope[i], cost_curvature[i],0.25,20,sampled_lambda[i]))

    #Define the dataframe to feed into the regression function.
    df_cols={
        'Effort_Choice_Low_Fixed':list(map(float, np.asarray(effort_choices_l))),
        'Effort_Choice_Hi_Fixed':list(map(float, np.asarray(effort_choices_h))),
        'Lambda': sampled_lambda,
        'Structla': sampled_structla,
        'Cost_Slope': cost_slope,
        'Cost_Curvature': cost_curvature,
        'Treatment': treatment_assignment
    }
    bootstapped_data = pd.DataFrame(df_cols)

    reg_results=treatment_effect_bootstrapped_between_lambda(bootstapped_data)
    bootstrap_MDE.append(reg_results.bse[3]*2.8)
    bootstrap_te.append(reg_results.params[3])

#plot the MDE curve
mde_plotter(bootstrap_N_list,bootstrap_MDE)

The resulting plot is displayed below. Our median effect size is roughly 15 tasks in this particular simulation, which would suggest a sample of about 400 to be sufficient.

mde_plot

Sources:
Augenblick, Ned and Matthew Rabin (2018). “An Experiment on Time Preference and Misprediction in Unpleasant Tasks”.
Goette, Lorenz, Thomas Graeber, Alexandre Kellogg, and Charles Sprenger (2018). “Het- erogeneity of Gain-Loss Attitudes and Expectations-Based Reference Points”.
Kőszegi, Botond and Matthew Rabin (2006). “A model of reference-dependent preferences”. In: The Quarterly Journal of Economics, pp. 1133–1165.
Kőszegi, Botond and Matthew Rabin (2007). “Reference-Dependent Risk Attitudes.” In: American Economic Review (4): 1047–73.