# Introduction to Quantitative Methods

## 8. Panel Data, Time-Series Cross-Section Models

### 8.2 Solutions

If you're using a UCL computer, please make sure you're running R version 3.2.0. Some of the seminar tasks and exercises will not work with older versions of R. Click here for help on how to start the new version of R on UCL computers.

#### Exercise 1

Download the Comparative Political Data Set. Refer to the Codebook for a complete list of variables.

#### Solution

NOTE: It is highly recommended that you restart RStudio to avoid problems with packages loading in incorrect order.

Let's load all the necessary packages first.

library(foreign)
library(plm)
library(lmtest)
library(texreg)

Warning: package 'texreg' was built under R version 3.2.3

library(dplyr)


We've saved the CPDS dataset to the course website so we can just load it directly from there.

cpds <- read.dta("http://uclspp.github.io/PUBLG100/data/CPDS_1960-2013_stata.dta")


The dataset has 291 variables, so let's just keep the ones we care about and drop the rest.

cpds <- select(cpds,
country,
year,
rae_ele,
vturn,
judrev,
ud,
unemp,
unemp_pmp,
realgdpgr,
debt,
socexp_t_pmp)


#### Exercise 2

Estimate a model for the electoral fractionalization of the party system as coded by the rae_ele variable in the dataset using all the variables listed at the end of the exercises section.

• Estimate a fixed effect model and test for country and time fixed effects.
• Run the necessary tests to check whether country and time fixed effects are present.
Variable Definition
1. vturn Voter turnout in election
2. judrev Judicial review (existence of an independent body which decides whether laws conform to the constitution).
Coded 0 = no, 1 = yes.
3. ud Net union membership as a proportion wage and salary earners in employment (union density)
4. unemp Unemployment rate, percentage of civilian labour force.
5. unemp_pmp Cash expenditure for unemployment benefits as a percentage of GDP (public and mandatory private).
6. realgdpgr Growth of real GDP, percent change from previous year.
7. debt Gross general government debt (financial liabilities) as a percentage of GDP.
8. socexp_t_pmp Total public and mandatory private social expenditure as a percentage of GDP.

#### Solution

Let's first look at the distribution of the dependent variable rae_ele.

hist(cpds$rae_ele, xlab = "Electoral Fractionalization Index (rae_ele)", main = "Histogram of Electoral Fractionalization")  According to the codebook, the rae_ele variable is an index of electoral fractionalization. The index can take values between 1 (maximal fractionalization) and 0 (minimal fractionalization). Let's look at the range and summary of rae_ele so we can better understand the effects of explanatory variables in relative terms. Let's also look at the range of our independent variables now. With the exception of judrev, all other independent variables are percentages. The simplest way is to do that is by just using either the summary() or the range() function on each variable and making sure to omit NAs if necessary. For example, to get the range of voter turnout variable vturn, we could do the following: range(cpds$vturn, na.rm = TRUE)

[1] 35.0 97.2


The ranges for all our independent variables look like this:

    vturn   ud unemp unemp_pmp realgdpgr   debt socexp_t_pmp
min  35.0  7.1   0.0       0.1    -21.26   4.64          9.9
max  97.2 99.1  27.5       5.3     13.20 224.24         36.0


Now lets look at the range for the dependent variable as well:

range(cpds$rae_ele, na.rm = TRUE)  [1] 0.491510 0.928253  summary(cpds$rae_ele)

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's
0.4915  0.6751  0.7436  0.7326  0.8076  0.9283       7


Although the rae_ele is coded as an index between 0 and 1, the range of values in our dataset only goes from 0.49 to 0.93 with three quarters of the observations between 0.68 and 0.93.

Let's run our first model of country fixed effects.

country_effects <- plm(rae_ele ~ vturn + judrev + ud + unemp + unemp_pmp + realgdpgr + debt + socexp_t_pmp,
data = cpds,
index = c("country", "year"),
model = "within",
effect = "individual")

summary(country_effects)

Oneway (individual) effect Within Model

Call:
plm(formula = rae_ele ~ vturn + judrev + ud + unemp + unemp_pmp +
realgdpgr + debt + socexp_t_pmp, data = cpds, effect = "individual",
model = "within", index = c("country", "year"))

Unbalanced Panel: n=35, T=4-32, N=768

Residuals :
Min.  1st Qu.   Median  3rd Qu.     Max.
-0.12200 -0.01710 -0.00169  0.01650  0.12900

Coefficients :
Estimate  Std. Error t-value  Pr(>|t|)
vturn                 -2.0891e-03  2.8088e-04 -7.4376 2.904e-13 ***
judrevJudicial review -3.9645e-02  1.2933e-02 -3.0655  0.002253 **
ud                    -7.5212e-04  2.8101e-04 -2.6765  0.007608 **
unemp                  3.8075e-03  7.1698e-04  5.3105 1.456e-07 ***
unemp_pmp             -5.8664e-03  3.3172e-03 -1.7685  0.077402 .
realgdpgr              4.3781e-05  4.6664e-04  0.0938  0.925277
debt                  -8.4289e-05  8.9307e-05 -0.9438  0.345579
socexp_t_pmp          -2.0604e-03  7.5389e-04 -2.7330  0.006430 **
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Total Sum of Squares:    0.83227
Residual Sum of Squares: 0.7102
R-Squared      :  0.14668
F-statistic: 15.5778 on 8 and 725 DF, p-value: < 2.22e-16


When controlling for country fixed effects, our model tells us that voter turnout, judicial independence, labor union membership, unemployment, and total social expenditure are all statistically significant factors affecting electoral fractionalization. But before we go any further, let's just verify the presence of country fixed effects using plmtest().

plmtest(country_effects, effect="individual")

    Lagrange Multiplier Test - (Honda)

data:  rae_ele ~ vturn + judrev + ud + unemp + unemp_pmp + realgdpgr +  ...
normal = 189.02, p-value < 2.2e-16
alternative hypothesis: significant effects


Since the p-value is less than 0.05, we can continue with our analysis. Let's look at our statistically significant variables and find out what their coefficients means in relative terms. We'll start off by looking at Judicial independenc or judrev which is a factor variable coded as 1 for the existence of independent body and 0 otherwise. We can calculate the relative effect of its coefficient -0.0396455 as follows:

-3.9645e-02 / diff(range(cpds$rae_ele, na.rm = TRUE))  [1] -0.09077421  The calculation above tells us that there is a decline of 0.0396 in electoral fractionalization index in countries with an independent judicial body which amounts to a decrease by 9.08 %. We would like to know the relative effect of other coefficients as well. But instead of calculating them one by one we can simply calculate them all at once: coef(country_effects) / diff(range(cpds$rae_ele, na.rm = TRUE))

                vturn judrevJudicial review                    ud
-0.0047832999         -0.0907752600         -0.0017221059
unemp             unemp_pmp             realgdpgr
0.0087180116         -0.0134320501          0.0001002442
debt          socexp_t_pmp
-0.0001929955         -0.0047175664


Now we can easily interpret the results for every statistically significant factor. Looking at the absolute coefficient estimates and the relative percentages, we can easily see that none of the other statistically significant factors are having a huge effect on our dependent variable.

For example, a one percent increase in voter turnout reduces electoral fractionalization by merely 0.002, as does one percent increase in social spending. This amounts to a relative decline of less than half a percent (0.48 %). Unemployment is the only factor with a statistically significant and positive correlation with our dependent variable. A one percent increase in unemployment increases electoral fractionalization by 0.004, a relative increase of under one percent (0.87 %).

None of the other factors (unemployment, government debt and GDP growth) appear to have any statistically significant effect on electoral fractionalization.

Now, let's run a time fixed effect model to estimate any effects that vary across time but are constant across countries.

time_effects <- plm(rae_ele ~ vturn + judrev + ud + unemp + unemp_pmp + realgdpgr + debt + socexp_t_pmp,
data = cpds,
index = c("country", "year"),
model = "within",
effect = "time")

plmtest(time_effects, effect="time")

    Lagrange Multiplier Test - time effects (Honda)

data:  rae_ele ~ vturn + judrev + ud + unemp + unemp_pmp + realgdpgr +  ...
normal = 3.3287, p-value = 0.0008727
alternative hypothesis: significant effects


Since plmtest() tells us that the model with time fixed effects is statistically significant, we can proceed with our analysis.

summary(time_effects)

Oneway (time) effect Within Model

Call:
plm(formula = rae_ele ~ vturn + judrev + ud + unemp + unemp_pmp +
realgdpgr + debt + socexp_t_pmp, data = cpds, effect = "time",
model = "within", index = c("country", "year"))

Unbalanced Panel: n=35, T=4-32, N=768

Residuals :
Min.  1st Qu.   Median  3rd Qu.     Max.
-0.21600 -0.04400 -0.00053  0.05480  0.19500

Coefficients :
Estimate  Std. Error t-value  Pr(>|t|)
vturn                 -0.00035960  0.00025356 -1.4182 0.1565535
judrevJudicial review -0.02911525  0.00774309 -3.7602 0.0001834 ***
ud                     0.00043130  0.00017305  2.4923 0.0129116 *
unemp                  0.00110320  0.00093798  1.1761 0.2399207
unemp_pmp              0.01588536  0.00405292  3.9195 9.711e-05 ***
realgdpgr              0.00411686  0.00143653  2.8658 0.0042792 **
debt                  -0.00020220  0.00010232 -1.9761 0.0485241 *
socexp_t_pmp           0.00552104  0.00078057  7.0731 3.562e-12 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Total Sum of Squares:    5.9337
Residual Sum of Squares: 4.5925
R-Squared      :  0.22603
F-statistic: 26.5754 on 8 and 728 DF, p-value: < 2.22e-16


Before trying to interpret the results, let's put the effects of statistically significant coefficients in context.

coef(time_effects) / diff(range(cpds$rae_ele, na.rm = TRUE))   vturn judrevJudicial review ud -0.0008233762 -0.0666644914 0.0009875313 unemp unemp_pmp realgdpgr 0.0025259692 0.0363723244 0.0094262695 debt socexp_t_pmp -0.0004629617 0.0126413824  With fixed time effects, judicial independence again is statistically significant, however its effect isn't nearly as large as it was in the country fixed effects model. In the time fixed effects model, countries with an independent judicial body experience a decline of 0.029 on the electoral fractionalization index which amounts to 6.67 %. The only other factor that is positively correlated with our dependent variable is government debt, although its marginal effect is negligible. An increase in 1 point in government debt as a percent of the GDP, has an estimated effect of lowering electoral fractionalization by 0 or 0.05 %. On the other hand, for each increase in unemployment benefits by one percent of the GDP, electoral fractionalization rises by 0.016 or an increase of 3.64 %. #### Exercise 3 Estimate a twoway model and compare to the previous country and time fixed effect models. #### Solution Let's run a twoway fixed effect model that controls for both country and time fixed effects. twoway_effects <- plm(rae_ele ~ vturn + judrev + ud + unemp + unemp_pmp + realgdpgr + debt + socexp_t_pmp, data = cpds, index = c("country", "year"), model = "within", effect = "twoway") summary(twoway_effects)  Twoways effects Within Model Call: plm(formula = rae_ele ~ vturn + judrev + ud + unemp + unemp_pmp + realgdpgr + debt + socexp_t_pmp, data = cpds, effect = "twoway", model = "within", index = c("country", "year")) Unbalanced Panel: n=35, T=4-32, N=768 Residuals : Min. 1st Qu. Median 3rd Qu. Max. -0.09810 -0.01770 -0.00108 0.01540 0.11200 Coefficients : Estimate Std. Error t-value Pr(>|t|) vturn -1.4588e-03 2.7207e-04 -5.3619 1.122e-07 *** judrevJudicial review -5.6636e-02 1.2445e-02 -4.5509 6.307e-06 *** ud -6.8797e-04 2.8491e-04 -2.4147 0.016007 * unemp 3.5554e-03 6.8669e-04 5.1776 2.947e-07 *** unemp_pmp 8.9455e-04 3.4704e-03 0.2578 0.796662 realgdpgr -1.4839e-03 5.7279e-04 -2.5906 0.009781 ** debt -2.4835e-04 8.4592e-05 -2.9358 0.003437 ** socexp_t_pmp -5.3631e-03 8.4541e-04 -6.3438 4.044e-10 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Total Sum of Squares: 0.7043 Residual Sum of Squares: 0.5786 R-Squared : 0.17847 Adj. R-Squared : 0.16128 F-statistic: 18.8459 on 8 and 694 DF, p-value: < 2.22e-16  coef(twoway_effects) / diff(range(cpds$rae_ele, na.rm = TRUE))

                vturn judrevJudicial review                    ud
-0.003340257          -0.129678436          -0.001575240
unemp             unemp_pmp             realgdpgr
0.008140687           0.002048229          -0.003397621
debt          socexp_t_pmp
-0.000568639          -0.012279748


Judicial independence once again is not only statistically significant but has a rather substantial effect on electoral fractionalization. Countries with an independent judicial body experience a decline of 0.057 on the electoral fractionalization index. This effect amounts to 12.97 % of our total range of values in the electoral fractionalization index. Voter turnout has a statistically significant negative effect on electoral fractionalization with a percentage inrease estimated to decrease electoral fractionalization by 0.001 or 0.33 %. Similarly, a rise in real GDP growth by 1 percent is estimated to decrease electoral fractionalization by 0.001 (0.34 %). When interpreting the effects of voter turnout and GDP growth it is important to keep in mind that a year-over-year increase of 1 percent in GDP growth is rather substantial, whereas it is not uncommon to see voter turnout fluctuating by several percentage points from one election to another.

A percentage increase in union membership has an estimated effect of decreasing our dependent variable by 0.001 (0.16 %), while a one percent increase in unemployment increases it by 0.004 (0.81 %). Social expenditure also has a statistically significant effect with a 1 percent increase driving electoral fractionalization down by 0.005 or 1.23 %, which is still relatively small.

We can now compare the three models we've estimated side-by-side.

screenreg(list(country_effects, time_effects, twoway_effects),
custom.model.names = c("Country Fixed Effects", "Time Fixed Effects", "Twoway Fixed Effects"))

======================================================================================
Country Fixed Effects  Time Fixed Effects  Twoway Fixed Effects
--------------------------------------------------------------------------------------
vturn                   -0.00 ***              -0.00               -0.00 ***
(0.00)                 (0.00)              (0.00)
judrevJudicial review   -0.04 **               -0.03 ***           -0.06 ***
(0.01)                 (0.01)              (0.01)
ud                      -0.00 **                0.00 *             -0.00 *
(0.00)                 (0.00)              (0.00)
unemp                    0.00 ***               0.00                0.00 ***
(0.00)                 (0.00)              (0.00)
unemp_pmp               -0.01                   0.02 ***            0.00
(0.00)                 (0.00)              (0.00)
realgdpgr                0.00                   0.00 **            -0.00 **
(0.00)                 (0.00)              (0.00)
debt                    -0.00                  -0.00 *             -0.00 **
(0.00)                 (0.00)              (0.00)
socexp_t_pmp            -0.00 **                0.01 ***           -0.01 ***
(0.00)                 (0.00)              (0.00)
--------------------------------------------------------------------------------------
R^2                      0.15                   0.23                0.18
Num. obs.              768                    768                 768
======================================================================================
*** p < 0.001, ** p < 0.01, * p < 0.05


By comparing the models we can see that Judicial independence is statistically significant in every model and is negatively correlated with electoral fractionalization. Real GDP growth and social spending are positively correlated in the time effects model but have a negative correlation in the twoway fixed effect model. A one point increase in GDP growth is estimated to increase electoral fractionalization by 0.004 (0.94 %) in the time fixed effect model and decrease it by 0.001 (0.34 %) in the twoway fixed effect model. Similarly, a one percent increase in social spending is estimated to increase electoral fractionalization by 0.006 (1.26 %) in the time fixed effect model and decrease it by 0.005 (1.23 %) in the twoway fixed effect model.

The effect of voter turnout in both country fixed effects are twice its effect in the twoway fixed effect models, and it is not statistically significant in the time fixed effects model. A one percent increase in voter turnout is estimated to decrease electoral fractionalization by 0.002 with country fixed effects, compared to 0.001 in the towway fixed effects model.

Unemployment effects in a country fixed effects model are similar to the effects in twoway fixed effect models and they are not statistically significant in model where we only control for time fixed effects. When unemployment increases by 1 percent it has an estimated effect of an increase of 0.004 in our dependent variable in both country fixed effects and twoway effects models, which amounts to a percentage increase of just under 1 percent.

#### Exercise 4

Test for serial correlation and cross sectional dependence in the twoway model.

#### Solution

We can test serial correlation with the Breusch-Godfrey test to see if there is any temporal dependence in our model.

pbgtest(twoway_effects)

    Breusch-Godfrey/Wooldridge test for serial correlation in panel
models

data:  rae_ele ~ vturn + judrev + ud + unemp + unemp_pmp + realgdpgr +     debt + socexp_t_pmp
chisq = 311.87, df = 4, p-value < 2.2e-16
alternative hypothesis: serial correlation in idiosyncratic errors


The Breusch-Godfrey test tells us that our model does indeed suffer from autocorrelated standard errors.

Next, we run the Pesaran cross-sectional dependence test to check for spatial dependence in our model.

pcdtest(twoway_effects)

    Pesaran CD test for cross-sectional dependence in panels

data:  formula
z = -3.8849, p-value = 0.0001024
alternative hypothesis: cross-sectional dependence


The Pesaran CD test confirms the presence of cross-sectional dependence in our model.

#### Exercise 5

If either serial correlation or cross sectional dependence is present, use the methods discussed in the seminar to obtain heteroskedastic and autocorrelation consistent standard errors.

#### Solution

From the Breusch-Godfrey test we know that we have serial correlation in the error term of our model so we need to adjust the standard errors. We can use the "allerano" method to obtain heteroskedasticity and autocorrelation (HAC) consistent standard errors. We first obtain a HAC robust covariance matrix from vcovHC() by specifying method = "arellano" and pass it along to coeftest() for adjusting the standard errors.

twoway_effects_hac <- coeftest(twoway_effects,
vcov = vcovHC(twoway_effects, method = "arellano", type = "HC3"))

twoway_effects_hac

t test of coefficients:

Estimate  Std. Error t value  Pr(>|t|)
vturn                 -0.00145883  0.00067683 -2.1554 0.0314745 *
judrevJudicial review -0.05663615  0.04175377 -1.3564 0.1754028
ud                    -0.00068797  0.00078204 -0.8797 0.3793147
unemp                  0.00355539  0.00114190  3.1136 0.0019244 **
unemp_pmp              0.00089455  0.00753789  0.1187 0.9055681
realgdpgr             -0.00148389  0.00053781 -2.7591 0.0059485 **
debt                  -0.00024835  0.00014049 -1.7678 0.0775399 .
socexp_t_pmp          -0.00536309  0.00150857 -3.5551 0.0004035 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


Since the Pesaran CD test told us that we have cross-sectional dependence in our model, we need to adjust for spatial dependence as well. Fortunately, the Driscoll and Kraay's (1998) SCC method gives us heteroskedasticity and autocorrelation consistent errors that are also robust to cross-sectional dependence.

twoway_effects_scc <- coeftest(twoway_effects,
vcov = vcovSCC(twoway_effects, type="HC3", cluster = "group"))

twoway_effects_scc

t test of coefficients:

Estimate  Std. Error t value  Pr(>|t|)
vturn                 -1.4588e-03  3.4686e-04 -4.2058 2.942e-05 ***
judrevJudicial review -5.6636e-02  2.9588e-02 -1.9142 0.0560119 .
ud                    -6.8797e-04  3.9845e-04 -1.7266 0.0846817 .
unemp                  3.5554e-03  8.1618e-04  4.3562 1.524e-05 ***
unemp_pmp              8.9455e-04  2.9604e-03  0.3022 0.7626130
realgdpgr             -1.4839e-03  3.7905e-04 -3.9148 9.941e-05 ***
debt                  -2.4835e-04  7.4754e-05 -3.3222 0.0009398 ***
socexp_t_pmp          -5.3631e-03  1.2916e-03 -4.1522 3.703e-05 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

##### Dynamic Models

Another way to address the serial correlation is by estimating a dynamic model and using a lagged dependent variable (LDV). The advantage of a dynamic model is that not only does it allow us to to overcome the problem of serial correlation but it also helps us answer the question of short term and long term effects of explanatory variables on the dependent variable.

To estimate a dynamic model, we add lag(rae_ele) to our list of explanatory variables leaving everything exactly the same as we had when we estimated a twoway fixed effect model.

dynamic_model <-
plm(rae_ele ~ lag(rae_ele) + vturn + judrev + ud + unemp + unemp_pmp + realgdpgr + debt + socexp_t_pmp,
data = cpds,
index = c("country", "year"),
model = "within",
effect = "twoways")


We need to make sure that there is no residual serial correlation left in our model. We can run the Breusch-Godfrey test again to verify that.

pbgtest(dynamic_model)

    Breusch-Godfrey/Wooldridge test for serial correlation in panel
models

data:  rae_ele ~ lag(rae_ele) + vturn + judrev + ud + unemp + unemp_pmp +     realgdpgr + debt + socexp_t_pmp
chisq = 5.2407, df = 4, p-value = 0.2635
alternative hypothesis: serial correlation in idiosyncratic errors


The p-value is of 0.263482 tells that we do not have any serial correlation left in our model. If, however, there still were serial correlation present in the model then we wouldn't be able to use this model.

We also need to make sure we there is no cross-sectional correlation in our model. Recall that we can test for cross-sectional dependence with the Pesaran CD test.

pcdtest(dynamic_model)

    Pesaran CD test for cross-sectional dependence in panels

data:  formula
z = -3.226, p-value = 0.001255
alternative hypothesis: cross-sectional dependence


The Pesaran CD test does show cross-sectional dependence, but we can correct it with Driscoll and Kraay SCC method we saw earlier with twoway fixed effect model.

dynamic_model_scc <- coeftest(dynamic_model,
vcov = vcovSCC(dynamic_model, type="HC3", cluster = "group"))


Let's now look at the results from our dynamic model.

dynamic_model_scc

t test of coefficients:

Estimate  Std. Error t value  Pr(>|t|)
lag(rae_ele)           7.3765e-01  5.3786e-02 13.7147 < 2.2e-16 ***
vturn                 -4.5016e-04  2.0145e-04 -2.2346   0.02576 *
judrevJudicial review -2.1053e-02  1.2313e-02 -1.7099   0.08773 .
ud                    -9.4584e-05  2.0606e-04 -0.4590   0.64636
unemp                  1.9049e-03  4.6751e-04  4.0745 5.145e-05 ***
unemp_pmp             -3.1041e-03  2.0591e-03 -1.5075   0.13214
realgdpgr             -4.8139e-04  2.7066e-04 -1.7786   0.07575 .
debt                  -7.4099e-05  5.9899e-05 -1.2371   0.21648
socexp_t_pmp          -1.1731e-03  4.7969e-04 -2.4455   0.01471 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


The serial correlation is now captured in the model and allows us to estimate the short term and long term effects of each explanatory variable.

Recall from the lecture and the readings that the dynamic model is specified with the following equation:

$$Y_{it} = \alpha_i + \phi Y_{i,t-1} + \beta_1 X_{it} + u_{it}$$ Our dynamic model has estimated the coefficient for the lagged dependent variable to be 0.738.

We can now calculate the short term effect and long term effects as follows:

##### Short Term Effects

For short term effects, we can interpret the coefficient estimates of statistically significant variables exactly the same way as we do for any other linear model. For example, the immediate effect of independent judiciary in a country is a decrease of 0.021 or 4.82 %. Social spending, unemployment and voter turnout are also statistically significant and we'll discuss them in greater detail in the final section.

##### Long Term Effects

We estimate the long term effects as follows:

$$\frac{\beta_1}{1 - \phi}$$ Since the coefficient for the lagged dependent variable is 0.738, we can substitute it and calculate the long term effects as:

$$\frac{\beta_1}{1 - 0.738 }$$ For voter turnout, the long term effect can therefore be calculated as:

$$\frac{-0.0005}{1 - 0.738 }$$ We will do the same for other variables that are statistically significant in our model in the final section.

#### Exercise 6

Compare the HAC and spatially robust standard errors with the twoway model estimated earlier.

#### Solution

The twoway fixed effect model with heteroskedastic and autocorrelation consistent (HAC) standard errors and with SCC corrected standard errors are presented in a side-by-side comparison below.

screenreg(list(twoway_effects, twoway_effects_hac, twoway_effects_scc),
custom.model.names = c("Twoway", "Twoway (HAC)", "Twoway (SCC)"))

=============================================================
Twoway      Twoway (HAC)  Twoway (SCC)
-------------------------------------------------------------
vturn                   -0.00 ***  -0.00 *       -0.00 ***
(0.00)     (0.00)        (0.00)
judrevJudicial review   -0.06 ***  -0.06         -0.06
(0.01)     (0.04)        (0.03)
ud                      -0.00 *    -0.00         -0.00
(0.00)     (0.00)        (0.00)
unemp                    0.00 ***   0.00 **       0.00 ***
(0.00)     (0.00)        (0.00)
unemp_pmp                0.00       0.00          0.00
(0.00)     (0.01)        (0.00)
realgdpgr               -0.00 **   -0.00 **      -0.00 ***
(0.00)     (0.00)        (0.00)
debt                    -0.00 **   -0.00         -0.00 ***
(0.00)     (0.00)        (0.00)
socexp_t_pmp            -0.01 ***  -0.01 ***     -0.01 ***
(0.00)     (0.00)        (0.00)
-------------------------------------------------------------
R^2                      0.18
Num. obs.              768
=============================================================
*** p < 0.001, ** p < 0.01, * p < 0.05


The "Twoway" model suffers from heteroskedasticity, autocorrelation and cross-sectional dependence while the "Twoway (HAC)" model suffers only from cross-sectional dependence. The "Twoway (SCC)" model corrects for heteroskedastic and autocorrelation consistent (HAC) standard errors that are also robust to cross-sectional dependence and is the correct model out of the three models we compared above. We'll discuss the comparison of "Twoway (SCC)" model with a dynamic model that uses a lagged dependent variable in the next section.

#### Exercise 7

Display the results in publication-ready tables and discuss the substantively significant findings.

#### Solution

We'll present all the models of interest in a publication-ready table using screenreg().

screenreg(list(country_effects, twoway_effects, twoway_effects_hac, twoway_effects_scc, dynamic_model_scc),
custom.model.names = c("Country Effects", "Twoway Effects", "Twoway (HAC)", "Twoway (SCC)", "Dynamic"))

=============================================================================================
Country Effects  Twoway Effects  Twoway (HAC)  Twoway (SCC)  Dynamic
---------------------------------------------------------------------------------------------
vturn                   -0.00 ***        -0.00 ***      -0.00 *       -0.00 ***     -0.00 *
(0.00)           (0.00)         (0.00)        (0.00)        (0.00)
judrevJudicial review   -0.04 **         -0.06 ***      -0.06         -0.06         -0.02
(0.01)           (0.01)         (0.04)        (0.03)        (0.01)
ud                      -0.00 **         -0.00 *        -0.00         -0.00         -0.00
(0.00)           (0.00)         (0.00)        (0.00)        (0.00)
unemp                    0.00 ***         0.00 ***       0.00 **       0.00 ***      0.00 ***
(0.00)           (0.00)         (0.00)        (0.00)        (0.00)
unemp_pmp               -0.01             0.00           0.00          0.00         -0.00
(0.00)           (0.00)         (0.01)        (0.00)        (0.00)
realgdpgr                0.00            -0.00 **       -0.00 **      -0.00 ***     -0.00
(0.00)           (0.00)         (0.00)        (0.00)        (0.00)
debt                    -0.00            -0.00 **       -0.00         -0.00 ***     -0.00
(0.00)           (0.00)         (0.00)        (0.00)        (0.00)
socexp_t_pmp            -0.00 **         -0.01 ***      -0.01 ***     -0.01 ***     -0.00 *
(0.00)           (0.00)         (0.00)        (0.00)        (0.00)
lag(rae_ele)                                                                         0.74 ***
(0.05)
---------------------------------------------------------------------------------------------
R^2                      0.15             0.18
Num. obs.              768              768
=============================================================================================
*** p < 0.001, ** p < 0.01, * p < 0.05


NOTE: In the screenreg() output, the coefficients are rounded to 2 decimal places by default thus making it seem as if many of them were zero. You can change the number of decimal places displayed, but it is more important to understand that even small coefficients don't necessarily mean that their effects are small and certainly does not mean that the variable is statistically insignificant. When interpreting your results, always keep in mind the scale and range of your independent and dependent variables.

The country effects model was the first model we estimated. However, since our data has time fixed effects as well, we estimated a twoway fixed effects model that controls for both country and time fixed effects. After correcting for heteroskedasticity and temporal and spatial correlation, our correct model was the "Twoway (SCC)". Since we're also interested in estimating short and long term effects of our independent variables on electoral fractionaliztion, we used a dynamic model with lagged dependent variable (LDV) and ensured that there was no residual serial autocorrelation in our model. We corrected for cross sectional dependence in the dynamic model with the Driscoll and Kraay SCC method and resulting dynamic model is our preferred model.

In our dynamic model voter turnout and social expenditure are negatively correlated with our dependent variable while unemployment has a positive correlation. We'll discuss each of these statistically significant variables in turn in this section. None of the other variables were statistically significant in our dynamic model.

We saw earlier that for short term effects, the coefficient estimates of statistically significant variables in a dynamic model can be interpreted exactly the same way as we do for any other linear model. In our dynamic model, the immediate effect of single percentage increase in voter turnout is estimted to decrease electoral fractionalization by 0.0005 or 0.1 %. We could think of increased voter participation as a sign of a healthy democracy with trust in institutions and established parties causing a decrease in electoral fractionalization.

An increase in social expenditure in our dynamic model is estimated to decrease electoral fractionaliztion by 0.001 (0.27 %). Social expenditure could either be viewed as a sign of prosperity with voters satisfied with their share of the economic pie and not interested in unnecessary electoral competition. It could also be viewed as having a pacifying effect on the population and keeping the masses quiet for the sake of securing electoral stability.

Unemployment rate has a positive correlation with our dependent variable and is estimated to increase electoral fractionaliztion by 0.002 or 0.44 % for each percentage point increase in jobless rate. Again, we could think of increase in unemployment as influencing general dissatisfaction with the institutions leading to a rise in newer parties taking votes away from established parties and increasing electoral fractionaliztion.

Finally, the reason we prefer the dynamic model is because it allows us to model both short term and long term effects. We presented the calculation of long term effects in the previous section but will recap it and present the calculation for all statistically significant variables here.

For voter turnout, the long term effect can be calculated as follows:

$$\frac{-0.0005}{1 - 0.738 }$$ We can do the same to calculate the long term effect of social expenditure:

$$\frac{-0.001}{1 - 0.738 }$$ Similarly the long term effect of unemployment is calculated as:

$$\frac{0.002}{1 - 0.738 }$$ Notice that the denominator in our equation for estimating long term effect gets smaller as the coefficient for the lagged dependent variable increases. This means that the larger the coefficient of the lagged dependent variable, the greater the long term effects of our independent variables.