23_model-posteriors.Rmd

## Model Results {#sec:model-results}

<!-- in chapter: model -->

```{r stopper, eval = FALSE, cache = FALSE, include = FALSE}
knitr::knit_exit()
```

```{r knitr-02-3_model-results, include = FALSE, cache = FALSE}
source(here::here("assets-bookdown", "knitr-helpers.R"))
```

```{r r-02-3_model-results, cache = FALSE}
library("here")
library("magrittr")
library("tidyverse")
library("boxr"); box_auth()

library("scales")
# library("english")
library("latex2exp")
library("patchwork")
library("ggforce")

# library("rstan")
library("tidybayes")
library("broom")
```

```{r mcmc-data}
# grab most recent filename
most_recent_date <- 
  list.files(here("data", "mcmc", "2-dgirt", "run", "samples")) %>%
  str_split(pattern = "-mcmc", simplify = TRUE) %>%
  as_tibble(.name_repair = "universal") %>%
  transmute(date = lubridate::ymd(`...1`)) %>%
  filter(date == max(date)) %>%
  pull(date) %>%
  print()

# uniquely identify file (length should be 1)
stopifnot(length(most_recent_date) == 1)

# make filename
mcmc_path <- file.path("data", "mcmc", "2-dgirt", "run", "samples")
clean_path <- file.path("data", "mcmc", "2-dgirt", "run", "clean")
input_path <- file.path("data", "mcmc", "2-dgirt", "run", "input")

most_recent_static <- as.character(most_recent_date) %>%
  str_glue("-mcmc-homsk-2010s.RDS") %>%
  as.character() %>%
  print()

# import MCMC
mcmc <- 
  here(mcmc_path, most_recent_static) %>%
  # here(mcmc_path, "2020-01-10-mcmc-homsk-2010s.RDS") %>% # small?
  readRDS()
```

```{r contextual-data}
# tidy pre-stan data
master_data <- 
  readRDS(here(input_path, "master-model-data.RDS")) %>%
  print()

# stan data
stan_data <- readRDS(here(input_path, "stan-data-list.RDS"))
lapply(stan_data, head)

index_crosswalk <- 
  tibble(
    # factors come from master-data
    group_f = master_data$group,
    item_f = master_data$item,
    state_f = master_data$state,
    region_f = master_data$region,
    district_f = master_data$district,
    party_f = master_data$party,
    # integers from stan (but could be coerced from master?)
    group = stan_data$group, 
    item = stan_data$item,
    state = stan_data$state,
    region = stan_data$region,
    district = stan_data$district,
    party = stan_data$party
  ) %>%
  # the "ordering" of groups as per my mistake in stan prep
  mutate(stan_group_item = row_number()) %>%
  print()
```

```{r mcmc-params}
# model params
load(here(input_path, "mcmc-params.Rdata"))
ls()
```

```{r}
total_samples <- (n_iterations - n_warmup)*(1 / n_thin)*n_chains
```

I estimate the ideal point model using Stan's "No-U-Turn" sampling algorithm, an adaptive variant of Hamiltonian Monte Carlo. 
I draw posterior samples using `r smart_number(n_chains)` Markov chains that were each run for `r comma(n_iterations)` iterations, discarding the first `r comma(n_warmup)` iterations that are used for an adaptive warmup period.^[
  Given the size of the data, a chain of this length can run on a 2014 Macbook Pro in approximately three hours.
  To generate more chains, however, I use an external computing cluster affiliated with the Social Science Computing Cooperative at the University of Wisconsin–Madison. 
]
I initialize the algorithm with an `adapt_delta` parameter of `r adapt_delta` and a maximum proposal tree-depth of `r max_treedepth`.
Following the advice of @link-eaton:2011:thinning-mcmc, I store every post-warmup sample with no thinning of chains, resulting in a total of `r comma(total_samples)` samples per parameter across all chains.^[
  The chains mix well and exhibit little autocorrelation, which is a credit to the model parameterization and the fact that the No-U-Turn algorithm is efficient at proposing transitions that explore the parameter space.
]
Just `r sum(rstan::get_divergent_iterations(mcmc))` out of `r comma(total_samples)` iterations (`r mean(rstan::get_divergent_iterations(mcmc)) %>% percent(accuracy = .01)`) encountered a divergent transition, which indicates no systematic issues with model parameterization,
and `r rstan::get_num_max_treedepth(mcmc)` iterations exceeded the maximum proposal tree-depth. 
The energy metrics that monitor the model's Hamiltonian mechanics also detect no problematic model behavior.

```{r}
stopifnot(rstan::get_low_bfmi_chains(mcmc) %>% length() == 0)
``` 


```{r neat-thetas}
theta_tidy <- readRDS(here(clean_path, "theta-tidy.RDS")) %>% ungroup()
theta_draws <- readRDS(here(clean_path, "theta-draws.RDS")) %>% ungroup()
```

```{r tidy-all}
tidymc <- tidy(mcmc, conf.int = TRUE)
```

```{r plot-thetas}
plot_theta_rank <- 
  ggplot(theta_tidy) +
  aes(x = rank(estimate),  y = estimate, color = as.factor(party)) +
  geom_hline(yintercept = 0, color = "gray") +
  geom_linerange(
    aes(ymin = conf.low_0.9, ymax = conf.high_0.9),
    alpha = 0.1
  ) +
  geom_linerange(
    aes(ymin = conf.low_0.5, ymax = conf.high_0.5),
    alpha = 0.7
  ) +
  geom_point(size = 0.25, color = "black") +
  scale_color_manual(values = party_factor_colors) +
  scale_fill_manual(values = party_factor_colors) +
  labs(
    x = "Rank Order", 
    y = "Policy Conservatism",
    title = "District-Party Public Ideal Points",
    subtitle = "MCMC means, 50% and 90% intervals"
  ) + 
  coord_flip(ylim = c(-2, 1)) +
  theme(legend.position = "none") +
  # annotate(
  #   geom = "text", 
  #   y = c(-1.5, 0.5), x = c(200, 500), 
  #   label = c("Democrats", "Republicans"),
  #   vjust = c(-2, 3)
  # ) +
  geom_mark_ellipse(
    aes(label = names(party_colors)[party], fill = as.factor(party)), 
    color = NA, 
    alpha = 0,
    label.family = font_fam,
    label.fill = "gray95", 
    label.fontface = "plain",
    con.type = "straight", 
    con.cap = 0
  ) + 
  NULL
```


```{r plot-theta-rank}
plot_theta_rank
```

I present posterior summaries of the ideal point estimates in Figure \@ref(fig:plot-theta-rank).
The figure features data from `r 435 * 2` district-party groups, including a posterior mean (black dot), a 50% compatibility interval (dark band) and a 90% compatibility interval (light band).
The horizontal axis shows these estimates along the restricted ideological spectrum, which is oriented so that larger values indicate greater ideological conservatism.
The vertical axis ranks the ideal points in ascending order, so the lowest value (the most progressive district-party group) gets the rank of $1$.


```{r plot-theta-rank, include = TRUE, fig.scap = "District-party ideal points and rank-ordering.", fig.cap = "District-party ideal points and rank-ordering. Points are posterior means. Error bars are 50 and 90 percent compatibility intervals.", out.width = "80%", fig.height = 5, fig.width = 7}
```

Unlike ideal point estimates for individual citizens, which may feature a great deal of overlap between the distribution of Republican and Democratic partisans, district-party ideal point estimates have no overlap whatsoever.
This is because district-party ideal points are estimates of the _mean_ ideology for Republican and Democratic groups, so they average the ideological heterogeneity among individuals in each party.

It is also important to note that the ideological space appears "asymmetrical"—the clusters of Republican and Democratic ideal points are not equidistant from zero.
Instead, the Republican cluster is located much closer to an ideal point value of zero, with some Republican groups estimates having negative ideal points.
The appearance of asymmetry results from the way I restrict the item parameters to identify the latent ideal point space.
The item midpoint parameters are restricted to sum to be mean zero, which means that the average item presents policy alternatives that are equidistant from zero.
The fact that Republican ideal points are clustered closer to zero suggests that Republicans are more likely to offer progressive item responses than Democrats are to offer conservative item responses, conditional on the items included for scaling.
Supposing that these items are representative of policy conflict in the 2010s, this finding is consistent with earlier research showing that US citizens tend to hold progressive views on policy even if their symbolic worldview is more conservative [@ellis-stimson:2012:symbolic-ideology].


```{r cces-data}
cces_raw <- read_rds(here("data", "cces", "cumulative_2006_2018.Rds"))
```

```{r cces-estimates}
cces_raw %>% count(year)

cces_ideo <- cces_raw %>%
  mutate_all(labelled::remove_labels) %>%
  mutate(
    ideo5_ch = as.character(ideo5),
    ideo5 = as.numeric(ideo5),
    party_name = case_when(
      pid3 == 1 ~ "Democrats",
      pid3 == 2 ~ "Republicans",
      pid3 == 3 ~ "Independents",
      TRUE ~ "Other/DK/NA"
    ),
    party = case_when(
      party_name == "Democrats" ~ 1,
      party_name == "Republicans" ~ 2
    ),
    district = abs(parse_number(cd))
  ) %>%
  select(
    cycle = year, state_abb = st, st_cd = cd, district_num = dist, 
    weight, party, party_name, ideo5, ideo5_ch
  ) %>%
  filter(ideo5 %in% 1:5) %>%
  filter(cycle >= 2012) %>%
  filter(party %in% c(1, 2)) %>%
  group_by(state_abb, st_cd, district_num, party, party_name) %>%
  summarize(
    ideo_mean = weighted.mean(x = ideo5, w = weight)
  ) %>%
  ungroup() %>%
  print()

count(cces_ideo, district_num)
count(theta_tidy, district_num)
```

```{r cces-v-draws}

joined_draws <- 
  master_data %>%
  transmute(state_abb, district_num, group = as.numeric(group), party = as.numeric(party)) %>%
  distinct() %>%
  right_join(theta_draws) %>%
  inner_join(
    x = ., y = cces_ideo, 
    by = c("state_abb" = "state_abb", "party", "district_num")
  ) %>%
  group_by(party_name, .draw) %>%
  mutate(
    r = cor(ideo_mean, theta)
  ) %>%
  ungroup() %>% 
  print()

ggplot(joined_draws) +
  aes(x = r) +
  facet_wrap(~ party_name) +
  geom_histogram()
```

```{r cces-v-means}
joined_ideal <- 
  inner_join(
    x = theta_tidy, y = cces_ideo, 
    by = c("state" = "state_abb", "party", "district_num")
  ) %>%
  group_by(party_name) %>%
  mutate(
    r = cor(ideo_mean, estimate)
  ) %>%
  ungroup() %>% 
  print()
```

```{r plot-vs-cces}
ggplot(joined_ideal) +
  aes(x = ideo_mean, y = estimate, color = as.factor(party)) +
  geom_point() +
  geom_mark_ellipse(
    aes(label = str_glue("{party_name} (r = {round(r, 2)})")),
    label.family = font_fam,
    label.fill = "gray95", 
    label.fontface = "plain",
    con.type = "straight", 
    con.cap = 0
  ) +
  scale_color_manual(values = party_factor_colors) +
  theme(legend.position = "nonw") +
  coord_cartesian(ylim = c(-2, 1)) +
  labs(
    title = "Ideal Points vs. Ideological Self-Placement",
    subtitle = "Self-placement from CCES 5-category item",
    x = "CCES self-placement, district-party average (2012–2018)",
    y = "District-party ideology from IRT model\n(posterior mean)"
  )
```

We can feel confident that these ideal point estimates capture real ideological variation by comparing them to other survey-based measures of ideology.
Figure \@ref(fig:plot-vs-cces) compares the IRT model's district-party ideology estimates (posterior means) to a survey-based measure of ideological self-placement.
Self-placement data are drawn from the Cooperative Congressional Election Study (CCES) for years 2012 through 2018, coding the 5-category item numerically and averaging partisan responses within each district-party. 
The figure shows a high degree of correlation between the IRT scores and the self-placement scores. 
The overall correlation between the ideal points and self-placement scores is `r joined_ideal %$% cor(estimate, ideo_mean) %>% round(2)`, and the within-party correlations are `r joined_ideal %>% filter(party == 1) %>%  sample_n(1) %>% pull(r) %>% round(2)` among Democrats and `r joined_ideal %>% filter(party == 2) %>%  sample_n(1) %>% pull(r) %>% round(2)` among Republicans.
These within-party correlations are as strong as the within-party correlations between CF scores and DW-NOMINATE scores for incumbent House members [@bonica:2013:ideology-interests;@bonica:2014:mapping-ideology].


```{r plot-vs-cces, include = TRUE, fig.scap = "Comparison of district-party ideology to ideological self-placement.", fig.cap = "Comparison of district-party ideology and ideological self-placement. Average ideological self-placement in each district-party is measured using the CCES 5-category item, combining survey waves 2012 through 2018.", fig.height = 6, fig.width = 7}
```


<!-- NOT EVALUATED -->
```{r, eval = FALSE}
set.seed(5435)
master_data %>%
  transmute(group = as.numeric(group), party) %>%
  distinct() %>%
  group_by(party) %>%
  sample_n(5) %>%
  left_join(theta_draws) %>%
  ggplot() +
  aes(x = theta, fill = party) +
  geom_density(
    aes(group = group),
    position = "identity",
    alpha = 0.5
  ) +
  scale_fill_manual(values = party_factor_colors)
```


### Ideological variation across districts


```{r}
plot_hist_means <- ggplot(theta_tidy) +
  aes(fill = as.factor(party), x = estimate) +
  geom_histogram(
    position = "identity",
    color = NA,
    binwidth = .025,
    alpha = 0.8
  ) +
  scale_fill_manual(values = party_factor_colors) +
  theme(
    legend.position = "none",
    plot.title.position = "plot"
  ) +
  labs(
    title = "Histogram of Ideal Points",
    subtitle = "MCMC Posterior Means",
    x = "Policy Conservatism",
    y = NULL
  ) +
  coord_cartesian(xlim = c(-2, 1))

plot_hist_means
```

```{r plot-hist-means}
plot_hist_means
```

Recent scholarship on party coalitions has highlighted the ideological cohesion of the Republican Party compared to the Democratic Party, owing to the fact that the Democratic Party is a big-tent assemblage of social groups with policy priorities that may conflict [@grossman-hopkins:2016:asymmetric-politics; @feldman-zaller:1992:political-ambivalence; @lelkes-sniderman:2016:ideological-asymmetry]. 
Party differences in ideological cohesion may appear across districts as well, especially if the demographic composition of Democratic constituents is more heterogeneous from one district to the next than the demographic composition of Republican constituents.
Figure \@ref(fig:plot-hist-means) provides some aggregate evidence that Republican constituencies are more ideologically similar to one another than Democratic constituencies are.
The histograms in the figure show that the distribution of Democratic district-party ideal points is approximately symmetric around their mean, while Republican district-party groups are more tightly distributed around a modal ideology that is on the conservative edge of the ideal point scores.


```{r plot-hist-means, include = TRUE, fig.cap = "Histogram of ideal point means in both parties.", fig.scap = "Histogram of ideal point means in both parties.", fig.width = 5, out.width = "60%"}
```

```{r SDs}
sd_draws <- master_data %>%
  transmute(group = as.numeric(group), party) %>%
  distinct() %>%
  right_join(theta_draws) %>%
  group_by(.draw, party) %>%
  summarize(
    sd = sd(theta)
  ) %>%
  ungroup() %>%
  pivot_wider(
    names_from = "party",
    values_from = "sd",
    names_prefix = "sd_"
  ) %>%
  mutate(
    sd_diff = sd_1 - sd_2
  ) %>%
  pivot_longer(
    cols = starts_with("sd_"), 
    names_to = "sd_group",
    values_to = "sd_value"
  ) %>%
  mutate(
    party = suppressWarnings(parse_number(sd_group)) %>%
      ifelse(is.na(.), "diff", .)
  ) %>%
  print()
```


```{r}
sd_labels <- tribble(
  ~ x, ~ y, ~ party, ~ label,
  .15, 200, "2", "Republicans",
  .20, 200, "1", "Democrats"
)

plot_sds <- 
  ggplot(sd_draws) +
  aes(x = sd_value, fill = party) +
  geom_histogram(
    position = "identity",
    alpha = 0.7,
    binwidth = .0025
  ) +
  facet_wrap(
    ~ party == "diff",
    scales = "free",
    labeller = c(
      "FALSE" = "Ideal point standard deviation",
      "TRUE" = "Difference in std devs:\nDemocrats minus Republicans"
    ) %>%
      as_labeller(),
    strip.position = "bottom"
  ) +
  geom_vline(
    data = sd_draws %>%
      filter(party == "diff") %>%
      sample_n(1),
    aes(xintercept = 0)
  ) +
  scale_fill_manual(
    values = c("1" = dblue, "2" = rred, "diff" = secondary)
  ) +
  geom_text(
    data = sd_labels,
    aes(x = x, y = y, label = label),
    hjust = c(1.2, -0.2),
    vjust = -0.2
  ) +
  geom_text(
    data = sd_draws %>%
      filter(sd_group == "sd_diff") %>%
      group_by(sd_group, party) %>% 
      summarize(
        p = mean(sd_value > 0) %>% round(2),
        lab = "More variation among\nDemocrats"
      ),
    aes(
      x = 0.005, y = 400,
      label = str_glue("{lab} (pr = {p})")
    ),
    # fill = "white",
    # alpha = 0.7,
    size = 3,
    hjust = 0,
    vjust = -0.1
  ) +
  coord_cartesian(ylim = c(0, 475)) +
  labs(
    x = NULL,
    y = NULL,
    title = "Geographic Variation in Ideology",
    subtitle = "Histograms of MCMC samples"
  ) +
  theme(
    legend.position = "none",
    strip.placement = "outside"
  )
```

```{r plot-sds}
plot_sds
```

Figure \@ref(fig:plot-sds) verifies this intuition by comparing the standard deviation of district-party ideal points in each party.
The first panel shows the standard deviation of ideal points Democratic and Republican groups.
The histogram represents the distribution of estimates across all MCMC sample iterations.
Democratic groups tend to be more variable across districts, with a distribution of standard deviation estimates centered on a mean of `r sd_draws %>% filter(party ==1) %$% mean(sd_value) %>% round(2)`.
Republican groups are less variable across districts, with a distribution of standard deviations centered on a mean of `r sd_draws %>% filter(party ==2) %$% mean(sd_value) %>% round(2)`.
The second panel of Figure \@ref(fig:plot-sds) plots the difference between the Democratic and Republican standard deviation estimates, again with a distribution representing all MCMC sample iterations.
The histogram shows that almost all MCMC iterations (`r sd_draws %>% filter(sd_group == "sd_diff") %$% mean(sd_value > 0) %>% percent(acc = 1)`) contain ideal point estimates that are higher variance for Democrats than for Republicans.

```{r plot-sds, include = TRUE, fig.scap = "Geographic heterogeneity in ideal points.", fig.cap = "Geographic heterogeneity in ideal points. Left panel: standard deviation of Republican and Democratic district-party ideologies. Right panel: difference in standard deviations (Democrats minus Republicans). Distributions reflect MCMC draws.", fig.width = 8, fig.height = 4, out.width = "100%"}
```


```{r correlations}
cor_draws <- master_data %>%
  transmute(district, group = as.numeric(group), party = as.numeric(party)) %>%
  distinct() %>%
  right_join(theta_draws) %>%
  select(-group, -sigma_in_g) %>%
  pivot_wider(
    names_from = "party",
    values_from = "theta",
    names_prefix = "theta_"
  ) %>%
  group_by(.draw) %>%
  summarize(r = cor(theta_1, theta_2, method = "pearson")) %>%
  print()

plot_cor_draws <- ggplot(cor_draws) +
  aes(x = r) +
  geom_histogram(
    boundary = 0,
    fill = secondary, 
    alpha = 0.7
  ) +
  geom_label(
    data = cor_draws %>%
      summarize(
        pos = mean(r > 0) %>% round(2)
      ),
    aes(x = 0, y = 700, 
        label = str_glue("pr(positive correlation) = {pos}")),
    hjust = -0.1,
    vjust = 1
  ) +
  geom_vline(xintercept = 0) +
  labs(
    title = "Correlation Between Republicans and Democrats",
    subtitle = "Histogram of correlation draws",
    x = "Pearson's r",
    y = NULL
  ) +
  ggeasy::easy_remove_y_axis()

plot_cor_draws
```

```{r ideal-scatter}
theta_match_district <- master_data %>%
  transmute(group = as.numeric(group), party = as.numeric(party)) %>%
  distinct() %>%
  right_join(theta_tidy) %>%
  select(district, party, estimate, ends_with("_0.9")) %>%
  pivot_wider(
    names_from = "party",
    values_from = c("estimate", "conf.low_0.9", "conf.high_0.9")
  ) %>%
  print()

plot_match_scatter <- ggplot(theta_match_district) +
  aes(x = estimate_1, y = estimate_2) +
  geom_point(color = "black", size = 3, alpha = 0.8, shape = 16) +
  labs(
    x = "Democratic Ideal Point", 
    y = "Republican Ideal Point",
    title = "Republican and Democratic Ideal Points",
    subtitle = "Within-district comparison"
  )

plot_match_scatter
```

```{r plot-cors}
plot_match_scatter + plot_cor_draws
```

Given the geographic variation in ideology, it is natural to wonder if Republican and Democratic ideal points are related.
Do the districts containing more conservative Republicans also contain more conservative Democrats? 
Perhaps the pattern is reversed, where local conditions that reinforce the conservatism of Republicans actually reinforce progressivism among Democrats?
Figure \@ref(fig:plot-cors) explores this possibility.
The left panel plots the Republican group ideal point in a district (vertical axis) against the Democratic group ideal point in the same district, with points representing posterior means. 
The posterior means do not exhibit much correlation to one another.
The right panel plots a histogram of correlation estimates (Pearson's $r$) from all MCMC draws. 
The distribution of correlations suggests that although there is a slight correlation between Republican and Democratic ideal points ($r > 0$ in `r cor_draws %$% mean(r > 0) %>% percent(acc = 1)` of MCMC draws), the correlation is quite small, with a posterior mean of `r cor_draws %$% mean(r) %>% round(2)` and a standard deviation of `r cor_draws %$% sd(r) %>% round(2)`.
The predominant takeaway is that most of the variation in Republican ideal points is unrelated to Democrat ideal points.
Although this project does not explore the correlates or possible causes of local ideological convergence or divergence with much detail, this project enables this research agenda by measuring local partisan ideology.

```{r plot-cors, include = TRUE, fig.height = 6, fig.width = 11, out.width = '100%', fig.scap = "Correlation between Republican and Democratic ideal points in the same district.", fig.cap = "Correlation between Republican and Democratic ideal points in the same district. Left: scatterplot of Republican versus  Democratic ideal points (posterior means). Left: posterior distribution of the correlation (Pearson's $r$) between Democratic and Republican ideal points."}
```


### Aggregate correlates of district-party ideology {#sec:ideal-covariates}


```{r}
grp_feature_names <- c("Pct. White", "Pct. College", "Med. Income", "Med. Age", "Gini", "Pct. Foreign Born", "Pct. Unemployed")

state_feature_names <- c("Pct. Evangelical", "Pct. Nonwhite", "Income per cap")

tidy_coefs <- tidymc %>%
  filter(
    str_detect(term, "coef_grp_mean") |
    str_detect(term, "coef_st_mean")
  ) %>%
  mutate(
    party = 
      str_split(term, pattern = ",", simplify = TRUE)[,2] %>%
      parse_number(),
    party_name = names(party_colors)[party],
    predictor = str_split(term, pattern = ",", simplify = TRUE)[,1] %>%
      parse_number(),
    level = str_split(term, pattern = "\\[",simplify = TRUE)[,1],
    coef_lab = case_when(
      level == "coef_grp_mean" ~ grp_feature_names[predictor],
      level == "coef_st_mean" ~ state_feature_names[predictor]
    )
  ) %>%
  print()

plot_coef_pts <- ggplot(tidy_coefs) +
  aes(x = coef_lab, y = estimate, 
      color = party_name,
      shape = party_name
  ) +
  geom_hline(yintercept = 0) +
  facet_wrap(
    ~ level, 
    ncol = 1, scales = "free_y",
    labeller = c("coef_grp_mean" = "District-Level Terms", 
                 "coef_st_mean" = "State-Level Terms") %>% 
               as_labeller()
  ) +
  geom_pointrange(
    aes(ymin = conf.low, ymax = conf.high),
    position = position_dodge(width = -0.5),
    fill = "white"
  ) +
  scale_color_manual(values = party_colors) +
  scale_shape_manual(values = c("Democrats" = 16, "Republicans" = 15)) +
  labs(
    y = "Coefficient on Policy Conservatism",
    x = NULL,
    color = NULL, 
    shape = NULL,
    title = "Hierarchical Coefficients",
    subtitle = "Separate parameters by party"
  ) +
  theme(
    plot.title.position = "plot",
    legend.position = "none"
  ) +
  coord_flip()
```

```{r eval = FALSE}

coef_samples <- mcmc %>%
  gather_draws(coef_grp_mean[j, party], coef_st_mean[j, party]) %>%
  ungroup() %>%
  pivot_wider(
    names_from = "party",
    values_from = ".value",
    names_prefix = "coef_"
  ) %>%
  mutate(
    coef_diff = coef_1 - coef_2,
    coef_lab = case_when(
      .variable == "coef_grp_mean" ~ grp_feature_names[j],
      .variable == "coef_st_mean" ~ state_feature_names[j]
    )
  ) %>%
  print()


ggplot(coef_samples) +
  aes(coef_1, coef_2) +
  facet_wrap(~ .variable + j) +
  geom_hline(yintercept = 0, color = "gray") +
  geom_vline(xintercept = 0, color = "gray") +
  geom_point(color = secondary, alpha = 0.2, shape = 16) +
  geom_abline() 

plot_coef_hist <- ggplot(coef_samples) +
  aes(x = coef_diff, y = coef_lab) +
  ggridges::geom_ridgeline(
    stat = "binline", 
    aes(height = ..density..),
    fill = secondary,
    draw_baseline = FALSE,
    scale = 0.1,
    binwidth = .005, 
    boundary = 0,
    position = "identity", 
    alpha = 0.5
  ) +
  geom_vline(xintercept = 0) +
  facet_wrap(
    ~ .variable,
    scales = "free_y",
    ncol = 1,
    labeller = c("coef_grp_mean" = "District-Level Terms", 
                 "coef_st_mean" = "State-Level Terms") %>% 
               as_labeller()
  ) +
  theme(plot.title.position = "plot") +
  labs(
    title = "Difference in Coefficients",
    subtitle = "Democrats minus Republicans",
    x = "Difference in Coefficients",
    y = NULL
  )
```

```{r}
plot_marginal_scatter <- theta_tidy %>%
  pivot_longer(
    cols = c(prcntWhite, prcntUnemp, prcntForeignBorn, prcntBA, medianIncome, gini), 
    names_to = "predictor",
    values_to = "value"
  ) %>%
  mutate(
    party_lab = names(party_colors)[party],
    coef_lab = case_when(
      predictor == "prcntWhite" ~ "Pct. White",
      predictor == "prcntUnemp" ~ "Pct.\nUnemployed",
      predictor == "prcntForeignBorn" ~ "Pct. Foreign\nBorn",
      predictor == "prcntBA" ~ "Pct. College",
      predictor == "medianIncome" ~ "Med. Income\n(Log)",
      predictor == "gini" ~ "Gini Coef."
    ),
    value = case_when(
      coef_lab == "Med. Income\n(Log)" ~ log(value),
      TRUE ~ value
    )
  ) %>%
  ggplot() +
  aes(x = value, y = estimate, 
      color = party_lab, shape = party_lab) +
  facet_wrap(~ coef_lab, scales = "free_x") +
  # geom_pointrange(
  #   aes(ymin = conf.low_0.9, ymax = conf.high_0.9),
  #   position = position_dodge(width = -0.25)
  # ) +
  geom_point(fill = "white", alpha = 0.5) +
  theme(legend.position = "bottom") +
  scale_shape_manual(values = c("Democrats" = 16, "Republicans" = 15)) +
  scale_color_manual(values = party_colors) +
  guides(color = guide_legend(override.aes = list(alpha = 1))) +
  labs(
    x = NULL,
    y = "District-Party Ideal Point",
    color = NULL,
    shape = NULL,
    title = "Ideal Points and Covariates",
    subtitle = "Bivariate (unadjusted) relationships"
  )
```

```{r plot-party-covs}
# coef_layout <- "
# ##AAA##
# #BBBBB#
# "

coef_layout <- "
AABBBBB
"

wrap_plots(
  A = plot_coef_pts, B = plot_marginal_scatter,
  design = coef_layout
) 

# plot_coef_pts + plot_marginal_scatter

# ggsave("~/desktop/test.png", height = 7, width = 12)
```

If Republican and Democratic ideal points are largely unrelated, it may be because district and state characteristics are related to partisan ideology in diverging ways. 
One final piece of descriptive analysis is to explore the relationship between district-party ideal points and the hierarchical covariates used to smooth the ideal point estimates in the model.
Figure \@ref(fig:plot-party-covs) visualizes the relationship between ideal points and aggregate covariates in two ways.
The left side of the figure displays coefficient estimates from the IRT model for district- and state-level covariates included in the hierarchical regression.
These coefficients capture the linear relationship between the covariates and ideal points, holding other covariates fixed.
The right side of the figure plots the _bivariate_ relationship between ideal points (posterior means) and a selection of district-level covariates, with no additional statistical adjustments.
The bivariate relationships convey information about the "types" of districts that contain more conservative or progressive partisans, even if these relationships may be statistical artifacts of other confounding relationships.

```{r plot-party-covs, include = TRUE, out.width = "100%", fig.height = 7, fig.width = 12, fig.scap = "How ideal points relate to hierarchical covariates.", fig.cap = "How ideal points relate to hierarchical covariates. Left: coefficients from hierarchical regression on district covariates (top left) and state covariates (bottom left), with 90 percent compatibility intervals. Right: Bivariate relationships between ideal points and a selection of district-level covariates with no statistical controls."}
```

Larger values of the ideal point space are associated with greater conservatism, so increasing covariate values are related to increasing conversatism if the coefficient is positive and progressivism if the coefficient is negative.
Some covariates have similar "effects" in both parties: districts with higher median income levels and median ages are estimated to be more conservative on average, and districts with greater college attainment are estimated to be more progressive.
Other covariates have diverging effects for each party.
Whiter districts are associated with greater conservatism among Republicans and (with less statistical confidence) greater progressivism among Democrats.
Higher unemployment rates are related to decreased conservatism among Republicans and increased conservatism among Democrats.
Districts with greater numbers of foreign born residents contain more progressive Republicans, which could be related to the cosmopolitanism of the district culture, and districts with greater income inequality (measured by the Gini coefficient) contain more conservative Republicans. 
The uncertainty of the coefficients at the state level indicate weaker relationships to state-level covariates, but it is worth noting that Republicans and Democrats in wealthier _states_ are more progressive than in less wealthy states, which is a pattern that differs from the district-level relationship of increasing conservatism greater wealth and is similar to the findings of @gelman-et-al:2007:red-state-qjps regarding state-level wealth.
Counter-intuitively, I also find that larger evangelical populations appear to be related to greater progressivism among both Republicans and Democrats, and larger White populations are weakly related to greater progressivism among Democrats.
Because these aggregate demographic features are likely to be confounded, it is important not to give these coefficients a causal interpretation.
Furthermore, the correlations themselves may not be most straightforward if the regression specification introduces collider bias among the aggregate predictors.

The bivariate relationships between ideal points and district covariates convey what types of districts are more progressive or conservative, setting aside statistical adjustments.
These relationships sometimes contrast with the model coefficients in interesting ways.
For instance, although the coefficient for median income is positive in both parties, the scatterplot shows that higher-income districts contain more progressive Democrats.
This suggests that other factors dominate the effect of income when determining the progressivism of Democratic groups in wealthy districts.
This factor could be college education, which is strongly related to progressivism among Democrats.
The scatterplot also captures relationships that appear in the coefficients: the strongest relationships to Republican ideology are the White share of the population (positively related to conservatism) and the foreign-born share of the population (negatively related to conservatism).
Among Democrats, greater unemployment is related to conservatism, and ideology is weakly related to racial composition, income inequality, and the foreign-born share of the population.


## Improving the ideal point model

The results from this ideal point model are promising and informative, but there are some modeling approaches that could improve this model and others like it.

The model's predictive ability could be improved by creating more flexible hierarchical regressions.
Because the hierarchical regression is included to smooth the ideal point estimates and not for causal interpretation, the linear specifications have no particular benefit over more flexible modeling approaches.
The use of nonlinear models and "machine learning" approaches is new to hierarchical measurement modeling in political science.
Examples include the use of regression trees multilevel regression and poststratification [@bisbee:2019:barp] and Gaussian process priors for IRT models [@duckmayr-et-al:2020:gpirt].
Work outside of political science has also explored the use of spline regression for IRT models [@woods-thissen:2006:spline-irt], which are not as flexible as Gaussian processes but are computationally less intensive.

The substance of the model could also be extended in several ways aside from its predictive capacity.
As mentioned above, @caughey-warshaw:2015:DGIRT lay out a dynamic linear model approach to bridge the ideal point space across multiple time periods.
The parameterization of the model also exposes a parameter for the variance of individual ideal points within a district, which could itself be modeled as a function of covariates [@lauderdale:2010:heteroskedastic-irt].
Lastly, because survey items vary in their response design, researchers have explored the use of ordinal response models to scale survey items that ask respondents to choose from more than two policy alternatives [e.g. @hill-tasanovitch:2015:polarization-disconnect].