4 Sample Descriptive Statistics

Setup data & environment to analyze.2

# rm(list=ls())
load('../data/dcassvy.Rdata')
texcmds[['Rversion']] <- paste(R.version$major, R.version$minor, sep='.')

## Helper functions for analyzing and reporting results
source('R/broom_mi.R')
source('R/report_descriptives.R')
source('R/report_models.R')
source('R/make_predictions.R')
source('R/marginal_effects.R')

Include only rows in which respondent is one of four mutually exclusive racial groups.

fourraces <- c('white', 'asian', 'black', 'latino')
dcas16svy <- subset(dcas16svy, anarace==TRUE) %>%
    update(raceeth = factor(raceeth, levels=fourraces))
dcas18svy <- subset(dcas18svy, anarace==TRUE) %>%
    update(raceeth = factor(raceeth, levels=fourraces))

Create tables with descriptive results of both DCAS samples, overall and by race. The tables are constructed using two helper functions saved in report_descriptives.R:

desc

Returns the summary of the multiply-imputed survey mean

make.desc

Returns the table of descriptive statistics

Results are saved in tables/descriptives_dcas16.tex and tables/descriptives_dcas18.tex.

Make table of overall descriptive statistics and by race for DCAS 2016 and DCAS 2018.

Missingness of education and income in 2016 and 2018.
mi16 <- apply(sapply(dcas16[, c("educ", "dem.income.cat4")], is.na), 2, sum)
mi18 <- apply(sapply(dcas18[, c("educ", "income")], is.na), 2, sum)

tibble(
    year = c("2016", "2018"),
    educ_mi = c(mi16[1], mi18[1]),
    educ_pct = round(c(mi16[1]/nrow(dcas16), mi18[1]/nrow(dcas18))*100, 1),
    inc_mi = c(mi16[2], mi18[2]),
    inc_pct = round(c(mi16[2]/nrow(dcas16), mi18[2]/nrow(dcas18))*100, 1)
)

Table 4.1:
yeareduc_mieduc_pctinc_miinc_pct
2016252  1098.9
2018252.4797.4
## Number of respondents by race

Table 4.2 shows the number of respondents in 2016 and 2018.

racecnt18 <- dcas18svy$designs[[1]]$variables %>%
  group_by(raceeth) %>%
  count() %>%
  rename(`2018` = n)

dcas16svy$designs[[1]]$variables %>%
  group_by(raceeth) %>%
  count() %>%
  rename(`2016` = n) %>%
  full_join(racecnt18, by = "raceeth") %>%
  as_huxtable() %>%
  set_caption(
    "Number of respondents by race in DCAS 2016 and DCAS 2018 surveys"
  ) %>%
  set_label("tab:racecount")
Table 4.2: Number of respondents by race in DCAS 2016 and DCAS 2018 surveys
raceeth20162018
white266513
asian18693
black105308
latino8475

4.1 Assign labels to be used throughout analyses

## Race labels and variable levels
racelabs <- c("Asian", "Black", "Latino", "White")
racelevs <- c("all", "asian", "black", "latino", "white")

## Labels for regression tables
regression_labels <- c('Asian','Black','Latinx',
                       'Age','Foreign Born','Male',
                       'Children Present', 'Married',
                       levels(dcas16$dem.educ.attain)[c(1,3:5)],
                       # levels(dcas$dem.income.cat4)[c(1,3:4)],
                       'Home owner',
                       'Years in neighborhood',
                       '10-50 blocks', '>50 blocks')

  1. The data loaded here were created in the file analysis/01-variable-data-construction.Rmd. Due to the stochasticity of the multiple imputations, new data generated by sourcing that file will result in minor differences in the final output.↩︎