4 Sample Descriptive Statistics
Setup data & environment to analyze.2
# rm(list=ls())
load('../data/dcassvy.Rdata')
texcmds[['Rversion']] <- paste(R.version$major, R.version$minor, sep='.')
## Helper functions for analyzing and reporting results
source('R/broom_mi.R')
source('R/report_descriptives.R')
source('R/report_models.R')
source('R/make_predictions.R')
source('R/marginal_effects.R')Include only rows in which respondent is one of four mutually exclusive racial groups.
fourraces <- c('white', 'asian', 'black', 'latino')
dcas16svy <- subset(dcas16svy, anarace==TRUE) %>%
update(raceeth = factor(raceeth, levels=fourraces))
dcas18svy <- subset(dcas18svy, anarace==TRUE) %>%
update(raceeth = factor(raceeth, levels=fourraces))Create tables with descriptive results of both DCAS samples, overall and by race. The tables are constructed using two helper functions saved in report_descriptives.R:
desc-
Returns the summary of the multiply-imputed survey mean
make.desc-
Returns the table of descriptive statistics
Results are saved in tables/descriptives_dcas16.tex and tables/descriptives_dcas18.tex.
Make table of overall descriptive statistics and by race for DCAS 2016 and DCAS 2018.
- Missingness of education and income in 2016 and 2018.
-
mi16 <- apply(sapply(dcas16[, c("educ", "dem.income.cat4")], is.na), 2, sum) mi18 <- apply(sapply(dcas18[, c("educ", "income")], is.na), 2, sum) tibble( year = c("2016", "2018"), educ_mi = c(mi16[1], mi18[1]), educ_pct = round(c(mi16[1]/nrow(dcas16), mi18[1]/nrow(dcas18))*100, 1), inc_mi = c(mi16[2], mi18[2]), inc_pct = round(c(mi16[2]/nrow(dcas16), mi18[2]/nrow(dcas18))*100, 1) )
## Number of respondents by raceTable 4.1: year educ_mi educ_pct inc_mi inc_pct 2016 25 2 109 8.9 2018 25 2.4 79 7.4
Table 4.2 shows the number of respondents in 2016 and 2018.
racecnt18 <- dcas18svy$designs[[1]]$variables %>%
group_by(raceeth) %>%
count() %>%
rename(`2018` = n)
dcas16svy$designs[[1]]$variables %>%
group_by(raceeth) %>%
count() %>%
rename(`2016` = n) %>%
full_join(racecnt18, by = "raceeth") %>%
as_huxtable() %>%
set_caption(
"Number of respondents by race in DCAS 2016 and DCAS 2018 surveys"
) %>%
set_label("tab:racecount")| raceeth | 2016 | 2018 |
|---|---|---|
| white | 266 | 513 |
| asian | 186 | 93 |
| black | 105 | 308 |
| latino | 84 | 75 |
4.1 Assign labels to be used throughout analyses
## Race labels and variable levels
racelabs <- c("Asian", "Black", "Latino", "White")
racelevs <- c("all", "asian", "black", "latino", "white")
## Labels for regression tables
regression_labels <- c('Asian','Black','Latinx',
'Age','Foreign Born','Male',
'Children Present', 'Married',
levels(dcas16$dem.educ.attain)[c(1,3:5)],
# levels(dcas$dem.income.cat4)[c(1,3:4)],
'Home owner',
'Years in neighborhood',
'10-50 blocks', '>50 blocks')The data loaded here were created in the file
analysis/01-variable-data-construction.Rmd. Due to the stochasticity of the multiple imputations, new data generated by sourcing that file will result in minor differences in the final output.↩︎