2 Description of DC-Area & Multiracial Neighborhood Populations
The following provides information about the DC Area cited in the manuscript. It includes code that creates Table 1 that compares demographic characteristics of multracial neighborhoods in the DC Area to all neighborhoods in the DC Area. Data come from the IPUMS National Historical Geographic Information System (NHGIS).1
Set up environment for analyzing data from the Census. Note the difference between DATADIR created here and dataDIR created in Data Construction. The data files are separated by category and are stored in separate directories, whose names are stored in vartypes.
DATADIR <- '../data/dcarea/tracts/'
vartypes <- c(
'children-present',
'educ-attainment',
'foreign-born',
'marital-status',
'median-age',
'race-ethnicity'
)
## HELPER FUNCTIONS
## Load DC Area data for category of variable
dcarea.data <- function(vartype) {
f <- paste0(DATADIR, '2010/tabular/', vartype, '/dataset/tracts-2010TIGER-',
vartype, '.csv')
return(read.csv(f))
}
## Return 2-item vector containing mean and standard deviation of variable
meansd <- function(var) {
return(c(mean=mean(var, na.rm=TRUE), sd=sd(var, na.rm=TRUE)))
}Load data from the ACS 2011-2015 and the 1980 Census to analyze.
## Load and join data from different categories of variables from 2015 ACS
dcarea <- lapply(vartypes, dcarea.data) %>%
reduce(left_join, by='GISJOIN') %>%
select('GISJOIN', ends_with('15'))
## Load data from 1980
fname <- paste(DATADIR, '1980/tabular/', 'race-ethnicity/dataset',
'tracts-1980TIGER-race-ethnicity.csv', sep='/')
dcarea80 <- read.csv(fname)2.1 DC-area characteristics
- Population of the DC area.
-
Calculate the total population of the DC area in 1980 and 2015.
totpop80 <- sum(dcarea80$totpop)
totpop15 <- sum(dcarea$totpop15)| Year | Total Population |
|---|---|
| 1980 | 2,764,079 |
| 2015 | 4,096,851 |
- Racial composition in 1980.
-
Calculate proportion of DC-area population made up each racial group in 1980.
dcarea80 <- sapply(dcarea80[,c('totpop', 'nhw','nhb','hsp','api')], sum)
prace80 <- round(sapply(dcarea80[2:5], function(x) x/dcarea80[1]),3)*100
kable(prace80)| x | |
|---|---|
| nhw.totpop | 63.6 |
| nhb.totpop | 29.4 |
| hsp.totpop | 3.2 |
| api.totpop | 2.7 |
- Racial composition in 2015.
-
Calculate proportion of DC-area population made up of each racial group in 2015.
n.hsp <- sum(dcarea$hsp15)
n.api <- sum(dcarea$api15)
n.nhw <- sum(dcarea$nhw15)
n.nhb <- sum(dcarea$nhb15)
n.totpop <- sum(dcarea$totpop15)
p.hsp <- n.hsp/n.totpop
p.api <- n.api/n.totpop
p.nhw <- n.nhw/n.totpop
p.nhb <- n.nhb/n.totpop
dcarea.race <- data.frame(
race=c('Latinx','Asian','Non-Hispanic white', 'Non-Hispanic black'),
vals=round(c(p.hsp, p.api, p.nhw, p.nhb),3)*100)
kable(dcarea.race)| race | vals |
|---|---|
| Latinx | 15.7 |
| Asian | 11.1 |
| Non-Hispanic white | 40.7 |
| Non-Hispanic black | 29.2 |
- Foreign Born
-
Calculate percentage of residents that are foreign-born in DC area in 2015 (referenced on page 8 of the manuscript).
sum(dcarea$fbpop15)/sum(dcarea$totpop15)## [1] 0.2567643
- Countries of orgin.
-
Report largest three countries of origin in the DC area. Load data and keep only tracts in the DC area.
## Analyzes foreign-born population to find top three nations of origin
## among foreign-born population in the DC area
## Source file with function to construct variables for DC area
source('dcarea/dcarea_functions.R')
## Load file of 2015 ACS data on tract-level foreign-born population
fname <- 'dcarea/AC52015_foreign_born.csv'
fborn <- read.csv(fname)
## Select only DC-area tracts
dc.fborn <- select.dcarea(fborn)Create list of variable names that define regions, not countries and keep variables not in the list of regions (which define countries).
region.vars <- c(
'ADU4E001', 'ADU4E002', 'ADU4E003', 'ADU4E004', 'ADU4E005',
'ADU4E013', 'ADU4E021', 'ADU4E028',
'ADU4E047', 'ADU4E048', 'ADU4E049', 'ADU4E056', 'ADU4E067', 'ADU4E078',
'ADU4E091', 'ADU4E092', 'ADU4E098', 'ADU4E101', 'ADU4E106', 'ADU4E109',
'ADU4E117', 'ADU4E118',
'ADU4E123', 'ADU4E124', 'ADU4E125', 'ADU4E138', 'ADU4E148', 'ADU4E160'
)
## Keep only non-region variables
nonregion.vars <- names(dc.fborn)[!(names(dc.fborn) %in% region.vars)]
dc.fborn <- dc.fborn[, nonregion.vars]Sum total population across DC-area tracts by each country of origin, then rank and report the top three countries. The codes (variable names) for the countries can be found in the file analysis/dcarea/nhgis0011_ds216_20155_2015_tract_E_codebook.txt. Report the top three DC-area countries of origin (reported by variable name).
country.sums <- sapply(dc.fborn[, grep('^ADU4', names(dc.fborn), value=TRUE)],
function(x) sum(as.numeric(x)))
# Rank countries of origin by population size
country.rank <- sort(country.sums, decreasing=TRUE)
# Report countries of origin by variable name
kable(head(country.rank, 3))| x | |
|---|---|
| ADU4E142 | 139774 |
| ADU4E059 | 67203 |
| ADU4E054 | 50986 |
- Educational Attainment among Foreign-Born Residents
-
Report the percentage of U.S. and DC-area residents with BA or graduate degrees by foreign-born status.
## Source file with function to construct variables for DC area
source('dcarea/dcarea_functions.R')
## Load file of 2015 ACS data on tract-level foreign-born population
fname <- 'dcarea/AC52015_educ_by_foreign_born.csv'
fbeduc <- read.csv(fname)
us.fbba <- sum(fbeduc$fbba, na.rm=TRUE) / sum(fbeduc$fbtot25o, na.rm=TRUE)
us.fbgr <- sum(fbeduc$fbgr, na.rm=TRUE) / sum(fbeduc$fbtot25o, na.rm=TRUE)
dc.fbeduc <- select.dcarea(fbeduc)
dc.fbba <- (
sum(dc.fbeduc$fbba, na.rm=TRUE) / sum(dc.fbeduc$fbtot25o, na.rm=TRUE))
dc.fbgr <- (
sum(dc.fbeduc$fbgr, na.rm=TRUE) / sum(dc.fbeduc$fbtot25o, na.rm=TRUE))
fbeduc.comp <- data.frame(degree=c('BA', 'MA+'),
DC=round(c(dc.fbba, dc.fbgr),3)*100,
US=round(c(us.fbba, us.fbgr),3)*100)
kable(fbeduc.comp)| degree | DC | US |
|---|---|---|
| BA | 21.1 | 16.5 |
| MA+ | 22.2 | 12.0 |
2.2 Multiracial neighborhood characteristics
- Population living in multiracial neighborhoods
-
Report the number of people living in multiracial neighborhoods in 2015.
(quadpop <- sum(dcarea[dcarea$quad15==TRUE, 'totpop15'], na.rm=TRUE))## [1] 584493
- Comparison of multiracial to all DC-area neighborhoods.
-
Construct table comparing characteristics of the population living in DC-area multiracial neighborhoods to the characteristics of all DC-area neighborhoods. Code creates Table 1 in the manuscript. Save the comparison table to
tables/nhood_descriptives.tex.
## Construct variables for analytic table, and keep only those variables
dcarea <- dcarea %>% mutate(
## Children present in HH
pchildpres = pchpr15 * 100,
## Educational attainment
peduc.lhs = plh15 * 100,
peduc.hs = phs15 * 100,
peduc.somecoll = (psc15 + paa15) * 100,
peduc.ba = pba15 * 100,
peduc.ma = pgr15 * 100,
## Foreign-born
pfborn = fbpop15 / totpop15 * 100,
## Currently married
pmarried = pmar15 * 100,
## Median age (no new variable necessary)
## Race-ethnicity
race.papi = papi15 * 100,
race.phsp = phsp15 * 100,
race.pnhb = pnhb15 * 100,
race.pnhw = pnhw15 * 100
) %>%
select(GISJOIN, totpop15, pchildpres, pfborn, pmarried, mdage15,
starts_with('peduc'), starts_with('race'), quad15, nhw15)
## CONSTRUCT TABLE
## Define list of variables in desired display order
tablevars <- c(grep('^race', names(dcarea), value=TRUE),
grep('^peduc', names(dcarea), value=TRUE),
'pfborn', 'pchildpres', 'pmarried')
## Create table of values for multiethnic and all DC Area tracts
quads <- t(sapply(dcarea[dcarea$quad15==TRUE, tablevars], meansd))
tracts <- t(sapply(dcarea[, tablevars], meansd))
tracttbl <- data.frame(quads, tracts)
## List of variable names to be published in the table
tblnames <- c('\\emph{Racial composition}&&&\\\\Percent Asian',
'Percent Hispanic',
'Percent non-Hispanic black',
'Percent non-Hispanic white',
'\\emph{Educational attainment}&&&\\\\Percent less than high school',
'Percent high school',
'Percent some college', 'Percent bachelor\'s degree',
'Percent professional degree',
'\\emph{Other demographic characteristics}&&&\\\\Percent foreign-born',
'Percent of households with children present',
'Percent married (not separated)'
)
## Create table with formatting
tbl <- data.frame(tblnames, quads, blank=rep(' ', nrow(quads)), tracts)
tbl[,c(2,3,5,6)] <- lapply(tbl[,c(2,3,5,6)], function(x) sprintf('%3.1f', x))
# tbl[,c(3,6)] <- lapply(tbl[,c(3,6)], function(x) paste0('(',x,')'))
colnames(tbl) <- c('Variable', 'Mean', 'S.D.','', 'Mean', 'S.D.')
cap <- paste0('Means and standard deviations of tract-level variables',
'in multiethnic and quadrivial neighborhoods in the DC Area')
kable(tbl,
caption=cap, row.names=FALSE)| Variable | Mean | S.D. | Mean | S.D. | |
|---|---|---|---|---|---|
| &&&\Percent Asian | 18.3 | 7.5 | 10.2 | 9.4 | |
| Percent Hispanic | 24.3 | 9.8 | 14.6 | 13.6 | |
| Percent non-Hispanic black | 22.2 | 9.5 | 30.9 | 31.3 | |
| Percent non-Hispanic white | 31.5 | 9.7 | 41.1 | 27.2 | |
| &&&\Percent less than high school | 12.8 | 6.5 | 9.9 | 9.3 | |
| Percent high school | 18.0 | 5.6 | 17.2 | 11.1 | |
| Percent some college | 22.6 | 5.3 | 20.6 | 8.6 | |
| Percent bachelor’s degree | 25.4 | 5.8 | 25.5 | 9.8 | |
| Percent professional degree | 21.2 | 6.6 | 26.8 | 15.9 | |
| &&&\Percent foreign-born | 39.7 | 9.1 | 23.9 | 14.7 | |
| Percent of households with children present | 37.5 | 9.9 | 32.7 | 12.2 | |
| Percent married (not separated) | 48.4 | 8.0 | 44.9 | 15.8 |
- Location of multiracial neighborhoods in the DC area
-
Rank counties by the number of multiracial neighborhoods included in their jurisdictions. Multiracial neighborhoods were those in which Asians, blacks, Latinxs, and whites all made up at least 10% of the neighborhood populaiton and no group represents a majority.
Construct variables measuring multiracial neighborhoods, both those included using the criteria and those excluded for not meeting the criterion that no group be a majority.
racevars <- paste0('race.',c('papi','pnhb','phsp','pnhw'))
dcarea$othmulti <- apply(sapply(dcarea[, racevars],
function(x) !is.na(x) & x>=10), 1, all)
dcarea$exclmulti <- dcarea$quad15 != dcarea$othmulti
dcarea$county <- factor(substr(as.character(dcarea$GISJOIN), 2, 7))
levels(dcarea$county) <- c(
'D.C.'
, 'Montgomery county'
, 'Prince George\'s county'
, 'Arlington county'
, 'Fairfax county'
, 'Fairfax city'
, 'Falls Church city'
, 'Alexandria city'
)
## Breakdown of 'excluded' multiethnic neighborhoods by county
excl_juris <- table(dcarea[, c('county', 'exclmulti')])
N_excluded <- sum(excl_juris[,2])
## Breakdown of multiethnic neighborhoods by county
multi_tbl <-table(dcarea[, c('county', 'quad15')])
kable(sort(multi_tbl[,2], decreasing=TRUE))| x | |
|---|---|
| Montgomery county | 53 |
| Fairfax county | 41 |
| Prince George’s county | 10 |
| Fairfax city | 4 |
| Arlington county | 2 |
| D.C. | 1 |
| Falls Church city | 0 |
| Alexandria city | 0 |
2.3 Diverse neighborhoods excluded for having a majority race
There were 18 neighborhoods that did not meet the criterion that no racial group be a majority.
Excluded for having a white majority
dcarea[!is.na(dcarea$exclmulti) & dcarea$exclmulti==TRUE & dcarea$race.pnhw>50,
c('county', racevars, 'GISJOIN')]| county | race.papi | race.pnhb | race.phsp | race.pnhw | GISJOIN |
|---|---|---|---|---|---|
| D.C. | 12.1 | 17.7 | 10.9 | 56 | G1100010005900 |
| D.C. | 14.1 | 12.3 | 12.3 | 57.7 | G1100010010100 |
| Montgomery county | 21 | 10.8 | 11.3 | 53.2 | G2400310700616 |
| Montgomery county | 13.5 | 11.3 | 10.2 | 61.5 | G2400310701004 |
| Montgomery county | 13.9 | 12.7 | 10.2 | 59.9 | G2400310701313 |
| Montgomery county | 10.3 | 18 | 18.1 | 50.1 | G2400310701314 |
| Montgomery county | 12.7 | 20.5 | 14.8 | 50.3 | G2400310703216 |
| Arlington county | 20.5 | 11.5 | 13.8 | 50.3 | G5100130103200 |
| Arlington county | 14.2 | 16.3 | 10.5 | 57 | G5100130103502 |
| Fairfax county | 13.4 | 13.4 | 15.4 | 55.4 | G5100590421102 |
| Fairfax county | 11.4 | 16.6 | 11.7 | 53.8 | G5100590422301 |
| Fairfax county | 18.5 | 11.4 | 12.4 | 55.4 | G5100590432000 |
| Fairfax county | 11.1 | 14.9 | 13.1 | 50.5 | G5100590432800 |
| Fairfax county | 21.7 | 12.1 | 10.7 | 51.5 | G5100590440202 |
| Fairfax city | 10.1 | 21.8 | 12.4 | 52.4 | G5105100200107 |
Excluded for having a black majority
dcarea[!is.na(dcarea$exclmulti) & dcarea$exclmulti==TRUE & dcarea$race.pnhb>50,
c('county', racevars, 'GISJOIN')]| county | race.papi | race.pnhb | race.phsp | race.pnhw | GISJOIN |
|---|---|---|---|---|---|
| Montgomery county | 13.4 | 58.3 | 11.6 | 10.6 | G2400310701423 |
| Prince George's county | 12.9 | 50.1 | 20.6 | 10.4 | G2400330801411 |
Excluded for having a Latino majority:
dcarea[!is.na(dcarea$exclmulti) & dcarea$exclmulti==TRUE & dcarea$race.phsp>50,
c('county', racevars, 'GISJOIN')]| county | race.papi | race.pnhb | race.phsp | race.pnhw | GISJOIN |
|---|---|---|---|---|---|
| Montgomery county | 12.8 | 23.3 | 50.8 | 10.3 | G2400310700724 |
Steven Manson, Jonathan Schroeder, David Van Riper, and Steven Ruggles. IPUMS National Historical Geographic Information System: Version 13.0 [Database]. Minneapolis: University of Minnesota. 2018. http://doi.org/10.18128/D050.V13.0↩︎