2 Description of DC-Area & Multiracial Neighborhood Populations

The following provides information about the DC Area cited in the manuscript. It includes code that creates Table 1 that compares demographic characteristics of multracial neighborhoods in the DC Area to all neighborhoods in the DC Area. Data come from the IPUMS National Historical Geographic Information System (NHGIS).¹

Set up environment for analyzing data from the Census. Note the difference between DATADIR created here and dataDIR created in Data Construction. The data files are separated by category and are stored in separate directories, whose names are stored in vartypes.

DATADIR <- '../data/dcarea/tracts/'

vartypes <- c(
    'children-present',
    'educ-attainment',
    'foreign-born',
    'marital-status',
    'median-age',
    'race-ethnicity'
)

## HELPER FUNCTIONS
## Load DC Area data for category of variable
dcarea.data <- function(vartype) {
    f <- paste0(DATADIR, '2010/tabular/', vartype, '/dataset/tracts-2010TIGER-', 
                vartype, '.csv')
    return(read.csv(f))
}

## Return 2-item vector containing mean and standard deviation of variable
meansd <- function(var) {
    return(c(mean=mean(var, na.rm=TRUE), sd=sd(var, na.rm=TRUE)))
}

Load data from the ACS 2011-2015 and the 1980 Census to analyze.

## Load and join data from different categories of variables from 2015 ACS
dcarea <- lapply(vartypes, dcarea.data) %>%
    reduce(left_join, by='GISJOIN') %>%
    select('GISJOIN', ends_with('15'))

## Load data from 1980
fname <- paste(DATADIR, '1980/tabular/', 'race-ethnicity/dataset',
                'tracts-1980TIGER-race-ethnicity.csv', sep='/')
dcarea80 <- read.csv(fname)

2.1 DC-area characteristics

Population of the DC area.: Calculate the total population of the DC area in 1980 and 2015.

totpop80 <- sum(dcarea80$totpop)
totpop15 <- sum(dcarea$totpop15)

Year	Total Population
1980	2,764,079
2015	4,096,851

Racial composition in 1980.: Calculate proportion of DC-area population made up each racial group in 1980.

dcarea80 <- sapply(dcarea80[,c('totpop', 'nhw','nhb','hsp','api')], sum)
prace80 <- round(sapply(dcarea80[2:5], function(x) x/dcarea80[1]),3)*100
kable(prace80)

	x
nhw.totpop	63.6
nhb.totpop	29.4
hsp.totpop	3.2
api.totpop	2.7

Racial composition in 2015.: Calculate proportion of DC-area population made up of each racial group in 2015.

n.hsp <- sum(dcarea$hsp15)
n.api <- sum(dcarea$api15)
n.nhw <- sum(dcarea$nhw15)
n.nhb <- sum(dcarea$nhb15)
n.totpop <- sum(dcarea$totpop15)

p.hsp <- n.hsp/n.totpop
p.api <- n.api/n.totpop
p.nhw <- n.nhw/n.totpop
p.nhb <- n.nhb/n.totpop

dcarea.race <- data.frame(
    race=c('Latinx','Asian','Non-Hispanic white', 'Non-Hispanic black'),
    vals=round(c(p.hsp, p.api, p.nhw, p.nhb),3)*100)
kable(dcarea.race)

race	vals
Latinx	15.7
Asian	11.1
Non-Hispanic white	40.7
Non-Hispanic black	29.2

Foreign Born: Calculate percentage of residents that are foreign-born in DC area in 2015 (referenced on page 8 of the manuscript).

sum(dcarea$fbpop15)/sum(dcarea$totpop15)

## [1] 0.2567643

Countries of orgin.: Report largest three countries of origin in the DC area. Load data and keep only tracts in the DC area.

## Analyzes foreign-born population to find top three nations of origin
## among foreign-born population in the DC area

## Source file with function to construct variables for DC area
source('dcarea/dcarea_functions.R')

## Load file of 2015 ACS data on tract-level foreign-born population
fname <- 'dcarea/AC52015_foreign_born.csv'
fborn <- read.csv(fname)

## Select only DC-area tracts
dc.fborn <- select.dcarea(fborn)

Create list of variable names that define regions, not countries and keep variables not in the list of regions (which define countries).

region.vars <- c(
    'ADU4E001', 'ADU4E002', 'ADU4E003', 'ADU4E004', 'ADU4E005',
    'ADU4E013', 'ADU4E021', 'ADU4E028',
    'ADU4E047', 'ADU4E048', 'ADU4E049', 'ADU4E056', 'ADU4E067', 'ADU4E078',
    'ADU4E091', 'ADU4E092', 'ADU4E098', 'ADU4E101', 'ADU4E106', 'ADU4E109',
    'ADU4E117', 'ADU4E118',
    'ADU4E123', 'ADU4E124', 'ADU4E125', 'ADU4E138', 'ADU4E148', 'ADU4E160'
)

## Keep only non-region variables
nonregion.vars <- names(dc.fborn)[!(names(dc.fborn) %in% region.vars)]
dc.fborn <- dc.fborn[, nonregion.vars]

Sum total population across DC-area tracts by each country of origin, then rank and report the top three countries. The codes (variable names) for the countries can be found in the file analysis/dcarea/nhgis0011_ds216_20155_2015_tract_E_codebook.txt. Report the top three DC-area countries of origin (reported by variable name).

country.sums <- sapply(dc.fborn[, grep('^ADU4', names(dc.fborn), value=TRUE)],
                       function(x) sum(as.numeric(x)))

# Rank countries of origin by population size
country.rank <- sort(country.sums, decreasing=TRUE)

# Report countries of origin by variable name
kable(head(country.rank, 3))

	x
ADU4E142	139774
ADU4E059	67203
ADU4E054	50986

Educational Attainment among Foreign-Born Residents: Report the percentage of U.S. and DC-area residents with BA or graduate degrees by foreign-born status.

## Source file with function to construct variables for DC area
source('dcarea/dcarea_functions.R')

## Load file of 2015 ACS data on tract-level foreign-born population
fname <- 'dcarea/AC52015_educ_by_foreign_born.csv'

fbeduc <- read.csv(fname)
us.fbba <- sum(fbeduc$fbba, na.rm=TRUE) / sum(fbeduc$fbtot25o, na.rm=TRUE)
us.fbgr <- sum(fbeduc$fbgr, na.rm=TRUE) / sum(fbeduc$fbtot25o, na.rm=TRUE)

dc.fbeduc <- select.dcarea(fbeduc)
dc.fbba <- (
    sum(dc.fbeduc$fbba, na.rm=TRUE) / sum(dc.fbeduc$fbtot25o, na.rm=TRUE))
dc.fbgr <- (
    sum(dc.fbeduc$fbgr, na.rm=TRUE) / sum(dc.fbeduc$fbtot25o, na.rm=TRUE))

fbeduc.comp <- data.frame(degree=c('BA', 'MA+'),
                          DC=round(c(dc.fbba, dc.fbgr),3)*100,
                          US=round(c(us.fbba, us.fbgr),3)*100)

kable(fbeduc.comp)

degree	DC	US
BA	21.1	16.5
MA+	22.2	12.0

2.2 Multiracial neighborhood characteristics

Population living in multiracial neighborhoods: Report the number of people living in multiracial neighborhoods in 2015.

(quadpop <- sum(dcarea[dcarea$quad15==TRUE, 'totpop15'], na.rm=TRUE))

## [1] 584493

Comparison of multiracial to all DC-area neighborhoods.: Construct table comparing characteristics of the population living in DC-area multiracial neighborhoods to the characteristics of all DC-area neighborhoods. Code creates Table 1 in the manuscript. Save the comparison table to tables/nhood_descriptives.tex.

## Construct variables for analytic table, and keep only those variables
dcarea <- dcarea %>% mutate(
    ## Children present in HH
    pchildpres = pchpr15 * 100,

    ## Educational attainment
    peduc.lhs = plh15 * 100,
    peduc.hs = phs15 * 100,
    peduc.somecoll = (psc15 + paa15) * 100,
    peduc.ba = pba15 * 100,
    peduc.ma = pgr15 * 100,

    ## Foreign-born
    pfborn = fbpop15 / totpop15 * 100,

    ## Currently married
    pmarried = pmar15 * 100,

    ## Median age (no new variable necessary)

    ## Race-ethnicity
    race.papi = papi15 * 100,
    race.phsp = phsp15 * 100,
    race.pnhb = pnhb15 * 100,
    race.pnhw = pnhw15 * 100
) %>%
    select(GISJOIN, totpop15, pchildpres, pfborn, pmarried, mdage15,
           starts_with('peduc'), starts_with('race'), quad15, nhw15)

## CONSTRUCT TABLE
## Define list of variables in desired display order
tablevars <- c(grep('^race', names(dcarea), value=TRUE),
               grep('^peduc', names(dcarea), value=TRUE),
               'pfborn', 'pchildpres', 'pmarried')

## Create table of values for multiethnic and all DC Area tracts
quads <- t(sapply(dcarea[dcarea$quad15==TRUE, tablevars], meansd))
tracts <- t(sapply(dcarea[, tablevars], meansd))
tracttbl <- data.frame(quads, tracts)

## List of variable names to be published in the table
tblnames <- c('\\emph{Racial composition}&&&\\\\Percent Asian',
              'Percent Hispanic',
              'Percent non-Hispanic black',
              'Percent non-Hispanic white',
              '\\emph{Educational attainment}&&&\\\\Percent less than high school',
              'Percent high school',
              'Percent some college', 'Percent bachelor\'s degree',
              'Percent professional degree',
              '\\emph{Other demographic characteristics}&&&\\\\Percent foreign-born',
              'Percent of households with children present',
              'Percent married (not separated)'
)

## Create table with formatting
tbl <- data.frame(tblnames, quads, blank=rep(' ', nrow(quads)), tracts)
tbl[,c(2,3,5,6)] <- lapply(tbl[,c(2,3,5,6)], function(x) sprintf('%3.1f', x))
# tbl[,c(3,6)] <- lapply(tbl[,c(3,6)], function(x) paste0('(',x,')'))
colnames(tbl) <- c('Variable', 'Mean', 'S.D.','', 'Mean', 'S.D.')

cap <- paste0('Means and standard deviations of tract-level variables',
              'in multiethnic and quadrivial neighborhoods in the DC Area')
kable(tbl, 
      caption=cap, row.names=FALSE)

Table 2.1: Means and standard deviations of tract-level variablesin multiethnic and quadrivial neighborhoods in the DC Area
Variable	Mean	S.D.	Mean	S.D.
&&&\Percent Asian	18.3	7.5	10.2	9.4
Percent Hispanic	24.3	9.8	14.6	13.6
Percent non-Hispanic black	22.2	9.5	30.9	31.3
Percent non-Hispanic white	31.5	9.7	41.1	27.2
&&&\Percent less than high school	12.8	6.5	9.9	9.3
Percent high school	18.0	5.6	17.2	11.1
Percent some college	22.6	5.3	20.6	8.6
Percent bachelor’s degree	25.4	5.8	25.5	9.8
Percent professional degree	21.2	6.6	26.8	15.9
&&&\Percent foreign-born	39.7	9.1	23.9	14.7
Percent of households with children present	37.5	9.9	32.7	12.2
Percent married (not separated)	48.4	8.0	44.9	15.8

Location of multiracial neighborhoods in the DC area

Rank counties by the number of multiracial neighborhoods included in their jurisdictions. Multiracial neighborhoods were those in which Asians, blacks, Latinxs, and whites all made up at least 10% of the neighborhood populaiton and no group represents a majority.

Construct variables measuring multiracial neighborhoods, both those included using the criteria and those excluded for not meeting the criterion that no group be a majority.

racevars <- paste0('race.',c('papi','pnhb','phsp','pnhw'))
dcarea$othmulti <- apply(sapply(dcarea[, racevars],
                                function(x) !is.na(x) & x>=10), 1, all)
dcarea$exclmulti <- dcarea$quad15 != dcarea$othmulti
dcarea$county <- factor(substr(as.character(dcarea$GISJOIN), 2, 7))
levels(dcarea$county) <- c(
    'D.C.'
    , 'Montgomery county'
    , 'Prince George\'s county'
    , 'Arlington county'
    , 'Fairfax county'
    , 'Fairfax city'
    , 'Falls Church city'
    , 'Alexandria city'
)

## Breakdown of 'excluded' multiethnic neighborhoods by county
excl_juris <- table(dcarea[, c('county', 'exclmulti')])
N_excluded <- sum(excl_juris[,2])

## Breakdown of multiethnic neighborhoods by county
multi_tbl <-table(dcarea[, c('county', 'quad15')])
kable(sort(multi_tbl[,2], decreasing=TRUE))

	x
Montgomery county	53
Fairfax county	41
Prince George’s county	10
Fairfax city	4
Arlington county	2
D.C.	1
Falls Church city	0
Alexandria city	0

2.3 Diverse neighborhoods excluded for having a majority race

There were 18 neighborhoods that did not meet the criterion that no racial group be a majority.

Excluded for having a white majority

dcarea[!is.na(dcarea$exclmulti) & dcarea$exclmulti==TRUE & dcarea$race.pnhw>50,
       c('county', racevars, 'GISJOIN')]

Table 2.2:
county	race.papi	race.pnhb	race.phsp	race.pnhw	GISJOIN
D.C.	12.1	17.7	10.9	56	G1100010005900
D.C.	14.1	12.3	12.3	57.7	G1100010010100
Montgomery county	21	10.8	11.3	53.2	G2400310700616
Montgomery county	13.5	11.3	10.2	61.5	G2400310701004
Montgomery county	13.9	12.7	10.2	59.9	G2400310701313
Montgomery county	10.3	18	18.1	50.1	G2400310701314
Montgomery county	12.7	20.5	14.8	50.3	G2400310703216
Arlington county	20.5	11.5	13.8	50.3	G5100130103200
Arlington county	14.2	16.3	10.5	57	G5100130103502
Fairfax county	13.4	13.4	15.4	55.4	G5100590421102
Fairfax county	11.4	16.6	11.7	53.8	G5100590422301
Fairfax county	18.5	11.4	12.4	55.4	G5100590432000
Fairfax county	11.1	14.9	13.1	50.5	G5100590432800
Fairfax county	21.7	12.1	10.7	51.5	G5100590440202
Fairfax city	10.1	21.8	12.4	52.4	G5105100200107

Excluded for having a black majority

dcarea[!is.na(dcarea$exclmulti) & dcarea$exclmulti==TRUE & dcarea$race.pnhb>50,
       c('county', racevars, 'GISJOIN')]

Table 2.3:
county	race.papi	race.pnhb	race.phsp	race.pnhw	GISJOIN
Montgomery county	13.4	58.3	11.6	10.6	G2400310701423
Prince George's county	12.9	50.1	20.6	10.4	G2400330801411

Excluded for having a Latino majority:

dcarea[!is.na(dcarea$exclmulti) & dcarea$exclmulti==TRUE & dcarea$race.phsp>50,
       c('county', racevars, 'GISJOIN')]

Table 2.4:
county	race.papi	race.pnhb	race.phsp	race.pnhw	GISJOIN
Montgomery county	12.8	23.3	50.8	10.3	G2400310700724

Steven Manson, Jonathan Schroeder, David Van Riper, and Steven Ruggles. IPUMS National Historical Geographic Information System: Version 13.0 [Database]. Minneapolis: University of Minnesota. 2018. http://doi.org/10.18128/D050.V13.0 ↩︎