2 Description of DC-Area & Multiracial Neighborhood Populations

The following provides information about the DC Area cited in the manuscript. It includes code that creates Table 1 that compares demographic characteristics of multracial neighborhoods in the DC Area to all neighborhoods in the DC Area. Data come from the IPUMS National Historical Geographic Information System (NHGIS).1

Set up environment for analyzing data from the Census. Note the difference between DATADIR created here and dataDIR created in Data Construction. The data files are separated by category and are stored in separate directories, whose names are stored in vartypes.

DATADIR <- '../data/dcarea/tracts/'

vartypes <- c(
    'children-present',
    'educ-attainment',
    'foreign-born',
    'marital-status',
    'median-age',
    'race-ethnicity'
)

## HELPER FUNCTIONS
## Load DC Area data for category of variable
dcarea.data <- function(vartype) {
    f <- paste0(DATADIR, '2010/tabular/', vartype, '/dataset/tracts-2010TIGER-', 
                vartype, '.csv')
    return(read.csv(f))
}

## Return 2-item vector containing mean and standard deviation of variable
meansd <- function(var) {
    return(c(mean=mean(var, na.rm=TRUE), sd=sd(var, na.rm=TRUE)))
}

Load data from the ACS 2011-2015 and the 1980 Census to analyze.

## Load and join data from different categories of variables from 2015 ACS
dcarea <- lapply(vartypes, dcarea.data) %>%
    reduce(left_join, by='GISJOIN') %>%
    select('GISJOIN', ends_with('15'))

## Load data from 1980
fname <- paste(DATADIR, '1980/tabular/', 'race-ethnicity/dataset',
                'tracts-1980TIGER-race-ethnicity.csv', sep='/')
dcarea80 <- read.csv(fname)

2.1 DC-area characteristics

Population of the DC area.

Calculate the total population of the DC area in 1980 and 2015.

totpop80 <- sum(dcarea80$totpop)
totpop15 <- sum(dcarea$totpop15)
Year Total Population
1980 2,764,079
2015 4,096,851
Racial composition in 1980.

Calculate proportion of DC-area population made up each racial group in 1980.

dcarea80 <- sapply(dcarea80[,c('totpop', 'nhw','nhb','hsp','api')], sum)
prace80 <- round(sapply(dcarea80[2:5], function(x) x/dcarea80[1]),3)*100
kable(prace80)
x
nhw.totpop 63.6
nhb.totpop 29.4
hsp.totpop 3.2
api.totpop 2.7
Racial composition in 2015.

Calculate proportion of DC-area population made up of each racial group in 2015.

n.hsp <- sum(dcarea$hsp15)
n.api <- sum(dcarea$api15)
n.nhw <- sum(dcarea$nhw15)
n.nhb <- sum(dcarea$nhb15)
n.totpop <- sum(dcarea$totpop15)

p.hsp <- n.hsp/n.totpop
p.api <- n.api/n.totpop
p.nhw <- n.nhw/n.totpop
p.nhb <- n.nhb/n.totpop

dcarea.race <- data.frame(
    race=c('Latinx','Asian','Non-Hispanic white', 'Non-Hispanic black'),
    vals=round(c(p.hsp, p.api, p.nhw, p.nhb),3)*100)
kable(dcarea.race)
race vals
Latinx 15.7
Asian 11.1
Non-Hispanic white 40.7
Non-Hispanic black 29.2
Foreign Born

Calculate percentage of residents that are foreign-born in DC area in 2015 (referenced on page 8 of the manuscript).

sum(dcarea$fbpop15)/sum(dcarea$totpop15)
## [1] 0.2567643
Countries of orgin.

Report largest three countries of origin in the DC area. Load data and keep only tracts in the DC area.

## Analyzes foreign-born population to find top three nations of origin
## among foreign-born population in the DC area

## Source file with function to construct variables for DC area
source('dcarea/dcarea_functions.R')

## Load file of 2015 ACS data on tract-level foreign-born population
fname <- 'dcarea/AC52015_foreign_born.csv'
fborn <- read.csv(fname)

## Select only DC-area tracts
dc.fborn <- select.dcarea(fborn)

Create list of variable names that define regions, not countries and keep variables not in the list of regions (which define countries).

region.vars <- c(
    'ADU4E001', 'ADU4E002', 'ADU4E003', 'ADU4E004', 'ADU4E005',
    'ADU4E013', 'ADU4E021', 'ADU4E028',
    'ADU4E047', 'ADU4E048', 'ADU4E049', 'ADU4E056', 'ADU4E067', 'ADU4E078',
    'ADU4E091', 'ADU4E092', 'ADU4E098', 'ADU4E101', 'ADU4E106', 'ADU4E109',
    'ADU4E117', 'ADU4E118',
    'ADU4E123', 'ADU4E124', 'ADU4E125', 'ADU4E138', 'ADU4E148', 'ADU4E160'
)

## Keep only non-region variables
nonregion.vars <- names(dc.fborn)[!(names(dc.fborn) %in% region.vars)]
dc.fborn <- dc.fborn[, nonregion.vars]

Sum total population across DC-area tracts by each country of origin, then rank and report the top three countries. The codes (variable names) for the countries can be found in the file analysis/dcarea/nhgis0011_ds216_20155_2015_tract_E_codebook.txt. Report the top three DC-area countries of origin (reported by variable name).

country.sums <- sapply(dc.fborn[, grep('^ADU4', names(dc.fborn), value=TRUE)],
                       function(x) sum(as.numeric(x)))

# Rank countries of origin by population size
country.rank <- sort(country.sums, decreasing=TRUE)

# Report countries of origin by variable name
kable(head(country.rank, 3))
x
ADU4E142 139774
ADU4E059 67203
ADU4E054 50986
Educational Attainment among Foreign-Born Residents

Report the percentage of U.S. and DC-area residents with BA or graduate degrees by foreign-born status.

## Source file with function to construct variables for DC area
source('dcarea/dcarea_functions.R')

## Load file of 2015 ACS data on tract-level foreign-born population
fname <- 'dcarea/AC52015_educ_by_foreign_born.csv'

fbeduc <- read.csv(fname)
us.fbba <- sum(fbeduc$fbba, na.rm=TRUE) / sum(fbeduc$fbtot25o, na.rm=TRUE)
us.fbgr <- sum(fbeduc$fbgr, na.rm=TRUE) / sum(fbeduc$fbtot25o, na.rm=TRUE)

dc.fbeduc <- select.dcarea(fbeduc)
dc.fbba <- (
    sum(dc.fbeduc$fbba, na.rm=TRUE) / sum(dc.fbeduc$fbtot25o, na.rm=TRUE))
dc.fbgr <- (
    sum(dc.fbeduc$fbgr, na.rm=TRUE) / sum(dc.fbeduc$fbtot25o, na.rm=TRUE))

fbeduc.comp <- data.frame(degree=c('BA', 'MA+'),
                          DC=round(c(dc.fbba, dc.fbgr),3)*100,
                          US=round(c(us.fbba, us.fbgr),3)*100)

kable(fbeduc.comp)
degree DC US
BA 21.1 16.5
MA+ 22.2 12.0

2.2 Multiracial neighborhood characteristics

Population living in multiracial neighborhoods

Report the number of people living in multiracial neighborhoods in 2015.

(quadpop <- sum(dcarea[dcarea$quad15==TRUE, 'totpop15'], na.rm=TRUE))
## [1] 584493
Comparison of multiracial to all DC-area neighborhoods.

Construct table comparing characteristics of the population living in DC-area multiracial neighborhoods to the characteristics of all DC-area neighborhoods. Code creates Table 1 in the manuscript. Save the comparison table to tables/nhood_descriptives.tex.

## Construct variables for analytic table, and keep only those variables
dcarea <- dcarea %>% mutate(
    ## Children present in HH
    pchildpres = pchpr15 * 100,

    ## Educational attainment
    peduc.lhs = plh15 * 100,
    peduc.hs = phs15 * 100,
    peduc.somecoll = (psc15 + paa15) * 100,
    peduc.ba = pba15 * 100,
    peduc.ma = pgr15 * 100,

    ## Foreign-born
    pfborn = fbpop15 / totpop15 * 100,

    ## Currently married
    pmarried = pmar15 * 100,

    ## Median age (no new variable necessary)

    ## Race-ethnicity
    race.papi = papi15 * 100,
    race.phsp = phsp15 * 100,
    race.pnhb = pnhb15 * 100,
    race.pnhw = pnhw15 * 100
) %>%
    select(GISJOIN, totpop15, pchildpres, pfborn, pmarried, mdage15,
           starts_with('peduc'), starts_with('race'), quad15, nhw15)

## CONSTRUCT TABLE
## Define list of variables in desired display order
tablevars <- c(grep('^race', names(dcarea), value=TRUE),
               grep('^peduc', names(dcarea), value=TRUE),
               'pfborn', 'pchildpres', 'pmarried')

## Create table of values for multiethnic and all DC Area tracts
quads <- t(sapply(dcarea[dcarea$quad15==TRUE, tablevars], meansd))
tracts <- t(sapply(dcarea[, tablevars], meansd))
tracttbl <- data.frame(quads, tracts)

## List of variable names to be published in the table
tblnames <- c('\\emph{Racial composition}&&&\\\\Percent Asian',
              'Percent Hispanic',
              'Percent non-Hispanic black',
              'Percent non-Hispanic white',
              '\\emph{Educational attainment}&&&\\\\Percent less than high school',
              'Percent high school',
              'Percent some college', 'Percent bachelor\'s degree',
              'Percent professional degree',
              '\\emph{Other demographic characteristics}&&&\\\\Percent foreign-born',
              'Percent of households with children present',
              'Percent married (not separated)'
)

## Create table with formatting
tbl <- data.frame(tblnames, quads, blank=rep(' ', nrow(quads)), tracts)
tbl[,c(2,3,5,6)] <- lapply(tbl[,c(2,3,5,6)], function(x) sprintf('%3.1f', x))
# tbl[,c(3,6)] <- lapply(tbl[,c(3,6)], function(x) paste0('(',x,')'))
colnames(tbl) <- c('Variable', 'Mean', 'S.D.','', 'Mean', 'S.D.')

cap <- paste0('Means and standard deviations of tract-level variables',
              'in multiethnic and quadrivial neighborhoods in the DC Area')
kable(tbl, 
      caption=cap, row.names=FALSE)
Table 2.1: Means and standard deviations of tract-level variablesin multiethnic and quadrivial neighborhoods in the DC Area
Variable Mean S.D. Mean S.D.
&&&\Percent Asian 18.3 7.5 10.2 9.4
Percent Hispanic 24.3 9.8 14.6 13.6
Percent non-Hispanic black 22.2 9.5 30.9 31.3
Percent non-Hispanic white 31.5 9.7 41.1 27.2
&&&\Percent less than high school 12.8 6.5 9.9 9.3
Percent high school 18.0 5.6 17.2 11.1
Percent some college 22.6 5.3 20.6 8.6
Percent bachelor’s degree 25.4 5.8 25.5 9.8
Percent professional degree 21.2 6.6 26.8 15.9
&&&\Percent foreign-born 39.7 9.1 23.9 14.7
Percent of households with children present 37.5 9.9 32.7 12.2
Percent married (not separated) 48.4 8.0 44.9 15.8
Location of multiracial neighborhoods in the DC area

Rank counties by the number of multiracial neighborhoods included in their jurisdictions. Multiracial neighborhoods were those in which Asians, blacks, Latinxs, and whites all made up at least 10% of the neighborhood populaiton and no group represents a majority.

Construct variables measuring multiracial neighborhoods, both those included using the criteria and those excluded for not meeting the criterion that no group be a majority.

racevars <- paste0('race.',c('papi','pnhb','phsp','pnhw'))
dcarea$othmulti <- apply(sapply(dcarea[, racevars],
                                function(x) !is.na(x) & x>=10), 1, all)
dcarea$exclmulti <- dcarea$quad15 != dcarea$othmulti
dcarea$county <- factor(substr(as.character(dcarea$GISJOIN), 2, 7))
levels(dcarea$county) <- c(
    'D.C.'
    , 'Montgomery county'
    , 'Prince George\'s county'
    , 'Arlington county'
    , 'Fairfax county'
    , 'Fairfax city'
    , 'Falls Church city'
    , 'Alexandria city'
)

## Breakdown of 'excluded' multiethnic neighborhoods by county
excl_juris <- table(dcarea[, c('county', 'exclmulti')])
N_excluded <- sum(excl_juris[,2])

## Breakdown of multiethnic neighborhoods by county
multi_tbl <-table(dcarea[, c('county', 'quad15')])
kable(sort(multi_tbl[,2], decreasing=TRUE))
x
Montgomery county 53
Fairfax county 41
Prince George’s county 10
Fairfax city 4
Arlington county 2
D.C. 1
Falls Church city 0
Alexandria city 0

2.3 Diverse neighborhoods excluded for having a majority race

There were 18 neighborhoods that did not meet the criterion that no racial group be a majority.

Excluded for having a white majority

dcarea[!is.na(dcarea$exclmulti) & dcarea$exclmulti==TRUE & dcarea$race.pnhw>50,
       c('county', racevars, 'GISJOIN')]
Table 2.2:
countyrace.papirace.pnhbrace.phsprace.pnhwGISJOIN
D.C.12.117.710.956  G1100010005900
D.C.14.112.312.357.7G1100010010100
Montgomery county21  10.811.353.2G2400310700616
Montgomery county13.511.310.261.5G2400310701004
Montgomery county13.912.710.259.9G2400310701313
Montgomery county10.318  18.150.1G2400310701314
Montgomery county12.720.514.850.3G2400310703216
Arlington county20.511.513.850.3G5100130103200
Arlington county14.216.310.557  G5100130103502
Fairfax county13.413.415.455.4G5100590421102
Fairfax county11.416.611.753.8G5100590422301
Fairfax county18.511.412.455.4G5100590432000
Fairfax county11.114.913.150.5G5100590432800
Fairfax county21.712.110.751.5G5100590440202
Fairfax city10.121.812.452.4G5105100200107

Excluded for having a black majority

dcarea[!is.na(dcarea$exclmulti) & dcarea$exclmulti==TRUE & dcarea$race.pnhb>50,
       c('county', racevars, 'GISJOIN')]
Table 2.3:
countyrace.papirace.pnhbrace.phsprace.pnhwGISJOIN
Montgomery county13.458.311.610.6G2400310701423
Prince George's county12.950.120.610.4G2400330801411

Excluded for having a Latino majority:

dcarea[!is.na(dcarea$exclmulti) & dcarea$exclmulti==TRUE & dcarea$race.phsp>50,
       c('county', racevars, 'GISJOIN')]
Table 2.4:
countyrace.papirace.pnhbrace.phsprace.pnhwGISJOIN
Montgomery county12.823.350.810.3G2400310700724

  1. Steven Manson, Jonathan Schroeder, David Van Riper, and Steven Ruggles. IPUMS National Historical Geographic Information System: Version 13.0 [Database]. Minneapolis: University of Minnesota. 2018. http://doi.org/10.18128/D050.V13.0↩︎