This notebook presents a national overview of U.S. Immigration and Customs Enforcement (ICE) Enforcement and Removal Operations (ERO) Law Enforcement Systems and Analysis Division (LESA) data from ICE’s Integrated Decision Support (IIDS) database regarding nationwide arrests for the time period from October 1, 2011, through January 29, 2023, (full U.S. Government Fiscal Years 2012 through 2022), obtained by the University of Washington Center for Human Rights (UWCHR) pursuant to FOIA request 2022-ICFO-09023.
For data and code used to generate this notebook, see: https://github.com/UWCHR/ice-enforce
options(scipen = 1000000)
library(pacman)
p_load(here, tidyverse, zoo, lubridate, ggplot2, plotly, gghighlight, viridis)
arr <- read_delim(here('write', 'input', 'ice_arrests_fy12-23ytd.csv.gz'), delim='|',
col_types = cols(aor = col_factor(),
arrest_date = col_date(format="%m/%d/%Y"),
departed_date = col_date(format="%m/%d/%Y"),
apprehension_landmark = col_factor(),
arrest_method = col_factor(),
operation = col_factor(),
processing_disposition = col_factor(),
citizenship_country = col_factor(),
gender = col_factor(),
case_closed_date = col_date(format="%m/%d/%Y"),
id = col_integer(),
hashid = col_character()
))
redacted <- c('removal_threat_level', 'apprehension_threat_level', 'alien_file_number')
redacted_text <- paste0('`', paste(unlist(redacted), collapse = '`, `'), '`')
arr <- arr %>%
dplyr::select(-all_of(redacted))
cy_months <- c("Jan","Feb","Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec")
fy_months <- c("Oct", "Nov", "Dec", "Jan","Feb","Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep")
arr <- arr %>%
mutate(aor = factor(aor, levels = sort(levels(arr$aor))),
arrest_date = as_date(arrest_date, format="%m/%d/%Y"),
year = year(arrest_date),
month = factor(month(arrest_date, label=TRUE, abbr=TRUE), levels = fy_months),
year_mth = zoo::as.yearmon(arrest_date),
fy = as.factor(substr(quarter(arrest_date, fiscal_start=10, type="year.quarter"), 1,4)),
gender = toupper(gender),
operation = toupper(operation),
processing_disposition = toupper(processing_disposition),
citizenship_country = factor(toupper(citizenship_country)),
apprehension_landmark = toupper(str_squish(apprehension_landmark)))
An administrative arrest (“arrest”) occurs when an individual is taken into custody by ICE and removal proceedings initiated against them.1
The arrests dataset (arr
) includes 1741174 observations
of 17 variables; 3 fully redacted fields
(removal_threat_level
,
apprehension_threat_level
, alien_file_number
)
are dropped from analysis.
The following provides an summary of dataset characteristics via
skimr::skim(arr)
:
skimr::skim(arr)
Name | arr |
Number of rows | 1741174 |
Number of columns | 17 |
_______________________ | |
Column type frequency: | |
character | 7 |
Date | 3 |
factor | 5 |
numeric | 2 |
________________________ | |
Group variables | None |
Variable type: character
skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
---|---|---|---|---|---|---|---|
apprehension_landmark | 55999 | 0.97 | 1 | 80 | 0 | 10536 | 0 |
operation | 1227173 | 0.30 | 1 | 79 | 0 | 512 | 0 |
processing_disposition | 3526 | 1.00 | 5 | 44 | 0 | 50 | 0 |
gender | 0 | 1.00 | 4 | 7 | 0 | 3 | 0 |
hashid | 0 | 1.00 | 40 | 40 | 0 | 1741174 | 0 |
area_of_responsibility | 16414 | 0.99 | 25 | 37 | 0 | 26 | 0 |
year_mth | 0 | 1.00 | 4 | 16 | 0 | 136 | 0 |
Variable type: Date
skim_variable | n_missing | complete_rate | min | max | median | n_unique |
---|---|---|---|---|---|---|
arrest_date | 0 | 1.00 | 2011-10-01 | 2023-01-29 | 2016-05-18 | 4139 |
departed_date | 879760 | 0.49 | 1982-07-15 | 2023-01-27 | 2016-01-08 | 4174 |
case_closed_date | 1297394 | 0.25 | 1989-05-11 | 2023-01-28 | 2018-09-21 | 2712 |
Variable type: factor
skim_variable | n_missing | complete_rate | ordered | n_unique | top_counts |
---|---|---|---|---|---|
aor | 16414 | 0.99 | FALSE | 25 | DAL: 156735, SNA: 153590, HOU: 140274, ATL: 134318 |
arrest_method | 0 | 1.00 | FALSE | 28 | CAP: 715306, Non: 262757, CAP: 254005, Loc: 175016 |
citizenship_country | 20 | 1.00 | FALSE | 221 | MEX: 990668, GUA: 154021, HON: 143928, EL : 101398 |
month | 0 | 1.00 | TRUE | 12 | Oct: 170676, Nov: 156697, Jan: 151794, Mar: 147540 |
fy | 0 | 1.00 | FALSE | 12 | 201: 265573, 201: 232287, 201: 183703, 201: 158581 |
Variable type: numeric
skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
---|---|---|---|---|---|---|---|---|---|---|
id | 0 | 1 | 870586.50 | 502633.78 | 0 | 435293.2 | 870586.5 | 1305880 | 1741173 | ▇▇▇▇▇ |
year | 0 | 1 | 2016.15 | 3.45 | 2011 | 2013.0 | 2016.0 | 2019 | 2023 | ▇▅▆▃▃ |
Datasets were released without any data dictionary or field descriptions; in cases where this information is not self-explanatory, we have attempted to provide citations of relevant sources providing context.
aor
: ICE Area of Responsibility associated with
arrestarrest_date
: Date of arrestdeparted_date
: Date of departurecase_closed_date
: Date of closure of casearrest_method
: ICE ERO division or category associated
with arrestapprehension_landmark
: Landmark or entity associated
with arrestoperation
: Operation associated with arrestprocessing_disposition
: Status of removal proceedings
associated with eventcitizenship_country
: Country of citizenship of arrested
individualgender
: Gender of arrested individualapprehension_threat_level
: Fully redacted in original
datasetremoval_threat_level
: Fully redacted in original
datasetalien_file_number
: Unique individual identifier for
arrested individual, fully redacted in original datasetid
: Sequential record identifier (not individual
identifier)hashid
: Unique record hash (not individual
identifier)year
: Calendar year derived from
arrest_date
month
: Abbreviated month derived from
arrest_date
year_mth
: Calendar year and month derived from
arrest_date
fy
: U.S. government fiscal year (Oct.-Sept.) derived
from arrest_date
p1 <- arr %>%
filter(arrest_date >= "2011-10-01",
arrest_date <= "2022-09-30") %>%
group_by(fy) %>%
summarize(n = n()) %>%
ggplot(aes(x = as.factor(fy), y=n)) +
geom_col() +
labs(title = "Total ICE arrests per FY") +
theme_minimal()
p1
p2 <- arr %>%
filter(arrest_date >= "2011-10-01",
arrest_date <= "2022-09-30") %>%
group_by(year_mth) %>%
summarize(n = n()) %>%
ggplot(aes(x = year_mth, y = n)) +
geom_line(aes(group=1)) +
ylim(0, NA) +
labs(title = "Total nationwide ICE arrests per month") +
theme_minimal()
p2
## Warning: The `trans` argument of `continuous_scale()` is deprecated as of ggplot2 3.5.0.
## ℹ Please use the `transform` argument instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
p3 <- arr %>%
filter(arrest_date >= "2011-10-01",
arrest_date <= "2022-09-30") %>%
group_by(fy, month) %>%
summarize(n = n()) %>%
ggplot(aes(x = month, y = n, color = fy, group = fy)) +
geom_line() +
ylim(0, NA) +
scale_color_viridis_d() +
labs(title = "Total nationwide ICE arrests per month") +
theme_minimal()
p3
Increasing proportion of females arrested since FY 2021:
# arr %>%
# mutate(gender = toupper(gender)) %>%
# group_by(gender) %>%
# summarize(n = n())
p1 <- arr %>%
filter(arrest_date >= "2011-10-01",
arrest_date <= "2022-09-30") %>%
count(fy, gender) %>%
ggplot(aes(x=fy, y=n, fill=gender)) +
geom_col(position='fill') +
scale_y_continuous(labels = scales::percent) +
labs(title="Total ICE arrests, % by gender") +
theme_minimal()
p1
Changing composition of arrest nationality: Mexico, Guatemala, El Salvador decrease; increase in Venezeula, Colombia, Nicaragua.
cit <- arr %>%
filter(arrest_date >= "2011-10-01",
arrest_date <= "2022-09-30") %>%
mutate(citizenship_country = toupper(citizenship_country)) %>%
group_by(citizenship_country) %>%
summarize(n = n()) %>%
arrange(desc(n))
p1 <- arr %>%
filter(arrest_date >= "2011-10-01",
arrest_date <= "2022-09-30") %>%
mutate(citizenship_country =
case_when(citizenship_country %in%
head(cit$citizenship_country, 15) ~
citizenship_country,
TRUE ~
"ALL OTHERS"
)) %>%
count(fy, citizenship_country) %>%
ggplot(aes(x=fy, y=n, fill=citizenship_country, color=citizenship_country)) +
geom_col() +
labs(title = "Total ICE arrests by country of citizenship (top 15)") +
theme_minimal()
ggplotly(p1)
cit_rank <- arr %>%
filter(arrest_date >= "2011-10-01",
arrest_date <= "2022-09-30") %>%
count(fy, citizenship_country) %>%
arrange(fy, desc(n), citizenship_country) %>%
group_by(fy) %>%
mutate(ranking = row_number())
p1 <- cit_rank %>%
filter(ranking <= 10) %>%
ggplot(aes(x = fy, y = ranking, group = citizenship_country)) +
geom_line(aes(color = citizenship_country), size = 1) +
geom_point(aes(color = citizenship_country), size = 2) +
scale_y_reverse(breaks = seq(1,10)) +
labs(title = "Ranked country of citizenship for ICE arrests") +
theme_minimal()
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
ggplotly(p1)
Below is an interactive chart of total ICE arrests per FY by AOR:
p1 <- arr %>%
filter(arrest_date >= "2011-10-01",
arrest_date <= "2022-09-30") %>%
group_by(fy, aor) %>%
summarize(n = n()) %>%
ggplot(aes(x = as.factor(fy), y=n, color=aor, group=aor)) +
geom_line() +
labs(title = "Total ICE arrests per FY by AOR") +
theme_minimal()
ggplotly(p1)
Percent change in arrests per FY nationally and by AOR.
natl_pct_chg <- arr %>%
filter(arrest_date >= "2011-10-01",
arrest_date <= "2022-09-30") %>%
group_by(fy) %>%
summarize(n = n()) %>%
mutate(pct_change = (n/lag(n) - 1))
p1 <- natl_pct_chg %>%
ggplot(aes(x = fy, y = pct_change)) +
geom_col() +
scale_y_continuous(labels = scales::percent) +
labs(title="FY % change in total ICE arrests") +
theme_minimal()
p1
aor_pct_chg <- arr %>%
filter(arrest_date >= "2011-10-01",
arrest_date <= "2022-09-30",
!is.na(aor)) %>%
group_by(fy, aor) %>%
summarize(n = n()) %>%
group_by(aor) %>%
arrange(fy, .by_group=TRUE) %>%
mutate(pct_change = (n/lag(n) - 1))
p2 <- aor_pct_chg %>%
ggplot(aes(x = fy, y = pct_change)) +
geom_col() +
scale_y_continuous(labels = scales::percent) +
scale_x_discrete(breaks=seq(2012, 2022, 4)) +
facet_wrap(~aor) +
labs(title="FY % change in total ICE arrests per AOR") +
theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust=1)) +
theme_minimal()
p2
arrest_method
methods <- arr %>%
count(arrest_method) %>%
arrange(desc(n))
top_methods <- methods %>%
filter(n > 10000)
arr <- arr %>%
mutate(arrest_method_short =
case_when(arrest_method %in%
unlist(top_methods$arrest_method) ~
as.character(arrest_method),
TRUE ~
"All others"))
p1 <- arr %>%
filter(arrest_date >= "2011-10-01",
arrest_date <= "2022-09-30") %>%
group_by(fy, arrest_method_short) %>%
ggplot(aes(x = fy, fill=arrest_method_short)) +
geom_bar(stat='count', position='stack') +
theme_minimal()
ggplotly(p1)
method_pct_chg <- arr %>%
filter(arrest_date >= "2011-10-01",
arrest_date <= "2022-09-30",
!is.na(aor)) %>%
group_by(fy, arrest_method_short) %>%
summarize(n = n()) %>%
group_by(arrest_method_short) %>%
arrange(fy, .by_group=TRUE) %>%
mutate(pct_change = (n/lag(n) - 1))
p1 <- method_pct_chg %>%
ggplot(aes(x = fy, y = pct_change)) +
geom_col() +
scale_y_continuous(labels = scales::percent) +
scale_x_discrete(breaks=seq(2012, 2022, 4)) +
facet_wrap(~arrest_method_short, scales='free_y', labeller = label_wrap_gen(width=20)) +
labs(title="FY % change in total ICE arrests by arrest method") +
theme_minimal()
p1
p2 <- arr %>%
mutate(fy = substr(as.character(fy), 3, 4)) %>%
filter(arrest_date >= "2011-10-01",
arrest_date <= "2022-09-30",
!is.na(aor),
aor != "HQ") %>%
group_by(aor, fy, arrest_method_short) %>%
ggplot(aes(x = fy, fill=arrest_method_short)) +
geom_bar(stat='count') +
scale_x_discrete(breaks=seq(12, 22, 4)) +
facet_wrap(~aor) +
labs(title="Total ICE arrests by arrest method per AOR") +
theme_minimal()
ggplotly(p2)
# p3 <- arr %>%
# filter(arrest_date >= "2011-10-01",
# arrest_date <= "2022-09-30",
# !is.na(aor),
# aor != "HQ",
# arrest_method_short == "ERO Reprocessed Arrest") %>%
# group_by(aor, fy, arrest_method_short) %>%
# ggplot(aes(x = fy, fill=arrest_method_short)) +
# geom_bar(stat='count') +
# scale_x_discrete(breaks=seq(2012, 2022, 2)) +
# facet_wrap(~aor) +
# labs(title="Total ICE arrests by arrest method per AOR") +
# theme(axis.text.x = element_text(angle = 90, vjust = 0, hjust=1))
#
# p3
p1 <- arr %>%
filter(arrest_date >= "2011-10-01",
arrest_date <= "2022-09-30") %>%
count(fy, gender, arrest_method_short) %>%
ggplot(aes(x=fy, y=n, fill=gender)) +
geom_col(position='fill') +
facet_wrap(~arrest_method_short, labeller = label_wrap_gen(width=20)) +
scale_x_discrete(breaks=seq(2012, 2022, 4)) +
scale_y_continuous(labels = scales::percent) +
labs(title="Total ICE arrests, % by gender") +
theme_minimal()
p1
disps <- arr %>%
count(processing_disposition) %>%
arrange(desc(n))
top_disps <- disps %>%
filter(n > 10000)
arr <- arr %>%
mutate(disp_short =
case_when(processing_disposition %in%
unlist(top_disps$processing_disposition) ~
as.character(processing_disposition),
TRUE ~
"ALL OTHERS"))
p1 <- arr %>%
filter(arrest_date >= "2011-10-01",
arrest_date <= "2022-09-30") %>%
group_by(fy, disp_short) %>%
summarize(n = n()) %>%
ggplot(aes(x = fy, y=n, fill=disp_short)) +
geom_col() +
labs(title = "Total ICE arrests per FY by processing disposition") +
theme_minimal()
ggplotly(p1)
Overview of most common arrest apprehension_landmark
values gives a sense of the diversity of this category, which includes
10537 unique values; note inclusion of general values likely denoting
the ICE sub-office or divison responsible for the arrest rather than a
precise location. Closer inspection is recommended at the AOR level.
landmarks_per_aor <- arr %>%
group_by(aor) %>%
summarize(n = n_distinct(apprehension_landmark))
landmarks <- arr %>%
count(apprehension_landmark) %>%
arrange(desc(n))
p1 <- arr %>%
filter(arrest_date >= "2011-10-01",
arrest_date <= "2022-09-30") %>%
mutate(apprehension_landmark =
case_when(apprehension_landmark %in%
head(landmarks$apprehension_landmark, 15) ~
as.character(apprehension_landmark),
TRUE ~
"ALL OTHERS")) %>%
group_by(fy, apprehension_landmark) %>%
summarize(n = n()) %>%
ggplot(aes(x = fy, y=n, fill=apprehension_landmark)) +
geom_col() +
labs(title = "Total ICE arrests per FY by `apprehension_landmark` (top 15)") +
theme_minimal()
ggplotly(p1)
For discussion of ICE’s definition of “arrests”, see American Immigration Council, “Changing Patterns of Interior Immigration Enforcement in the United States, 2016 - 2018”, July 2019: https://www.americanimmigrationcouncil.org/research/interior-immigration-enforcement-united-states-2016-2018↩︎