Data overview: Arrests

This notebook presents a national overview of U.S. Immigration and Customs Enforcement (ICE) Enforcement and Removal Operations (ERO) Law Enforcement Systems and Analysis Division (LESA) data from ICE’s Integrated Decision Support (IIDS) database regarding nationwide arrests for the time period from October 1, 2011, through January 29, 2023, (full U.S. Government Fiscal Years 2012 through 2022), obtained by the University of Washington Center for Human Rights (UWCHR) pursuant to FOIA request 2022-ICFO-09023.

For data and code used to generate this notebook, see: https://github.com/UWCHR/ice-enforce

options(scipen = 1000000)

library(pacman)

p_load(here, tidyverse, zoo, lubridate, ggplot2, plotly, gghighlight, viridis)

arr <- read_delim(here('write', 'input', 'ice_arrests_fy12-23ytd.csv.gz'), delim='|',
                  col_types = cols(aor = col_factor(),
                                  arrest_date = col_date(format="%m/%d/%Y"),
                                  departed_date = col_date(format="%m/%d/%Y"),
                                  apprehension_landmark = col_factor(),
                                  arrest_method = col_factor(),
                                  operation = col_factor(),
                                  processing_disposition = col_factor(),
                                  citizenship_country = col_factor(),
                                  gender = col_factor(),
                                  case_closed_date = col_date(format="%m/%d/%Y"),
                                  id = col_integer(),
                                  hashid = col_character()
                                  )) 

redacted <- c('removal_threat_level', 'apprehension_threat_level', 'alien_file_number')
redacted_text <- paste0('`', paste(unlist(redacted), collapse = '`, `'), '`')

arr <- arr %>% 
  dplyr::select(-all_of(redacted))

cy_months <- c("Jan","Feb","Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec")
fy_months <- c("Oct", "Nov", "Dec", "Jan","Feb","Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep")

arr <- arr %>% 
  mutate(aor = factor(aor, levels = sort(levels(arr$aor))),
         arrest_date = as_date(arrest_date, format="%m/%d/%Y"),
         year = year(arrest_date),
         month = factor(month(arrest_date, label=TRUE, abbr=TRUE), levels = fy_months),
         year_mth = zoo::as.yearmon(arrest_date),
         fy = as.factor(substr(quarter(arrest_date, fiscal_start=10, type="year.quarter"), 1,4)),
         gender = toupper(gender),
         operation = toupper(operation),
         processing_disposition = toupper(processing_disposition),
         citizenship_country = factor(toupper(citizenship_country)),
         apprehension_landmark = toupper(str_squish(apprehension_landmark)))

An administrative arrest (“arrest”) occurs when an individual is taken into custody by ICE and removal proceedings initiated against them.¹

The arrests dataset (arr) includes 1741174 observations of 17 variables; 3 fully redacted fields (removal_threat_level, apprehension_threat_level, alien_file_number) are dropped from analysis.

The following provides an summary of dataset characteristics via skimr::skim(arr):

skimr::skim(arr)

Data summary
Name	arr
Number of rows	1741174
Number of columns	17
_______________________
Column type frequency:
character	7
Date	3
factor	5
numeric	2
________________________
Group variables	None

Variable type: character

skim_variable	n_missing	complete_rate	min	max	n_unique
apprehension_landmark	55999	0.97	1	80	10536
operation	1227173	0.30	1	79	512
processing_disposition	3526	1.00	5	44	50
gender	0	1.00	4	7	3
hashid	0	1.00	40	40	1741174
area_of_responsibility	16414	0.99	25	37	26
year_mth	0	1.00	4	16	136

Variable type: Date

skim_variable	n_missing	complete_rate	min	max	median	n_unique
arrest_date	0	1.00	2011-10-01	2023-01-29	2016-05-18	4139
departed_date	879760	0.49	1982-07-15	2023-01-27	2016-01-08	4174
case_closed_date	1297394	0.25	1989-05-11	2023-01-28	2018-09-21	2712

Variable type: factor

skim_variable	n_missing	complete_rate	ordered	n_unique	top_counts
aor	16414	0.99	FALSE	25	DAL: 156735, SNA: 153590, HOU: 140274, ATL: 134318
arrest_method	0	1.00	FALSE	28	CAP: 715306, Non: 262757, CAP: 254005, Loc: 175016
citizenship_country	20	1.00	FALSE	221	MEX: 990668, GUA: 154021, HON: 143928, EL : 101398
month	0	1.00	TRUE	12	Oct: 170676, Nov: 156697, Jan: 151794, Mar: 147540
fy	0	1.00	FALSE	12	201: 265573, 201: 232287, 201: 183703, 201: 158581

Variable type: numeric

skim_variable	n_missing	complete_rate	mean	sd	p0	p25	p50	p75	p100	hist
id	0	1	870586.50	502633.78	0	435293.2	870586.5	1305880	1741173	▇▇▇▇▇
year	0	1	2016.15	3.45	2011	2013.0	2016.0	2019	2023	▇▅▆▃▃

Field definitions

Datasets were released without any data dictionary or field descriptions; in cases where this information is not self-explanatory, we have attempted to provide citations of relevant sources providing context.

Original dataset fields

aor: ICE Area of Responsibility associated with arrest
arrest_date: Date of arrest
departed_date: Date of departure
case_closed_date: Date of closure of case
arrest_method: ICE ERO division or category associated with arrest
apprehension_landmark: Landmark or entity associated with arrest
operation: Operation associated with arrest
processing_disposition: Status of removal proceedings associated with event
citizenship_country: Country of citizenship of arrested individual
gender: Gender of arrested individual
apprehension_threat_level: Fully redacted in original dataset
removal_threat_level: Fully redacted in original dataset
alien_file_number: Unique individual identifier for arrested individual, fully redacted in original dataset

Additional fields created by UWCHR

id: Sequential record identifier (not individual identifier)
hashid: Unique record hash (not individual identifier)
year: Calendar year derived from arrest_date
month: Abbreviated month derived from arrest_date
year_mth: Calendar year and month derived from arrest_date
fy: U.S. government fiscal year (Oct.-Sept.) derived from arrest_date

Total arrests

p1 <- arr %>%
  filter(arrest_date >= "2011-10-01",
         arrest_date <= "2022-09-30") %>% 
  group_by(fy) %>% 
  summarize(n = n()) %>% 
  ggplot(aes(x = as.factor(fy), y=n)) +
  geom_col() +
  labs(title = "Total ICE arrests per FY") +
  theme_minimal()

p1

p2 <- arr %>%
  filter(arrest_date >= "2011-10-01",
         arrest_date <= "2022-09-30") %>% 
  group_by(year_mth) %>%
  summarize(n = n()) %>%
  ggplot(aes(x = year_mth, y = n)) +
  geom_line(aes(group=1)) +
  ylim(0, NA) +
  labs(title = "Total nationwide ICE arrests per month") +
  theme_minimal()

p2

## Warning: The `trans` argument of `continuous_scale()` is deprecated as of ggplot2 3.5.0.
## ℹ Please use the `transform` argument instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

p3 <- arr %>%
  filter(arrest_date >= "2011-10-01",
         arrest_date <= "2022-09-30") %>% 
  group_by(fy, month) %>%
  summarize(n = n()) %>%
  ggplot(aes(x = month, y = n, color = fy, group = fy)) +
  geom_line() +
  ylim(0, NA) +
  scale_color_viridis_d() +
  labs(title = "Total nationwide ICE arrests per month") +
  theme_minimal()

p3

Basic demographics

Gender

Increasing proportion of females arrested since FY 2021:

# arr %>%
#   mutate(gender = toupper(gender)) %>% 
#   group_by(gender) %>% 
#   summarize(n = n())

p1 <- arr %>% 
  filter(arrest_date >= "2011-10-01",
         arrest_date <= "2022-09-30") %>% 
  count(fy, gender) %>% 
  ggplot(aes(x=fy, y=n, fill=gender)) +
  geom_col(position='fill') +
  scale_y_continuous(labels = scales::percent) +
  labs(title="Total ICE arrests, % by gender") +
  theme_minimal()

p1

Country of citizenship

Changing composition of arrest nationality: Mexico, Guatemala, El Salvador decrease; increase in Venezeula, Colombia, Nicaragua.

cit <- arr %>%
  filter(arrest_date >= "2011-10-01",
         arrest_date <= "2022-09-30") %>% 
  mutate(citizenship_country = toupper(citizenship_country)) %>% 
  group_by(citizenship_country) %>% 
  summarize(n = n()) %>% 
  arrange(desc(n))

p1 <- arr %>% 
  filter(arrest_date >= "2011-10-01",
         arrest_date <= "2022-09-30") %>% 
  mutate(citizenship_country =
           case_when(citizenship_country %in%
                       head(cit$citizenship_country, 15) ~
                       citizenship_country,
                     TRUE ~
                       "ALL OTHERS"
  )) %>% 
  count(fy, citizenship_country) %>% 
  ggplot(aes(x=fy, y=n, fill=citizenship_country, color=citizenship_country)) +
  geom_col() +
  labs(title = "Total ICE arrests by country of citizenship (top 15)") +
  theme_minimal()

ggplotly(p1)

cit_rank <- arr %>% 
  filter(arrest_date >= "2011-10-01",
         arrest_date <= "2022-09-30") %>% 
  count(fy, citizenship_country) %>% 
  arrange(fy, desc(n), citizenship_country) %>% 
  group_by(fy) %>% 
  mutate(ranking = row_number())

p1 <- cit_rank %>%
  filter(ranking <= 10) %>% 
  ggplot(aes(x = fy, y = ranking, group = citizenship_country)) +
  geom_line(aes(color = citizenship_country), size = 1) +
  geom_point(aes(color = citizenship_country), size = 2) +
  scale_y_reverse(breaks = seq(1,10)) +
  labs(title = "Ranked country of citizenship for ICE arrests") +
  theme_minimal()

## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

ggplotly(p1)

Total arrests by AOR

Below is an interactive chart of total ICE arrests per FY by AOR:

p1 <- arr %>%
  filter(arrest_date >= "2011-10-01",
         arrest_date <= "2022-09-30") %>%   
  group_by(fy, aor) %>% 
  summarize(n = n()) %>% 
  ggplot(aes(x = as.factor(fy), y=n, color=aor, group=aor)) +
  geom_line() +
  labs(title = "Total ICE arrests per FY by AOR") +
  theme_minimal()

ggplotly(p1)

Percent change

Percent change in arrests per FY nationally and by AOR.

natl_pct_chg <- arr %>%
  filter(arrest_date >= "2011-10-01",
         arrest_date <= "2022-09-30") %>% 
  group_by(fy) %>%
  summarize(n = n()) %>% 
  mutate(pct_change = (n/lag(n) - 1))

p1 <- natl_pct_chg %>% 
  ggplot(aes(x = fy, y = pct_change)) +
  geom_col() +
  scale_y_continuous(labels = scales::percent) +
  labs(title="FY % change in total ICE arrests") +
  theme_minimal()

p1

aor_pct_chg <- arr %>%
  filter(arrest_date >= "2011-10-01",
         arrest_date <= "2022-09-30",
         !is.na(aor)) %>% 
  group_by(fy, aor) %>%
  summarize(n = n()) %>% 
  group_by(aor) %>% 
  arrange(fy, .by_group=TRUE) %>% 
  mutate(pct_change = (n/lag(n) - 1))

p2 <- aor_pct_chg %>% 
  ggplot(aes(x = fy, y = pct_change)) +
  geom_col() +
  scale_y_continuous(labels = scales::percent) +
  scale_x_discrete(breaks=seq(2012, 2022, 4)) +
  facet_wrap(~aor)  +
  labs(title="FY % change in total ICE arrests per AOR") +
  theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust=1)) +
  theme_minimal()

p2

Arrests by `arrest_method`

methods <- arr %>% 
  count(arrest_method) %>% 
  arrange(desc(n))

top_methods <- methods %>% 
  filter(n > 10000)

arr <- arr %>% 
  mutate(arrest_method_short =
           case_when(arrest_method %in%
                       unlist(top_methods$arrest_method) ~
                       as.character(arrest_method), 
                     TRUE ~ 
                       "All others"))

p1 <- arr %>% 
  filter(arrest_date >= "2011-10-01",
         arrest_date <= "2022-09-30") %>% 
  group_by(fy, arrest_method_short) %>%
  ggplot(aes(x = fy, fill=arrest_method_short)) +
  geom_bar(stat='count', position='stack') +
  theme_minimal()

ggplotly(p1)

method_pct_chg <- arr %>%
  filter(arrest_date >= "2011-10-01",
         arrest_date <= "2022-09-30",
         !is.na(aor)) %>% 
  group_by(fy, arrest_method_short) %>%
  summarize(n = n()) %>% 
  group_by(arrest_method_short) %>% 
  arrange(fy, .by_group=TRUE) %>% 
  mutate(pct_change = (n/lag(n) - 1))

p1 <- method_pct_chg %>% 
  ggplot(aes(x = fy, y = pct_change)) +
  geom_col() +
  scale_y_continuous(labels = scales::percent) +
  scale_x_discrete(breaks=seq(2012, 2022, 4)) +
  facet_wrap(~arrest_method_short, scales='free_y', labeller = label_wrap_gen(width=20))  +
  labs(title="FY % change in total ICE arrests by arrest method") +
  theme_minimal()

p1

p2 <- arr %>% 
  mutate(fy = substr(as.character(fy), 3, 4)) %>% 
  filter(arrest_date >= "2011-10-01",
         arrest_date <= "2022-09-30",
         !is.na(aor),
         aor != "HQ") %>% 
  group_by(aor, fy, arrest_method_short) %>%
  ggplot(aes(x = fy, fill=arrest_method_short)) +
  geom_bar(stat='count') +
  scale_x_discrete(breaks=seq(12, 22, 4)) +
  facet_wrap(~aor) +
  labs(title="Total ICE arrests by arrest method per AOR") +
  theme_minimal()

ggplotly(p2)

# p3 <- arr %>%
#     filter(arrest_date >= "2011-10-01",
#          arrest_date <= "2022-09-30",
#          !is.na(aor),
#          aor != "HQ",
#          arrest_method_short == "ERO Reprocessed Arrest") %>% 
#   group_by(aor, fy, arrest_method_short) %>%
#   ggplot(aes(x = fy, fill=arrest_method_short)) +
#   geom_bar(stat='count') +
#   scale_x_discrete(breaks=seq(2012, 2022, 2)) +
#   facet_wrap(~aor) +
#   labs(title="Total ICE arrests by arrest method per AOR") +
#   theme(axis.text.x = element_text(angle = 90, vjust = 0, hjust=1))
# 
# p3

p1 <- arr %>% 
  filter(arrest_date >= "2011-10-01",
       arrest_date <= "2022-09-30") %>% 
  count(fy, gender, arrest_method_short) %>% 
  ggplot(aes(x=fy, y=n, fill=gender)) +
  geom_col(position='fill') +
  facet_wrap(~arrest_method_short, labeller = label_wrap_gen(width=20)) +
  scale_x_discrete(breaks=seq(2012, 2022, 4)) +
  scale_y_continuous(labels = scales::percent) +
  labs(title="Total ICE arrests, % by gender") +
  theme_minimal()

p1

Processing disposition

disps <- arr %>% 
  count(processing_disposition) %>% 
  arrange(desc(n))

top_disps <- disps %>% 
  filter(n > 10000)

arr <- arr %>% 
  mutate(disp_short = 
           case_when(processing_disposition %in%
                       unlist(top_disps$processing_disposition) ~
                       as.character(processing_disposition), 
                     TRUE ~
                       "ALL OTHERS"))

p1 <- arr %>%
  filter(arrest_date >= "2011-10-01",
         arrest_date <= "2022-09-30") %>% 
  group_by(fy, disp_short) %>% 
  summarize(n = n()) %>% 
  ggplot(aes(x = fy, y=n, fill=disp_short)) +
  geom_col() +
  labs(title = "Total ICE arrests per FY by processing disposition") +
  theme_minimal()

ggplotly(p1)

Apprehension Landmark

Overview of most common arrest apprehension_landmark values gives a sense of the diversity of this category, which includes 10537 unique values; note inclusion of general values likely denoting the ICE sub-office or divison responsible for the arrest rather than a precise location. Closer inspection is recommended at the AOR level.

For more on apprehension_landmark values, see the Landmarks notebook.

landmarks_per_aor <- arr %>%
  group_by(aor) %>%
  summarize(n = n_distinct(apprehension_landmark))

landmarks <- arr %>% 
  count(apprehension_landmark) %>% 
  arrange(desc(n))

p1 <- arr %>% 
  filter(arrest_date >= "2011-10-01",
         arrest_date <= "2022-09-30") %>% 
  mutate(apprehension_landmark =
           case_when(apprehension_landmark %in%
                     head(landmarks$apprehension_landmark, 15) ~
                     as.character(apprehension_landmark), 
                   TRUE ~
                     "ALL OTHERS")) %>% 
  group_by(fy, apprehension_landmark) %>% 
  summarize(n = n()) %>% 
  ggplot(aes(x = fy, y=n, fill=apprehension_landmark)) +
  geom_col() +
  labs(title = "Total ICE arrests per FY by `apprehension_landmark` (top 15)") +
  theme_minimal()

ggplotly(p1)

For discussion of ICE’s definition of “arrests”, see American Immigration Council, “Changing Patterns of Interior Immigration Enforcement in the United States, 2016 - 2018”, July 2019: https://www.americanimmigrationcouncil.org/research/interior-immigration-enforcement-united-states-2016-2018 ↩︎

ICE ERO-LESA nationwide arrests data, FY12-22

UWCHR

2023-06-01

Data overview: Arrests

Field definitions

Original dataset fields

Additional fields created by UWCHR

Total arrests

Basic demographics

Gender

Country of citizenship

Total arrests by AOR

Percent change

Arrests by `arrest_method`

Processing disposition

Apprehension Landmark

ICE ERO-LESA nationwide arrests data, FY12-22

UWCHR

2023-06-01

Data overview: Arrests

Field definitions

Original dataset fields

Additional fields created by UWCHR

Total arrests

Basic demographics

Gender

Country of citizenship

Total arrests by AOR

Percent change

Arrests by arrest_method

Processing disposition

Apprehension Landmark

Arrests by `arrest_method`