This notebook presents a national overview of U.S. Immigration and Customs Enforcement (ICE) Enforcement and Removal Operations (ERO) Law Enforcement Systems and Analysis Division (LESA) data from ICE’s Integrated Decision Support (IIDS) database regarding nationwide ICE removals for the time period from October 1, 2011, through January 29, 2023, (full U.S. Government Fiscal Years 2012 through 2022), obtained by the University of Washington Center for Human Rights pursuant to FOIA request 2022-ICFO-09023.
For data and code used to generate this notebook, see:
options(scipen = 1000000)
p_load(here, tidyverse, zoo, lubridate, ggplot2, plotly, gghighlight)
pd_dict <- read_delim(here('share', 'hand', 'processing_disp.csv'), delim='|')
rem <- read_delim(here('write', 'input', 'ice_removals_fy12-23ytd.csv.gz'), delim='|',
col_types = cols(aor = col_factor(),
arrest_date = col_date(format="%m/%d/%Y"),
departed_date = col_date(format="%m/%d/%Y"),
case_close_date = col_date(format="%m/%d/%Y"),
removal_date = col_date(format="%m/%d/%Y"),
apprehension_method_code = col_character(),
processing_disposition_code = col_factor(),
citizenship_country = col_factor(),
gender = col_factor(),
final_charge_section = col_factor(),
id = col_integer(),
hashid = col_character()
redacted <- c('removal_threat_level', 'alien_file_number')
redacted_text <- paste0('`', paste(unlist(redacted), collapse = '`, `'), '`')
rem <- rem %>%
dplyr::select(-redacted, -case_closed_date)
cy_months <- c("Jan","Feb","Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec")
fy_months <- c("Oct", "Nov", "Dec", "Jan","Feb","Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep")
rem <- rem %>%
mutate(aor = factor(aor, levels = sort(levels(rem$aor))),
year = year(departed_date),
month = factor(month(departed_date, label=TRUE, abbr=TRUE), levels = fy_months),
year_mth = zoo::as.yearmon(departed_date),
processing_disp = toupper(coalesce(processing_disposition_code, processing_disposition)),
fy =substr(quarter(departed_date, fiscal_start=10, type="year.quarter"), 1,4),
gender = toupper(gender),
processing_disposition = toupper(processing_disposition),
citizenship_country = factor(toupper(citizenship_country)))
rem <- left_join(rem, pd_dict, by=c('processing_disp' = 'processing_disposition_raw'))
A removal occurs when an individual is issued a final order of removal and departs the United States via deportation or voluntary return.1
The removals dataset (rem
) includes 2665505 observations
of 20 variables; 2 fully redacted fields
, alien_file_number
) are
dropped from analysis.
The following provides an summary of dataset characteristics via
Name | rem |
Number of rows | 2665505 |
Number of columns | 20 |
_______________________ | |
Column type frequency: | |
character | 9 |
Date | 4 |
factor | 5 |
numeric | 2 |
________________________ | |
Group variables | None |
Variable type: character
skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
apprehension_method_code | 2503421 | 0.06 | 1 | 5 | 0 | 29 | 0 |
gender | 0 | 1.00 | 4 | 7 | 0 | 3 | 0 |
processing_disposition | 2297154 | 0.14 | 5 | 44 | 0 | 35 | 0 |
hashid | 0 | 1.00 | 40 | 40 | 0 | 2665505 | 0 |
area_of_responsibility | 0 | 1.00 | 25 | 37 | 0 | 26 | 0 |
year_mth | 0 | 1.00 | 4 | 16 | 0 | 148 | 0 |
processing_disp | 12290 | 1.00 | 1 | 44 | 0 | 89 | 0 |
fy | 0 | 1.00 | 4 | 4 | 0 | 13 | 0 |
processing_disposition_clean | 170886 | 0.94 | 5 | 44 | 0 | 53 | 0 |
Variable type: Date
skim_variable | n_missing | complete_rate | min | max | median | n_unique |
departed_date | 0 | 1.00 | 2010-10-01 | 2023-01-27 | 2015-09-28 | 4500 |
case_close_date | 29356 | 0.99 | 2011-10-01 | 2022-10-04 | 2015-09-15 | 4018 |
arrest_date | 439348 | 0.84 | 1968-03-09 | 2023-01-27 | 2015-12-30 | 11667 |
removal_date | 778493 | 0.71 | 2013-10-01 | 2023-01-27 | 2017-05-18 | 3406 |
Variable type: factor
skim_variable | n_missing | complete_rate | ordered | n_unique | top_counts |
aor | 0 | 1.00 | FALSE | 25 | SNA: 707834, ELP: 313355, PHO: 265235, SND: 238497 |
processing_disposition_code | 380641 | 0.86 | FALSE | 54 | REI: 918042, ER: 451871, WA/: 327493, T: 172526 |
citizenship_country | 7 | 1.00 | FALSE | 210 | MEX: 1581516, GUA: 395299, HON: 280187, EL : 188837 |
final_charge_section | 11956 | 1.00 | FALSE | 141 | 212: 703614, 212: 677353, 212: 571377, 212: 303340 |
month | 0 | 1.00 | TRUE | 12 | Oct: 248369, May: 238244, Mar: 237583, Nov: 230357 |
Variable type: numeric
skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
id | 0 | 1 | 1332752.00 | 769465.16 | 0 | 666376 | 1332752 | 1999128 | 2665504 | ▇▇▇▇▇ |
year | 0 | 1 | 2015.57 | 2.96 | 2010 | 2013 | 2015 | 2018 | 2023 | ▅▇▅▆▁ |
Datasets were released without any data dictionary or field descriptions; in cases where this information is not self-explanatory, we have attempted to provide citations of relevant sources providing context.
: ICE Area of Responsibility associated with
: Date of arrestdeparted_date
: Date of departureremoval_date
: Date of order of removalcase_closed_date
: Date of closure of caseapprehension_method_code
: Abbreviated code for
apprehension method associated with removalprocessing_disposition_code
: Abbreviated code for
processing disposition associated with removalfinal_charge_section
: Federal code under which
individual ordered removedcitizenship_country
: Country of citizenship of removed
: Gender of removed individualapprehension_threat_level
: Fully redacted in original
_threat_level`: Fully redacted in original
: Unique individual identifier for
arrested individual, fully redacted in original datasetid
: Sequential record identifier (not individual
: Unique record hash (not individual
: Inferred full text value
of processing_disposition_code
: Calendar year derived from
: Abbreviated month derived from
: Calendar year and month derived from
: U.S. government fiscal year (Oct.-Sept.) derived
from arrest_date
Major decrease in removals by ICE, but note CBP Title 42 expulsions at Southern border since 2020 are not counted here.
p1 <- rem %>%
filter(departed_date >= "2011-10-01",
departed_date <= "2022-09-30") %>%
group_by(fy) %>%
summarize(n = n()) %>%
ggplot(aes(x = as.factor(fy), y=n)) +
geom_col() +
labs(title = "Total removals per FY") +
p2 <- rem %>%
filter(departed_date >= "2011-10-01",
departed_date <= "2022-09-30") %>%
group_by(year_mth) %>%
summarize(n = n()) %>%
ggplot(aes(x = year_mth, y = n)) +
geom_line(aes(group=1)) +
ylim(0, NA) +
labs(title = "Total nationwide ICE removals per month") +
p3 <- rem %>%
filter(departed_date >= "2011-10-01",
departed_date <= "2022-09-30") %>%
group_by(fy, month) %>%
summarize(n = n()) %>%
ggplot(aes(x = month, y = n, color = fy, group = fy)) +
geom_line() +
ylim(0, NA) +
scale_color_viridis_d() +
labs(title = "Total nationwide ICE removals per month") +
# rem %>%
# mutate(gender = tolower(gender)) %>%
# group_by(gender) %>%
# summarize(n = n())
p1 <- rem %>%
filter(departed_date >= "2011-10-01",
departed_date <= "2022-09-30") %>%
count(fy, gender) %>%
ggplot(aes(x=fy, y=n, fill=gender)) +
geom_col(position='fill') +
scale_y_continuous(labels = scales::percent) +
labs(title="Total ICE removals, % by gender") +
Note citizenship_country
may not correspond with an
individual’s deportation destination; deportation destination is not
represented in this dataset.
cit <- rem %>%
mutate(citizenship_country = toupper(citizenship_country)) %>%
group_by(citizenship_country) %>%
summarize(n = n()) %>%
p1 <- rem %>%
filter(departed_date >= "2011-10-01",
departed_date <= "2022-09-30") %>%
mutate(citizenship_country = case_when(
citizenship_country %in% head(cit$citizenship_country, 15) ~ citizenship_country,
)) %>%
count(fy, citizenship_country) %>%
ggplot(aes(x=fy, y=n, fill=citizenship_country, color=citizenship_country)) +
geom_col() +
labs(title = "Total ICE removals by country of citizenship (top 15)") +
# % change in removal by group?
p1 <- rem %>%
filter(departed_date >= "2011-10-01",
departed_date <= "2022-09-30") %>%
group_by(fy, aor) %>%
summarize(n = n()) %>%
ggplot(aes(x = as.factor(fy), y=n, color=aor, group=aor)) +
geom_line() +
labs(title = "Total removals per FY by AOR") +
natl_pct_chg <- rem %>%
filter(departed_date >= "2011-10-01",
departed_date <= "2022-09-30") %>%
group_by(fy) %>%
summarize(n = n()) %>%
mutate(pct_change = (n/lag(n) - 1))
p1 <- natl_pct_chg %>%
ggplot(aes(x = fy, y = pct_change)) +
geom_col() +
scale_y_continuous(labels = scales::percent) +
labs(title="FY % change in total removals") +
aor_pct_chg <- rem %>%
filter(departed_date >= "2011-10-01",
departed_date <= "2022-09-30",
aor != "HQ",
! %>%
group_by(fy, aor) %>%
summarize(n = n()) %>%
group_by(aor) %>%
arrange(fy, .by_group=TRUE) %>%
mutate(pct_change = (n/lag(n) - 1))
p2 <- aor_pct_chg %>%
ggplot(aes(x = fy, y = pct_change)) +
geom_col() +
scale_y_continuous(labels = scales::percent) +
scale_x_discrete(breaks=seq(2012, 2022, 2)) +
facet_wrap(~aor) +
labs(title="FY % change in total removals per AOR") +
theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust=1)) +
Unlike datasets for encounters and arrests, removals data represents case
processing disposition as an abbreviated
. Where possible, we have
inferred correspondence between full text
values in encounters and arrests
datasets and processing_disposition_code
values in this
dataset; cleaned values are represented in the
disps <- rem %>%
filter(departed_date >= "2012-10-01",
departed_date <= "2022-09-30",
) %>%
count(processing_disposition_clean) %>%
top_disp <- disps %>%
filter(n > 50000)
rem <- rem %>%
mutate(disp_short = case_when(processing_disposition_clean %in% unlist(top_disp$processing_disposition_clean) ~ as.character(processing_disposition_clean),
p1 <- rem %>%
filter(departed_date >= "2012-10-01",
departed_date <= "2022-09-30",
) %>%
group_by(fy, disp_short) %>%
summarize(n = n()) %>%
ggplot(aes(x = as.factor(fy), y=n, fill=disp_short)) +
geom_col() +
labs(title = "Total removals per FY by processing disposition") +
This field is largely missing data prior to FY 2022. Codes are
alphanumeric abbreviations; most common codes are analogous to full text
values in apprehension_method
field of arrests dataset but significance of some codes
is unclear; for example, the 15 top values for this field and inferred
rem <- rem %>%
mutate(apprehension_method_code = str_replace_all(apprehension_method_code, "287.0", "287"))
apprehension_method_code_rank <- rem %>%
count(apprehension_method_code) %>%
p1 <- rem %>%
filter(departed_date >= "2011-10-01",
departed_date <= "2022-09-30") %>%
count(year_mth, apprehension_method_code) %>%
ggplot(aes(x = year_mth, y = n, fill = apprehension_method_code)) +
geom_col() +
labs(title = "Removals by `apprehension_method_code`, FY 2022") +
Removals data includes four separate date fields:
, case_close_date
, and removal_date
. Of these,
is most complete, with no missing values;
therefore we use this date as the primary field for date values in this
rem %>%
dplyr::select(contains('date')) %>%
Name | Piped data |
Number of rows | 2665505 |
Number of columns | 4 |
_______________________ | |
Column type frequency: | |
Date | 4 |
________________________ | |
Group variables | None |
Variable type: Date
skim_variable | n_missing | complete_rate | min | max | median | n_unique |
departed_date | 0 | 1.00 | 2010-10-01 | 2023-01-27 | 2015-09-28 | 4500 |
case_close_date | 29356 | 0.99 | 2011-10-01 | 2022-10-04 | 2015-09-15 | 4018 |
arrest_date | 439348 | 0.84 | 1968-03-09 | 2023-01-27 | 2015-12-30 | 11667 |
removal_date | 778493 | 0.71 | 2013-10-01 | 2023-01-27 | 2017-05-18 | 3406 |
hist(rem$departed_date, breaks='years', col='pink')
hist(rem$removal_date, breaks='years', col='lightblue')
hist(rem$arrest_date, breaks='years', col='lightyellow')
hist(rem$case_close_date, breaks='years', col='lightgreen')
The earliest dataset analyzed here, for FY 2012, includes only the
and case_close_date
fields; the
FY 2013 dataset introduces an additional arrest_date
alongside these; and the FY 2014 and subsequent datasets include a
fourth value for removal_date
The fields departed_date
and removal_date
are complete for all records in datasets where these date fields appear.
Only the most recent records for FY 2022 are missing
values, logically suggests that these cases
remained open at the time of production of this dataset; a small
proportion of records are missing arrest_date
during all
years since FY 2013, it is not clear what this indicates about the cases
in question.
rem %>%
filter(departed_date >= "2011-10-01",
departed_date <= "2022-09-30") %>%
group_by(fy) %>%
summarize(missing_dep_date = sum(,
missing_rem_date = sum(,
missing_arr_date = sum(,
missing_cc_date = sum(,
## # A tibble: 11 × 5
## fy missing_dep_date missing_rem_date missing_arr_date missing_cc_date
## <chr> <int> <int> <int> <int>
## 1 2012 0 408419 402919 0
## 2 2013 0 363144 144 0
## 3 2014 0 0 1772 0
## 4 2015 0 0 1836 0
## 5 2016 0 0 2679 0
## 6 2017 0 0 3362 0
## 7 2018 0 0 4095 0
## 8 2019 0 0 4723 0
## 9 2020 0 0 4117 0
## 10 2021 0 0 2756 0
## 11 2022 0 0 2922 1924
We can calculate lag between different dates, between arrest date and
departure date, which reveals an increase in average time between arrest
and departure since FY 2014; and significant difference between cases by
rem$dep_diff_arr <- difftime(rem$departed_date, rem$arrest_date, units='days')
p1 <- rem %>%
filter(departed_date >= "2011-10-01",
departed_date <= "2022-09-30") %>%
filter(fy >= 2013) %>%
group_by(fy) %>%
summarize(mean_dep_diff_arr = mean(dep_diff_arr, na.rm = TRUE)) %>%
ggplot(aes(x = fy, y = mean_dep_diff_arr)) +
geom_line(group=1) +
ylim(0, NA) +
p2 <- rem %>%
filter(departed_date >= "2011-10-01",
departed_date <= "2022-09-30") %>%
mutate(disp_short = case_when(processing_disposition_clean %in%
unlist(top_disp$processing_disposition_clean) ~
filter(fy >= 2014) %>%
group_by(fy, disp_short) %>%
summarize(mean_dep_diff_arr = mean(dep_diff_arr, na.rm = TRUE),
med_dep_diff_arr = median(dep_diff_arr, na.rm = TRUE)) %>%
ggplot(aes(y = disp_short, x = mean_dep_diff_arr, color = disp_short, group=disp_short)) +
geom_boxplot() +
scale_y_discrete(label=function(x) abbreviate(x, minlength=10)) +
For discussion of ICE’s definition of “removals”, see American Immigration Council, “Changing Patterns of Interior Immigration Enforcement in the United States, 2016 - 2018”, July 2019:↩︎