pacman::p_load(tidyverse, sf, rnaturalearth, rnaturalearthdata, maps, tools, plotly)
cca <- read_csv("../input/applied.csv",
col_names=TRUE,
col_select=c("file_hash", "is_contemp_comm_arrest", "model_label"),
show_col_types=FALSE) %>%
rename(hand_label=is_contemp_comm_arrest) %>%
mutate(is_hand_labeled=!is.na(hand_label)) %>%
mutate(cca_label=ifelse(is_hand_labeled, hand_label, model_label))
inputfile <- here::here("./import/input/uw-chr-i213-public.csv.gz")
i213 <- read_delim(
inputfile,
delim = "|",
col_types = cols(
source = col_character(),
sex = col_character(),
cmplxn = col_character(),
country_of_citizenship = col_factor(),
year = col_double(),
month = col_double(),
day = col_double(),
hour = col_double(),
minute = col_double(),
fy = col_double(),
age = col_double(),
accompanied_juvenile_flag = col_double(),
unaccompanied_juvenile_flag = col_double(),
custody_redetermination_flag = col_double(),
lawsuit = col_factor()
))
i213 <- inner_join(i213, cca, by="file_hash")
hand_labeled <- i213 %>%
filter(is_hand_labeled==TRUE)
model_labeled <- i213 %>%
filter(is_hand_labeled==FALSE)
subtitle <- "For All Arrests Records with Method 'NCA' or 'O'"
As part of its “Human Rights At Home” and “Immigrant Rights Observatory” research initiatives, the University of Washington Center for Human Rights (UWCHR) obtained a collection I-213 “Record of Deportable/Inadmissible Alien” forms via Freedom of Information Act (FOIA) requests to the US Department of Homeland Security (DHS). The UWCHR first obtained a collection of 3887 I-213 records from Immigration and Customs Enforcement (ICE) and Customs and Border Patrol (CBP) through a lawsuit against DHS in 2020, and later obtained a collection of 2895 additional records through a lawsuit against ICE in 2021.
While analyzing trends in I-213 records obtained across both lawsuits, the UWCHR noticed a significant increase in arrest records categorized as Non-Custodial Arrests (NCA). Traditionally, agencies had used the NCA categorization to refer to arrests that were carried out as a result of targeted surveillance efforts by agents. An emergent concern was therefore that the quantitative increase in NCA arrests over time might reflect an increase in community surveillance efforts, raising concerns of privacy and human rights. Further inspection, however, revealed that a significant number of NCA arrests from lawsuit 2 reflected incidents where immigrants were first encountered by CBP along the US Border before being instructed to rep- ort to local ICE offices in the Pacific Northwest. While such encounters doubtless raise important human rights concerns as well, it was important for UWCHR to distinguish between traditional “contemporary community arrests” and the new class of NCA arrests to better understand evolving trends in immigration enforcement. Given the large number of I-213 records in UWCHR’s possession, it became necessary to use machine learning techniques to efficiently process the large collection of records. In this document, we outline the methodology, accuracy, and results of this machine learning research effort.
We develop and use the following taxonomy of I-213 records to classify records as contemporary community arrests (TRUE) or non-contemporary community arrests (FALSE) by narrative:
After further analyzing the I-213 records, we realized that the Other (O) category of I-213 arrest records also contained a significant proportion of arrests that would be characterized as contemporary community arrests under the taxonomy above. Therefore, here we include both NCA and O records in our datasets. Out of 1574 total NCA and O records with text narratives, 1034 are categorized as NCA and 540 are categorized as O; 693 are from lawsuit 1 and 881 are from lawsuit 2.
We hand-code a subset of these to train a machine learning (ML) model. Out of a total 639 hand-coded records, 579 records are categorized as NCA and 60 are categorized as O; 80 are from lawsuit 1 and 559 are from lawsuit 2.
After training, we apply the model to the remaining records to generate machine-coded labels. Out of a total of 935 machine-coded records, 455 are categorized as NCA and 480 are categorized as O; 613 are from lawsuit 1 and 322 are from lawsuit 2.
We split the data into train, validation, and test sets, with a 60-20-20 split respectively. We fine-tune a RoBERTa-base model on the train data and save the model with the highest validation accuracy. We then measure the model performance against the test data and apply to the remaining unseen I-213 records categorized as NCA or O. We train for 30 epochs with a batch size of 16 and an AdamW optimizer with learning rate 1e-5.
log <- read_log("../input/training.log",
col_names=c("col", "value"),
col_types=cols('c', 'c')) %>%
mutate(col=str_sub(col, 15, -2))
train_data <- log %>%
filter(col %in% c("train_acc", "val_acc", "saved")) %>%
mutate(epoch=(row_number() + 2) %/% 3) %>%
pivot_wider(names_from=col, values_from=value) %>%
mutate(train_acc=as.numeric(train_acc)) %>%
mutate(val_acc=as.numeric(val_acc))
The following chart shows the train and validation accuracy from training over 30 epochs. The model achieves the highest validation accuracy of 0.9296875 at epoch 7, 9:
ggplot(train_data, aes(x=epoch, group=1)) +
geom_line(aes(y=train_acc, color="Train Accuracy")) +
geom_line(aes(y=val_acc, color="Validation Accuracy")) +
labs(title="Model Training: Train and Validation Accuracy",
x="Epoch",
y="Accuracy (%)",
color="")
test_results <- log %>%
filter(col %in% c("test_acc", "pos_precision", "neg_precision",
"pos_recall", "neg_recall", "pos_f1", "neg_f1")) %>%
mutate(value=as.numeric(value)) %>%
pivot_wider(names_from=col, values_from=value)
The model then achieves the following performance on the test data:
The relatively low precision of positive (contemporary community arrest) records suggests a high degree of false positives. We therefore believe that the our results are likely a slight overcount of the true number of contemporary community arrests.
We apply the trained model to the remainder of the non-hand-coded NCA and O records. The combination of hand-coded and machine-coded results reveals that the proportion of non-contemporary community arrests performed by ICE out of all NCA and O arrests has increased dramatically from lawsuit 1 to lawsuit 2:
cca_counts <- i213 %>% count(lawsuit, method_short, cca_label, name="count")
cca_counts_all <- i213 %>% count(lawsuit, cca_label, name="count") %>%
mutate(method_short="All")
cca_counts <- bind_rows(cca_counts, cca_counts_all) %>%
arrange(lawsuit, method_short, cca_label)
cca_counts
## # A tibble: 12 × 4
## lawsuit method_short cca_label count
## <fct> <chr> <lgl> <int>
## 1 1 All FALSE 156
## 2 1 All TRUE 537
## 3 1 NCA FALSE 107
## 4 1 NCA TRUE 398
## 5 1 O FALSE 49
## 6 1 O TRUE 139
## 7 2 All FALSE 767
## 8 2 All TRUE 114
## 9 2 NCA FALSE 420
## 10 2 NCA TRUE 109
## 11 2 O FALSE 347
## 12 2 O TRUE 5
ggplot(cca_counts, aes(cca_label, count, fill=lawsuit)) +
geom_col(position="dodge") +
facet_wrap(~method_short) +
labs(title="Categorization of Records: Lawsuit 1 vs. Lawsuit 2",
subtitle=subtitle,
x="Contemporary Community Arrest",
y="Count",
fill="Lawsuit")
Based on these results, we do not believe that the increase in NCA arrest records reflects an increase in contemporary community arrests performed by ICE, but rather a shift in how ICE is categorizing arrest records.
Finally, we note several limitations of our approach for I-213 classification. One challenge has to do with the potential ambiguity presented by text narratives with respect to the “contemporary community arrest” categorization. During hand-labeling, for instance, we found several records that did not neatly fall into or outside this category. Where applicable, we recorded our decisions in such cases by updating our taxonomy, but given that we did not hand-label all records, the ambiguity of community arrests likely presents a source of potential error in the model results. This error is especially likely when the model is applied to records from the “O” (“Other”) category, because the category encompasses a wide range of potential text narratives that may or may not have been seen by the model during training.
An additional limitation of our methodology is the use of optical character recognition (OCR) tools to digitize the text narratives from the I-213 records. We find that our OCR tools have difficulty interpreting redacted information, often adding meaningless text in the middle of a sentence. It is therefore unclear to what extent the classification model is able to adequately capture the meaning within the text narrative, limiting the accuracy of our classifications. Future research might also train an OCR model to better handle redacted information when digitizing text narratives.