Skip to main content

Statistical Analysis of criminal cases in the United States District Court of Puerto Rico

Submission Number: 187
Submission ID: 4251
Submission UUID: 802b435b-48c3-4f1e-8c57-67713360aeba
Submission URI: /form/project

Created: Fri, 12/08/2023 - 15:21
Completed: Fri, 12/08/2023 - 15:21
Changed: Tue, 07/02/2024 - 10:51

Remote IP address: 131.128.76.34
Submitted by: Gaurav Khanna
Language: English

Is draft: No
Webform: Project
Statistical Analysis of criminal cases in the United States District Court of Puerto Rico
CAREERS
{Empty}
{Empty}
Complete

Project Leader

Michael Chou
{Empty}
{Empty}

Project Personnel

Michael Chou
Emily Gelchie
{Empty}

Project Information

For the purposes of submitting an amicus brief to the US Supreme Court, the Puerto Rico Association of Criminal Defense Lawyers (PRACDL) compiled several indictments and docket sheets from the PACER system. Data from these documents were extracted and analyzed with sociodemographic data from the US Census. The wealth of data contained in these documents is not easily accessible for statistical study. The goal of this project is two-fold. First, to write script to data mine these documents for information including but not limited to: the length of time that the case is "open", the percentage of persons represented by a court-appointed attorney, the average length of sentences, the number of persons granted bail, the number of persons with bail violations and the reasons for those violations, among others. Secondly, data science techniques will be used to provide insightful visualizations and detect correlation between these various categories. An understanding of these data will facilitate related future social justice projects in this jurisdiction, as well as apply to other indictment and docket sheets from the PACER system at large.

Project Information Subsection

{Empty}
{Empty}
{Empty}
{Empty}
{Empty}
{Empty}
Providence College
{Empty}
CR-University of Rhode Island
{Empty}
Yes
Already behind3Start date is flexible
6
{Empty}
{Empty}
{Empty}
{Empty}
  • Milestone Title: Milestone #1
    Milestone Description: Identify correlations to study, launch presentation preparation, set up project on github.
    Completion Date Goal: 2024-01-01
  • Milestone Title: Milestone #2
    Milestone Description: Initialize data set with case numbers, names, docket entries. Clean data set.
    Completion Date Goal: 2024-02-01
  • Milestone Title: Milestone #3
    Milestone Description: Harvest first set of variables to study. Check accuracy on small subset.
    Completion Date Goal: 2024-03-01
  • Milestone Title: Milestone #4
    Milestone Description: Harvest second set of variables to study. Check accuracy on small subset.
    Completion Date Goal: 2024-05-01
  • Milestone Title: Milestone #5
    Milestone Description: Run regressions, create visualizations, wrap presentation.
    Completion Date Goal: 2024-06-30
{Empty}
{Empty}
{Empty}
{Empty}
{Empty}
{Empty}
{Empty}

Final Report

This project is at the forefront of large-scale data processing in the legal field. Our work allows for the analysis of court dockets, revealing macro trends that enhance transparency and shed light on biases in the legal system.
The impact on other disciplines is limitless. Although we focused on the interaction between law data science and social science for this project, there are court cases reflecting many disciplines. Now that we have created the technology for large-scale analysis, we are able to expand upon this project in any discipline which there are court cases for.
{Empty}
{Empty}
{Empty}
{Empty}
{Empty}
Once we can identify more macro trends in court cases, we can present these findings to government officials to reform the US justice system and mitigate legal biases.
Throughout this project, I learned a lot about not only the biases present in the legal system, particularly charge 846, but also learned about working on a team of developers, conducting research, how to use GitHub, and enhance my data science skills while also learning about machine learning techniques.
Overall, we were able to begin to find trends in the court dockets we analyzed, suggesting some biases between wealth and preferential legal treatment. Looking forward, we will dig deeper into these initial findings to create a deeper proof of this discovery.