Course Description
STEM 1, spanning from August to February, comprises of an independent research project overseen by Dr. C. The coursework includes completing an updated logbook, understanding and analyzing previous research, writing detailed article notes, collecting and analyzing data, completing a Grant Proposal, and writing a STEM thesis. This work is in preparation for Feb Fair.
Using Math Modeling to Identify and Correct Gerrymandering
Overview
This research project sought to develop an objective measure of identifying gerrymandering and an algorithm to output corrected non-gerrymandered maps to provide evidence to prosecute gerrymandering and protect the essential right to vote for everyone, including minorities.
Abstract
There are several gerrymandering-preventative legislations focused on maintaining district compactness and prohibiting discriminatory redistricting plans, outlined in Section 2 of the Voting Rights Act (Section 2 of the Voting Rights Act, 2015). However, these laws are ultimately deemed as ineffective because gerrymandering cannot be objectively measured or solved. Current models aimed at outputting optimized non-gerrymandered maps fail to account for all factors. These existing models are accurate but solely consider geographical size, compactness, and preserving old district cores. They leave out one crucial factor that is one of the main causes of the continued usage of the archaic electoral college system: just representation of minorities. Thus, the project goal is to develop a mathematical model to objectively and accurately identify and correct gerrymandering, specifically of minorities. This research could significantly change the way we redistrict and provide objective legal evidence to prosecute gerrymandering to make every vote count the same and ensure political justice.
Problem Statement
Existing mathematical redistricting models focus on compactness or geographical factors and do not account for fair representation of minorities, the purpose of the electoral college system.
Objective
The goal of this project is to develop objective measures to identify and correct minority-targeted gerrymandering.
Background
Gerrymandering is a political issue that describes the manipulation of electoral borders to work in the favor of, or against, a specific party. It is deeply undemocratic, but without objective evidence, action cannot be taken against it (Kirschenbaum & Li, 2021). There are two types of gerrymandering: packing and cracking. Packing describes grouping several voters of the same party in one district to ensure that one district wins by a tremendous margin, but the surrounding districts are less competitive (Jones, 2018). Cracking also makes districts less competitive, but it does this by splitting a party’s voters across several districts, making them a minority in each one (Jones, 2018). There is legislature aimed at preventing gerrymandering. One such is Section 2 of the Voting Rights Act which states that redistricting to intentionally pack or crack minorities is strictly prohibited (Section 2 of the Voting Rights Act, 2015).
Procedure
The procedure of this research project followed the typical math modeling process: defining the problem, making assumptions, defining variables, solving, and analyzing the solution. Gerrymandering is a combinatorial optimization problem, so the Correction Model had to be a rapid randomized algorithm. The Correction Model, programmed as a Java class, functions by calling upon the MapGenerator class to randomly generate a map with a varied amount of horizontal and vertical lines creating random districts. Each random map’s gerrymandering score is evaluated using the Identification Model and the best score after several runs is outputted with its corresponding redistricting plan. The Identification Model and Correction Model were each refined several times. The major revisions can be found in the Research Proposal subpage.
Figures
Figure 1: This figure depicts the Identification Model and the specifics of its calculation. This model has three variables: the efficiency gap, the popular vote deviation for Party A, and the popular vote deviation for Party B. The weights were assigned to each variable: 0.50 for EG, 0.25 for PVDA, and 0.25 for PVDB. This shows the inputs and factors that were accounted for during this model development.
Figure 2: This figure depicts an image of Connecticut’s electoral district map (ABC News, 2024) during the 2024 election. The provided information from the website’s interactive map was used as inputs for the Identification Model. It yielded an output of 0.1811. The Correction Model was able to reduce this gerrymandering score to 0.139 with its recommended outputted map.
Figure 3: This figure shows a histogram of the results of 15 trials. For each trial, the inputs were randomized, then each trial was set to run twenty times. The best gerrymandering score by the twentieth time was recorded. The histogram demonstrates that even after twenty runs, the Correction Model is skewed left, meaning it has a bias for less-gerrymandering. This supports the model’s validity.
Figure 4: This figure compares the outputted gerrymandering scores of the Identification Model after the input of three datasets with increasingly greater gerrymandering in each one. Dataset 1 yielded the lowest gerrymandering score of around 0, Dataset 2 had a gerrymandering score of 0.2833, and Dataset 3 had a gerrymandering score of 0.3.
Analysis
Validating the statistical significance of a math model can be challenging, but this was completed using a t-test for the Correction Model and a ranked Spearman Correlation test for the Identification Model. The latter was conducted with ordinal data, so Dataset 1 was assigned the value 1, Dataset Set 2 2, etc. The Spearman correlation was 1 with a p-value of 0. This is indicative of a positive monotonic relationship, but because of the sample size, a perfectly linear relationship can’t be guaranteed without further testing. A histogram of the Correction Model was developed which shows a left bias, meaning the model favors redistricting plans with lower gerrymandering scores. A one-sample t-test was completed to validate this hypothesis. The hypothesized population mean was 0.5 yielding a t-statistic of -6.88 and a p-value of 0.456. Thus, there is not significant evidence to reject the null hypothesis. The Correction Model may not statistically generate a does not necessarily mean that the model itself is inaccurate. It’s simply providing a result based on outputted gerrymandering scores. It doesn’t account for the redistricting maps or the model algorithm. The Identification Model was shown to be generally statistically accurate.
Discussion/Conclusion
Next Steps:
Significance:
References
Feb Fair Poster