Precinct and Demographic-Level Election Simulation
Now that the campaign has a tool to allow them to train canvassers, where should they deploy them? Identifying a strategy for the campaign to build an achievable path to victory was one of our earliest and most important challenges as a team. We initially used some standard “path to victory” spreadsheets, but these proved to be better for validating an existing strategy than for iterating on possible strategies. Additionally, they require significant up-front investment of time for the user to understand and fill out with a plan.
To address this, we built tooling in Python to take as input crosstabs of counts of voters grouped along any set of dimensions and the average turnout and support of voters in each group. These inputs are easy to produce and can be exported directly from VoteBuilder or similar platforms. We then run simulations on these buckets of voters- we can run one million simulations quickly. Each simulation consists of randomly drawing a probable number of Democratic voters from each bucket of people (for example, if the voters are grouped by age and race we would select a number of Democratic votes for “white voters age 18-35”). This number of votes is selected using the size of that group, their average turnout, and their average partisanship plus some configurable amount of random variance.
We then tally up votes from the various groups which lets us determine whether the Democratic candidate would win or not. At that point, we have produced one million election outcomes, paired with the turnout and support of various simulated groups. In Ashwin’s district, as a baseline, our simulations expect a Dem candidate to win there about one third of the time. This seems likely considering it is a tough district for Democrats.
We can then run a regression on those outcomes (our dependent variable) and the grouped Democratic votes (our independent variable). This gives a list of coefficients for turnout and support in each group - the coefficient indicates how much impact that feature has on the outcome. For example, here is a map of the coefficients for turnout in precincts in GA-48. We can see some precincts (in green) where having more folks turnout helps us as well as red precincts where, on average across all one million simulations, an increase in turnout was associated with a Dem loss.
And here are the coefficients of age/race/gender buckets that are most correlated with a loss:
We can see unsurprisingly that high turnout among older white voters is correlated with a loss.
But it also shows how we can improve the outcomes- here are features with large, positive coefficients:
These numbers begin to suggest a path to victory along race/age/gender demographics- we can see that although they are not a huge demographic, Asian women aged 30-50 have stronger turnout in situations where Dems ultimately win. This jives well with the demographic strategy pursued by our candidate- Ashwin’s team planned significant outreach to Southeast Asian voters in the district.
Finally, we created a pre-built dashboard which takes these simulation outcomes and makes the support and turnout levels interactive so that a user could see how their likelihood of victory changes if they cause, for example, a 5% increase in expected turnout in a given bucket of voters. (And you could do this for whichever dimensions you selected earlier in the process- for example, if you had produced precinct-level crosstabs, your sliders here would change the turnout and support levels in different precincts).
—————————————————————————————————————————
About the Author
Cody Braun was the project lead for the Ashwin Ramaswami fellow team in May 2024. He gives credit to Henry Randall and Aldo Polanco for doing most of the real work. Cody is a Vegas-based data engineer/data scientist/sometimes tech co-founder, let’s build something together.
Commentaires