2023

Posted on May 23, 2023May 23, 2023 by Bradley Wieneke

A Sample Post Election Analysis Report

Haystaq prepares post election analysis reports on behalf of clients. Above is a sample from the Georgia 2020 December Runoff Election.

Georgia 2022 General Election Analysis 5/23/2023

Posted on May 23, 2023May 23, 2023 by Bradley Wieneke

A Sample Post Election Analysis Report

Download an excel file of this Report

Haystaq prepares post election analysis reports on behalf of clients. Above is a sample from the Georgia 2020 General Election.

Common Redistricting Criteria 1/10/2023

Posted on January 10, 2023February 2, 2023 by Haystaq DNA

Common Redistricting Criteria

Download a PDF of this Report

After each decennial Census, governments at all levels (e.g. congressional, state legislative, municipal, county, special districts) adjust the districts of their respective legislative bodies to reflect the changing demographics of their region. The individual or group drawing the map must be given criteria on what they can and cannot do when creating a map. Redistricting processes and rules vary across states, so map-drawers must be aware of the laws of the state and jurisdiction to avoid having their maps rejected or deemed unconstitutional. Some states will strictly define what redistricting criteria must be followed while others will not. Some municipalities or jurisdictions will also have previously defined their own criteria. Other jurisdictions will be able to adopt their own criteria.

This document is not intended to be a legal guide, but simply a discussion of the most common redistricting criteria.

There are some criteria that are part of almost all redistricting guidelines:

Equal Population — The population of each new district should be ‘equal’. When drawing congressional maps, court rulings have largely defined ‘equal’ as “within 1 person” based on the census’ latest PL 94-171 file. Unless state statute dictates otherwise, counties, municipalities and special districts are not required to adhere to such a strict definition, and districts can have differences within a small percentage of one another. Some states define the allowed deviations, otherwise the jurisdiction does it themselves. Some jurisdictions use as low as a 2% deviation, others as high as 10%. When you talk about percentages you also have to determine if this is an cumulative difference or individual district difference from the ideal population. An example would be a jurisdiction with a 5% deviation rule. A proposed map has one district with a -3% deviation (underpopulated) and another district with a positive 4% deviation (overpopulated). Under the rules is each district under a 5% cutoff and OK, or does the cumulative difference of 7% violate the rule?
Contiguity — The district must be one cohesive unit. For example, you can’t draw one part of a district in the southeast corner of a county and add an unattached region in the northwest corner of the county. Those two sections will need to be physically linked. Definitions of contiguity can also vary between states, as some consider corner and point contiguity as part of the criteria, particularly in areas where geographic limitations (e.g. bodies of water) are present. There are of course sometimes areas which are not contiguous to the jurisdiction itself (examples would include physical islands or annexed pieces of land that do not touch the original jurisdiction), these of course cannot be drawn to be contiguous. https://redistricting.lls.edu/redistricting-101/where-are-the-lines-drawn/#contiguity
Comply with the Federal Voting Rights Act — Section Two of the Federal Voting Rights Act prohibits discrimination on the basis of race, color, or membership in a language minority group. Generally speaking, it means existing majority-minority districts should be preserved and new majority-minority districts should be created where possible. Jurisdictions often use CVAP (Citizen Voting Age Population) data to determine minority percentages, and the allowed source data should be specified (CVAP vs VAP vs PL file demographics). At the federal level, a majority minority district must be 50% plus one additional person of the same protected class in one district. If a new majority minority district can be drawn in a “reasonably compact” way, it must be drawn. Section Five of the VRA required certain states and jurisdictions with a history of racial gerrymandering to submit their redistricting plans to the Department of Justice for preclearance, but this was struck down by the U.S. Supreme Court in 2013 (Shelby County v. Holder).
Comply with State or Jurisdiction Voting Rights Laws – Federal VRA law is complicated, but largely dictates that a minority majority district consists of a single protected minority comprising more than 50% of the population, but state law or local jurisdiction law can go further than federal law. State law might favor the creation of ‘coalition districts’ where multiple minorities together can form a minority district. Additionally, states or jurisdictions can write Voting Rights Laws to not use a hard 50% cutoff, but rather use an “ability to elect” cutoff. This means if a district is 40% minority, but that 40% along with allied voters is large enough to see their ‘candidate of choice’ win, that can still be considered a minority district. Federal law, however, supersedes state and municipal law.

There are other criteria that are widely considered to be best practices:

Use Practical Geographic Boundaries — This criteria means that, where possible, practical geographic boundaries should be used to create the lines of a district. This can include major roadways, municipal boundaries, waterways, park boundaries, school district boundaries, etc..
Where Possible, Make Districts Compact — In terms of the overall shape, compactness refers to how closely packed together the district is. A perfectly compact district would be a circle, but it is impossible to fill a map with only circular districts. That said, districts should avoid sprouting ‘arms’ and the distance to the center of a district should be similar all along the boundaries (exceptions to compactness are often made for Section Two districts or to keep communities intact). For more information on multiple standard measures of compactness (e.g. Polsby-Popper, Reock, Convex Hull etc.), visit: https://redistricting.lls.edu/redistricting-101/where-are-the-lines-drawn/#compactness
Keep as Many Communities of Interest (COIs) Intact as Possible — A community of interest is often defined as a group of people who have common policy concerns and would benefit from being maintained in a single district. Examples of this might be a neighborhood in a city, a HOA, a school district, or a Native American reservation. Redistricting often allows for public input for citizens to define their own communities of interest.

Other criteria that are sometimes used to diminish the effects of gerrymandering or are adopted for other considerations:

Document All Iterative Changes to a Map and Summarize the Reasons Why — This criteria is sometimes put onto map drawers to show that they are adhering to the defined criteria and so that the public can see the versions and iterations the map went through. This is done to increase the transparency of the process.
Draw Maps Blind to Incumbent Addresses — To make sure that no one is creating maps to advantage or disadvantage certain incumbents, the map drawers can be kept blind to the addresses of the incumbent representatives. This hopefully results in map drawers creating the best maps they can without taking political considerations into account.
Draw Maps to Purposefully Avoid Grouping Incumbents into a New District — This criteria is the opposite of the above. It is sometimes used when one political party will have control over the map drawing process and use it to try to squeeze out members of the opposing party. There have been cases where partisan map drawers have purposefully grouped incumbents of the same party into one district so that only one can win, or they group incumbents of one party with a stronger incumbent of the other party to ensure they are not re-elected. Making sure incumbents are not grouped together can eliminate this, but this criteria has also been used to protect incumbents from political challenge and to keep the detrimental effects of previous maps intact.
Multi-Member Districts- While federal law prohibits multi-member districts for Congress, many state legislatures elect several representatives from a single district.
Nesting Requirement – In states where districts are “nested,” each lower house (e.g. state house) district is a subdivision of a larger upper house (e.g. state senate) district.
Floterial District- Sometimes, individual districts may have a surplus population that is not significant enough to gain additional representation but still causes issues with deviation. A floterial district combines and overlaps several of these districts to meet the population requirement and add an extra representative for the combined population.
Draw Maps without Access to Political Data — While VRA evaluation requires political data, often, redistricting criteria requires map drawers to draw the remaining districts without having election results or voter registration loaded as a layer. This prevents map drawers from trying to create partisan favoring districts.
Draw to Create as Many Politically Competitive Districts as Possible — This is almost the opposite of the criteria above. In this version the map drawer is directed to draw as many districts as possible where results of historic statewide elections in the newly created districts are as close to 50/50 as possible.
Draw without Considering Past Districts (aside from VRA districts) — Oftentimes, old districts were drawn for political purposes and map drawers are directed to start from scratch.
Draw to Keep the Core of Past Districts Intact — This is the opposite of the criteria above. One of the easiest and perhaps quickest ways to draw a map is to simply update old districts to reflect population changes. However, adopting this criteria risks locking in any detrimental effects of old plans.
Preserve as Many Voting Precincts As Possible — Oftentimes, precinct maps will be redrawn on different timelines. It is inconvenient for precincts to be split by one or more districts and have to offer different types of ballots depending which part of the precinct voters live in. For this reason redistricting consultants often try to keep voting precincts intact within a district.
Redistributing Prisoners to their Home Addresses — If there is one or more large prisons in a jurisdiction, sometimes that prison can throw off either the average number of voters in a district (in most states, prisoners cannot vote) or distort VRA calculations. Some states or jurisdictions will attempt to assign prisoners back to their last known address for redistricting purposes. This is a good practice, but it is a relatively heavy data lift and one better adopted at the state level than done individually by a county or city. If a state does prisoner allocation, they generally release a modified PL 94-171 file with the prisoner reallocation already done. Map Drawers can incorporate this file into their drawing software to then account for this provision. For more information on states that reallocate prisoners to their home addresses, visit: https://redistrictingdatahub.org/data/ongoing-data-projects/states-that-adjust-the-census-data-for-redistricting/.
Redistributing College Students to their Home Addresses — Similar to the above. Again, this one is generally better if it is done at the state level.
Draw During Public or Recorded Meetings — It is possible to draw the maps during public meetings so that the public can see the process and, in some cases, provide community feedback. This is done to increase transparency and build trust in the system but requires multiple and/or longer public meetings.
Proportionality – The proportion of voters in a jurisdiction/district who favor each political party should be similar to the voter preferences in historical elections. For example, if the population of a jurisdiction/district allotted 10 seats is 60% democratic and 40% republican, the jurisdiction/district would ideally elect 6 Democrats and 4 Republicans as representatives. This criterion is most common in multi-member districts.

Understanding and adhering to such a myriad of criteria can be a difficult task, but HaystaqDNA can help you draw fair and compliant districts. For more information on Haystaq’s services call us at 202-548-2562 or visit our contact page – https://haystaqdna.com/contact/

Common Redistricting Terms

At-Large Districts

Populations in at-large districts are not divided into several districts and the entire region votes as a cohesive unit. For example, states that are assigned one member in the House hold at-large elections across the entire state.

Reapportionment

The Permanent Apportionment Act of 1929 set the number of seats in the U.S. House of Representatives to 435. After each decennial Census, seats are redistributed to ensure proportional representation in Congress for all states.

P.L. 94-171 Redistricting Data

Public Law (P.L.) 94-171 requires the Census Bureau to provide detailed demographic data needed for redistricting. At the most granular level, the data summarizes the demographic makeup (race/ethnicity, age, etc.) of each Census Block.

Voting Age Population (VAP)

Total population of individuals who are at least 18 years old.

Citizen Voting Age Population (CVAP)

Total population of individuals that are of voting age and a U.S. citizen. This data is usually derived from the American Community Survey (ACS) data set.

Ideal Population

The ideal population size for each district is found by dividing the total population of a region by the number of districts in the respective representative body. Ideal Population = Total Population/ # of Districts

Raw Deviation

The numerical difference of a district’s population from the ideal population. Raw Deviation = District Population – Ideal Population

Percent of Deviation

The proportional difference of a district’s population from the ideal population expressed as a percentage. % of Deviation= Raw Deviation/ Ideal Population

Deviation Range

Total deviation across all districts. Found by dividing the deviation percent of each individual district and adding the absolute values of the minimum and maximum deviations. Deviation Range=abs(min(% of Deviation)) + abs(max(% of Deviation))

Cracking

Splitting communities into various districts to dilute said group’s voting power.

Packing

The opposite of cracking; concentrating certain populations into a limited number of districts to restrict the amount of representatives they can elect.

Voting Rights Act (VRA)

Passed in 1965, the Federal VRA protects voters belonging to minority and or protected groups from discrimination.

Majority-Minority District

Electoral districts in which the majority of constituents (50% or more of the citizen voting-age population) belong to racial or ethnic minorities. Where possible, creating/preserving such a district would be required by the VRA to ensure these groups can elect their candidates of choice.

Coalition District

Districts in which individual minority groups do not form a majority but vote together with other minority groups to form a coalition vote bloc in which the sum of these groups forms the majority and is able to elect a candidate of choice. Such a district, however, is not legally required by the VRA.

Opportunity District

A district where a minority group is able to elect their candidate of choice because the majority group votes similarly to them. This is also not legally required by the VRA.

Community of Interest (COI)

A community of interest COI is a group of people with shared concerns, interests, and characteristics. Every COI is unique; they might be formed around neighborhoods or the physical landscape, cultures, values, and many other things. Because of these shared interests and concerns, a common redistricting criteria is that COIs be considered during the process.

Incumbent

The current holder of an office or position.

Incumbency Criteria

Requires the incumbent’s house to remain in the district they represent.

Haystaq – Abortion Impact on State Legislatures 10/11/2022

Posted on October 11, 2022February 2, 2023 by Haystaq DNA

Expanding Democrat’s Holdings in State Legislatures by Leveraging Abortion

Download a PDF of this Report

This white paper takes advantage of Haystaq’s national issue scores to identify expansion targets for the Democratic Party in state legislative elections.^[1] Based on feedback we have received, we focused on the following chambers in bold that we believe are pick up opportunities when using abortion related messaging. This analysis considers all seats being contested by the two major parties, excluding only those rated as “Solid D” by CNalysis (whose seat ratings this paper adopts).

Expansion Targets 2022^[2]	Protects 2022 Cycle	Long-term Opportunities
Michigan House	Nevada House, and Senate	Georgia House only
Pennsylvania House	Maine Senate	North Carolina House
Arizona Senate	Minnesota House	North Carolina Senate
Michigan Senate

The table below briefly summarizes the legal status of abortion in the targeted states and public option in the state overall. It is quite possible that abortion may prove to be a more salient issue in states where the current abortion policy is at odds with the opinion of the majority of voters.

State	Legal	Constitutional Protection	Legislative Control	% supporting Abortion Statewide	Abortion Rights Under Threat
Arizona	No (pending court challenges)	No	GOP	56%	Yes
Pennsylvania	Yes	No	GOP	56%	Yes^[3]
Minnesota	Yes	Yes (non-explicit)	Split	54%	No
Michigan	Yes (pending court action)	No	GOP	55%	No
Nevada	Yes	Yes	Dem	63%	No
Maine	Yes	No	Dem	62%	No
North Carolina	Yes (viability)	No	GOP	49%	Yes
Georgia	Yes (6-weeks)	No	GOP	49%	Yes
Texas	No	No	GOP	46%	Yes

The key finding is, given that the vast majority of Americans disapprove of the overturning of Roe versus Wade, there are several dozen GOP held state legislative seats with pro-choice majorities and where Haystaq’s data can be used to id and target pro-choice voters, and there are many more seats where vulnerable Democrats can use the abortion issue to bolster their electoral position. If Democrats succeed in unifying the pro-choice vote, the party could gain control over both houses of the Michigan legislature along with the Pennsylvania House and the Arizona Senate. Of these legislative chambers, the Michigan Senate will be the easiest to flip, followed by the PA House and AZ Senate.^[4] Evidence from field testing and from recent special elections shows that this issue indeed moves vote choice.

The tables below highlight seats in each legislative chamber where abortion related messaging can make a difference with expansion opportunity seats in bold and districts ranked by the salience of the abortion issue. The race ratings key below is taken from CNalysis. For each chamber, the number of seats currently held by the GOP which have Pro Choice majorities is noted, and the information is underlined when the number of seats is enough to flip control of the chamber to the Democrats.

Race Ratings Key

Solid R

Very Likely R

Likely R

Lean R

Tilt R

Tossup

Tilt D

Lean D

Likely D

Very Likely D

Expansion Targets

MI Senate (Chamber Rating: toss-up)

3 flips possible
3 seats need for control

District With CN Analysis Rating	Current Hold	Reg Voters	Likely Voters	Count Pro Choice	Count Pro Choice Likely Voters	Percent Pro Choice
13	D	203,108	96,391	124,993	59,925	62.2%
21	R	198,356	72,090	124,412	42,377	58.8%
4	D	209,569	71,730	121,671	41,947	58.5%
14	R	201,075	82,691	109,008	47,801	57.8%
28	R	181,207	74,818	104,858	42,582	56.9%
9	D	185,843	75,889	102,357	40,839	53.8%
11	D	198,566	64,910	108,308	33,591	51.8%

PA House (Chamber Rating likely R)

15 flips identified
12 seats needed for control

District With CN Analysis Rating	Current Hold	Reg Voters	Likely Voters	Count Pro Choice	Count Pro Choice Likely Voters	Percent Pro Choice
172	D	33,927	10,996	22,451	7,861	71.5%
45	D	43,874	15,931	27,537	10,751	67.5%
121	D	31,109	9,160	19,974	6,054	66.1%
25	D	42,922	15,281	25,405	9,733	63.7%
16	D	40,813	14,148	21,510	8,880	62.8%
118	D	42,605	17,678	23,913	11,075	62.6%
33	R	42,033	17,389	22,957	10,729	61.7%
2	D	40,142	15,108	22,620	9,219	61.0%
151	R	45,408	20,190	25,941	12,161	60.2%
61	D	45,892	22,495	25,552	12,769	56.8%
29	R	49,874	24,003	26,528	13,555	56.5%
137	R	44,561	15,873	23,385	8,956	56.4%
189	R	37,397	8,393	20,618	4,691	55.9%
53	D	40,307	14,872	22,866	8,240	55.4%
115	D	37,081	8,163	20,666	4,517	55.3%
39	R	44,855	17,716	21,461	9,763	55.1%
146	D	41,054	12,209	22,100	6,662	54.6%
30	R	44,426	20,948	22,894	11,429	54.6%
168	R	42,387	19,484	22,816	10,466	53.7%
3	D	44,772	19,822	22,179	10,629	53.6%
26	R	42,572	16,872	22,166	9,037	53.6%
142	R	43,865	17,959	22,188	9,544	53.1%
18	R	39,719	12,975	22,449	6,835	52.7%
51	R	38,160	12,060	16,212	6,323	52.4%
144	R	45,006	18,380	22,831	9,625	52.4%
74	D	37,401	10,581	18,538	5,449	51.5%
82	R	37,986	13,096	20,020	6,678	51.0%
120	R	39,922	15,036	18,435	7,586	50.5%

AZ Senate (Chamber Rating: Very Likely R)

3 flips possible
2 seats needed for control

District With CN Analysis Rating	Current Hold	Reg Voters	Likely Voters	Count Pro Choice	Count Pro Choice Likely Voters	Percent Pro Choice
11	R	112,243	21,923	79,713	16,064	73.3%
18	D	158,642	75,347	111,147	48,576	64.5%
8	R	128,059	36,355	87,767	21,919	60.3%
23	R	120,100	24,701	68,004	12,805	51.8%

MI House (Chamber Rating: lean R)

8 flips possible
3 seats needed to gain control

bgcolor=”blue”

District With CN Analysis Rating	Current Hold	Reg Voters	Likely Voters	Count Pro Choice	Count Pro Choice Likely Voters	Percent Pro Choice
20	D	74,703	33,317	44,633	21,164	63.5%
2	D	65,447	18,182	40,242	10,875	59.8%
21	D	58,957	28,645	36,086	16,881	58.9%
40	D	67,752	30,234	41,519	17,688	58.5%
69	D	72,560	22,945	41,253	13,418	58.5%
22	D	74,522	38,574	42,596	21,876	56.7%
73	R	55,292	23,075	31,876	13,015	56.4%
80	R	66,445	27,813	38,367	15,675	56.4%
81	R	68,707	30,413	38,364	16,737	55.0%
27	D	71,170	27,671	38,741	15,117	54.6%
31	D	72,470	25,464	38,074	13,908	54.6%
55	D	66,620	31,869	38,131	17,284	54.2%
28	D	69,534	22,056	36,538	11,800	53.5%
83	R	59,140	14,944	36,566	7,973	53.4%
54	D	69,211	32,757	40,086	17,424	53.2%
84	R	66,713	23,496	38,486	12,497	53.2%
57	R	63,492	19,714	31,956	10,376	52.6%
58	R	65,522	21,407	34,247	11,175	52.2%
48	R	73,170	35,774	36,904	18,598	52.0%
61	D	71,308	24,653	38,596	12,660	51.4%
76	D	70,433	29,252	37,021	14,962	51.1%
68	D	73,016	26,474	39,197	13,513	51.0%

Protects

NV House (Chamber Rating: Lean D)

1 flip possible

District With CN Analysis Rating	Current Hold	Reg Voters	Likely Voters	Count Pro Choice	Count Pro Choice Likely Voters	Percent Pro Choice
42	D	44,553	12,768	25,434	7,464	58.5%
8	D	45,109	11,885	24,845	6,932	58.3%
1	D	51,497	17,487	28,641	10,081	57.6%
16	D	44,548	11,222	26,180	6,403	57.1%
9	D	45,980	14,436	24,380	7,992	55.4%
34	D	46,428	14,224	26,023	7,855	55.2%
41	D	49,126	15,291	26,150	8,440	55.2%
35	D	49,196	15,937	25,225	8,609	54.0%
5	D	46,656	15,138	25,280	8,115	53.6%
29	D	47,750	15,236	25,112	8,103	53.2%
3	D	43,360	12,115	24,085	6,425	53.0%
25	R	49,102	25,773	25,767	13,467	52.3%
21	D	49,808	18,523	25,623	9,670	52.2%
12	D	43,998	13,569	24,805	6,950	51.2%

NV Senate (Chamber Rating: Very Likely D)

0 flips possible
None of the districts up for election this cycle have a majority of voters that are pro-choice, the below districts are analyzed based on registered rather than likely voters

District With CN Analysis Rating	Current Hold	Reg Voters	Count Pro Choice	Percent Pro Choice
8	D	96,819	48,480	50%
9	D	90,550	49,824	55%
12	R	98,969	51,790	52.3%

ME Senate (Chamber Rating: Toss-Up)

0 flips possible
GOP needs nine seats to gain control

District With CN Analysis Rating	Current Hold	Reg Voters	Likely Voters	Count Pro Choice	Count Pro Choice Likely Voters	Percent Pro Choice
21	D	25,931	7,124	17,802	4,036	56.7%
32	D	26,344	8,889	14,976	4,884	54.9%
31	D	28,733	9,933	15,916	5,400	54.4%
34	D	31,246	12,996	15,178	6,576	50.6%

MN House (Tilt R)

0 flips possible
GOP needs four seats to gain control

District With CN Analysis Rating	Current Hold	Reg Voters	Likely Voters	Count Pro Choice	Count Pro Choice Likely Voters	Percent Pro Choice
53B	D	25,309	7,593	16,415	4,114	54.2%

Long Term Opportunities

NC House (Chamber Rating: Very Likely R)

8 flips possible
18 seats needed for control

District With CN Analysis Rating	Current Hold	Reg Voters	Likely Voters	Count Pro Choice	Count Pro Choice Likely Voters	Percent Pro Choice
50	D	59,865	27,811	38,018	19,911	71.6%
54	D	61,270	30,619	38,775	21,297	69.6%
36	D	61,463	25,334	39,993	17,206	67.9%
40	D	65,822	31,750	41,228	20,272	63.8%
2	R	59,451	25,007	34,722	15,938	63.7%
115	D	60,045	24,852	35,735	15,817	63.6%
47	D	47,726	11,876	25,628	7,539	63.5%
35	D	60,762	24,066	37,173	15,226	63.3%
48	D	48,841	13,585	28,474	8,535	62.8%
45	R	47,078	12,215	28,417	7,652	62.6%
105	D	56,324	20,484	35,701	12,659	61.8%
98	R	60,543	22,631	34,947	13,450	59.4%
104	D	59,160	27,187	35,184	16,127	59.3%
9	D	55,901	19,298	32,233	11,185	58.0%
103	D	60,482	26,478	35,050	15,210	57.4%
62	R	64,967	28,217	38,258	16,097	57.0%
32	D	52,567	17,380	29,592	9,815	56.5%
37	R	61,579	23,314	33,283	12,959	55.6%
73	R	51,777	17,394	27,506	9,460	54.4%
43	R	52,756	16,831	27,273	8,767	52.1%
74	R	62,114	26,190	32,873	13,619	52.0%
20	R	62,597	24,229	30,376	12,350	51.0%
24	D	54,889	18,154	28,798	9,097	50.1%

NC Senate (Chamber Rating: Very Likely R)

2 flips possible
4 seats need for control

District With CN Analysis Rating	Current Hold	Reg Voters	Likely Voters	Count Pro Choice	Count Pro Choice Likely Voters	Percent Pro Choice
17	D	136,972	53,338	81,240	32,867	61.6%
24	R	115,536	30,281	63,238	18,564	61.3%
18	D	135,437	53,694	79,153	32,830	61.1%
19	D	128,229	37,142	76,536	22,350	60.2%
42	R	143,920	59,588	87,923	35,736	60.0%
3	D	136,151	47,483	69,517	24,703	52.0%

GA House (Chamber Rating: Very Likely R)

1 flip possible
14 seats needed for control

District With CN Analysis Rating	Current Hold	Reg Voters	Likely Voters	Count Pro Choice	Count Pro Choice Likely Voters	Percent Pro Choice
54	D	38,541	14,555	28,391	9,010	61.9%
106	D	39,923	15,464	22,184	8,580	55.5%
50	D	36,503	13,086	23,946	7,105	54.3%
101	D	38,378	12,387	21,243	6,550	52.9%
35	R	37,535	11,982	19,972	6,280	52.4%
154	D	40,566	14,684	25,057	7,670	52.2%
105	D	38,221	12,695	20,704	6,582	51.8%

TX House (Chamber Rating: Very Likely R)

2 flips possible
10 seats needed for control

District With CN Analysis Rating	Current Hold	Reg Voters	Likely Voters	Count Pro Choice	Count Pro Choice Likely Voters	Percent Pro Choice
35	D	72,311.00	12,898.00	62,774.00	10,016.00	77.7%
41	D	97,046.00	24,425.00	81,649.00	17,763.00	72.7%
74	D	109,198.00	25,412.00	87,426.00	17,676.00	69.6%
47	D	129,733.00	70,610.00	83,307.00	43,880.00	62.1%
135	D	99,870.00	25,133.00	66,184.00	15,587.00	62.0%
37	D	96,832.00	23,604.00	75,297.00	14,532.00	61.6%
148	D	89,635.00	22,825.00	59,411.00	13,594.00	59.6%
45	D	121,123.00	43,602.00	74,356.00	24,199.00	55.5%
105	D	75,258.00	20,157.00	50,794.00	11,005.00	54.6%
34	D	103,130.00	24,252.00	46,370.00	12,661.00	52.2%
118	R	113,023.00	29,691.00	70,522.00	15,032.00	50.6%
31	R	111,130.00	35,801.00	71,456.00	18,112.00	50.6%

Works Cited

“Abortion in Nevada.” Wikipedia, 3 Aug. 2022, en.wikipedia.org/wiki/Abortion_in_Nevada#History. Accessed 9 Aug. 2022.

“Arizona Senate.” Wikipedia, 25 July 2022, en.wikipedia.org/wiki/Arizona_Senate. Accessed 9 Aug. 2022.

“Arizona State Legislative.” Projects.cnalysis.com, projects.cnalysis.com/21-22/state-legislative/arizona. Accessed 9 Aug. 2022.

Center for Reproductive Rights. “Abortion Laws by State.” Center for Reproductive Rights, reproductiverights.org/maps/abortion-laws-by-state/.

Cohn, Nate. “Do Americans Support Abortion Rights? Depends on the State.” The New York Times, 4 May 2022, www.nytimes.com/2022/05/04/upshot/polling-abortion-states.html.

“Georgia House of Representatives.” Wikipedia, 16 June 2022, en.wikipedia.org/wiki/Georgia_House_of_Representatives. Accessed 16 Aug. 2022.

“Maine State Legislative.” Projects.cnalysis.com, projects.cnalysis.com/21-22/state-legislative/maine. Accessed 9 Aug. 2022.

“Michigan Senate.” Wikipedia, 18 May 2022, en.wikipedia.org/wiki/Michigan_Senate. Accessed 9 Aug. 2022.

“Minnesota House of Representatives.” Wikipedia, 24 Mar. 2022, en.wikipedia.org/wiki/Minnesota_House_of_Representatives. Accessed 9 Aug. 2022.

“Minnesota State Legislative.” Projects.cnalysis.com, projects.cnalysis.com/21-22/state-legislative/minnesota. Accessed 9 Aug. 2022.

“Nevada Assembly.” Wikipedia, 7 June 2021, en.wikipedia.org/wiki/Nevada_Assembly.

“Nevada State Legislative.” Projects.cnalysis.com, projects.cnalysis.com/21-22/state-legislative/nevada. Accessed 9 Aug. 2022.

“North Carolina House of Representatives.” Wikipedia, 1 Aug. 2022, en.wikipedia.org/wiki/North_Carolina_House_of_Representatives. Accessed 16 Aug. 2022.

“North Carolina Senate.” Wikipedia, 8 Oct. 2020, en.wikipedia.org/wiki/North_Carolina_Senate.

“North Carolina State Legislative.” Projects.cnalysis.com, projects.cnalysis.com/21-22/state-legislative/north-carolina. Accessed 16 Aug. 2022.

“Pennsylvania House of Representatives.” Wikipedia, 30 July 2022, en.wikipedia.org/wiki/Pennsylvania_House_of_Representatives. Accessed 9 Aug. 2022.

“Pennsylvania State Legislative.” Projects.cnalysis.com, projects.cnalysis.com/21-22/state-legislative/pennsylvania. Accessed 9 Aug. 2022.

Wikipedia Contributors. “Michigan House of Representatives.” Wikipedia, Wikimedia Foundation, 31 Oct. 2019, en.wikipedia.org/wiki/Michigan_House_of_Representatives. Accessed 7 Nov. 2019.

—. “Texas House of Representatives.” Wikipedia, Wikimedia Foundation, 25 Nov. 2019, en.wikipedia.org/wiki/Texas_House_of_Representatives.

[1] In Wyoming and Oklahoma Democrats are not contesting enough seats to take control of the legislature. In West Virginia, the Dakotas, Tenensee, Kentucky, Indiana, and Utah not enough Democratic candidates filed to take control of the upper houses of the legislature. In Massachusetts the GOP is not contesting either house of the legislature and in California the GOP is not contesting the state senate.

[2] The AZ house was not analyzed because CNalysis does not rate these races.

[3] Both houses of the PA leg. are controlled by anti-abortion forces and GOP gubernational nominee Doug Mastriano supports a total ban on abortion

[4] The Michigan house will be difficult to flip because the Dems need to play defense in several challenging seats.

Report on Racial Bloc Voting Analysis for the State of Montana 10/03/2022

Posted on October 3, 2022February 2, 2023 by Haystaq DNA

Report on Racial Bloc Voting Analysis for the State of Montana

Prepared for:

The Montana Districting and Apportionment Commission

Download a PDF of this Report

May 6, 2022

Synopsis

Analysis of the State of Montana’s election results going back to 2014 shows evidence of racial bloc voting (RBV) in the five regions solicited by the Montana Districting and Apportionment Commission. This analysis used general and party primary elections for ten races going back to 2014. We show the results of Homogeneous Precinct Analysis, Bivariate Regression Analysis, and Ecological Inference Analysis to support these findings. All analyses were run using two definitions of American Indian, “American Indian Any” and “American Indian Alone”, using Citizen Voting Age Population (CVAP) data. In the following report, we will walk you through our findings.

Introduction

Figure 1: Map of the 5 regions for analysis defined by the MT Redistricting and Apportionment Commission

Since 2004, Montana’s legislative maps have included 6 majority-minority House Districts (out of 100 districts total) and 3 majority-minority Senate Districts (out of 50 districts total). These majority-minority House and Senate Districts cover all or part of 7 reservations in Montana, which guided our RBV analysis. In addition to the three regions spelled out below that surround current majority-minority legislative districts, the Montana Districting and Apportionment Commission has requested that Haystaq DNA analyze the two major cities with the highest American Indian populations, Great Falls and Billings.

As of the 2016-2020 American Community Survey 5-year Estimates released by the U.S. Census, Montana has a citizen voting age population statewide that is 6.53% “American Indian Alone” and 8.09% “Any Part American Indian.” For minority demographic groups, it was requested that HaystaqDNA look at citizen voting age populations for “Any Part American Indian,” as well as “American Indian Alone” to determine if racially polarized voting exists and to what extent amongst those two categories.

Regions

HaystaqDNA performed the analysis on five regions in the State of Montana to determine if there was RBV in past elections. Our analysis includes all precincts within 17 counties, broken down into the following five regions at the request of the Redistricting Commission. While regions 1 through 3 were selected because they encompass current majority-minority House and Senate Districts, it was proposed that our analysis include all precincts within the 17 counties listed below to ensure coverage of precincts both on and off reservation lands and both within and outside of current majority-minority legislative districts.

Region 1 – Blackfeet & Flathead Reservations (SD 8)

Reservation	Senate District	House District	County
Blackfeet	SD 8	HDs 15 & 16	Glacier Co. Precincts
Blackfeet	SD 8	HD 15	Pondera Co. Precincts
Flathead	SD 8	HD 15	Lake Co. Precincts
Flathead			Sanders Co. Precincts

Region 2 – Rocky Boy’s, Fort Belknap, & Fort Peck Reservations (SD 16)

Reservation	Senate District	House District	County
Rocky Boy’s	SD 16	HD 32	Hill Co. Precincts
Rocky Boy’s	SD 16	HD 32	Chouteau Co. Precincts
Fort Belknap	SD 16	HD 32	Blaine Co. Precincts
Fort Belknap	SD 16	HD 32	Phillips Co. Precincts
Fort Peck	SD 16	HD 31	Roosevelt Co. Precincts
Fort Peck	SD 16	HD 31	Valley Co. Precincts
Fort Peck			Daniels Co. Precincts
Fort Peck			Sheridan Co. Precincts

Region 3 – Crow & Northern Cheyenne Reservations (SD 21)

Reservation	Senate District	House District	County
Crow	SD 21	HDs 41 & 42	Big Horn Co. Precincts
Crow	SD 21	HD 42	Yellowstone Co. Precincts 42.1 and 42.2 only (remaining Yellowstone Co. Precincts part of Billings Region)
Crow	SD 21	HD 41	Rosebud Co. Precincts
	SD 21	HD 41	Powder River Co. Precincts

Region 4 – City of Billings

Reservation	Senate District	House District	County
			Yellowstone County Precincts (excluding Precincts 42.1 and 42.2 which are included in a majority-minority House and Senate District and part of Region 3)

Region 5 – City of Great Falls (Little Shell)

Reservation	Senate District	House District	County
			Cascade County Precincts

Elections

The following elections were analyzed using Ecological Inference, Homogeneous Precinct Analysis and Bivariate Regression Analysis in each of the 5 regions. There were 10 races for analysis total, with 4 from each presidential election cycle and 1 from each midterm between 2014 and 2020. For each of the races listed below, we analyzed the election results from the General, Democratic Primary, and Republican Primary elections where applicable.

2014 U.S. Senate
2016 President
2016 Congressional
2016 Governor
2016 Attorney General
2018 U.S. Senate
2020 President
2020 U.S. Senate
2020 Attorney General
2020 Auditor

Data

Election Results & Precinct Shapefiles

Our process began with creating precinct shapefiles joined to election results for each year of elections, to reflect precinct geographies in place at the time of the election. We obtained the shapefile of Montana Voting Precincts from the Montana State Library Services repository. A shapefile is a geospatial vector data format for geographic information system (GIS) software. We use these shapefiles to spatially analyze the election results in comparison with population data. We obtained historical election results at the precinct level from the Montana Secretary of State website. Using the information below, we manually consolidated election results or precinct geographies.

Four of the 17 counties included in the region of study have had a precinct change between 2014 and 2020. The following is a list of the Precinct Changes between 2014 and 2020 which were made manually on the 2020 precinct shapefile before running the rest of the analysis.

Precinct changes between 2014 and 2016

Yellowstone County (Region 4 – City of Billings)

o Consolidation: Precincts 40.2 and 45.1 consolidated into Precinct 40-45

Precinct Changes between 2016 and 2018

Lake County (Region 1 – Flathead Reservation)

o Split: Precinct Pab 1 split from Ron 1

o Split: Precinct Pab 2 split from Ron 2

*The precinct geography for Ron 1 in 2014 and 2016 was equivalent to the current Pab 1 and Ron 1 combined on the Census Bureau’s 2020 Lake County VTDs. The precinct geography for Ron 2 in 2014 and 2016 was equivalent to the current Pab 2 and Ron 2 combined on the Census Bureau’s 2020 Lake County VTDs.

Phillips County (Region 2 – Fort Belknap Reservation)

o Consolidation: Precincts 2s, 5, 6, 8s, 11 and 12s consolidated into 11S

o Consolidation: Precincts 2n, 7, 8n, 9-1, 9-2, 12n and 16 consolidated into 11N

Valley County (Region 2 – Fort Peck Reservation)

o Consolidation: Precincts 1 and 2 consolidated into 31

o Consolidation: Precincts 2 and 4 consolidated int 33

o Consolidation: Precincts 5, 6, 7 and 8 consolidated into 34

Population Data

For the demographic population data, we used Citizen Voting Age Population (CVAP) from the American Community Survey 5-year Estimates released by the U.S. Census. These data are available at the Block Group level. We disaggregated the data from the Block Group level to the Block level, and then aggregated the data to precincts. This same process was followed for every election year, following the manual modifications made to the precinct shapefile reflecting the consolidations and changes made to precinct geography outlined above, to join CVAP from the year of the election to election results.

We performed the analysis using the “American Indian Alone” category, and also on an “Any Part American Indian” category that we created by combining “American Indian Alone”, “American Indian or White” and “American Indian and Black or African American.” We compared these American Indian categories to “White Alone” and “Other”, which is the combination of all remaining variables.

Methodologies

Homogeneous Precinct Analysis

Homogeneous precinct analysis is a method for estimating voting behavior by race or ethnicity by comparing voting patterns in “homogeneous precincts,” i.e. precincts that are composed of a single racial or ethnic group.

For example, if there is a precinct composed entirely of American Indian voters, and the voters within that precinct give 90% of their votes to Candidate X, then we know that 90% of the American Indian voters supported Candidate X. Since precincts are usually not exclusively one race or ethnicity, precincts that are 90% or more of a single race or ethnicity are usually considered homogeneous for the purposes of this analysis.

A drawback of homogeneous precinct analysis is that we are only able to perform it in a given region that has homogeneous precincts. For example, if a region does not have any precincts that are over 90% CVAP for the race or ethnicity of interest, we are unable to perform homogeneous precinct analysis.

For the purposes of our analysis, we define a homogeneous American Indian precinct to be any precinct that is 90% or more American Indian Alone or American Indian Any.

In Figure 2, we have an example of homogeneous precinct analysis from the 2016 General Election for Attorney General in Region 1.

Figure 2: Bivariate regression plot: Attorney General, 2016 General Election, Region 1.

In Figure 2, we see in the first table that in all of Region 1, there are 3 homogeneous American Indian precincts, all in Glacier county. We see in the Total row that these precincts are overall 96.2% American Indian Any, and the vast majority of voters, 85.4%, voted for Larry Jent. In the second table, we can see that there are many more homogeneous white precincts in Region 1, and overall they are 94.6% white. The majority of voters in these precincts, 78.8%, voted for Tim Fox. These results show evidence of racial bloc voting in this election and region.

In the results section, we consider evidence of RBV to be present when we see greater than a 50% preference for the American Indian and White candidate of choice, and when those candidates differ.

Bivariate Regression Analysis

Bivariate regression analysis provides estimates of voting patterns by race or ethnicity across precincts, regardless of the existence of homogeneous precincts. The analysis shows the relationship between each candidate’s precinct-level vote share and the precinct-level CVAP for each race or ethnicity.

For example, in Figure 3, we can see the plot of % American Indian Alone, White Alone, and Other CVAP in comparison with the % of Votes for Tim Fox and Larry Jent in the 2016 General Election for Attorney General in Region 1.

Figure 3: Bivariate regression plot: Attorney General, 2016 General Election, Region 1.

In Figure 3, each point represents a precinct and its share of CVAP on the x-axis vs the candidate votes in that precinct on the y-axis. In the top left, we can see as the proportion of American Indian Alone CVAP % increases, the share of votes for Tim Fox decreases. In the bottom right, we can see that as the proportion of American Indian Alone CVAP increases, the share of votes for Larry Jent increases. The inverse is true for the Whie Alone category. For the Other category, the % CVAP is so small that we cannot draw any conclusions.

In Figure 4, we can see the correlation coefficients for each bivariate relationship shown in Figure 3.

Figure 4: Bivariate regression correlation coefficients: Attorney General, 2016 General Election, Region 1.

For each plot, a correlation coefficient can be between -1 to 1, where -1 is a perfect negative correlation and 1 is a perfect positive correlation, or slope of the graph if we were to draw a line of best fit on each plot in Figure 3. Here, in Region 1, we see a strong positive correlation between American Indian Alone CVAP and votes for Larry Jent, with a coefficient of 0.9265. We also see a strong positive correlation between White Alone CVAP and votes for Tim Fox in the 2016 Attorney General’s race. These results show evidence of racial bloc voting in this election in Region 1.

In the results section, we consider evidence of RBV to be present when we see a strong positive correlation of greater than 0.5 for the American Indian and White candidate of choice, and when those candidates differ.

Ecological Inference Analysis

Ecological Inference (EI) is the process of drawing conclusions about individual-level behavior from aggregate-level data. The process involves using aggregate (historically called “ecological”) data to draw conclusions about individual-level behavior when no individual-level data are available. The fundamental difficulty with such inferences is that many different possible relationships at the individual level can generate the same observation at the aggregate level. For example, there are a very large number of ways in which electoral support for a political candidate can break down among individual voters and still produce the same aggregate level of support. In the absence of individual-level measurement (for example in the form of surveys), such information needs to be inferred.

EI analysis builds on ecological regression analysis by incorporating method of bounds and maximum likelihood estimation statistical techniques. For our analysis, we use the eiCompare package in R, which builds on Gary King’s ei package in R.

Figure 5 provides an example of EI analysis for the same election as the previous examples, the 2016 Attorney General election in Region 1.

Figure 5: Ecological Inference Plot: Attorney General, 2016 General Election, Region 1.

In Figure 5, we see the estimates of the EI analysis. The green dots represent the EI estimate, and the lines on either side of the dots represent the confidence interval of the statistical estimate. When looking at the estimates for American Indian Alone in the top box, you can see that the estimate of votes is strongly for Larry Jent. For White CVAP, the preference is for Tim Fox.

Figure 6: Ecological Inference Estimates: Attorney General, 2016 General Election, Region 1.

Figure 6 gives the precise estimates shown in the chart above. In the above example, the predicted support for Larry Jent among American Indian Alone CVAP was 90% and for White CVAP was 14.8%. The predicted support for Tim Fox among White CVAP was 85%. The confidence intervals (ci_95_lower and ci_95_upper) indicate that for this estimate, there is a 95% confidence that the true value of this statistically predicted support for Tim Fox among White CVAP is between 83.75% and 86.63%. These results show evidence of racial bloc voting in this election and region.

In the results section, we consider evidence of RBV to be present when we see a clear majority preference for both the American Indian and White candidate of choice and when those candidates differ.

Results

In looking at elections in Montana going back to 2014, we found evidence of racial bloc voting in each of the five regions we analyzed. That is, we found evidence that using either the “American Indian Alone” or the “American Indian Any” definition, American Indian voters vote cohesively in support of their candidate of choice, and that White voters often vote in a bloc for a different candidate of choice.

Across regions, we only saw a few instances where there was RBV for only one of “American Indian Alone” or “American Indian Any”, but not for both. Overall, the results were very similar between the two definitions of American Indian.

In the following sections, we provide summary level results of our findings for each Region that we analyzed. We are also providing a Supporting Appendix with the charts for each methodology and definition of American Indian where we found evidence of RBV.

Region 1 – Blackfeet & Flathead Reservations (SD 8)

Figure 7: Region 1

Region 1 consists of all precincts in Glacier, Pondera, Lake, and Sanders counties. Region 1 has a total CVAP of approximately 47,434. The American Indian Alone CVAP is approximately 12,693, or 26.76%, and the American Indian Any CVAP is 14,021, or 29.57% of the total CVAP.

The examples provided in the methodology show evidence of racial bloc voting in the 2016 General Election for Attorney General.

In Region 1, the Homogeneous Precinct Analysis, Bivariate Regression Analysis, and Ecological Inference Analysis all showed evidence of RBV in the following elections, using both the American Indian Alone and American Indian Any definitions:

2014 U.S. Senate, General Election
2016 President, General Election
2016 Congressional, General Election
2016 Attorney General, General Election
2016 Governor, General Election
2018 U.S. Senate, General Election
2020 President, General Election
2020 U.S. Senate, General Election
2020 Attorney General, General Election
2020 Auditor, General Election

In addition, we found evidence of RBV with the following methods and elections:

Homogeneous Precinct Analysis showed RBV in the 2016 Presidential Democratic Primary and the 2020 Attorney General Republican Primary.
Bivariate Regression Analysis showed RBV in the 2014 US Senate Republican Primary and the 2016 Gubernatorial Republican Primary for both American Indian Alone and American Indian Any definitions.

Our analysis did not find evidence of RBV using Ecological Inference in any primary elections in Region 1.

Region 2 – Rocky Boy’s, Fort Belknap, & Fort Peck Reservations (SD 16)

Figure 8: Region 2

Region 2 consists of all precincts in Hill, Chouteau, Blaine, Phillips, Roosevelt, Valley, Daniels, and Sheridan counties. Region 2 has a total CVAP of approximately 40,957. The American Indian Alone CVAP is approximately 10,590, or 25.9%, and the American Indian Any CVAP is 11,581, or 28.3% of the total CVAP.

In Region 2, the Homogeneous Precinct Analysis, Bivariate Regression Analysis, and Ecological Inference Analysis all showed evidence of RBV in the following elections, using both the American Indian Alone and American Indian Any definitions:

2014 U.S. Senate, General Election
2016 President, General Election
2016 Congressional, General Election
2016 Governor, General Election
2016 Attorney General, General Election
2018 U.S. Senate, General Election
2020 President, General Election
2020 U.S. Senate, General Election
2020 Attorney General, General Election
2020 Auditor, General Election
2020 Auditor, Democratic Primary

The 2016 Democratic Primary Election for President showed evidence of RBV using the Homogeneous Precinct Analysis and Ecological Inference Analysis, using both the American Indian Alone and American Indian Any definitions.

Region 3 – Crow & Northern Cheyenne Reservations (SD 21)

Figure 9: Region 3

Region 3 consists of all precincts in Big Horn, Rosebud, and Powder River Counties. Region 3 also includes Yellowstone County Precincts 42.1 and 42.2 (remaining Yellowstone County precincts not in the Billings region). Region 3 has a total CVAP of approximately 17,532. The American Indian Alone CVAP is approximately 7,615, or 43.4%, and the American Indian Any CVAP is 8,016, or 45.7% of the total CVAP.

In Region 3, the Homogeneous Precinct Analysis, Bivariate Regression Analysis, and Ecological Inference Analysis all showed evidence of RBV in the following elections, using both the American Indian Alone and American Indian Any definitions:

2014 U.S. Senate, General Election
2016 President, General Election
2016 Congressional, General Election
2016 Attorney General, General Election
2016 Governor, General Election
2018 U.S. Senate, General Election
2020 President, General Election
2020 U.S. Senate, General Election
2020 Attorney General, General Election
2020 Auditor, General Election

In addition, we found evidence of RBV with the following methods and elections:

Ecological Inference showed evidence of RBV in the 2020 Attorney General Democratic Primary for both American Indian Alone and American Indian Any definitions.
Bivariate Regression Analysis showed evidence of RBV in the 2016 Presidential Republican Primaries for both American Indian Alone and American Indian Any definitions.
Homogeneous Precinct Analysis showed evidence of RBV in the 2016 Gubernatorial Republican Primary.

Region 4 – City of Billings

Figure 10: Region 4

Region 4 consists of all precincts in Yellowstone County, except for precincts 42.1 and 42.2 which are included in Region 3. Region 4 has a total CVAP of approximately 121,688. The American Indian Alone CVAP is approximately 5,484, or 4.5%, and the American Indian Any CVAP is 7,351, or 6% of the total CVAP.

In Region 4, we were not able to perform Homogeneous Precinct Analysis because there were no homogeneous American Indian Alone or American Indian Any precincts, where the American Indian CVAP was 90% or more of the total CVAP.

The Bivariate Regression Analysis and Ecological Inference Analysis showed evidence of RBV in the following elections, using both the American Indian Alone and American Indian Any definitions:

2014 U.S. Senate, General Election
2016 President, General Election
2016 Congressional, General Election
2016 Governor, General Election
2016 Attorney General, General Election
2018 U.S. Senate, General Election
2020 President, General Election
2020 U.S. Senate, General Election
2020 Attorney General, General Election
2020 Auditor, General Election

Additionally, the Ecological Inference alone showed evidence of RBV in the following elections, using both the American Indian Alone and American Indian Any definitions:

2014 U.S. Senate, Democratic Primary
2016 President, Democratic Primary
2016 Governor, Republican Primary
2018 U.S. Senate, Republican Primary
2020 President, Democratic Primary
2020 Attorney General, Democratic Primary
2020 Attorney General, Republican Primary
2020 Auditor, Democratic Primary

Using only the American Indian Alone definition, the Ecological Inference showed RBV in the following elections as well:

2014 U.S. Senate, Republican Primary
2020 Auditor, Republican Primary

Using only the American Indian Any definition, the Ecological Inference showed RBV in the following election as well:

2016 President, Republican Primary

The Bivariate Regression also shows RBV in the 2020 U.S. Senate Republican Primary for American Indian Alone and American Indian Any.

Region 5 – City of Great Falls (Little Shell)

Figure 11: Region 5

Region 5 consists of all precincts in Cascade County. Region 5 has a total CVAP of approximately 63,032. The American Indian Alone CVAP is approximately 3,434, or 5.5%, and the American Indian Any CVAP is 4,697, or 7.5% of the total CVAP.

In Region 5, we were not able to perform Homogeneous Precinct Analysis because there were no homogeneous American Indian Alone or American Indian Any precincts, where the American Indian CVAP was 90% or more of the total CVAP.

The Ecological Inference Analysis showed evidence of RBV in the following elections, using both the American Indian Alone and American Indian Any definitions:

2014 U.S. Senate, General Election
2016 President, General Election
2016 Congressional, General Election
2016 Attorney General, General Election
2018 U.S. Senate, General Election
2020 President, General Election
2020 U.S. Senate, General Election
2020 Attorney General, General Election
2020 Attorney General, Democratic Primary
2020 Auditor, General Election
2020 Auditor, Democratic Primary

Additionally, the Ecological Inference showed evidence of RBV in the 2016 Presidential Democratic Primary using American Indian Alone only.

Supporting Appendices

Attached to this report is a folder of Supporting Appendices that has the following structure:

Supporting Appendices

Bivariate Regression and Ecological Inference Analysis – this folder contains all charts as shown in the methodology section for these analysis, where we found evidence of RBV. The charts are broken out by Region.

Region 1

Files ending in ‘Bivariate Plot’ and ‘Bivariate Coefficients’ show the results of the Bivariate Regression Analysis.

Files ending in ‘EI Plot’ and ‘EI Estimates’ show the results of the Ecological Inference Analysis.

G stands for General Election, DemP stands for Democratic Primary, RepP stands for Republican Primary in the file names.

AI stands for American Indian in the file names.

Region 2

Region 3

Region 4

Region 5

Homogeneous Precinct Analysis – this folder contains data files (in CSV format) of the results of the homogeneous precinct analysis for all regions that had precincts that were 90% or more American Indian Any/Alone or White.

G stands for General Election, DemP stands for Democratic Primary, RepP stands for Republican Primary in the file names.

AI stands for American Indian in the file names.

Growing Your Audiences 9/20/2022

Posted on September 20, 2022February 2, 2023 by Haystaq DNA

HaystaqDNA’s Predictive Microtargeting – Growing your audiences

Download a PDF of the Case Study

Microtargeting has historically been used by political campaigns to not only identify potential supporters, but to track individual voters. Its success comes in its ability to craft tailored messages to the identifiable targeted subgroup. As the digital footprint of millions of Americans grows larger, so does the ability to identify an individual’s disposition. Campaigns have learned that rather than a single television advertisement, multiple tailored messages to smaller audiences can be more effective. Ensuring your message is in the spoken language of the targeted individual is essential. Today businesses and organizations alike have begun adopting many of these microtargeting practices due to their cost, accuracy, and success rate.

As campaigns and marketing teams rely further on targeted advertisement, it’s no longer just about reaching the masses, as much as reaching individuals susceptible to the message. Finding those to target has become an increasingly difficult task for many organizations and companies to handle internally. Many however, underutilize their ability to reach more consumers with data already in their possession. Whether it’s a campaign that’s fundraising or a NGO that is phone banking, data utilization is essential in growing one’s business. At Haystaq, our ability to import your data, model, and score, ensures a cost-effective advertising strategy that targets consumers that are receptive to the media we are delivering.

Our data modeling has been used in multiple industries including healthcare, retail, education, entertainment, and professional sports. The basis for our scores comes from our ability to identify individuals. With over 220 million adults 18+ and by continuously updating our national issue scores we maintain a database that allows us not only to identify and match, but to create custom models that provide clients with exactly what they need. From hot button issues, to state and local concerns our scores cover a wide range of topics. These scores paint a picture of individual personality and interests.

Perhaps the most important part of this process is matching the data provided by the client to our records. We can work with what we are given. Our list matching can be done with the most basic individual information like name and address. The match rate depends on the amount of information given as well as the integrity of the information provided. By matching we can connect individual records provided by the client, to their larger digital footprint. This additional information gives the client the latitude to decide how they want to target this individual.

Next we move to translating the data for our models. This allows us to rank the individuals based on the specifications needed. After our models run we look at the attributes of individuals who score highly and report this information back to the client for insight. With candidate support and donor/fundraising models, our proprietary modeling gives clients lists that will best serve them. Scores are important to understanding demographics, views on issues, as well as home life and financial standing.

Recently, we did a project in the state of California where we received statewide membership data on over 10,000 members. We were able to build out models and apply the score to all 20 million+ voters in the state. This allowed advocacy organizations to then target for new members throughout the state, including areas that they previously were not in. At Haystaq we specialize in reading and combining client data in order to expand the targeted audience.

We are then able to split individuals into ranked tiers, based on their level of support. With other indicators this allowed the client to choose what groups of individuals to target per region. These lists can be used for mail/email campaigns with our cell phone indicator allowing for sms and text message based campaigns. In the past we have used this to help clients find new car buyers, restaurant goers, solar panel owners and more. The technology at our disposal allows us to engage in targeted advertising in new and complex ways. We increase the precision and accuracy of our marketing by knowing specific traits, preferences, and interests of the average American.

Who are Black Republicans? 9/02/2022

Posted on September 2, 2022February 2, 2023 by Haystaq DNA

Who are Black Republicans?

Download a PDF of the Case Study

Since 1940 African American voters across the nation have shifted en masse from the GOP to the democratic.¹ This shift was driven by African Americans’ (AA) support for Democratic party programs including the New Deal, The Great Society, and Lyndon Johnson’s critical support for the civil rights movement of the 1960s.² Once home to relatively liberal African Americans during the reconstruction era the GOP lost Black membership due to the disenfranchisement of Black voters in the south after the end of reconstruction in 1877 and because of the segregationist “lily-white” movement within the party. More recently, the “southern strategy” of appealing to Wallace and Thurmond voters that the GOP has pursued since the Goldwater era has left many feeling that the Republican Party does not care about Black America.

African American political behavior is shaped by a concept that political scientists refer to as “shared fate” which means black americans feel they gain when representatives of the African American community succeed and this leads to favoritism towards African American candidates in many races and leads to most African Americans voting for left wing candidates regardless of their other demographic characteristics. This does not mean that African Americans are uniformly liberal, indeed roughly a third of them identify as conservative when surveyed as of the 2020 primary, a massive increase since 1972 when ten percent did even as the 2010s have seen a slight increase in the number of African Americans identifying as liberal or moderate.³ Nonetheless most African Americans (90-95%) vote Democratic because of a belief that Democrats support the AA community and because of intense pressure within this community to vote the Democratic ticket.⁴ Further under certain circumstances African American can be convinced to vote for conservative candidates and Gen Z AAs express much skepticism about the Democratic party.⁵ Black Republicans share a sense of shared fate with other African Americans, have strong connections to the African American community, have a strong sense of Black identity, and take a variety of perspectives on whether public policies should be examined using a race-conscious lens.

Across the years a number of Black conservative pundits including Larry Elder, Carol Swain, Jason Riley, and Candence Owens Herman Cain and Ben Caronson have argued that African Americans should vote for the GOP and that the Democratic party would collapse if the Republicans could get 20-30% of the AA vote.⁶ These pundits repeat a familiar set of talking points regarding the Democrats’ historical affinity for the Klan and slavery, the supposedly negative effects of the welfare state on African American families, the disproportionate impact of abortion on black women and the apparent disconnect between the views of everyday AAs and Black political leadership on a variety of issues such as school choice.⁷ These talking heads challenge the notion that African Americans are racially targeted in police killings and argue that democratic policies promote a victim mentality and dependence on government. Previous research on AA Republicans suggests that these voters are widely representative of the AA community on a variety of demographic factors and are actually less church attending than AA Democrats.

Methodology

This paper aims to examine African Americans Republicans (roughly 5-10% of the Black community and 2% of all republicans) who vote for the republican party taking advantage of states where voters register by both party and race (NC, FL, and LA) and using data from L2’s voter file to build demographic profiles of AA GOP registrants (about 8% of all African Americans in the USA). This methodology produces a sample of 136,460 Black GOP voters. The findings cut against those of Corey Fields in Black Elephants. The analysis shows that Black Republicans are male, richer, and less likely to vote than most African Americans. Further, HatstaqDNA’s voter targeting models offer several insights on how this segment of the electorate might be reached, 56% of our Black Republicans use Fox News as their primary TV news source, and our models suggest that a supermajority (73%) oppose former president Trump.⁸

The African American population in the US is 48% male, whereas our sample is 55% male, and nearly 70% of our Black Republicans are unmarried compared to 59% of all voters nationwide

African American Republican voters appear to be skewed more middle-class than AAs nationwide in 2019
- Nationwide 56% of all native-born African Americans have incomes lower than 50,000$ in our sample the number is 35.52%
- Nationwide 27% of all native-born African Americans have between $50,000 and $99,999 in our sample the number is 42.61%
- Nationwide 17% of native-born African Americans have incomes above 100,000$ in our sample the number is 18.18%
Almost half of these voters (47.99%) live in politically mixed households. This compares to 25.25% of all Americans nationwide⁹

Despite their relative wealth compared to other African Americans, Black Republicans are still poor compared to the average citizen in their state and disproportionately concentrated in the bottom 20% of the income distribution.

Compared to other African Americans nationwide who are citizens and eligible to vote, our Black republicans have far lower turnout in some elections and slightly lower turnout in others. This is the opposite of what we would expect because the former is a sample of all Black Americans and not those registered to vote.

While a significant number of African Americans in each state consider themselves conservative (33-45%), at most, between 6-10% of conservative Blacks are registered Republicans, and in most states, less than a quarter of them voted for Trump.

This is in keeping with previous research, which demonstrates that African Americans who consider themselves conservative are more likely to be
- Young
- Female
- Southern
- Economically disadvantaged
- Unmarried
- Uneducated

As compared to blacks who consider themselves Republicans suggesting that for African American men being older and better off produces an alignment between conservative views and party id.

Works Cited

Bennett, Lerone. BEFORE the MAYFLOWER: A History of the Negro in America, 1619-1962. Martino Fine Books, 2018, p. 371.

Bracey, Christopher. Saviors or Sellouts the Promise and Peril of Black Conservatism, from Booker T. Washington to Condoleezza Rice. Beacon Pr, 2009, pp. 4–19.

Elder, Larry. Stupid Black Men: How to Play the Race Card– and Lose. New York, St. Martin’s Press, 2008.

Fields, Corey. Black Elephants in the Room: The Unexpected Politics of African American Republicans. Berkeley, University Of California Press, 2016, pp. 13, 15, 16, 19, 38, 62.

“Florida 2020 President Exit Polls.” Www.cnn.com, www.cnn.com/election/2020/exit-polls/president/florida.

Hansen, John. “Lecture: Black Politics.” 2019.

Hersh, Eitan. Hacking the Electorate: How Campaigns Perceive Voters. Cambridge, Cambridge University Press, 2015, pp. 122–125.

Hetherington, Marc J, and Jonathan Weiler. Prius or Pickup? : How the Answers to Four Simple Questions Explain America’s Great Divide. Boston; New York, Houghton Mifflin Harcourt, 2018, pp. 18, 51–53.

King, Maya. “For Some Black Youth, It’s Time to Question Democratic Loyalties.” Politico, 2020, www.politico.com/news/2020/10/11/gen-z-black-youth-conservatives-trump-421914.

Lepore, Jill. IF THEN: How Simulmatics Corporation Invented the Future. S.L., Liveright Publishing Corp, 2021, pp. 118–120.

“Louisiana Voter Surveys: How Different Groups Voted.” The New York Times, 3 Nov. 2020, www.nytimes.com/interactive/2020/11/03/us/elections/ap-polls-louisiana.html. Accessed 19 July 2022.

Malone, Justin. “Uncle Tom.” Under the Milky Way, 19 Aug. 2020, www.youtube.com/watch?v=UOvuylAnw-E.

Masket, Seth. LEARNING from LOSS: The Democrats, 2016-2020. New York City, Cambridge Univ Press, 2020, p. 45.

“North Carolina Exit Polls: How Different Groups Voted.” The New York Times, 3 Nov. 2020, www.nytimes.com/interactive/2020/11/03/us/elections/exit-polls-north-carolina.html. Accessed 19 July 2022.

Owens, Candace. Blackout: How Black America Can Make Its Second Escape from the Democrat Plantation. New York, Threshold Editions, 2020.

Pew Research Center. “Religious Landscape Study.” Pew Research Center’s Religion & Public Life Project, www.pewresearch.org/religion/religious-landscape-study/compare/political-ideology/by/state/among/racial-and-ethnic-composition/black/. Accessed 19 July 2022.

Philpot, Tasha S. Conservative but Not Republican: The Paradox of Party Identification and Ideology among African Americans. Cambridge, United Kingdom; New York, NY, Cambridge University Press, 2017.

Public Religion Research Institute. “PRRI – American Values Atlas.” Ava.prri.org, Public Religion Research Institute, 2020, ava.prri.org/#politics/2020/States/ideology/. Accessed 19 July 2022.

Richardson, Heather. To Make Men Free: A History of the Republican Party. New York, Basic Books, 2014.

Riley, Jason. Please Stop Helping Us: How Liberals Make It Harder for Blacks to Succeed. New York, Encounter Books, 2016, pp. 28–31.

Tamir, Christine, and Monica Anderson. “5. Household Income, Poverty Status, and Home Ownership among Black Immigrants.” Pew Research Center Race & Ethnicity, Pew Research Center, 20 Jan. 2022, www.pewresearch.org/race-ethnicity/2022/01/20/household-income-poverty-status-and-home-ownership-among-black-immigrants/#:~:text=In%202019. Accessed 14 July 2022.

US Census Bureau. “Historical Reported Voting Rates.” The United States Census Bureau, Department Of Commerce, 7 Oct. 2019, www.census.gov/data/tables/time-series/demo/voting-and-registration/voting-historical-time-series.html.

Yan, Alan, and Hakeen Jefferson. “How the Two-Party System Obscures the Complexity of Black Americans’ Politics.” FiveThirtyEight, 6 Oct. 2020, fivethirtyeight.com/features/how-the-two-party-system-obscures-the-complexity-of-black-americans-politics/. Accessed 9 Oct. 2020.

Identify & Model Owners of Solar Panels

Posted on January 5, 2019October 20, 2023 by Haystaq DNA

Download a PDF of the Case Study

As of 2017, Less than one percent of the 125M plus residential buildings in the United States currently have solar panels installed. This low penetration exists despite a 30% federal tax credit and other state and local incentives. At the same time, according to the framework of the recently-signed Paris Accords, the United States will need to reduce its greenhouse gas emissions by 26% below 2005 levels by 2025. 30% of US greenhouse gas emissions currently come from electrical generation. While many reductions in this category will come from new solar and wind power plants, residential solar will also play a large role in meeting the Paris obligations. The financial benefit a consumer could realize for installing residential solar differs according to a geography’s level of solar suitability (sunlight) and a state’s financial incentives, but there are a number of states where it is already economically worthwhile to install solar (AZ, CA, CO, DE, FL, HI, MA, MD, NJ, NV, NY, etc.). This raises the question of how do we find the individual home owners most likely to buy or lease solar panels, particularly in these target states.

HaystaqDNA’s interest in this area is not entirely academic. While the company’s origins are as a left-leaning political modeling firm, there is an immediate value in finding residential solar buyers for other industries as well. Individuals who buy in one category of green energy will usually buy in others. Haystaq’s existing clients in the in the automotive industry face the increasing need to find buyers of electric and electric-hybrid vehicles. In the coming years, manufacturers of LED lights, tankless water heaters, high efficiency appliances, etc. will be able to market to these same individuals.

Haystaq’s founder and CEO, Ken Strasma perfected his microtargeting skills and techniques in John Kerry’s 2004 democratic primary run and in Barack Obama’s 2008 presidential campaign. In 2013, HaystaqDNA was formed to take these techniques and technologies used in the political arena and use them to help companies find and understand their customers in the corporate world. We knew that if we could find a sample of existing solar panel owners, we could use our advanced statistical algorithms to find other consumers who would behave in a similar way, just as we have in politics,. Where there is sufficient data available, Haystaq has successfully applied microtargeting techniques to verticals including: automotive, healthcare, television programming, professional sports, consumer package goods and retail. Unfortunately, outside of the agencies that regulate state incentive programs (which don’t share data), there is no centralized source of solar panel owners. Haystaq needed to create its own method of identifying solar owners.

While solar panel ownership data is not freely available, there are a number of sources for satellite and aerial photos. Through early in-house experiments, we found that if satellite images were overlaid with GPS coordinates (either provided by the vendor or geocoded in-house) we could match images of structures to the owners of those structures. We did this by using a commercial database and either using the provided latitude/longitude values for each household or by geocoding the addresses when this information was not provided. We were then able to manually review images rooftop by rooftop and determine which had solar panels. This initial effort was successful, but time-intensive and wearying for our analysts.

Above: An image from Haystaq’s early internal interface for finding rooftop solar panels

The next step was to turn to Amazon’s Mechanical Turk (MTurk) crowdsourcing marketplace. MTurk allows Haystaq to put out a request for work to be completed by remote workers willing to work for the incentive offered.. In this case we wanted users to look at images of roofs and mark whether each one appears to contain a solar panel or not. We automated a backend to carve images into individual residential buildings, match those buildings to households on our consumer file and feed those images into Mechanical Turk’s API. We slip in images, that we know contain solar panels and we use these images as our Quality Assurance. If a worker cannot correctly identify the QA images, we disregard their work and prevent them from accepting future work from us. For the non-QA images, we feed each image to two different workers, if they agree that an image does or does not contain solar panels, we mark it as such. If the workers disagree, we send the image to a third worker for arbitration. Due to a high prevalence of rooftop solar panels, we have collected test samples from areas of Orange County, CA; Los Angeles County, CA; and Nevada.

Above is the initial screen Haystaq’s MTurk workers see.

Our MTurk interface has provided us with huge efficiency gains over our initial method. There are still several drawbacks. 1. With novice MTurk workers, we get some false positives for things easily mistaken for solar panels, like skylights or solar water heaters (this second one is not as problematic as it is still a green product). 2. While this gives us a great way to efficiently sample a geography, it is still too expensive and slow to comprehensively examine the entire country. We can, however, create a model with the geographic samples we have collected.

At this point in the process we turn back to Haystaq’s tried and tested data analytics techniques and code. The consumer file mentioned above, consists of roughly 260M US adult consumers. This file contains all of the Personally Identifying Information (PII) as well as over 1200 fields of additional data — Census Data, Property Data, Survey Data, Modeled Data and aggregated data bought from sources like magazines, retailers, airlines, hotels, insurance companies, financial institutions, etc. All of this data is converted to ‘indicator’ or ‘independent variable’ form where text fields are converted to binary flags and false numeric data is discarded. The solar panel sample data from MTurk is already matched to this dataset at a household level. We create the ‘dependent variables’ or ‘dvs’ for each house using an assigned head of household. If a roof had a solar panel, the owner of that house is designated as a 1, the owner of a house without a solar panel is assigned a 0. We then use Python (and its SciKit-Learn and Pandas libraries) to model these dependent variables. We use a variety of algorithms (Logistic Regression, Decision Trees, Nearest Neighbor, Neural Networks, etc) to create the initial models. We often blend the results of multiple models as we find that doing this tends to amplify the underlying signal of each model and cancel out the random errors (noise). No one indicator field will determine a person’s score, our typical models will use in excess of 100 coefficients. The final model provides an algorithm that scores each individual in the consumer database for relative likelihood to become a solar panel buyer.

Having a model is meaningless, unless we can validate that it works. Towards this end we withhold one third of the sample of known solar panel owners from the modeling process which becomes the ‘test set’. We then verify that our scores accurately classified the individuals in the test set who are known to own solar panels. Haystaq’s solar panel models have proved to be highly predictive of the test set.

Group 1

An example of two of our QA checks against the test sample — a Hosmer-Lemeshow step chart and a Receiver Operating Characteristic chart are featured above.

Once we have a model that validates well, we score everyone on the consumer file within a similar geography. At this point we have a rank ordering of consumers ranging from most likely to buy solar (or other green products) to those least likely to buy. We believe this model has direct value to people marketing these product, but at this point we use this model as a filter on or feeder model for our electric and electric-hybrid car models.

Next Steps:

Ideally, we would not rely on a model to find solar panel owners, but instead we would be able to identify all users. With our existing process using MTurk, classifying all 125M US rootfops as solar or non-solar would be time- and cost-prohibitive. This is exacerbated by the need to resurvey periodically to monitor solar growth. To solve this problem, we are writing code to have our AWS cluster environment attempt to categorize the rooftop images before we send them to MTurk workers for verification. We can feed a server a set of images known to have solar panels and a set of images that do not. Using image sensing and machine learning algorithms It will then attempt to categorize new images. Those images are scored, the server learns from its mistakes and this process continues iteratively until the server can correctly categorize the rooftops with solar installs. Using this process, only the images identified by the server as containing solar panels would be sent to MTurk for human validation.

Group 2

Some images from trial runs of using machine learning techniques to identify solar panels on rooftops

There are a number of challenges with having a computer correctly identify a specific set black rectangles surrounded by other dark rectangles, but the early results seem promising.

There are two potential challenges specific to solar that might limit the window in which we can use this technology. One is that our models are making the assumption that a home’s current owner is the owner that installed the solar (or secondarily that the owner valued the solar panels equivalently when buying the home). While rooftop solar penetration is low, that is a fairly safe assumption, but as the efficiency of solar panels increase and their penetration increases that assumption will eventually break. We can get around this challenge by having snapshots in time of solar coverage and comparing this to changes in homeownership that are visible in the consumer file. The other challenge is that our method assumes that solar panels are the easy to identify black rectangles we currently see. Recently Tesla announced solar panels that look and act like traditional roofing tiles — aside from Tesla it is likely that future panels will derivate into different shapes and arrangements. That said, solar is only one potential application for this technology as it could also be easily adapted to find things like swimming pools, boats or RVs.

HaystaqDNA’s Automotive Microtargeting Engine 5/09/2017

Posted on May 9, 2017February 2, 2023 by Haystaq DNA

Download a PDF of the Case Study

Leading Microtargeting firm HaystaqDNA has developed an Automotive Microtargeting Engine that allows automotive OEMs and dealers to select conquest targets based on HaystaqDNA’s powerful predictive models. Compared to traditional list providers these models show up to 80% higher conversion rates on email and direct mail. When used with addressable TV, the HaystaqDNA models yield a 70% higher conversion rate. Marketers for automotive OEMs in the United States must not only sell more cars to existing customers, but also capture new customers from other brands in order to maintain and increase sales. The traditional method of ‘conquesting’ is to buy lists of target customers from generic consumer data vendors such as Experian or Axicom or from US Industry specific vendors such as Polk or AutoIntenders. Unfortunately for the OEM’s buying these lists, their competitors are often buying the exact same targets. These lists are based on basic demographics and targeted to a vehicle segment (example: Luxury Compact Sedans) rather than a brand or specific carline (example: Mercedes-Benz C-Class). Using the technologies and techniques developed in political microtargeting, HaystaqDNA can instead create specific targeting models for individual products. This is accomplished by ingesting existing customer data, augmenting it with original survey research, and using advanced data analytic methods find the individuals most likely to purchase the target product. The conquest targeting provided by HaystaqDNA is specific not only to a given brand, but also to a given car line. We rank every single consumer (~260M individuals) on their likelihood of buying that specific car. Our modeling methods go far past basic demographics and use over 1,000 distinct indicators to find our conquest targets. This technique is far more accurate than relying solely on age/gender/income/location based targets. Below is a case study of how a leading Luxury Automotive Brand used HaystaqDNA Automotive Microtargeting Engine to improve its conquest campaign results.

Like most automotive OEMs, our client traditionally bought lists for direct mail and email campaigns from commercial vendors. The brand was consistently delivering year over year sales growth thanks to regular significant new product introductions and excellent customer loyalty, but they understood they needed to dramatically increase conquest sales (automobile buyers who do not currently own a product from that brand) in order to achieve their future growth targets. They also required an easy to use interface to allow marketing staff across the organization to create and utilize lists of conquest customers. Based on HaystaqDNA’s success over several automotive pilot projects, the brand joined in partnership with HaystaqDNA to create and run the Automotive Microtargeting Engine (AME) in the fall of 2014.

A screenshot of HaystaqDNA’s AME Interface..

The AME project consisted of the following parts:

Setting up and receiving recurring feeds from the brand’s internal CRM system.
Matching existing customers to a commercial database.
Surveying new car buyers.
Using Machine Learning Algorithms to model likely buyers and their preferences.
QA and Validation of these models
Scoring every individual in the country.
Matching in vehicle ownership and in-market timing data.
Creating an interface for marketers to identify, explore and pull conquest targets.

Recurring Feeds: HaystaqDNA worked with the company’s IT department to receive recurring sales, dealer territory, dealer service, accessories, options and event attendance feeds. In the future Haystaq anticipates receiving additional feeds on inventory levels, inventory pipelines and financing received. These feeds come in across secure channels into a firewalled Amazon AWS cloud environment where they are cleaned and formatted. The data is all related back to itself via customer ids, vehicle ids and dealer ids.
Consumer File: On behalf of the client, HaystaqDNA licensed the national infoGroup consumer file, consisting of roughly 260M US adult consumers. This file contains all of the Personally Identifying Information (PII) as well as over 1200 fields of additional data — Census Data, Property Data, Survey Data, Modeled Data and aggregated data bought from sources like magazines, retailers, airlines, hotels, insurance companies, financial institutions, etc. All of this data is converted to ‘indicator’ or ‘independent variable’ form where text fields are converted to binary flags and false numeric data is discarded. The historical brand sales data is then matched to this file using the PII in both.
Surveys: Several times each year, HaystaqDNA conducts a survey of likely car buyers to find things like preferences for particular powertrains and lifestyle choices (like sports attendance and participation). These surveys and primarily conducted through IVR calls to landlines and live calls to cell phones, with online panels and SMS surveys used to supplement where needed.
Dependent Variables and Modeling: Both the Customer and the Survey data is then transformed into ‘dependent variables’ or ‘DVs’. For example, for a specific car dependent variable, a person who is known to have bought said vehicle will be given a value of 1, while another who did not, will be given a value of 0. For a skiing DV, a survey respondent who indicates that they enjoy skiing will be given a value of 1 and another who answered that they never ski will be given a 0. Using our AWS infrastructure, we bring in both our data sets of DVs along with our massive table of independent variables. We have also found that people’s buying behaviors differ regionally, so we typically divide the US into four regions — Northeast, Southeast, Central and Western and model the DVs for each region independently. We use Python (and its SciKit-Learn and Pandas libraries) to model these dependent variables. We use a variety of algorithms (Logistic Regression, Decision Trees, Nearest Neighbor, Neural Networks, etc.) depending on the data sets, but we rely most heavily on the Logistic Regression and Decision Tree algorithms. We often blend the results of multiple models as we find that in doing this we often amplify the underlying signal and cancel out the noise. No one coefficient will determine a person’s score, our typical models will use in excess of 100 coefficients.
Quality Assurance and Model Validation: We always withhold 1/3rd of the DVs to serve as a clean test set and we validate all models against this set. Using this test set we know what the model would predict for these individuals and we can compare that to their actual behavior (their car ownership or survey answers).

An example of two of our QA checks against the test sample — a Hosmer-Lemeshow step chart and a Receiver Operating Characteristic chart are featured above..
We also use visual tools to see how the different models correlate to one another.

This visualization takes a sample of people and looks at their scores across a number of vehicle segments. We expect there to be a high correlation between models at similar price points and some derivations where the price points are car types are very different. Here we see the Luxury Midsize Hybrid SUV and Luxury Compact SUV scores highly correlate while the Luxury Midsize Hybrid Sedan and Luxury Full-Size SUV scores do not correlate..
At this point, the models also provide either coefficient weights or indicator importance ranks which allows us to see the attributes of individuals who score highly. We often report these attributes back to the client so they can gain insight into their customers.

Some examples of recent positive and negative coefficients from a Full-sized Luxury SUV Model..
Scoring: Once the best validated models are selected, our analysts set up a cluster environment on AWS. We use the Python models produced in the modeling phase, but we rely on Spark and Parquet to help us take advantage of the clustered environment. We can assign a score for every car line and lifestyle filter to every individual in a region and ultimately the country in a couple of hours. By default, our scores come out as a value between 0 and 1, but we convert these scores to a rank ordering of individuals within each region.
Vehicle Ownership and In-Market Timing Data: Once we have our consumer file scored and ranked, we match in garage and auto intender data from the client’s chosen vendor. The garage data consists of over 168M individual vehicles that are or have been owned by over 110M individuals. Additionally, this vendor provides a database of Auto Intenders — an in-market timing file which indicates individuals likely to buy a car within the next three, six, or twelve months. This file usually consists of 10M-12M individuals. Both the garage data and in market timing data become filters in the AME Interface.
The AME Interface: The client needed these conquest targets to be available to marketers at multiple levels (National staff – both brand and agency, Regional offices, and individual Dealers) for both exploration and list pulling, so HaystaqDNA created the Automotive Microtargeting Engine. This interface allows marketers to specify a geography (dealers are limited to their own boundaries), specify which car line or car lines they are interested in marketing, filter by different lifestyle, car preference, demographic or market timing filters, indicate the desired list size and optionally put in a distance from event limiter.

A screenshot of HaystaqDNA’s AME Interface..
In addition to creating a list query, Marketers can merge lists or exclude previously created lists and they can assign individuals in a list to specific car lines/collateral. The target channel for the lists can also be specified by a number of templates — AME lists have been used for direct mail, email and digital outreach.
Using this interface, marketers can explore their target areas in real time and pull lists in near real time, greatly shortening the time to deployment vs. what the client had experienced with traditional list providers.

Results: The client has continually tested AME against its traditional list providers. Time and again AME has achieved better campaign conversion. A recent test showed AME with 50-80% higher conversion (depending on the specific car line). The targeting cost per sale also halved. In experiments with addressable TV, Haystaq has seen a 70% higher conversion rate on automotive vs. tests with a leading addressable TV vendor.

Support for the Affordable Care Act 1/23/2017

Posted on January 23, 2017February 2, 2023 by Haystaq DNA

Download a PDF of the Case Study

As the new Congress rushes towards a repeal of the Affordable Care Act, many are working against the opinion of the voters in their own districts. Research conducted by HaystaqDNA during the 2016 campaign showed that a majority of Americans support the ACA. However, members of Congress are more concerned with opinions of their constituents than they are with national numbers. Therefore, Haystaq looked at support levels by Congressional District. 253 of 435 or 58% of Congressional Districts show a majority of voters supporting ACA.

Not surprisingly, the majority of these pro-ACA districts are held by Democrats. However, 61 pro-ACA districts are currently held by Republicans. Many of these districts are relatively safely Republican, but in many, the difference in support in favor of the ACA is near or above the margin of victory in the 2016 election. This would suggest that voting to repeal the act puts these candidates at risk next year, even more so once voters realize how they will be personally affected by a repeal of the ACA.

The Haystaq microtargeting models have identified 98,942,762 likely ACA supporters nationwide, 41,697,492 of whom live in Republican districts.

METHODOLOGY

These numbers are based on a national survey of approximately 10,000 registered voters. The survey responses were used to build microtargeting models predicting how any individual voter would have an- swered the question had they been surveyed. The Congressional District percent in support of ACA is based on the number of voters in each district with an ACA support score of 50% or higher. The ACA support score predicts the likelihood that a voter would say that they support the ACA if surveyed. These numbers differ from poll results in that they are not weighted. A poll is likely to be weighted based on assumptions about likely turnout. The Haystaq models are applied to every registered voter.

The microtargeting models were built using a combination of the survey results and nearly 1,000 fields of commercial marketing data, Census demographics and proprietary derived indicators. Haystaq combines a variety of statistical and machine learning algorithms including Penalized Logistic Regression and Random Forests. The predictive models were validated against a hold-out sample to confirm that they accurately predicted the likely survey responses of individuals whose responses were not used in building the models.

Following is the question wording used in the survey:

Which comes closest to your opinion on the Affordable Care Act or Obamacare: that it is beneficial but doesn’t go far enough, that it is about right, or that it goes too far and should be repealed? Please press 1 if you think Obamacare is beneficial but doesn’t go far enough, press 2 if you like the law as it is, press 3 if you think Obamacare goes too far and should be repealed, or press 4 if you are not sure.

The model predicts the likelihood that a voter with an opinion on ACA would select option 1 (Support ACA but thinks it doesn’t go far enough) or option 2 (like the law as it is) vs. 3 (Goes too far and should be repealed). Because the model is predicting support only among those with an opinion, respondents picking option 4 (unsure) are not included.

The survey was conducted using a combination of live and IVR (automated phone calls) to a random sample of more than 10,000 voters nationwide.

CD	Name	% of Vote in 2016 Election	% of Voters Supporting ACA
TX23	Will Hurd	50.90%	72.40%
NY11	Daniel Donovan	63.30%	70.40%
FL27	Ileana Ros-Lehtinen	54.90%	67.20%
FL26	Carlos Curbelo	56.30%	65.30%
WA8	Dave Reichert	60.00%	64.90%
CA21	David G. Valadao	93.20%	63.80%
IL12	Mike Bost	57.80%	63.30%
MI11	David Trott	56.90%	61.40%
VA10	Barbara Comstock	52.90%	61.00%
KY6	Andy Barr	61.10%	60.60%
IL13	Rodney Davis	59.70%	60.50%
NJ11	Rodney Frelinghuysen	60.00%	60.40%
NJ7	Leonard Lance	55.70%	59.50%
VA2	Scott Taylor	61.70%	59.10%
MI8	Mike Bishop	58.80%	58.60%
IL6	Peter J. Roskam	59.50%	58.40%
FL18	Brian Mast	55.50%	58.10%
NM2	Steve Pearce	62.80%	57.90%
FL25	Mario Diaz-Balart	62.40%	57.90%
MI6	Fred Upton	61.70%	57.60%
CA25	Stephen Knight	54.20%	57.50%
CO6	Mike Coffman	54.70%	56.70%
FL2	Neal Dunn	69.20%	56.40%
NY24	John Katko	61.00%	55.70%
NY19	John Faso	54.70%	55.60%
AZ2	Martha McSally	56.70%	54.80%
CA39	Edward Royce	57.70%	54.60%
MI7	Tim Walberg	57.90%	54.60%
MI1	Jack Bergman	58.20%	54.60%
PA15	Charles W. Dent	60.60%	54.30%
PA18	Tim Murphy	100.00%	54.20%
PA8	Brian Fitzpatrick	54.50%	54.10%
IL14	Randy Hultgren	59.60%	54.10%
MI4	John Moolenaar	65.80%	54.00%
IA1	Rod Blum	53.90%	53.90%
WA5	Cathy McMorris Rodgers	59.50%	53.90%
TX32	Pete Sessions	100.00%	53.90%
NJ3	Tom MacArthur	60.60%	53.70%
WA3	Jaime Herrera Beutler	61.40%	53.60%
NJ4	Chris Smith	65.50%	53.60%
NJ2	Frank LoBiondo	61.60%	53.60%
MN3	Erik Paulsen	56.90%	53.60%
PA12	Keith Rothfus	61.90%	53.50%
KY1	James Comer Jr.	71.20%	53.30%
MI3	Justin Amash	61.30%	53.00%
ME2	Bruce Poliquin	54.90%	52.70%
GA6	Tom Price	61.60%	52.30%
VA5	Thomas Garrett	58.30%	52.10%
TX27	Blake Farenthold	58.90%	52.10%
LA4	Mike Johnson	65.20%	52.00%
NY2	Peter T. King	62.40%	51.90%
LA5	Ralph Abraham	100.00%	51.80%
TX7	John Culberson	56.20%	51.70%
NC13	Ted Budd	56.10%	51.50%
CA49	Darrell Issa	51.00%	51.40%
NY1	Lee Zeldin	59.00%	51.40%
PA6	Ryan Costello	57.30%	51.20%
FL15	Dennis A. Ross	57.50%	51.10%
OH14	David Joyce	62.70%	51.10%
GA12	Rick Allen	61.60%	50.70%
OH1	Steve Chabot	59.60%	50.40%