Common Redistricting Criteria 1/10/2023

Common Redistricting Criteria

Download a PDF of this Report

After each decennial Census, governments at all levels (e.g. congressional, state legislative, municipal, county, special districts) adjust the districts of their respective legislative bodies to reflect the changing demographics of their region. The individual or group drawing the map must be given criteria on what they can and cannot do when creating a map. Redistricting processes and rules vary across states, so map-drawers must be aware of the laws of the state and jurisdiction to avoid having their maps rejected or deemed unconstitutional. Some states will strictly define what redistricting criteria must be followed while others will not. Some municipalities or jurisdictions will also have previously defined their own criteria. Other jurisdictions will be able to adopt their own criteria.

This document is not intended to be a legal guide, but simply a discussion of the most common redistricting criteria.

There are some criteria that are part of almost all redistricting guidelines:

  1. Equal Population — The population of each new district should be ‘equal’. When drawing congressional maps, court rulings have largely defined ‘equal’ as “within 1 person” based on the census’ latest PL 94-171 file. Unless state statute dictates otherwise, counties, municipalities and special districts are not required to adhere to such a strict definition, and districts can have differences within a small percentage of one another. Some states define the allowed deviations, otherwise the jurisdiction does it themselves. Some jurisdictions use as low as a 2% deviation, others as high as 10%. When you talk about percentages you also have to determine if this is an cumulative difference or individual district difference from the ideal population. An example would be a jurisdiction with a 5% deviation rule. A proposed map has one district with a -3% deviation (underpopulated) and another district with a positive 4% deviation (overpopulated). Under the rules is each district under a 5% cutoff and OK, or does the cumulative difference of 7% violate the rule?
  2. Contiguity — The district must be one cohesive unit. For example, you can’t draw one part of a district in the southeast corner of a county and add an unattached region in the northwest corner of the county. Those two sections will need to be physically linked. Definitions of contiguity can also vary between states, as some consider corner and point contiguity as part of the criteria, particularly in areas where geographic limitations (e.g. bodies of water) are present.  There are of course sometimes areas which are not contiguous to the jurisdiction itself (examples would include physical islands or annexed pieces of land that do not touch the original jurisdiction), these of course cannot be drawn to be contiguous.  https://redistricting.lls.edu/redistricting-101/where-are-the-lines-drawn/#contiguity
  3. Comply with the Federal Voting Rights Act — Section Two of the Federal Voting Rights Act prohibits discrimination on the basis of race, color, or membership in a language minority group. Generally speaking, it means existing majority-minority districts should be preserved and new majority-minority districts should be created where possible. Jurisdictions often use CVAP (Citizen Voting Age Population) data to determine minority percentages, and the allowed source data should be specified (CVAP vs VAP vs PL file demographics). At the federal level, a majority minority district must be 50% plus one additional person of the same protected class in one district. If a new majority minority district can be drawn in a “reasonably compact” way, it must be drawn. Section Five of the VRA required certain states and jurisdictions with a history of racial gerrymandering to submit their redistricting plans to the Department of Justice for preclearance, but this was struck down by the U.S. Supreme Court in 2013 (Shelby County v. Holder). 
  4. Comply with State or Jurisdiction Voting Rights Laws – Federal VRA law is complicated, but largely dictates that a minority majority district consists of a single protected minority comprising more than 50% of the population, but state law or local jurisdiction law can go further than federal law. State law might favor the creation of ‘coalition districts’ where multiple minorities together can form a minority district. Additionally, states or jurisdictions can write Voting Rights Laws to not use a hard 50% cutoff, but rather use an “ability to elect” cutoff. This means if a district is 40% minority, but that 40% along with allied voters is large enough to see their ‘candidate of choice’ win, that can still be considered a minority district. Federal law, however, supersedes state and municipal law.

There are other criteria that are widely considered to be best practices:

  • Use Practical Geographic Boundaries — This criteria means that, where possible, practical geographic boundaries should be used to create the lines of a district. This can include major roadways, municipal boundaries, waterways, park boundaries, school district boundaries, etc..
  • Where Possible, Make Districts CompactIn terms of the overall shape, compactness refers to how closely packed together the district is. A perfectly compact district would be a circle, but it is impossible to fill a map with only circular districts. That said, districts should avoid sprouting ‘arms’ and the distance to the center of a district should be similar all along the boundaries (exceptions to compactness are often made for Section Two districts or to keep communities intact). For more information on multiple standard measures of compactness (e.g. Polsby-Popper, Reock, Convex Hull etc.), visit: https://redistricting.lls.edu/redistricting-101/where-are-the-lines-drawn/#compactness
  • Keep as Many Communities of Interest (COIs) Intact as Possible — A community of interest is often defined as a group of people who have common policy concerns and would benefit from being maintained in a single district. Examples of this might be a neighborhood in a city, a HOA, a school district, or a Native American reservation. Redistricting often allows for public input for citizens to define their own communities of interest.

Other criteria that are sometimes used to diminish the effects of gerrymandering or are adopted for other considerations:

  • Document All Iterative Changes to a Map and Summarize the Reasons Why — This criteria is sometimes put onto map drawers to show that they are adhering to the defined criteria and so that the public can see the versions and iterations the map went through. This is done to increase the transparency of the process.
  • Draw Maps Blind to Incumbent Addresses — To make sure that no one is creating maps to advantage or disadvantage certain incumbents, the map drawers can be kept blind to the addresses of the incumbent representatives. This hopefully results in map drawers creating the best maps they can without taking political considerations into account.
  • Draw Maps to Purposefully Avoid Grouping Incumbents into a New District — This criteria is the opposite of the above. It is sometimes used when one political party will have control over the map drawing process and use it to try to squeeze out members of the opposing party. There have been cases where partisan map drawers have purposefully grouped incumbents of the same party into one district so that only one can win, or they group incumbents of one party with a stronger incumbent of the other party to ensure they are not re-elected. Making sure incumbents are not grouped together can eliminate this, but this criteria has also been used to protect incumbents from political challenge and to keep the detrimental effects of previous maps intact.
  • Multi-Member Districts- While federal law prohibits multi-member districts for Congress, many state legislatures elect several representatives from a single district. 
  • Nesting Requirement – In states where districts are “nested,” each lower house (e.g. state house) district is a subdivision of a larger upper house (e.g. state senate) district.
  • Floterial District- Sometimes, individual districts may have a surplus population that is not significant enough to gain additional representation but still causes issues with deviation. A floterial district combines and overlaps several of these districts to meet the population requirement and add an extra representative for the combined population.
  • Draw Maps without Access to Political Data — While VRA evaluation requires political data, often, redistricting criteria requires map drawers to draw the remaining districts without having election results or voter registration loaded as a layer. This prevents map drawers from trying to create partisan favoring districts.
  • Draw to Create as Many Politically Competitive Districts as Possible — This is almost the opposite of the criteria above. In this version the map drawer is directed to draw as many districts as possible where results of historic statewide elections in the newly created districts are as close to 50/50 as possible.
  • Draw without Considering Past Districts (aside from VRA districts) — Oftentimes, old districts were drawn for political purposes and map drawers are directed to start from scratch.
  • Draw to Keep the Core of Past Districts Intact — This is the opposite of the criteria above. One of the easiest and perhaps quickest ways to draw a map is to simply update old districts to reflect population changes. However, adopting this criteria risks locking in any detrimental effects of old plans.
  • Preserve as Many Voting Precincts As Possible — Oftentimes, precinct maps will be redrawn on different timelines. It is inconvenient for precincts to be split by one or more districts and have to offer different types of ballots depending which part of the precinct voters live in. For this reason redistricting consultants often try to keep voting precincts intact within a district.
  • Redistributing Prisoners to their Home Addresses — If there is one or more large prisons in a jurisdiction, sometimes that prison can throw off either the average number of voters in a district (in most states, prisoners cannot vote) or distort VRA calculations. Some states or jurisdictions will attempt to assign prisoners back to their last known address for redistricting purposes. This is a good practice, but it is a relatively heavy data lift and one better adopted at the state level than done individually by a county or city. If a state does prisoner allocation, they generally release a modified PL 94-171 file with the prisoner reallocation already done. Map Drawers can incorporate this file into their drawing software to then account for this provision. For more information on states that reallocate prisoners to their home addresses, visit: https://redistrictingdatahub.org/data/ongoing-data-projects/states-that-adjust-the-census-data-for-redistricting/.
  • Redistributing College Students to their Home Addresses — Similar to the above. Again, this one is generally better if it is done at the state level.
  • Draw During Public or Recorded Meetings — It is possible to draw the maps during public meetings so that the public can see the process and, in some cases, provide community feedback. This is done to increase transparency and build trust in the system but requires multiple and/or longer public meetings.
  • Proportionality – The proportion of voters in a jurisdiction/district who favor each political party should be similar to the voter preferences in historical elections. For example, if the population of a jurisdiction/district allotted 10 seats is 60% democratic and 40% republican, the jurisdiction/district would ideally elect 6 Democrats and 4 Republicans as representatives. This criterion is most common in multi-member districts.

Understanding and adhering to such a myriad of criteria can be a difficult task, but HaystaqDNA can help you draw fair and compliant districts. For more information on Haystaq’s services call us at 202-548-2562 or visit our contact page – https://haystaqdna.com/contact/

Common Redistricting Terms

At-Large Districts

Populations in at-large districts are not divided into several districts and the entire region votes as a cohesive unit. For example, states that are assigned one member in the House hold at-large elections across the entire state.

Reapportionment

The Permanent Apportionment Act of 1929 set the number of seats in the U.S. House of Representatives to 435. After each decennial Census, seats are redistributed to ensure proportional representation in Congress for all states.

P.L. 94-171 Redistricting Data

Public Law (P.L.) 94-171 requires the Census Bureau to provide detailed demographic data needed for redistricting. At the most granular level, the data summarizes the demographic makeup (race/ethnicity, age, etc.) of each Census Block.

Voting Age Population (VAP)

Total population of individuals who are at least 18 years old.

Citizen Voting Age Population (CVAP)

Total population of individuals that are of voting age and a U.S. citizen. This data is usually derived from the American Community Survey (ACS) data set.

Ideal Population

The ideal population size for each district is found by dividing the total population of a region by the number of districts in the respective representative body. Ideal Population = Total Population/ # of Districts

Raw Deviation 

The numerical difference of a district’s population from the ideal population.  Raw Deviation = District Population – Ideal Population

Percent of Deviation 

The proportional difference of a district’s population from the ideal population expressed as a percentage.   % of Deviation= Raw Deviation/ Ideal Population

Deviation Range

Total deviation across all districts. Found by dividing the deviation percent of each individual district and adding the absolute values of the minimum and maximum deviations.  Deviation Range=abs(min(% of Deviation)) + abs(max(% of Deviation))

Cracking

Splitting communities into various districts to dilute said group’s voting power.

Packing

The opposite of cracking; concentrating certain populations into a limited number of districts to restrict the amount of representatives they can elect.

Voting Rights Act (VRA)

Passed in 1965, the Federal VRA protects voters belonging to minority and or protected groups from discrimination.

Majority-Minority District

Electoral districts in which the majority of constituents (50% or more of the citizen voting-age population) belong to racial or ethnic minorities. Where possible, creating/preserving such a district would be required by the VRA to ensure these groups can elect their candidates of choice. 

Coalition District

Districts in which individual minority groups do not form a majority but vote together with other minority groups to form a coalition vote bloc in which the sum of these groups forms the majority and is able to elect a candidate of choice. Such a district, however, is not legally required by the VRA.

Opportunity District

A district where a minority group is able to elect their candidate of choice because the majority group votes similarly to them. This is also not legally required by the VRA.

Community of Interest (COI)

A community of interest COI is a group of people with shared concerns, interests, and characteristics. Every COI is unique; they might be formed around neighborhoods or the physical landscape, cultures, values, and many other things. Because of these shared interests and concerns, a common redistricting criteria is that COIs be considered during the process.

Incumbent

The current holder of an office or position.

Incumbency Criteria

Requires the incumbent’s house to remain in the district they represent.

 

Haystaq – Abortion Impact on State Legislatures 10/11/2022

 

Expanding Democrat’s Holdings in State Legislatures by Leveraging Abortion

Download a PDF of this Report

This white paper takes advantage of Haystaq’s national issue scores to identify expansion targets for the Democratic Party in state legislative elections.[1] Based on feedback we have received, we focused on the following chambers in bold that we believe are pick up opportunities when using abortion related messaging. This analysis considers all seats being contested by the two major parties, excluding only those rated as “Solid D” by CNalysis (whose seat ratings this paper adopts).

Expansion Targets 2022[2]

Protects 2022 Cycle

Long-term Opportunities

Michigan House

Nevada House, and Senate

Georgia House only

Pennsylvania House

Maine Senate

North Carolina House

Arizona Senate

Minnesota House

North Carolina Senate

Michigan Senate

 

 

 

 

 

 

 

 

 

The table below briefly summarizes the legal status of abortion in the targeted states and public option in the state overall. It is quite possible that abortion may prove to be a more salient issue in states where the current abortion policy is at odds with the opinion of the majority of voters.

State

Legal

Constitutional Protection

Legislative Control

% supporting Abortion Statewide

Abortion Rights Under Threat

Arizona

No (pending court challenges)

No

GOP

56%

Yes

Pennsylvania

Yes

No

GOP

56%

Yes[3]

Minnesota

Yes

Yes (non-explicit)

Split

54%

No

Michigan

Yes (pending court action)

No

GOP

55%

No

Nevada

Yes

Yes

Dem

63%

No

Maine

Yes

No

Dem

62%

No

North Carolina

Yes (viability)

No

GOP

49%

Yes

Georgia

Yes (6-weeks)

No

GOP

49%

Yes

Texas

No

No

GOP

46%

Yes

The key finding is, given that the vast majority of Americans disapprove of the overturning of Roe versus Wade, there are several dozen GOP held state legislative seats with pro-choice majorities and where Haystaq’s data can be used to id and target pro-choice voters, and there are many more seats where vulnerable Democrats can use the abortion issue to bolster their electoral position. If Democrats succeed in unifying the pro-choice vote, the party could gain control over both houses of the Michigan legislature along with the Pennsylvania House and the Arizona Senate. Of these legislative chambers, the Michigan Senate will be the easiest to flip, followed by the PA House and AZ Senate.[4] Evidence from field testing and from recent special elections shows that this issue indeed moves vote choice.

The tables below highlight seats in each legislative chamber where abortion related messaging can make a difference with expansion opportunity seats in bold and districts ranked by the salience of the abortion issue. The race ratings key below is taken from CNalysis. For each chamber, the number of seats currently held by the GOP which have Pro Choice majorities is noted, and the information is underlined when the number of seats is enough to flip control of the chamber to the Democrats.

Race Ratings Key

Solid R

Very Likely R

Likely R

Lean R

Tilt R

Tossup

Tilt D

Lean D

Likely D

Very Likely  D

Expansion Targets

MI Senate (Chamber Rating: toss-up)

  • 3 flips possible
  • 3 seats need for control

District With CN Analysis Rating

Current Hold

Reg Voters

Likely Voters

Count Pro Choice

Count Pro Choice Likely Voters

Percent Pro Choice

13

D

203,108

96,391

124,993

59,925

62.2%

21

R

198,356

72,090

124,412

42,377

58.8%

4

D

209,569

71,730

121,671

41,947

58.5%

14

R

201,075

82,691

109,008

47,801

57.8%

28

R

181,207

74,818

104,858

42,582

56.9%

9

D

185,843

75,889

102,357

40,839

53.8%

11

D

198,566

64,910

108,308

33,591

51.8%

PA House (Chamber Rating likely R)

  • 15 flips identified
  • 12 seats needed for control

District With CN Analysis Rating

Current Hold

Reg Voters

Likely Voters

Count Pro Choice

Count Pro Choice Likely Voters

Percent Pro Choice

172

D

33,927

10,996

22,451

7,861

71.5%

45

D

43,874

15,931

27,537

10,751

67.5%

121

D

31,109

9,160

19,974

6,054

66.1%

25

D

42,922

15,281

25,405

9,733

63.7%

16

D

40,813

14,148

21,510

8,880

62.8%

118

D

42,605

17,678

23,913

11,075

62.6%

33

R

42,033

17,389

22,957

10,729

61.7%

2

D

40,142

15,108

22,620

9,219

61.0%

151

R

45,408

20,190

25,941

12,161

60.2%

61

D

45,892

22,495

25,552

12,769

56.8%

29

R

49,874

24,003

26,528

13,555

56.5%

137

R

44,561

15,873

23,385

8,956

56.4%

189

R

37,397

8,393

20,618

4,691

55.9%

53

D

40,307

14,872

22,866

8,240

55.4%

115

D

37,081

8,163

20,666

4,517

55.3%

39

R

44,855

17,716

21,461

9,763

55.1%

146

D

41,054

12,209

22,100

6,662

54.6%

30

R

44,426

20,948

22,894

11,429

54.6%

168

R

42,387

19,484

22,816

10,466

53.7%

3

D

44,772

19,822

22,179

10,629

53.6%

26

R

42,572

16,872

22,166

9,037

53.6%

142

R

43,865

17,959

22,188

9,544

53.1%

18

R

39,719

12,975

22,449

6,835

52.7%

51

R

38,160

12,060

16,212

6,323

52.4%

144

R

45,006

18,380

22,831

9,625

52.4%

74

D

37,401

10,581

18,538

5,449

51.5%

82

R

37,986

13,096

20,020

6,678

51.0%

120

R

39,922

15,036

18,435

7,586

50.5%

AZ Senate (Chamber Rating: Very Likely R)

  • 3 flips possible
  • 2 seats needed for control

District With CN Analysis Rating

Current Hold

Reg Voters

Likely Voters

Count Pro Choice

Count Pro Choice Likely Voters

Percent Pro Choice

11

R

112,243

21,923

79,713

16,064

73.3%

18

D

158,642

75,347

111,147

48,576

64.5%

8

R

128,059

36,355

87,767

21,919

60.3%

23

R

120,100

24,701

68,004

12,805

51.8%

MI House (Chamber Rating: lean R)

  • 8 flips possible
  • 3 seats needed to gain control

bgcolor=”blue”

District With CN Analysis Rating

Current Hold

Reg Voters

Likely Voters

Count Pro Choice

Count Pro Choice Likely Voters

Percent Pro Choice

20

D

74,703

33,317

44,633

21,164

63.5%

2

D

65,447

18,182

40,242

10,875

59.8%

21

D

58,957

28,645

36,086

16,881

58.9%

40

D

67,752

30,234

41,519

17,688

58.5%

69

D

72,560

22,945

41,253

13,418

58.5%

22

D

74,522

38,574

42,596

21,876

56.7%

73

R

55,292

23,075

31,876

13,015

56.4%

80

R

66,445

27,813

38,367

15,675

56.4%

81

R

68,707

30,413

38,364

16,737

55.0%

27

D

71,170

27,671

38,741

15,117

54.6%

31

D

72,470

25,464

38,074

13,908

54.6%

55

D

66,620

31,869

38,131

17,284

54.2%

28

D

69,534

22,056

36,538

11,800

53.5%

83

R

59,140

14,944

36,566

7,973

53.4%

54

D

69,211

32,757

40,086

17,424

53.2%

84

R

66,713

23,496

38,486

12,497

53.2%

57

R

63,492

19,714

31,956

10,376

52.6%

58

R

65,522

21,407

34,247

11,175

52.2%

48

R

73,170

35,774

36,904

18,598

52.0%

61

D

71,308

24,653

38,596

12,660

51.4%

76

D

70,433

29,252

37,021

14,962

51.1%

68

D

73,016

26,474

39,197

13,513

51.0%

Protects

NV House (Chamber Rating: Lean D)

  • 1 flip possible

District With CN Analysis Rating

Current Hold

Reg Voters

Likely Voters

Count Pro Choice

Count Pro Choice Likely Voters

Percent Pro Choice

42

D

44,553

12,768

25,434

7,464

58.5%

8

D

45,109

11,885

24,845

6,932

58.3%

1

D

51,497

17,487

28,641

10,081

57.6%

16

D

44,548

11,222

26,180

6,403

57.1%

9

D

45,980

14,436

24,380

7,992

55.4%

34

D

46,428

14,224

26,023

7,855

55.2%

41

D

49,126

15,291

26,150

8,440

55.2%

35

D

49,196

15,937

25,225

8,609

54.0%

5

D

46,656

15,138

25,280

8,115

53.6%

29

D

47,750

15,236

25,112

8,103

53.2%

3

D

43,360

12,115

24,085

6,425

53.0%

25

R

49,102

25,773

25,767

13,467

52.3%

21

D

49,808

18,523

25,623

9,670

52.2%

12

D

43,998

13,569

24,805

6,950

51.2%

NV Senate (Chamber Rating: Very Likely D)

  • 0 flips possible
  • None of the districts up for election this cycle have a majority of voters that are pro-choice, the below districts are analyzed based on registered rather than likely voters

District With CN Analysis Rating

Current Hold

Reg Voters

Count Pro Choice

Percent Pro Choice

8

D

96,819

48,480

50%

9

D

90,550

49,824

55%

12

R

98,969

51,790

52.3%

ME Senate (Chamber Rating: Toss-Up)

  • 0 flips possible
  • GOP needs nine seats to gain control

District With CN Analysis Rating

Current Hold

Reg Voters

Likely Voters

Count Pro Choice

Count Pro Choice Likely Voters

Percent Pro Choice

21

D

25,931

7,124

17,802

4,036

56.7%

32

D

26,344

8,889

14,976

4,884

54.9%

31

D

28,733

9,933

15,916

5,400

54.4%

34

D

31,246

12,996

15,178

6,576

50.6%

MN House (Tilt R)

  • 0 flips possible
  • GOP needs four seats to gain control

District With CN Analysis Rating

Current Hold

Reg Voters

Likely Voters

Count Pro Choice

Count Pro Choice Likely Voters

Percent Pro Choice

53B

D

25,309

7,593

16,415

4,114

54.2%

Long Term Opportunities

NC House (Chamber Rating: Very Likely R)

  • 8 flips possible
  • 18 seats needed for control

District With CN Analysis Rating

Current Hold

Reg Voters

Likely Voters

Count Pro Choice

Count Pro Choice Likely Voters

Percent Pro Choice

50

D

59,865

27,811

38,018

19,911

71.6%

54

D

61,270

30,619

38,775

21,297

69.6%

36

D

61,463

25,334

39,993

17,206

67.9%

40

D

65,822

31,750

41,228

20,272

63.8%

2

R

59,451

25,007

34,722

15,938

63.7%

115

D

60,045

24,852

35,735

15,817

63.6%

47

D

47,726

11,876

25,628

7,539

63.5%

35

D

60,762

24,066

37,173

15,226

63.3%

48

D

48,841

13,585

28,474

8,535

62.8%

45

R

47,078

12,215

28,417

7,652

62.6%

105

D

56,324

20,484

35,701

12,659

61.8%

98

R

60,543

22,631

34,947

13,450

59.4%

104

D

59,160

27,187

35,184

16,127

59.3%

9

D

55,901

19,298

32,233

11,185

58.0%

103

D

60,482

26,478

35,050

15,210

57.4%

62

R

64,967

28,217

38,258

16,097

57.0%

32

D

52,567

17,380

29,592

9,815

56.5%

37

R

61,579

23,314

33,283

12,959

55.6%

73

R

51,777

17,394

27,506

9,460

54.4%

43

R

52,756

16,831

27,273

8,767

52.1%

74

R

62,114

26,190

32,873

13,619

52.0%

20

R

62,597

24,229

30,376

12,350

51.0%

24

D

54,889

18,154

28,798

9,097

50.1%

NC Senate (Chamber Rating: Very Likely R)

  • 2 flips possible
  • 4 seats need for control

District With CN Analysis Rating

Current Hold

Reg Voters

Likely Voters

Count Pro Choice

Count Pro Choice Likely Voters

Percent Pro Choice

17

D

136,972

53,338

81,240

32,867

61.6%

24

R

115,536

30,281

63,238

18,564

61.3%

18

D

135,437

53,694

79,153

32,830

61.1%

19

D

128,229

37,142

76,536

22,350

60.2%

42

R

143,920

59,588

87,923

35,736

60.0%

3

D

136,151

47,483

69,517

24,703

52.0%

GA House (Chamber Rating: Very Likely R)

  • 1 flip possible
  • 14 seats needed for control

District With CN Analysis Rating

Current Hold

Reg Voters

Likely Voters

Count Pro Choice

Count Pro Choice Likely Voters

Percent Pro Choice

54

D

38,541

14,555

28,391

9,010

61.9%

106

D

39,923

15,464

22,184

8,580

55.5%

50

D

36,503

13,086

23,946

7,105

54.3%

101

D

38,378

12,387

21,243

6,550

52.9%

35

R

37,535

11,982

19,972

6,280

52.4%

154

D

40,566

14,684

25,057

7,670

52.2%

105

D

38,221

12,695

20,704

6,582

51.8%

TX House (Chamber Rating: Very Likely R)

  • 2 flips possible
  • 10 seats needed for control

District With CN Analysis Rating

Current Hold

Reg Voters

Likely Voters

Count Pro Choice

Count Pro Choice Likely Voters

Percent Pro Choice

35

D

72,311.00

12,898.00

62,774.00

10,016.00

77.7%

41

D

97,046.00

24,425.00

81,649.00

17,763.00

72.7%

74

D

109,198.00

25,412.00

87,426.00

17,676.00

69.6%

47

D

129,733.00

70,610.00

83,307.00

43,880.00

62.1%

135

D

99,870.00

25,133.00

66,184.00

15,587.00

62.0%

37

D

96,832.00

23,604.00

75,297.00

14,532.00

61.6%

148

D

89,635.00

22,825.00

59,411.00

13,594.00

59.6%

45

D

121,123.00

43,602.00

74,356.00

24,199.00

55.5%

105

D

75,258.00

20,157.00

50,794.00

11,005.00

54.6%

34

D

103,130.00

24,252.00

46,370.00

12,661.00

52.2%

118

R

113,023.00

29,691.00

70,522.00

15,032.00

50.6%

31

R

111,130.00

35,801.00

71,456.00

18,112.00

50.6%

Works Cited

“Abortion in Nevada.” Wikipedia, 3 Aug. 2022, en.wikipedia.org/wiki/Abortion_in_Nevada#History. Accessed 9 Aug. 2022.

“Arizona Senate.” Wikipedia, 25 July 2022, en.wikipedia.org/wiki/Arizona_Senate. Accessed 9 Aug. 2022.

“Arizona State Legislative.” Projects.cnalysis.com, projects.cnalysis.com/21-22/state-legislative/arizona. Accessed 9 Aug. 2022.

Center for Reproductive Rights. “Abortion Laws by State.” Center for Reproductive Rights, reproductiverights.org/maps/abortion-laws-by-state/.

Cohn, Nate. “Do Americans Support Abortion Rights? Depends on the State.” The New York Times, 4 May 2022, www.nytimes.com/2022/05/04/upshot/polling-abortion-states.html.

“Georgia House of Representatives.” Wikipedia, 16 June 2022, en.wikipedia.org/wiki/Georgia_House_of_Representatives. Accessed 16 Aug. 2022.

“Maine State Legislative.” Projects.cnalysis.com, projects.cnalysis.com/21-22/state-legislative/maine. Accessed 9 Aug. 2022.

“Michigan Senate.” Wikipedia, 18 May 2022, en.wikipedia.org/wiki/Michigan_Senate. Accessed 9 Aug. 2022.

“Minnesota House of Representatives.” Wikipedia, 24 Mar. 2022, en.wikipedia.org/wiki/Minnesota_House_of_Representatives. Accessed 9 Aug. 2022.

“Minnesota State Legislative.” Projects.cnalysis.com, projects.cnalysis.com/21-22/state-legislative/minnesota. Accessed 9 Aug. 2022.

“Nevada Assembly.” Wikipedia, 7 June 2021, en.wikipedia.org/wiki/Nevada_Assembly.

“Nevada State Legislative.” Projects.cnalysis.com, projects.cnalysis.com/21-22/state-legislative/nevada. Accessed 9 Aug. 2022.

“North Carolina House of Representatives.” Wikipedia, 1 Aug. 2022, en.wikipedia.org/wiki/North_Carolina_House_of_Representatives. Accessed 16 Aug. 2022.

“North Carolina Senate.” Wikipedia, 8 Oct. 2020, en.wikipedia.org/wiki/North_Carolina_Senate.

“North Carolina State Legislative.” Projects.cnalysis.com, projects.cnalysis.com/21-22/state-legislative/north-carolina. Accessed 16 Aug. 2022.

“Pennsylvania House of Representatives.” Wikipedia, 30 July 2022, en.wikipedia.org/wiki/Pennsylvania_House_of_Representatives. Accessed 9 Aug. 2022.

“Pennsylvania State Legislative.” Projects.cnalysis.com, projects.cnalysis.com/21-22/state-legislative/pennsylvania. Accessed 9 Aug. 2022.

Wikipedia Contributors. “Michigan House of Representatives.” Wikipedia, Wikimedia Foundation, 31 Oct. 2019, en.wikipedia.org/wiki/Michigan_House_of_Representatives. Accessed 7 Nov. 2019.

—. “Texas House of Representatives.” Wikipedia, Wikimedia Foundation, 25 Nov. 2019, en.wikipedia.org/wiki/Texas_House_of_Representatives.


[1] In Wyoming and Oklahoma Democrats are not contesting enough seats to take control of the legislature. In West Virginia, the Dakotas, Tenensee, Kentucky, Indiana, and Utah not enough Democratic candidates filed to take control of the upper houses of the legislature. In Massachusetts the GOP is not contesting either house of the legislature and in California the GOP is not contesting the state senate.

[2] The AZ house was not analyzed because CNalysis does not rate these races.

[3] Both houses of the PA leg. are controlled by anti-abortion forces and GOP gubernational nominee Doug Mastriano supports a total ban on abortion

[4] The Michigan house will be difficult to flip because the Dems need to play defense in several challenging seats.

Report on Racial Bloc Voting Analysis for the State of Montana 10/03/2022

 

Report on Racial Bloc Voting Analysis for the State of Montana

Prepared for: 

The Montana  Districting and Apportionment Commission

Download a PDF of this Report

May 6, 2022

 

Synopsis

Analysis of the State of Montana’s election results going back to 2014 shows evidence of racial bloc voting (RBV) in the five regions solicited by the Montana Districting and Apportionment Commission. This analysis used general and party primary elections for ten races going back to 2014. We show the results of Homogeneous Precinct Analysis, Bivariate Regression Analysis, and Ecological Inference Analysis to support these findings. All analyses were run using two definitions of American Indian, “American Indian Any” and “American Indian Alone”, using Citizen Voting Age Population (CVAP) data. In the following report, we will walk you through our findings. 

Introduction

Figure 1: Map of the 5 regions for analysis defined by the MT Redistricting and Apportionment Commission

Since 2004, Montana’s legislative maps have included 6 majority-minority House Districts (out of 100 districts total) and 3 majority-minority Senate Districts (out of 50 districts total).  These majority-minority House and Senate Districts cover all or part of 7 reservations in Montana, which guided our RBV analysis.  In addition to the three regions spelled out below that surround current majority-minority legislative districts, the Montana Districting and Apportionment Commission has requested that Haystaq DNA analyze the two major cities with the highest American Indian populations, Great Falls and Billings. 

 

As of the 2016-2020 American Community Survey 5-year Estimates released by the U.S. Census, Montana has a citizen voting age population statewide that is 6.53% “American Indian Alone” and 8.09% “Any Part American Indian.” For minority demographic groups, it was requested that HaystaqDNA look at citizen voting age populations for “Any Part American Indian,” as well as “American Indian Alone” to determine if racially polarized voting exists and to what extent amongst those two categories.  

Regions

HaystaqDNA performed the analysis on five regions in the State of Montana to determine if there was RBV in past elections.  Our analysis includes all precincts within 17 counties, broken down into the following five regions at the request of the Redistricting Commission.  While regions 1 through 3 were selected because they encompass current majority-minority House and Senate Districts, it was proposed that our analysis include all precincts within the 17 counties listed below to ensure coverage of precincts both on and off reservation lands and both within and outside of current majority-minority legislative districts. 

 

Region 1 – Blackfeet & Flathead Reservations (SD 8) 

Reservation Senate District House District County 
Blackfeet  SD 8 HDs 15 & 16 Glacier Co. Precincts 
Blackfeet  SD 8 HD 15 Pondera Co. Precincts
Flathead  SD 8 HD 15 Lake Co. Precincts
Flathead  Sanders Co. Precincts 

 

 Region 2 – Rocky Boy’s, Fort Belknap, & Fort Peck Reservations (SD 16) 

Reservation Senate District  House District County
Rocky Boy’s SD 16 HD 32 Hill Co. Precincts
Rocky Boy’s SD 16 HD 32 Chouteau Co. Precincts
Fort Belknap SD 16 HD 32 Blaine Co. Precincts
Fort Belknap SD 16 HD 32 Phillips Co. Precincts 
Fort Peck SD 16 HD 31 Roosevelt Co. Precincts
Fort Peck SD 16 HD 31 Valley Co. Precincts
Fort Peck Daniels Co. Precincts 
Fort Peck Sheridan Co. Precincts

 

Region 3 – Crow & Northern Cheyenne Reservations (SD 21) 

Reservation Senate District House District  County
Crow SD 21 HDs 41 & 42  Big Horn Co. Precincts
Crow SD 21 HD 42  Yellowstone Co. Precincts 42.1 and 

42.2 only (remaining Yellowstone Co. 

Precincts part of Billings Region) 

Crow SD 21 HD 41 Rosebud Co. Precincts 
SD 21 HD 41 Powder River Co. Precincts 

 

Region 4 – City of Billings   

Reservation Senate District House District  County
Yellowstone County Precincts (excluding Precincts 42.1 and 42.2 which are included in a majority-minority House and Senate District and part of Region 3) 

 

Region 5 – City of Great Falls (Little Shell) 

Reservation Senate District House District  County
Cascade County Precincts

Elections

The following elections were analyzed using Ecological Inference,  Homogeneous Precinct Analysis and Bivariate Regression Analysis in each of the 5 regions. There were 10 races for analysis total, with 4 from each presidential election cycle and 1 from each midterm between 2014 and 2020. For each of the races listed below, we analyzed the election results from the General, Democratic Primary, and Republican Primary elections where applicable. 

 

  1. 2014 U.S. Senate 
  2. 2016 President 
  3. 2016 Congressional 
  4. 2016 Governor 
  5. 2016 Attorney General 
  6. 2018 U.S. Senate 
  7. 2020 President 
  8. 2020 U.S. Senate 
  9. 2020 Attorney General 
  10. 2020 Auditor 

Data

Election Results & Precinct Shapefiles

Our process began with creating precinct shapefiles joined to election results for each year of elections, to reflect precinct geographies in place at the time of the election. We obtained the shapefile of Montana Voting Precincts from the Montana State Library Services repository. A shapefile is a geospatial vector data format for geographic information system (GIS) software. We use these shapefiles to spatially analyze the election results in comparison with population data. We obtained historical election results at the precinct level from the Montana Secretary of State website. Using the information below, we manually consolidated election results or precinct geographies. 

Four of the 17 counties included in the region of study have had a precinct change between 2014 and 2020. The following is a list of the Precinct Changes between 2014 and 2020 which were made manually on the 2020 precinct shapefile before running the rest of the analysis. 

Precinct changes between 2014 and 2016

  • Yellowstone County (Region 4 – City of Billings) 

          o Consolidation:  Precincts 40.2 and 45.1 consolidated into Precinct 40-45 

Precinct Changes between 2016 and 2018 

  • Lake County (Region 1 – Flathead Reservation) 

          o Split:  Precinct Pab 1 split from Ron 1 

          o Split:  Precinct Pab 2 split from Ron 2 

*The precinct geography for Ron 1 in 2014 and 2016 was equivalent to the current Pab 1 and Ron 1 combined on the Census Bureau’s 2020 Lake County VTDs.  The precinct geography for Ron 2 in 2014 and 2016 was equivalent to the current Pab 2 and Ron 2 combined on the Census Bureau’s 2020 Lake County VTDs. 

  • Phillips County (Region 2 – Fort Belknap Reservation) 

          o Consolidation:  Precincts 2s, 5, 6, 8s, 11 and 12s consolidated into 11S 

          o Consolidation:  Precincts 2n, 7, 8n, 9-1, 9-2, 12n and 16 consolidated into 11N 

 

  • Valley County (Region 2 – Fort Peck Reservation) 

          o Consolidation:  Precincts 1 and 2 consolidated into 31 

          o Consolidation:  Precincts 2 and 4 consolidated int 33 

          o Consolidation:  Precincts 5, 6, 7 and 8 consolidated into 34 

Population Data

For the demographic population data, we used Citizen Voting Age Population (CVAP) from the American Community Survey 5-year Estimates released by the U.S. Census. These data are available at the Block Group level. We disaggregated the data from the Block Group level to the Block level, and then aggregated the data to precincts. This same process was followed for every election year, following the manual modifications made to the precinct shapefile reflecting the consolidations and changes made to precinct geography outlined above, to join CVAP from the year of the election to election results.

We performed the analysis using the “American Indian Alone” category, and also on an “Any Part American Indian” category that we created by combining “American Indian Alone”, “American Indian or White” and “American Indian and Black or African American.” We compared these American Indian categories to “White Alone” and “Other”, which is the combination of all remaining variables. 

Methodologies

Homogeneous Precinct Analysis

Homogeneous precinct analysis is a method for estimating voting behavior by race or ethnicity by comparing voting patterns in “homogeneous precincts,” i.e. precincts that are composed of a single racial or ethnic group. 

For example, if there is a precinct composed entirely of American Indian voters, and the voters within that precinct give 90% of their votes to Candidate X, then we know that 90% of the American Indian voters supported Candidate X. Since precincts are usually not exclusively one race or ethnicity, precincts that are 90% or more of a single race or ethnicity are usually considered homogeneous for the purposes of this analysis.

A drawback of homogeneous precinct analysis is that we are only able to perform it in a given region that has homogeneous precincts. For example, if a region does not have any precincts that are over 90% CVAP for the race or ethnicity of interest, we are unable to perform homogeneous precinct analysis. 

For the purposes of our analysis, we define a homogeneous American Indian precinct to be any precinct that is 90% or more American Indian Alone or American Indian Any.

In Figure 2, we have an example of homogeneous precinct analysis from the 2016 General Election for Attorney General in Region 1.

Figure 2: Bivariate regression plot: Attorney General, 2016 General Election, Region 1.

In Figure 2, we see in the first table that in all of Region 1, there are 3 homogeneous American Indian precincts, all in Glacier county. We see in the Total row that these precincts are overall 96.2% American Indian Any, and the vast majority of voters, 85.4%, voted for Larry Jent. In the second table, we can see that there are many more homogeneous white precincts in Region 1, and overall they are 94.6% white. The majority of voters in these precincts, 78.8%, voted for Tim Fox. These results show evidence of racial bloc voting in this election and region. 

In the results section, we consider evidence of RBV to be present when we see greater than a 50% preference for the American Indian and White candidate of choice, and when those candidates differ. 

Bivariate Regression Analysis

Bivariate regression analysis provides estimates of voting patterns by race or ethnicity across precincts, regardless of the existence of homogeneous precincts. The analysis shows the relationship between each candidate’s precinct-level vote share and the precinct-level CVAP for each race or ethnicity. 

For example, in Figure 3, we can see the plot of % American Indian Alone, White Alone, and Other CVAP in comparison with the % of Votes for Tim Fox and Larry Jent in the 2016 General Election for Attorney General in Region 1.  

Figure 3: Bivariate regression plot: Attorney General, 2016 General Election, Region 1.

In Figure 3, each point represents a precinct and its share of CVAP on the x-axis vs the candidate votes in that precinct on the y-axis. In the top left, we can see as the proportion of American Indian Alone CVAP % increases, the share of votes for Tim Fox decreases. In the bottom right, we can see that as the proportion of American Indian Alone CVAP increases, the share of votes for Larry Jent increases. The inverse is true for the Whie Alone category. For the Other category, the % CVAP is so small that we cannot draw any conclusions. 

In Figure 4, we can see the correlation coefficients for each bivariate relationship shown in Figure 3. 

Figure 4: Bivariate regression correlation coefficients: Attorney General, 2016 General Election, Region 1.

For each plot, a correlation coefficient can be between -1 to 1, where -1 is a perfect negative correlation and 1 is a perfect positive correlation, or slope of the graph if we were to draw a line of best fit on each plot in Figure 3. Here, in Region 1, we see a strong positive correlation between American Indian Alone CVAP and votes for Larry Jent, with a coefficient of 0.9265. We also see a strong positive correlation between White Alone CVAP and votes for Tim Fox in the 2016 Attorney General’s race. These results show evidence of racial bloc voting in this election in Region 1. 

In the results section, we consider evidence of RBV to be present when we see a strong positive correlation of greater than 0.5 for the American Indian and White candidate of choice, and when those candidates differ. 

Ecological Inference Analysis

Ecological Inference (EI) is the process of drawing conclusions about individual-level behavior from aggregate-level data. The process involves using aggregate (historically called “ecological”) data to draw conclusions about individual-level behavior when no individual-level data are available. The fundamental difficulty with such inferences is that many different possible relationships at the individual level can generate the same observation at the aggregate level. For example, there are a very large number of ways in which electoral support for a political candidate can break down among individual voters and still produce the same aggregate level of support. In the absence of individual-level measurement (for example in the form of surveys), such information needs to be inferred.

EI analysis builds on ecological regression analysis by incorporating method of bounds and maximum likelihood estimation statistical techniques. For our analysis, we use the eiCompare package in R, which builds on Gary King’s ei package in R. 

Figure 5 provides an example of EI analysis for the same election as the previous examples, the 2016 Attorney General election in Region 1. 

Figure 5: Ecological Inference Plot: Attorney General, 2016 General Election, Region 1.

In Figure 5, we see the estimates of the EI analysis. The green dots represent the EI estimate, and the lines on either side of the dots represent the confidence interval of the statistical estimate. When looking at the estimates for American Indian Alone in the top box, you can see that the estimate of votes is strongly for Larry Jent. For White CVAP, the preference is for Tim Fox.

Figure 6: Ecological Inference Estimates: Attorney General, 2016 General Election, Region 1.

Figure 6 gives the precise estimates shown in the chart above. In the above example, the predicted support for Larry Jent among American Indian Alone CVAP was 90% and for White CVAP was 14.8%. The predicted support for Tim Fox among White CVAP was 85%. The confidence intervals (ci_95_lower and ci_95_upper) indicate that for this estimate, there is a 95% confidence that the true value of this statistically predicted support for Tim Fox among White CVAP is between 83.75% and 86.63%. These results show evidence of racial bloc voting in this election and region. 

In the results section, we consider evidence of RBV to be present when we see a clear majority preference for both the American Indian and White candidate of choice and when those candidates differ. 

Results

In looking at elections in Montana going back to 2014, we found evidence of racial bloc voting in each of the five regions we analyzed. That is, we found evidence that using either the “American Indian Alone” or the “American Indian Any” definition, American Indian voters vote cohesively in support of their candidate of choice, and that White voters often vote in a bloc for a different candidate of choice. 

Across regions, we only saw a few instances where there was RBV for only one of “American Indian Alone” or “American Indian Any”, but not for both. Overall, the results were very similar between the two definitions of American Indian. 

In the following sections, we provide summary level results of our findings for each Region that we analyzed. We are also providing a Supporting Appendix with the charts for each methodology and definition of American Indian where we found evidence of RBV. 

Region 1 – Blackfeet & Flathead Reservations (SD 8)

Figure 7: Region 1

Region 1 consists of all precincts in Glacier, Pondera, Lake, and Sanders counties. Region 1 has a total CVAP of approximately 47,434. The American Indian Alone CVAP is approximately 12,693, or 26.76%, and the American Indian Any CVAP is 14,021, or 29.57% of the total CVAP. 

The examples provided in the methodology show evidence of racial bloc voting in the 2016 General Election for Attorney General. 

In Region 1, the Homogeneous Precinct Analysis, Bivariate Regression Analysis, and Ecological Inference Analysis all showed evidence of RBV in the following elections, using both the American Indian Alone and American Indian Any definitions: 

  • 2014 U.S. Senate, General Election
  • 2016 President, General Election
  • 2016 Congressional, General Election
  • 2016 Attorney General, General Election
  • 2016 Governor, General Election
  • 2018 U.S. Senate, General Election
  • 2020 President, General Election
  • 2020 U.S. Senate, General Election
  • 2020 Attorney General, General Election
  • 2020 Auditor, General Election

 

In addition, we found evidence of RBV with the following methods and elections: 

  • Homogeneous Precinct Analysis showed RBV in the 2016 Presidential Democratic Primary and the 2020 Attorney General Republican Primary. 
  • Bivariate Regression Analysis showed RBV in the 2014 US Senate Republican Primary and the 2016 Gubernatorial Republican Primary for both American Indian Alone and American Indian Any definitions. 

Our analysis did not find evidence of RBV using Ecological Inference in any primary elections in Region 1.  

Region 2 – Rocky Boy’s, Fort Belknap, & Fort Peck Reservations (SD 16)

Figure 8: Region 2

Region 2 consists of all precincts in Hill, Chouteau, Blaine, Phillips, Roosevelt, Valley, Daniels, and Sheridan counties. Region 2 has a total CVAP of approximately 40,957. The American Indian Alone CVAP is approximately 10,590, or 25.9%, and the American Indian Any CVAP is 11,581, or 28.3% of the total CVAP. 

In Region 2, the Homogeneous Precinct Analysis, Bivariate Regression Analysis, and Ecological Inference Analysis all showed evidence of RBV in the following elections, using both the American Indian Alone and American Indian Any definitions: 

  • 2014 U.S. Senate, General Election
  • 2016 President, General Election
  • 2016 Congressional, General Election
  • 2016 Governor, General Election
  • 2016 Attorney General, General Election
  • 2018 U.S. Senate, General Election
  • 2020 President, General Election
  • 2020 U.S. Senate, General Election
  • 2020 Attorney General, General Election
  • 2020 Auditor, General Election
  • 2020 Auditor, Democratic Primary

The 2016 Democratic Primary Election for President showed evidence of RBV using the Homogeneous Precinct Analysis and Ecological Inference Analysis, using both the American Indian Alone and American Indian Any definitions.

Region 3 – Crow & Northern Cheyenne Reservations (SD 21)

Figure 9: Region 3

Region 3 consists of all precincts in Big Horn, Rosebud, and Powder River Counties. Region 3 also includes Yellowstone County Precincts 42.1 and 42.2 (remaining Yellowstone County precincts not in the Billings region). Region 3 has a total CVAP of approximately 17,532. The American Indian Alone CVAP is approximately 7,615, or 43.4%, and the American Indian Any CVAP is 8,016, or 45.7% of the total CVAP. 

In Region 3, the Homogeneous Precinct Analysis, Bivariate Regression Analysis, and Ecological Inference Analysis all showed evidence of RBV in the following elections, using both the American Indian Alone and American Indian Any definitions: 

  • 2014 U.S. Senate, General Election
  • 2016 President, General Election
  • 2016 Congressional, General Election
  • 2016 Attorney General, General Election
  • 2016 Governor, General Election
  • 2018 U.S. Senate, General Election
  • 2020 President, General Election
  • 2020 U.S. Senate, General Election
  • 2020 Attorney General, General Election
  • 2020 Auditor, General Election

In addition, we found evidence of RBV with the following methods and elections: 

  • Ecological Inference showed evidence of RBV in the 2020 Attorney General Democratic Primary for both American Indian Alone and American Indian Any definitions. 
  • Bivariate Regression Analysis showed evidence of RBV in the 2016 Presidential Republican Primaries for both American Indian Alone and American Indian Any definitions. 
  • Homogeneous Precinct Analysis showed evidence of RBV in the 2016 Gubernatorial Republican Primary. 

Region 4 – City of Billings

Figure 10: Region 4

Region 4 consists of all precincts in Yellowstone County, except for precincts 42.1 and 42.2 which are included in Region 3. Region 4 has a total CVAP of approximately 121,688. The American Indian Alone CVAP is approximately 5,484, or 4.5%, and the American Indian Any CVAP is 7,351, or 6% of the total CVAP. 

In Region 4, we were not able to perform Homogeneous Precinct Analysis because there were no homogeneous American Indian Alone or American Indian Any precincts, where the American Indian CVAP was 90% or more of the total CVAP. 

The Bivariate Regression Analysis and Ecological Inference Analysis showed evidence of RBV in the following elections, using both the American Indian Alone and American Indian Any definitions: 

  • 2014 U.S. Senate, General Election
  • 2016 President, General Election
  • 2016 Congressional, General Election
  • 2016 Governor, General Election
  • 2016 Attorney General, General Election
  • 2018 U.S. Senate, General Election
  • 2020 President, General Election
  • 2020 U.S. Senate, General Election
  • 2020 Attorney General, General Election
  • 2020 Auditor, General Election

Additionally, the Ecological Inference alone showed evidence of RBV in the following elections, using both the American Indian Alone and American Indian Any definitions: 

  • 2014 U.S. Senate, Democratic Primary
  • 2016 President, Democratic Primary
  • 2016 Governor, Republican Primary
  • 2018 U.S. Senate, Republican Primary
  • 2020 President, Democratic Primary
  • 2020 Attorney General, Democratic Primary
  • 2020 Attorney General, Republican Primary 
  • 2020 Auditor, Democratic Primary

Using only the American Indian Alone definition, the Ecological Inference showed RBV in the following elections as well: 

  • 2014 U.S. Senate, Republican Primary
  • 2020 Auditor, Republican Primary

Using only the American Indian Any definition, the Ecological Inference showed RBV in the following election as well: 

  • 2016 President, Republican Primary

The Bivariate Regression also shows RBV in the 2020 U.S. Senate Republican Primary for American Indian Alone and American Indian Any. 

Region 5 – City of Great Falls (Little Shell)

Figure 11: Region 5

Region 5 consists of all precincts in Cascade County. Region 5 has a total CVAP of approximately 63,032. The American Indian Alone CVAP is approximately 3,434, or 5.5%, and the American Indian Any CVAP is 4,697, or 7.5% of the total CVAP. 

In Region 5, we were not able to perform Homogeneous Precinct Analysis because there were no homogeneous American Indian Alone or American Indian Any precincts, where the American Indian CVAP was 90% or more of the total CVAP. 

The Ecological Inference Analysis showed evidence of RBV in the following elections, using both the American Indian Alone and American Indian Any definitions: 

  • 2014 U.S. Senate, General Election
  • 2016 President, General Election
  • 2016 Congressional, General Election
  • 2016 Attorney General, General Election
  • 2018 U.S. Senate, General Election
  • 2020 President, General Election
  • 2020 U.S. Senate, General Election
  • 2020 Attorney General, General Election
  • 2020 Attorney General, Democratic Primary
  • 2020 Auditor, General Election
  • 2020 Auditor, Democratic Primary

Additionally, the Ecological Inference showed evidence of RBV in the 2016 Presidential Democratic Primary using American Indian Alone only. 

Supporting Appendices

Attached to this report is a folder of Supporting Appendices that has the following structure: 

Supporting Appendices

        Bivariate Regression and Ecological Inference Analysis – this folder contains all charts as shown in the methodology section for these analysis, where we found evidence of RBV. The charts are broken out by Region

                    Region 1

                             Files ending in ‘Bivariate Plot’ and ‘Bivariate Coefficients’ show the results of the Bivariate Regression Analysis. 

                            Files ending in ‘EI Plot’ and ‘EI Estimates’ show the results of the Ecological Inference Analysis

                           G stands for General Election, DemP stands for Democratic Primary, RepP stands for Republican Primary in the file names.

                          AI stands for American Indian in the file names. 

                   Region 2

                   Region 3

                   Region 4

                   Region 5

                        Homogeneous Precinct Analysis – this folder contains data files (in CSV format) of the results of the homogeneous precinct analysis for all regions that had precincts that were 90% or more American Indian Any/Alone or White. 

                        G stands for General Election, DemP stands for Democratic Primary, RepP stands for Republican Primary in the file names.

                        AI stands for American Indian in the file names. 

 

 

Growing Your Audiences 9/20/2022

HaystaqDNA’s Predictive Microtargeting – Growing your audiences

Download a PDF of the Case Study

Microtargeting has historically been used by political campaigns to not only identify potential supporters, but to track individual voters. Its success comes in its ability to craft tailored messages to the identifiable targeted subgroup. As the digital footprint of millions of Americans grows larger, so does the ability to identify an individual’s disposition. Campaigns have learned that rather than a single television advertisement, multiple tailored messages to smaller audiences can be more effective. Ensuring your message is in the spoken language of the targeted individual is essential. Today businesses and organizations alike have begun adopting many of these microtargeting practices due to their cost, accuracy, and success rate.

As campaigns and marketing teams rely further on targeted advertisement, it’s no longer just about reaching the masses, as much as reaching individuals susceptible to the message. Finding those to target has become an increasingly difficult task for many organizations and companies to handle internally. Many however, underutilize their ability to reach more consumers with data already in their possession. Whether it’s a campaign that’s fundraising or a NGO that is phone banking, data utilization is essential in growing one’s business. At Haystaq, our ability to import your data, model, and score, ensures a cost-effective advertising strategy that targets consumers that are receptive to the media we are delivering.

Our data modeling has been used in multiple industries including healthcare, retail, education, entertainment, and professional sports. The basis for our scores comes from our ability to identify individuals. With over 220 million adults 18+ and by continuously updating our national issue scores we maintain a database that allows us not only to identify and match, but to create custom models that provide clients with exactly what they need. From hot button issues, to state and local concerns our scores cover a wide range of topics. These scores paint a picture of individual personality and interests.

Perhaps the most important part of this process is matching the data provided by the client to our records. We can work with what we are given. Our list matching can be done with the most basic individual information like name and address. The match rate depends on the amount of information given as well as the integrity of the information provided. By matching we can connect individual records provided by the client, to their larger digital footprint. This additional information gives the client the latitude to decide how they want to target this individual.

Next we move to translating the data for our models. This allows us to rank the individuals based on the specifications needed. After our models run we look at the attributes of individuals who score highly and report this information back to the client for insight. With candidate support and donor/fundraising models, our proprietary modeling gives clients lists that will best serve them. Scores are important to understanding demographics, views on issues, as well as home life and financial standing.

Recently, we did a project in the state of California where we received statewide membership data on over 10,000 members. We were able to build out models and apply the score to all 20 million+ voters in the state. This allowed advocacy organizations to then target for new members throughout the state, including areas that they previously were not in. At Haystaq we specialize in reading and combining client data in order to expand the targeted audience.

We are then able to split individuals into ranked tiers, based on their level of support. With other indicators this allowed the client to choose what groups of individuals to target per region. These lists can be used for mail/email campaigns with our cell phone indicator allowing for sms and text message based campaigns. In the past we have used this to help clients find new car buyers, restaurant goers, solar panel owners and more. The technology at our disposal allows us to engage in targeted advertising in new and complex ways. We increase the precision and accuracy of our marketing by knowing specific traits, preferences, and interests of the average American.

Who are Black Republicans? 9/02/2022

Who are Black Republicans?

Download a PDF of the Case Study

Since 1940 African American voters across the nation have shifted en masse from the GOP to the democratic.1 This shift was driven by African Americans’ (AA) support for Democratic party programs including the New Deal, The Great Society, and Lyndon Johnson’s critical support for the civil rights movement of the 1960s.2 Once home to relatively liberal African Americans during the reconstruction era the GOP lost Black membership due to the disenfranchisement of Black voters in the south after the end of reconstruction in 1877 and because of the segregationist “lily-white” movement within the party. More recently, the “southern strategy” of appealing to Wallace and Thurmond voters that the GOP has pursued since the Goldwater era has left many feeling that the Republican Party does not care about Black America.

 

 African American political behavior is shaped by a concept that political scientists refer to as “shared fate” which means black americans feel they gain when representatives of the African American community succeed and this leads to favoritism towards African American candidates in many races and leads to most African Americans voting for left wing candidates regardless of their other demographic characteristics. This does not mean that African Americans are uniformly liberal, indeed roughly a third of them identify as conservative when surveyed as of the 2020 primary, a massive increase since 1972 when ten percent did even as the 2010s have seen a slight increase in the number of African Americans identifying as liberal or moderate.3 Nonetheless most African Americans (90-95%) vote Democratic because of a belief that Democrats support the AA community and because of intense pressure within this community to vote the Democratic ticket.4 Further under certain circumstances African American can be convinced to vote for conservative candidates and Gen Z AAs express much skepticism about the Democratic party.5 Black Republicans share a sense of shared fate with other African Americans, have strong connections to the African American community, have a strong sense of Black identity, and take a variety of perspectives on whether public policies should be examined using a race-conscious lens. 

 

Across the years a number of Black conservative pundits including Larry Elder, Carol Swain, Jason Riley, and Candence Owens Herman Cain and Ben Caronson have argued that African Americans should vote for the GOP and that the Democratic party would collapse if the Republicans could get 20-30% of the AA vote.6 These pundits repeat a familiar set of talking points regarding the Democrats’ historical affinity for the Klan and slavery, the supposedly negative effects of the welfare state on African American families, the disproportionate impact of abortion on black women and the apparent disconnect between the views of everyday AAs and Black political leadership on a variety of issues such as school choice.7 These talking heads challenge the notion that African Americans are racially targeted in police killings and argue that democratic policies promote a victim mentality and dependence on government. Previous research on AA Republicans suggests that these voters are widely representative of the AA community on a variety of demographic factors and are actually less church attending than AA Democrats.

 

Methodology 

This paper aims to examine African Americans Republicans (roughly 5-10% of the Black community and 2% of all republicans) who vote for the republican party taking advantage of states where voters register by both party and race (NC, FL, and LA) and using data from L2’s voter file to build demographic profiles of AA GOP registrants (about 8% of all African Americans in the USA). This methodology produces a sample of 136,460 Black GOP voters. The findings cut against those of Corey Fields in Black Elephants. The analysis shows that Black Republicans are male, richer, and less likely to vote than most African Americans. Further, HatstaqDNA’s voter targeting models offer several insights on how this segment of the electorate might be reached, 56% of our Black Republicans use Fox News as their primary TV news source, and our models suggest that a supermajority (73%) oppose former president Trump.8

  • The African American population in the US is 48% male, whereas our sample is 55% male, and nearly 70% of our Black Republicans are unmarried compared to 59% of all voters nationwide

 

  • African American Republican voters appear to be skewed more middle-class than AAs nationwide in 2019 
    • Nationwide 56% of all native-born African Americans have incomes lower than 50,000$ in our sample the number is 35.52%
    • Nationwide 27% of all native-born African Americans have between $50,000 and $99,999 in our sample the number is 42.61%
    • Nationwide 17% of native-born African Americans have incomes above 100,000$ in our sample the number is 18.18%
  • Almost half of these voters (47.99%) live in politically mixed households. This compares to 25.25% of all Americans nationwide9
  • Despite their relative wealth compared to other African Americans, Black Republicans are still poor compared to the average citizen in their state and disproportionately concentrated in the bottom 20% of the income distribution. 

  • Compared to other African Americans nationwide who are citizens and eligible to vote, our Black republicans have far lower turnout in some elections and slightly lower turnout in others. This is the opposite of what we would expect because the former is a sample of all Black Americans and not those registered to vote.

  • While a significant number of African Americans in each state consider themselves conservative (33-45%), at most, between 6-10% of conservative Blacks are registered Republicans, and in most states, less than a quarter of them voted for Trump.

  •  This is in keeping with previous research, which demonstrates that African Americans who consider themselves conservative are more likely to be
    • Young
    • Female
    • Southern
    • Economically disadvantaged
    • Unmarried
    • Uneducated 

As compared to blacks who consider themselves Republicans suggesting that for African American men being older and better off produces an alignment between conservative views and party id.

 

Works Cited

Bennett, Lerone. BEFORE the MAYFLOWER: A History of the Negro in America, 1619-1962. Martino Fine Books, 2018, p. 371.

Bracey, Christopher. Saviors or Sellouts the Promise and Peril of Black Conservatism, from Booker T. Washington to Condoleezza Rice. Beacon Pr, 2009, pp. 4–19.

Elder, Larry. Stupid Black Men: How to Play the Race Card– and Lose. New York, St. Martin’s Press, 2008.

Fields, Corey. Black Elephants in the Room: The Unexpected Politics of African American Republicans. Berkeley, University Of California Press, 2016, pp. 13, 15, 16, 19, 38, 62.

“Florida 2020 President Exit Polls.” Www.cnn.com, www.cnn.com/election/2020/exit-polls/president/florida.

Hansen, John. “Lecture: Black Politics.” 2019.

Hersh, Eitan. Hacking the Electorate: How Campaigns Perceive Voters. Cambridge, Cambridge University Press, 2015, pp. 122–125.

Hetherington, Marc J, and Jonathan Weiler. Prius or Pickup? : How the Answers to Four Simple Questions Explain America’s Great Divide. Boston; New York, Houghton Mifflin Harcourt, 2018, pp. 18, 51–53.

King, Maya. “For Some Black Youth, It’s Time to Question Democratic Loyalties.” Politico, 2020, www.politico.com/news/2020/10/11/gen-z-black-youth-conservatives-trump-421914.

Lepore, Jill. IF THEN: How Simulmatics Corporation Invented the Future. S.L., Liveright Publishing Corp, 2021, pp. 118–120.

“Louisiana Voter Surveys: How Different Groups Voted.” The New York Times, 3 Nov. 2020, www.nytimes.com/interactive/2020/11/03/us/elections/ap-polls-louisiana.html. Accessed 19 July 2022.

Malone, Justin. “Uncle Tom.” Under the Milky Way, 19 Aug. 2020, www.youtube.com/watch?v=UOvuylAnw-E.

Masket, Seth. LEARNING from LOSS: The Democrats, 2016-2020. New York City, Cambridge Univ Press, 2020, p. 45.

“North Carolina Exit Polls: How Different Groups Voted.” The New York Times, 3 Nov. 2020, www.nytimes.com/interactive/2020/11/03/us/elections/exit-polls-north-carolina.html. Accessed 19 July 2022.

Owens, Candace. Blackout: How Black America Can Make Its Second Escape from the Democrat Plantation. New York, Threshold Editions, 2020.

Pew Research Center. “Religious Landscape Study.” Pew Research Center’s Religion & Public Life Project, www.pewresearch.org/religion/religious-landscape-study/compare/political-ideology/by/state/among/racial-and-ethnic-composition/black/. Accessed 19 July 2022.

Philpot, Tasha S. Conservative but Not Republican: The Paradox of Party Identification and Ideology among African Americans. Cambridge, United Kingdom; New York, NY, Cambridge University Press, 2017.

Public Religion Research Institute. “PRRI – American Values Atlas.” Ava.prri.org, Public Religion Research Institute, 2020, ava.prri.org/#politics/2020/States/ideology/. Accessed 19 July 2022.

Richardson, Heather. To Make Men Free: A History of the Republican Party. New York, Basic Books, 2014.

Riley, Jason. Please Stop Helping Us: How Liberals Make It Harder for Blacks to Succeed. New York, Encounter Books, 2016, pp. 28–31.

Tamir, Christine, and Monica Anderson. “5. Household Income, Poverty Status, and Home Ownership among Black Immigrants.” Pew Research Center Race & Ethnicity, Pew Research Center, 20 Jan. 2022, www.pewresearch.org/race-ethnicity/2022/01/20/household-income-poverty-status-and-home-ownership-among-black-immigrants/#:~:text=In%202019. Accessed 14 July 2022.

US Census Bureau. “Historical Reported Voting Rates.” The United States Census Bureau, Department Of Commerce, 7 Oct. 2019, www.census.gov/data/tables/time-series/demo/voting-and-registration/voting-historical-time-series.html.

Yan, Alan, and Hakeen Jefferson. “How the Two-Party System Obscures the Complexity of Black Americans’ Politics.” FiveThirtyEight, 6 Oct. 2020, fivethirtyeight.com/features/how-the-two-party-system-obscures-the-complexity-of-black-americans-politics/. Accessed 9 Oct. 2020.

 

 

Identify & Model Owners of Solar Panels

Download a PDF of the Case Study

As of 2017, Less than one percent of the 125M plus residential buildings in the United States currently have solar panels installed.  This low penetration exists despite a 30% federal tax credit and other state and local incentives.  At the same time, according to the framework of the recently-signed Paris Accords, the United States will need to reduce its greenhouse gas emissions by 26% below 2005 levels by 2025.   30% of US greenhouse gas emissions currently come from electrical generation.  While many reductions in this category will come from new solar and wind power plants, residential solar will also play a large role in meeting the Paris obligations.  The financial benefit a consumer could realize for installing residential solar differs according to a geography’s level of solar suitability (sunlight) and a state’s financial incentives, but there are a number of states where it is already economically worthwhile to install solar (AZ, CA, CO, DE, FL, HI, MA, MD, NJ, NV, NY, etc.).  This raises the question of how do we find the individual home owners most likely to buy or lease solar panels, particularly in these target states.

HaystaqDNA’s interest in this area is not entirely academic.  While the company’s origins are as a left-leaning political modeling firm, there is an immediate value in finding residential solar buyers for other industries as well.  Individuals who buy in one category of green energy will usually buy in others.   Haystaq’s existing clients in the in the automotive industry face  the increasing need to find buyers of electric and electric-hybrid vehicles. In the coming years, manufacturers of LED lights, tankless water heaters, high efficiency appliances, etc. will be able to market to these same individuals.  

Haystaq’s founder and CEO, Ken Strasma perfected his microtargeting skills and techniques in John Kerry’s 2004 democratic primary run and in Barack Obama’s 2008 presidential campaign.  In 2013, HaystaqDNA was formed to take these techniques and technologies used in the political arena and use them to help companies find and understand their customers in the corporate world.  We knew that if we could find a sample of existing solar panel owners, we could use our advanced statistical algorithms to find other consumers who would behave in a similar way, just as we have in politics,. Where there is sufficient data available, Haystaq has successfully applied microtargeting techniques to verticals including: automotive, healthcare, television programming, professional sports, consumer package goods and retail.  Unfortunately, outside of the agencies that regulate state incentive programs (which don’t share data), there is no centralized source of solar panel owners.  Haystaq needed to create its own method of identifying solar owners.

     While solar panel ownership data is not freely available, there are a number of sources for satellite and aerial photos.  Through early in-house experiments, we found that if satellite images were overlaid with GPS coordinates (either provided by the vendor or geocoded in-house) we could match images of structures to the owners of those structures.  We did this by using a commercial database and either using the provided latitude/longitude values for each household or by geocoding the addresses when this information was not provided.  We were then able to manually review images rooftop by rooftop and determine which had solar panels.  This initial effort was successful, but time-intensive and wearying for our analysts.

image06

Above: An image from Haystaq’s early internal interface for finding rooftop solar panels

The next step was to turn to Amazon’s Mechanical Turk (MTurk) crowdsourcing marketplace.  MTurk allows Haystaq to put out a request for work to be completed by remote workers willing to work for the incentive offered..  In this case we wanted users to look at images of roofs and mark whether each one appears to contain a solar panel or not.  We automated a backend to carve images into individual residential buildings, match those buildings to households on our consumer file and feed those images into Mechanical Turk’s API.  We slip in images, that we know contain solar panels and we use these images as our Quality Assurance.  If a worker cannot correctly identify the QA images, we disregard their work and prevent them from accepting future work from us.  For the non-QA images, we feed each image to two different workers, if they agree that an image does or does not contain solar panels, we mark it as such.  If the workers disagree, we send the image to a third worker for arbitration.   Due to a high prevalence of rooftop solar panels, we have collected test samples from areas of Orange County, CA; Los Angeles County, CA; and Nevada.

image02

Above is the initial screen Haystaq’s MTurk workers see.

Our MTurk interface has provided us with huge efficiency gains over our initial method.  There are still several drawbacks.  1. With novice MTurk workers, we get some false positives for things easily mistaken for solar panels, like skylights or solar water heaters (this second one is not as problematic as it is still a green product).   2. While this gives us a great way to efficiently sample a geography, it is still too expensive and slow to comprehensively examine the entire country.  We can, however, create a model with the geographic samples we have collected.

At this point in the process we turn back to Haystaq’s tried and tested data analytics techniques and code.  The consumer file mentioned above, consists of roughly 260M US adult consumers.  This file contains all of the Personally Identifying Information (PII) as well as over 1200 fields of additional data — Census Data, Property Data, Survey Data, Modeled Data and aggregated data bought from sources like magazines, retailers, airlines, hotels, insurance companies, financial institutions, etc.   All of this data is converted to ‘indicator’ or ‘independent variable’ form where text fields are converted to binary flags and false numeric data is discarded.  The solar panel sample data from MTurk is already matched to this dataset at a household level. We create the ‘dependent variables’ or ‘dvs’ for each house using an assigned head of household.  If a roof had a solar panel, the owner of that house is designated as a 1, the owner of a house without a solar panel is assigned a 0.   We then use Python (and its SciKit-Learn and Pandas libraries) to model these dependent variables.  We use a variety of algorithms (Logistic Regression, Decision Trees, Nearest Neighbor, Neural Networks, etc) to create the initial models.   We often blend the results of multiple models as we find that doing this tends to amplify the underlying signal of each model and cancel out the random errors (noise). No one indicator field will determine a person’s score, our typical models will use in excess of 100 coefficients.  The final model provides an algorithm that scores each individual in the consumer database for relative likelihood to become a solar panel buyer.

Having a model is meaningless, unless we can validate that it works.  Towards this end we withhold one third of the sample of known solar panel owners from the modeling process  which becomes the ‘test set’.  We then verify that  our scores accurately classified  the individuals in the test set who are known to own solar panels.  Haystaq’s solar panel models have proved to be highly predictive of the test set.

Group 1

An example of two of our QA checks against the test sample — a Hosmer-Lemeshow step chart and a Receiver Operating Characteristic chart are featured above.

Once we have a model that validates well, we score everyone on the consumer file within a similar geography.  At this point we have a rank ordering of consumers ranging from most likely to buy solar (or other green products) to those least likely to buy.   We believe this model has direct value to people marketing these product, but at this point we use this model as a filter on or feeder model for our electric and electric-hybrid car models.

Next Steps:

Ideally, we would not rely on a model to find solar panel owners, but instead we would be able to identify all users.  With our existing process using MTurk, classifying all 125M US rootfops as solar or non-solar would be time- and cost-prohibitive.  This is exacerbated by the need to resurvey periodically to monitor solar growth.  To solve this problem, we are writing code to have our AWS cluster environment attempt to categorize the rooftop images before we send them to MTurk workers for verification.  We can feed a server a set of images known to have solar panels and a set of images that do not. Using image sensing and machine learning algorithms It will then attempt to categorize new images. Those images are scored, the server learns from its mistakes and this process continues iteratively until the server can correctly categorize the rooftops with solar installs.  Using this process, only the images identified by the server as containing solar panels would be sent to MTurk for human validation.

Group 2

Some images from trial runs of using machine learning techniques to identify solar panels on rooftops

There are a number of challenges with having a computer correctly identify a specific set black rectangles surrounded by other dark rectangles, but the early results seem promising.                       

There are two potential challenges specific to solar that might limit the window in which we can use this technology.  One is that our models are making the assumption that a home’s current owner is the owner that installed the solar (or secondarily that the owner valued the solar panels equivalently when buying the home).  While rooftop solar penetration is low, that is a fairly safe assumption, but as the efficiency of solar panels increase and their penetration increases that assumption will eventually break.  We can get around this challenge by having snapshots in time of solar coverage and comparing this to changes in homeownership that are visible in the consumer file.  The other challenge is that our method assumes that solar panels are the easy to identify black rectangles we currently see.  Recently Tesla announced solar panels that look and act like traditional roofing tiles — aside from Tesla it is likely that future panels will derivate into different shapes and arrangements.  That said, solar is only one potential application for this technology as it could also be easily adapted to find things like swimming pools, boats or RVs.

HaystaqDNA’s Automotive Microtargeting Engine 5/09/2017

Download a PDF of the Case Study

Leading Microtargeting firm HaystaqDNA has developed an Automotive Microtargeting Engine that allows automotive OEMs and dealers to select conquest targets based on HaystaqDNA’s powerful predictive models. Compared to traditional list providers these models show up to 80% higher conversion rates on email and direct mail. When used with addressable TV, the HaystaqDNA models yield a 70% higher conversion rate. Marketers for automotive OEMs in the United States must not only sell more cars to existing customers, but also capture new customers from other brands in order to maintain and increase sales. The traditional method of ‘conquesting’ is to buy lists of target customers from generic consumer data vendors such as Experian or Axicom or from US Industry specific vendors such as Polk or AutoIntenders. Unfortunately for the OEM’s buying these lists, their competitors are often buying the exact same targets. These lists are based on basic demographics and targeted to a vehicle segment (example: Luxury Compact Sedans) rather than a brand or specific carline (example: Mercedes-Benz C-Class). Using the technologies and techniques developed in political microtargeting, HaystaqDNA can instead create specific targeting models for individual products. This is accomplished by ingesting existing customer data, augmenting it with original survey research, and using advanced data analytic methods find the individuals most likely to purchase the target product. The conquest targeting provided by HaystaqDNA is specific not only to a given brand, but also to a given car line. We rank every single consumer (~260M individuals) on their likelihood of buying that specific car. Our modeling methods go far past basic demographics and use over 1,000 distinct indicators to find our conquest targets. This technique is far more accurate than relying solely on age/gender/income/location based targets. Below is a case study of how a leading Luxury Automotive Brand used HaystaqDNA Automotive Microtargeting Engine to improve its conquest campaign results.

Like most automotive OEMs, our client traditionally bought lists for direct mail and email campaigns from commercial vendors. The brand was consistently delivering year over year sales growth thanks to regular significant new product introductions and excellent customer loyalty, but they understood they needed to dramatically increase conquest sales (automobile buyers who do not currently own a product from that brand) in order to achieve their future growth targets. They also required an easy to use interface to allow marketing staff across the organization to create and utilize lists of conquest customers. Based on HaystaqDNA’s success over several automotive pilot projects, the brand joined in partnership with HaystaqDNA to create and run the Automotive Microtargeting Engine (AME) in the fall of 2014.

AME interface

A screenshot of HaystaqDNA’s AME Interface..

The AME project consisted of the following parts:

  1. Setting up and receiving recurring feeds from the brand’s internal CRM system.
  2. Matching existing customers to a commercial database.
  3. Surveying new car buyers.
  4. Using Machine Learning Algorithms to model likely buyers and their preferences.
  5. QA and Validation of these models
  6. Scoring every individual in the country.
  7. Matching in vehicle ownership and in-market timing data.
  8. Creating an interface for marketers to identify, explore and pull conquest targets.

 

  1. Recurring Feeds: HaystaqDNA worked with the company’s IT department to receive recurring sales, dealer territory, dealer service, accessories, options and event attendance feeds. In the future Haystaq anticipates receiving additional feeds on inventory levels, inventory pipelines and financing received. These feeds come in across secure channels into a firewalled Amazon AWS cloud environment where they are cleaned and formatted. The data is all related back to itself via customer ids, vehicle ids and dealer ids.
  2. Consumer File: On behalf of the client, HaystaqDNA licensed the national infoGroup consumer file, consisting of roughly 260M US adult consumers. This file contains all of the Personally Identifying Information (PII) as well as over 1200 fields of additional data — Census Data, Property Data, Survey Data, Modeled Data and aggregated data bought from sources like magazines, retailers, airlines, hotels, insurance companies, financial institutions, etc. All of this data is converted to ‘indicator’ or ‘independent variable’ form where text fields are converted to binary flags and false numeric data is discarded. The historical brand sales data is then matched to this file using the PII in both.
  3. Surveys: Several times each year, HaystaqDNA conducts a survey of likely car buyers to find things like preferences for particular powertrains and lifestyle choices (like sports attendance and participation). These surveys and primarily conducted through IVR calls to landlines and live calls to cell phones, with online panels and SMS surveys used to supplement where needed.
  4. Dependent Variables and Modeling: Both the Customer and the Survey data is then transformed into ‘dependent variables’ or ‘DVs’. For example, for a specific car dependent variable, a person who is known to have bought said vehicle will be given a value of 1, while another who did not, will be given a value of 0. For a skiing DV, a survey respondent who indicates that they enjoy skiing will be given a value of 1 and another who answered that they never ski will be given a 0. Using our AWS infrastructure, we bring in both our data sets of DVs along with our massive table of independent variables. We have also found that people’s buying behaviors differ regionally, so we typically divide the US into four regions — Northeast, Southeast, Central and Western and model the DVs for each region independently. We use Python (and its SciKit-Learn and Pandas libraries) to model these dependent variables. We use a variety of algorithms (Logistic Regression, Decision Trees, Nearest Neighbor, Neural Networks, etc.) depending on the data sets, but we rely most heavily on the Logistic Regression and Decision Tree algorithms. We often blend the results of multiple models as we find that in doing this we often amplify the underlying signal and cancel out the noise. No one coefficient will determine a person’s score, our typical models will use in excess of 100 coefficients.
  5. Quality Assurance and Model Validation: We always withhold 1/3rd of the DVs to serve as a clean test set and we validate all models against this set. Using this test set we know what the model would predict for these individuals and we can compare that to their actual behavior (their car ownership or survey answers).
    QA checks

    An example of two of our QA checks against the test sample — a Hosmer-Lemeshow step chart and a Receiver Operating Characteristic chart are featured above..

    We also use visual tools to see how the different models correlate to one another.
    Visual tools checks

    This visualization takes a sample of people and looks at their scores across a number of vehicle segments. We expect there to be a high correlation between models at similar price points and some derivations where the price points are car types are very different. Here we see the Luxury Midsize Hybrid SUV and Luxury Compact SUV scores highly correlate while the Luxury Midsize Hybrid Sedan and Luxury Full-Size SUV scores do not correlate..

    At this point, the models also provide either coefficient weights or indicator importance ranks which allows us to see the attributes of individuals who score highly. We often report these attributes back to the client so they can gain insight into their customers.
    QA checks

    Some examples of recent positive and negative coefficients from a Full-sized Luxury SUV Model..

  6. Scoring: Once the best validated models are selected, our analysts set up a cluster environment on AWS. We use the Python models produced in the modeling phase, but we rely on Spark and Parquet to help us take advantage of the clustered environment. We can assign a score for every car line and lifestyle filter to every individual in a region and ultimately the country in a couple of hours. By default, our scores come out as a value between 0 and 1, but we convert these scores to a rank ordering of individuals within each region.
  7. Vehicle Ownership and In-Market Timing Data: Once we have our consumer file scored and ranked, we match in garage and auto intender data from the client’s chosen vendor. The garage data consists of over 168M individual vehicles that are or have been owned by over 110M individuals. Additionally, this vendor provides a database of Auto Intenders — an in-market timing file which indicates individuals likely to buy a car within the next three, six, or twelve months. This file usually consists of 10M-12M individuals. Both the garage data and in market timing data become filters in the AME Interface.
  8. The AME Interface: The client needed these conquest targets to be available to marketers at multiple levels (National staff – both brand and agency, Regional offices, and individual Dealers) for both exploration and list pulling, so HaystaqDNA created the Automotive Microtargeting Engine. This interface allows marketers to specify a geography (dealers are limited to their own boundaries), specify which car line or car lines they are interested in marketing, filter by different lifestyle, car preference, demographic or market timing filters, indicate the desired list size and optionally put in a distance from event limiter.

    AME interface v2

    A screenshot of HaystaqDNA’s AME Interface..

    In addition to creating a list query, Marketers can merge lists or exclude previously created lists and they can assign individuals in a list to specific car lines/collateral. The target channel for the lists can also be specified by a number of templates — AME lists have been used for direct mail, email and digital outreach.
    Using this interface, marketers can explore their target areas in real time and pull lists in near real time, greatly shortening the time to deployment vs. what the client had experienced with traditional list providers.

Results: The client has continually tested AME against its traditional list providers. Time and again AME has achieved better campaign conversion. A recent test showed AME with 50-80% higher conversion (depending on the specific car line). The targeting cost per sale also halved. In experiments with addressable TV, Haystaq has seen a 70% higher conversion rate on automotive vs. tests with a leading addressable TV vendor.

Support for the Affordable Care Act 1/23/2017

Download a PDF of the Case Study

As the new Congress rushes towards a repeal of the Affordable Care Act, many are working against the opinion of the voters in their own districts. Research conducted by HaystaqDNA during the 2016 campaign showed that a majority of Americans support the ACA. However, members of Congress are more concerned with opinions of their constituents than they are with national numbers. Therefore, Haystaq looked at support levels by Congressional District. 253 of 435 or 58% of Congressional Districts show a majority of voters supporting ACA.

Not surprisingly, the majority of these pro-ACA districts are held by Democrats. However, 61 pro-ACA districts are currently held by Republicans. Many of these districts are relatively safely Republican, but in many, the difference in support in favor of the ACA is near or above the margin of victory in the 2016 election. This would suggest that voting to repeal the act puts these candidates at risk next year, even more so once voters realize how they will be personally affected by a repeal of the ACA.

The Haystaq microtargeting models have identified 98,942,762 likely ACA supporters nationwide, 41,697,492 of whom live in Republican districts.

METHODOLOGY

These numbers are based on a national survey of approximately 10,000 registered voters. The survey responses were used to build microtargeting models predicting how any individual voter would have an- swered the question had they been surveyed. The Congressional District percent in support of ACA is based on the number of voters in each district with an ACA support score of 50% or higher. The ACA support score predicts the likelihood that a voter would say that they support the ACA if surveyed. These numbers differ from poll results in that they are not weighted. A poll is likely to be weighted based on assumptions about likely turnout. The Haystaq models are applied to every registered voter.

The microtargeting models were built using a combination of the survey results and nearly 1,000 fields of commercial marketing data, Census demographics and proprietary derived indicators. Haystaq combines a variety of statistical and machine learning algorithms including Penalized Logistic Regression and Random Forests. The predictive models were validated against a hold-out sample to confirm that they accurately predicted the likely survey responses of individuals whose responses were not used in building the models.

Following is the question wording used in the survey:

Which comes closest to your opinion on the Affordable Care Act or Obamacare: that it is beneficial but doesn’t go far enough, that it is about right, or that it goes too far and should be repealed? Please press 1 if you think Obamacare is beneficial but doesn’t go far enough, press 2 if you like the law as it is, press 3 if you think Obamacare goes too far and should be repealed, or press 4 if you are not sure.

The model predicts the likelihood that a voter with an opinion on ACA would select option 1 (Support ACA but thinks it doesn’t go far enough) or option 2 (like the law as it is) vs. 3 (Goes too far and should be repealed). Because the model is predicting support only among those with an opinion, respondents picking option 4 (unsure) are not included.

The survey was conducted using a combination of live and IVR (automated phone calls) to a random sample of more than 10,000 voters nationwide.

ACA-support-HaystaqDNA-score-by-county

CD Name % of Vote in 2016 Election % of Voters Supporting ACA
TX23 Will Hurd 50.90% 72.40%
NY11 Daniel Donovan 63.30% 70.40%
FL27 Ileana Ros-Lehtinen 54.90% 67.20%
FL26 Carlos Curbelo 56.30% 65.30%
WA8 Dave Reichert 60.00% 64.90%
CA21 David G. Valadao 93.20% 63.80%
IL12 Mike Bost 57.80% 63.30%
MI11 David Trott 56.90% 61.40%
VA10 Barbara Comstock 52.90% 61.00%
KY6 Andy Barr 61.10% 60.60%
IL13 Rodney Davis 59.70% 60.50%
NJ11 Rodney Frelinghuysen 60.00% 60.40%
NJ7 Leonard Lance 55.70% 59.50%
VA2 Scott Taylor 61.70% 59.10%
MI8 Mike Bishop 58.80% 58.60%
IL6 Peter J. Roskam 59.50% 58.40%
FL18 Brian Mast 55.50% 58.10%
NM2 Steve Pearce 62.80% 57.90%
FL25 Mario Diaz-Balart 62.40% 57.90%
MI6 Fred Upton 61.70% 57.60%
CA25 Stephen Knight 54.20% 57.50%
CO6 Mike Coffman 54.70% 56.70%
FL2 Neal Dunn 69.20% 56.40%
NY24 John Katko 61.00% 55.70%
NY19 John Faso 54.70% 55.60%
AZ2 Martha McSally 56.70% 54.80%
CA39 Edward Royce 57.70% 54.60%
MI7 Tim Walberg 57.90% 54.60%
MI1 Jack Bergman 58.20% 54.60%
PA15 Charles W. Dent 60.60% 54.30%
PA18 Tim Murphy 100.00% 54.20%
PA8 Brian Fitzpatrick 54.50% 54.10%
IL14 Randy Hultgren 59.60% 54.10%
MI4 John Moolenaar 65.80% 54.00%
IA1 Rod Blum 53.90% 53.90%
WA5 Cathy McMorris Rodgers 59.50% 53.90%
TX32 Pete Sessions 100.00% 53.90%
NJ3 Tom MacArthur 60.60% 53.70%
WA3 Jaime Herrera Beutler 61.40% 53.60%
NJ4 Chris Smith 65.50% 53.60%
NJ2 Frank LoBiondo 61.60% 53.60%
MN3 Erik Paulsen 56.90% 53.60%
PA12 Keith Rothfus 61.90% 53.50%
KY1 James Comer Jr. 71.20% 53.30%
MI3 Justin Amash 61.30% 53.00%
ME2 Bruce Poliquin 54.90% 52.70%
GA6 Tom Price 61.60% 52.30%
VA5 Thomas Garrett 58.30% 52.10%
TX27 Blake Farenthold 58.90% 52.10%
LA4 Mike Johnson 65.20% 52.00%
NY2 Peter T. King 62.40% 51.90%
LA5 Ralph Abraham 100.00% 51.80%
TX7 John Culberson 56.20% 51.70%
NC13 Ted Budd 56.10% 51.50%
CA49 Darrell Issa 51.00% 51.40%
NY1 Lee Zeldin 59.00% 51.40%
PA6 Ryan Costello 57.30% 51.20%
FL15 Dennis A. Ross 57.50% 51.10%
OH14 David Joyce 62.70% 51.10%
GA12 Rick Allen 61.60% 50.70%
OH1 Steve Chabot 59.60% 50.40%