Sunday, June 26, 2016

Guest Post: Rethinking STV Fundamentals

This is a guest post from Kevin Baas.  Kevin is a computer programmer from Milwaukee and he has been thinking deeply about the right way to implement STV elections.  His conclusions sharply disagree with long-standing practices for STV elections so it makes for a very interesting read!

I am still unsure if I personally agree with Kevin's conclusions, but his approach is very novel and worth sharing with others.  Kevin's post is very technical so put on your thinking caps and get ready to dive in.

Rethinking STV Fundamentals

Three errors have plagued the counting of ranked choice ballots for over a century, and here’s how to fix them.

 While learning about how ranked choice ballots are traditionally counted, I discovered three critical flaws in the way everyone does it -- indeed, the way everyone has been doing it for over a century. I would like to share with you how I discovered these flaws and how to fix them. In the end I will outline a new and complete method for counting ranked choice ballots that contains none of these flaws, and consequently produces fairer results than any other system. (With the possible exception of CPO-STV, which it might be tied with.) In addition to explaining the reasoning for these changes, I will provide evidence to support my (admittedly bold) claim that the results are more fair.

Firstly let me outline two of the flaws by walking through two hypothetical scenarios.

On passing two tests. 

Failing Test 1

Consider the scenario (from http://www.accuratedemocracy.com/e_shares.htm ):

A ranked-choice election to fill 3 seats. The candidates are A, B, R, and S and 36 ballots are cast. 

Numbers of voters  12   7   9   8 
Their preferences  AB  BA  RS  SR 

Thus 19 voters, a majority, prefer A and B while 17 prefer R and S.

Now the most immediately apparent way to determine the winners, is that, since each candidate represents 1/3 of the voters, or 36/3 voters, then that's how many votes they need to get elected. Any surplus votes should transfer to the next choices. And the final seat will be determined by who has the most votes.

Okay, so 36/3=12... So only A has quota. A is elected, leaving:

Numbers of voters  7   9   8 
Their preferences  B  RS  SR 

Well now no one has quota, so what do we do now? Well the standard thing to do in counting ranked choice ballots is to eliminate the candidate with the fewest votes, and transfer the votes to the next choice. Well, that would be B. That leaves only R and S. There are only 2 candidates left, and 2 seats, so we give them to them. So the final winners are: A,R,S.

But wait.... A and B was preferred by the majority, so shouldn't it have gone to A,B,R? Yes, it should have. We have failed this test.

Passing Test 1

The standard way to "solve" this problem is by using votes/(seats+1) as the quota instead. This is called the "Droop quota” and is used almost universally in ranked choice elections. If we instead used votes/(seats+1) = 9 as the quota, B would have received 3 surplus votes, making it 10 B - 9 RS - 8 SR. B & R make quota, filling the two remaining seats. Thus, the majority wins the majority. We now pass the test.

...but Failing Test 2

But the "Droop" quota introduces its own problem. Consider this scenario:

5 seats, 2 parties (a,b,c,d,e - vs - v,w,x,y,z)

ballots are:

690 ballots: a,b,c,d,e, v,w,x,y,z 
310 ballots: v,w,x,y,z, a,b,c,d,e, 

Using votes/(seats+1) as the quota, quota is 1000/(5+1) = 166.666 Majority party meets quota 4 times, minority party 1 time. No seats remain Thus majority party wins 4 seats, and minority wins 1. Using, instead, votes/seats as the quota, quota is 1000/5 = 200

Majority party meets quota 3 times, minority party 1 time. There's 1 seat left. The final seat is determined by plurality vote. After subtracting used votes, the remaining votes are:

690 - 3*200 = 90 
310 - 1*200 = 110 

The minority party has plurality, and thus gets the final seat, making it 3 to 2. So which one is right? Well the ratio of majority voters to minority voters is closer to 3:2 than it is to 4:1, so the proper quota is more fair; represents the voters more proportionally. Using votes/(seats+1) as the quota systematically over-favors the majority at the expense of minority parties. Test failed.

Passing both Test 1 and Test 2

So how can you solve both problems at once? Well, by realizing that the problem in the first scenario, was never with the quota - it was that after B was eliminated in the first round, he was never re-introduced. Simply re-introducing candidate B -- and more generally re-introducing dropped candidates each round -- allows them to win in a later round. Thus solving the first problem. And does so without having to skew the quota, thus never creating the second problem.

So let's go back to the step where we eliminated B and this time only _temporarily_ eliminate B. 

Numbers of voters 7  9  8 
Their preferences B RS SR 

We see that, as before, it leaves only R and S. Only 2 candidates left, and 2 seats... but remember B can be re-introduced in the next round - so we still have 3 candidates that could win a seat! So we can't just give 'em to R and S. We have to check for quota, which is votes/seats, which is 12. No one has it, so temporarily eliminate the lowest, which is now S. The votes all transfer to R, giving R 17, which meets quota.

Numbers of voters  7  9     8 
Their preferences (B) R(S) (S)R 

So now remember we only _temporarily_ eliminated S, so any surplus votes can still transfer. Well there are 17-12 = 5 surplus votes, and they all transfer to S. So now we elect R, and S now has 5, leaving us with:

Numbers of voters 7 5
Their preferences B S 

No one has quota, so we eliminate the lowest, leaving just B, which thus fills our final seat. And the winners are: A,R,B. The majority wins the majority! And without having to create the second problem discussed above, where the minority can get under-represented! By solving our initial problem correctly we avoided creating a second problem.

You look skeptical...

At this point, a reader knowledgeable in voting system theory will probably be skeptical, particularly on the subject of using the “Hare quota” (votes/seats) instead of the unbalanced “Droop quota” (votes/(seats+1)). They’d say “The Droop quota has been preferred for over a hundred years. You’d have to make a very strong case, if you presume to claim the Hare quota should be used in any scenario.” Well the critical thinker in me would be inclined to respond “That’s the ad populum fallacy! A claim is true or false based on its merits, not its popularity or lack thereof!” But the rhetorician in me would say “Why yes, yes I would have to make a very strong case. Bear with me, and I shall do precisely that!”

Let me begin by backing this up a bit. I came to discover the errors with the current methods when working on an entirely different (though related) problem...

How I discovered a new method of counting ranked choice ballots 

While in the process of generating multi-member redistricting maps for FairVote.org, we were discussing how the winners are calculated. This is when I first learned of the “Droop quota” (votes/(seats+1)), and how basically all STV vote counting methods use it. My first impression was "that's wrong". Where “seats” is the number of seats available, each candidate represents votes/seats voters. So quota should obviously be votes/seats.

Let me explain this in pictures… A “seats-votes curve” graphs the relationship between the fraction of votes a party gets, and the fraction of seats that gives them. When you use votes/(seats+1), then the further the vote is from a 50/50 split, the more distant the resulting winners are from what people want. You can clearly see this in the resulting “seats-votes curve”. Theoretically it should be a straight diagonal line, but when you use a quota of votes/(seats+1), it’s inevitably steeper than it should be. Below are the seats-votes curve for hypothetical multi-member districts (5 seats per district) for Texas, Ohio, and Pennsylvania, when using votes/(seats+1) as the quota.

Votes-seats curves for multi-member districts. Votes-seats curves for multi-member districts. Votes-seats curves for multi-member districts.
Votes-seats curves for multi-member districts (5 seats per district), using a quota of votes/(seats+1). (Namely, Texas, Ohio, and Pennsylvania) (x = % dem votes, y = % dem seat) The +1 in the denominator skews the seats-votes curve in favor of whichever party has the majority, resulting in representation that is not proportional. The darker red/blue shading shows excess seats for that party, beyond what would be proportional to the votes. (The extra-dark shading highlights asymmetries in the curve.) The curve has a slope of (5+1)/5. 

Now there is only one way to solve this problem: Take the +1 out of the denominator. With the +1 in the denominator, you’re essential graphing the function y = x * (seats+1)/seats, which looks, well, like you see above. But to be proportional, the curve should look like y = x, by definition. To get that, you need to change that y = x *(seats+1)/seats to a y = x * (seats+0)/seats. Which means you need to use votes/(seats+0) for the quota instead of votes/(seats+1).

The inescapable conclusion is that in order to get proportional representation, you must use votes/seats as the quota. There’s no two ways about it. So why does everyone use votes/(seats+1) as the quota, if it’s doomed to bias? Well when I asked, I was given that first example I explained above in “On passing two tests”.

Then I found there was another thing wrong with the way ranked choice was traditionally counted. When no one meets quota, they eliminate the candidate that got the fewest votes and look at the next highest choice for those ballots. Okay, that's fine -- that makes sense -- but then after a candidate meets quota, they never reintroduce those candidates! That elimination should have only been temporary -- only for determining the winner for that seat. You should only eliminate a ballot permanently when it's used to meet quota. Otherwise those voters just aren't being represented at all. They might as well not show up to the polls because you just threw away their ballot.

But now I was starting to see why they were using an unbalanced quota of votes/(seats+1) (called the "Droop" quota): because the premature elimination of the ballots was screwing things up, especially for minority candidates. So instead of realizing what was causing this (the premature elimination of ballots), they artificially lowered the bar, and that seemed to work, so they went with it. They added a second error to try to treat a symptom of the first, thereby compounding the error. I'm reminded of the nursery rhyme "There was an old lady who swallowed a fly..." Now it made sense. They were using the wrong quota because they were prematurely eliminating ballots. The correct solution was of course to not prematurely eliminate ballots, and to use the correct quota of votes/seats. Somehow this had gone unnoticed for centuries. The old lady had gone on swallowing things and the fly was still wiggling and jiggling inside her.

The first iteration 

So, equipped now with a proper solution, I did as all good programmers do: I wrote a program. This accomplishes three things:
  1. It helps me make sure I've got every nuance and corner case and all that taken care of; make sure I really have every part in the machine figured out. (And if I do, it demonstrates that.) 
  2. It allows me to quickly and accurately run tests to verify and confirm the method. 
  3. It gives me a working automated counter to demonstrate the method's superiority. (Or discover its inferiority, be that the case.) 
Though I had first to figure out one more thing: how to deal with only partially ranked ballots – that is, what to do when a ballot runs out of choices? It seemed to me the correct way to deal with it was to say that they rank all remaining choices equally – the remaining choices are tied for last. But this seemed like it would make for some complex code, so I looked for a way to simplify it. Sure enough, it turns out that simply reducing the quota as if the ballot was never cast is mathematically equivalent to saying they ranked the remaining candidates as tied for last.

So there was my 3rd change:
  •  Re-calculate the quota just before testing for it, instead of just once at the beginning. This compensates for ballots that have run out of choices. 
Now that I thought I had all the pieces, I wrote it up...

 The second iteration 

Well turns out I didn't have every nuance figured out. It occurred to me that if I tried (temporarily!) eliminating candidate A, then tried eliminating candidate A and B, well wouldn't it be better to try eliminating just candidate B, before trying both A and B together? The standard method was just to add one more candidate to what you’re eliminating, each time, thus skipping eliminating just B alone. That didn’t seem right. More precisely I knew it was wrong (because I had a counterexample) – I just didn’t know what the right way was.

So, I just went with the traditional way for now, see where that gets me. I figured I could revisit the problem later, and be better equipped when I did. All the results of my tests looked pretty good, except for one. And I knew almost right away what the problem was. Because the elimination order was wrong. It was going from A to A,B -- skipping B. it was time to revisit the elimination order.

Thanks to some conversation on google election science forum, I decided to go all the way with this - to order every single possible combination of candidates. The key was switching the sort criteria from how many ballots were affected, to what I called (for lack of a better term) how many "choice demotions" were produced. Going from 1st choice to 2nd was 1 “choice demotion”, whereas going from 1st to 3rd was 2. Now I had a way to score every possible combination of candidates, and thus a way to sort them. Notwithstanding ties (which are unavoidable), I now had a complete ordering.

So I tried that, worked beautifully. And now I have my final algorithm. The 4 major changes from traditional STV are:
  • Uses the correct quota of votes/seats (don’t swallow the spider) 
  • Only permanently eliminate ballots if they are used to make quota (don’t swallow the fly) 
  • Recalculate quota right before checking (‘cause ballots could have run out of choices) 
  • Temporarily eliminate sets of candidates in order of fewest "rank demotions" to most, ordering all possible combinations. 

Validation 

So now I have 3 major changes from the traditional way of counting ranked choice votes:
  1. Restore ballots that haven’t been used to meet a quota. (as opposed to not restoring) 
  2. Use votes/seats as the quota. (commonly known as “Hare quota”). (as opposed to using an unbalanced quota of votes/(seats+1)) 
  3. Temporarily ignore combinations of candidates based on the number of choice demotions it results in. (sorted from least to most) (as opposed to fewest votes) 
Now the first obvious question that comes to mind is “How does applying these three changes compare against not applying any of them?” The answer to this should come pretty easily: we’ve corrected 3 mistakes, so unless there’s some glaring 4th mistake that will just wipe out all our progress, it should on average be better.

But the more curious would ask a few further questions: “What if you implement just some of these changes? E.g. just the first and the second, or just the third? Maybe one of those rules doesn’t really make that big of a difference. Maybe one of those rules makes a huge difference. Maybe two of those rules only make a difference when they’re both applied… etc.”

Well I’ll give you one guess which type of person I am -- that’s right, the more curious type. And I have a very valuable skill that gives me a unique ability to satisfy my curiosity with relative ease: I know how to code. So you can guess what comes next: I wrote a computer program to automatically calculate the results for all 8 different combinations of the rules, then I gathered no less than 42 different scenarios from across the internet, and made my program test all 42 of them, for all 8 different combinations. Then I put the results in a spreadsheet and prettied it up.

A few things stand out in the results:
  • Firstly, and most glaringly, using all 3 changes passes ALL BUT ONE of the tests, and it is the ONLY combination that does
  • Secondly, if you don’t apply change #2 (Hare quota), neither of the other changes make much of a difference, except for a few cases. 
  • Thirdly, the only way to pass the proportionality test is by using change #2 (Hare quota). This also is the only way to make it tactical voting immune, at least for the cases tested. 
  • Fourthly, if you pick ANY of the changes and choose not to apply it, there are tests that that will ALWAYS fail, regardless of whether or not you apply the other 2 changes. This tells you that each original rule is definitely wrong
  • And finally, change #1 (Restore ballots) only makes a difference (save a few cases) when change #2 (Hare quota) is applied. So there is no value in combining the restore ballots rule with the Droop quota. You need to do both Hare & restore or neither
The most significant conclusion we can draw here, is that, using all 3 changes together, at least as far as the 42 tests go, ALWAYS results in a better result than any other combination.

But you say: “This isn’t every possible test case. There may be some scenarios you haven’t tried in which some other combination performs better.” To this I say: “Find one, show it to me, and I’ll concede it. That is the true test of science.” What did Albert Einstein say?: “No amount of experimentation can ever prove me right; a single experiment can prove me wrong.” I welcome such a test -- nay, I invite it! I wrote a user-friendly interface for my vote-counting program, and published it and its source code online, here. Test, test, test away! For with each successful test, my case grows stronger, and with each failure our knowledge increases!

The method 

For the sake of precision, let me lay it all out:

STEP 0: Make a copy of the ballots - these will be referred to as the "original" copy, even though they'll be altered (reweighted) due to surplus vote transfers.

Determining the next winner:

STEP 1: Determine quota 1: Count how many ballots are left (using the weighted value of the ballot), divide by number of seats left to fill. This is the quota.

STEP 2: order all combinations of candidates
  1. Compute all combinations of candidates (e.g. {a},{b},{c},{a,b},{b,c},{a,c},{a,b,c}) 
  2. For each such combination, count how many "choice demotions" would happen if those candidates are removed, in order for every voter to have a 1st choice candidate, or run out of choices. For instance, if a person's choices are {a,b,c,d}, and a,b & d are removed, then the result is {c}, and it counts as 2 demotions. (Since what is now the 1st choice used to be the 3rd). Remember to use the weighted value of the ballot in counting demotions. 
  3. Sort all combinations by how many choice demotions they produce, least to most. 
    1. Ties are broken by fewest ballots affected. 
    2. Ties on that, in turn, are broken by most candidates eliminated this way. 
    3.  Add the "empty set" (no candidates) to the beginning.
STEP 3 (loop): Make a local copy, and try removing the next combination. (first time through, this is removing no combinations)
  1.  Make a local copy of the ballots. For the remainder we will be operating on this local copy. 
  2.  Take the next set of candidates in the sorted demotion list (if this is the first time, this will be an empty set). 
  3.  Remove those candidates from the ballot, and promote all the choices so it doesn't skip any ranks. if a ballot is out of choices, disregard it. 
STEP 4: Check for quota (repeat from step 3 if not reached)
  1. Check if any candidate has reached or exceeded quota (using their weighted value -- always use the weighted value). 
  2.  If one has, the one with the most votes is elected. 
  3.  If none have reached quota, revert to the local copy you made in step 3, and repeat, using the next candidate combination in the sorted list. 
STEP 5: Winner selected, transfer surplus votes proportionally (on _original_ copy).
  1. The candidate with the most first place votes is elected. 
  2. Re-weight the ballots that were used to make quota - that have that person listed as first in the local copy. Their new weight is whatever their previous weight was, times 1-(quota/(the sum of their current weights)). This re-weighting applies to the _original_ ballots, not our local copy. 
  3. Remove the elected candidate from all ballots, and shift the ranks up to make them sequential. If any ballot runs out of choices, remove it. 
STEP 6: Repeat as necessary. Repeat from the step 1 for the next seat.