Thursday, October 30, 2008

When Will Winner Be Known?

2008 November 1. Updated. Still Iowa for Obama at 10 pm. There are some changes. North Dakota gets decided at 3:43 am for McCain, instead of after an infinite amount of time. Missouri has flipped into the McCain column, and is the last state to be decided, at 5:10 am. ElectoralVote.com says if Missouri goes McCain and Obama wins the election, Missouri loses its bellwether status. It never had it to begin with. Missouri went the other way from the entire election in 1860, 1872, 1876, 1880, 1888, 1896, 1900, and 1956, when Stevenson took the state. What would happen in this instance is that the statement "The Democrats have never won an election without taking Missouri.", which is true now, would become false.

2008 October 31. Updated. Now Iowa wins it for Obama at 10 pm, right as the polls close.

It is now only 5 days until the Great Presidential Election of 2008. After the conventions, the race was a tie, but Barack Obama has piled up a moderate lead in recent weeks. I therefore expect Election Night not to be as long as in 2000 and 2004. The 2004 race was not settled until Wednesday morning, when it became clear that Ohio, the deciding state, was going to go to Bush by a substantial margin. In 2000, the race was not decided until mid-December, with Florida being the deciding state. So I expect this night to be shorter. How much shorter?

To answer that question, look at what happens on Election Night. The networks will not announce any results for any state until the polls close everywhere in the state. However, they will be performing exit polls, and on this basis, when the polls close, they can project ("call") a state for one candidate or the other. If the race in that state is close, they will not call it then but wait until enough votes are in for them to conclude that a candidate has won it. To get an estimate of Decision Time on Election Night, we need to know how the networks are going to call a state. I have only memory to rely on for this, so I will make a guess at a possible calling function.

This function will be a function of the average poll result for the state. For this number, I am going to use the file provided by Electoral-Vote.Com, as well as their procedure, which is to take an average of all the polls for the previous week. I will use only the two major party candidates' poll numbers, which I will normalize by the usual method, namely to set the vote for Barack Obama

= Poll for Obama / (Poll for Obama + Poll for McCain)

and set the vote for McCain as being 1 - Poll for Obama. Call the poll result for Obama x. Set the closing time of the election polls for the state = c. Then my function is:

f(x) = 0 if |x-0.5| > 0.04 (networks call the state immediately upon poll closing)
= 336, if x = 0.5 (two weeks. Highly improbable in election but not in polls)
= 1/50/|x-0.5| otherwise.

The value of the function is given in hours. I choose a hyperbolic function on the difference between the vote result and 50%. If a candidate is getting 52% (he leads by 4 points), this function gives one hour; that is, the networks will wait an hour before calling the state. If a candidate gets 51%, the wait is 2 hours, and if it is 50.5% (1 point difference) the wait is 4 hours.

I then add this value to the closing time to get an estimated calling time T for the state; in other words,

T= c + f(x).

I rank-order the states by calling-times, and add up the Obama votes. I then look for that time when the Obama vote is 269 electoral votes or more (I assume that Obama wins a 269-269 tie).

I did that this morning and got that the deciding state will be Indiana, and that the calling time will be 9:45 pm. This will allow for an extra snack or drink and some discussion before going to bed for the night.

Here are the deciding states and times I have been getting the past few days, based on the polls, times in EST:

October 12, North Carolina, 10:40 pm
October 22, California+,11:00 pm
October 23, California+, 11:00 pm
October 24, Florida, 9:50 pm
October 26, Florida,10:25 pm
October 27, Florida, 10:00:05 pm
October 28, North Carolina, 10:28 pm
October 29, North Carolina, 10:19 pm
October 30, Indiana, 9:45 pm
October 31, Iowa, 10:00 pm
November 1, Iowa, 10:00 pm

By + for October 22 and 23, I mean that the election is decided by all four states that are immediately called for Obama at 11 pm EST.

From this one can deduce several points. One is that the deciding state is likely to be a large state. Indiana is the smallest state so far that has appeared as a deciding state. For some time earlier this week, McCain was gaining, causing later decision times. But now it's getting earlier again. Yesterday's polls were overwhelmingly for Obama. At 10 pm Obama will either have reached 269 or will be just short of 269. The first state in that hour, from 10-11 pm, that gets called for Obama may very well put him over the top.

This is only a model, and if I change the function, I may change the time and deciding state. But I gather from this that Election Night will be over with by 11 pm EST, and maybe even by 10 pm, and that the deciding state will probably be one of these: California, North Carolina, Indiana, Missouri, Florida, or possibly Nevada, although that is more remote.

Tuesday, March 04, 2008

Ambiguous Children's Number Puzzle

This morning's Kidspot in the Richmond Times-Dispatch features a number puzzle entitled "Sum Fun". The puzzle is given as follows. An array of circles is given as:
 O 
OOO
 O 
The numbers 2, 3, 4, 5, and 6 are also shown. The instructions say "Here are five numbers to play with. Put one number in each circle so that the numbers down and across add up to twelve."

Call the numbers N, C, W, E, and S, for the cells in the northern, central, western, eastern and southern circles. Then we can write down two equations:

N + C + S = 12
W + C + E = 12

We have one other equation:

N + C + S + W + E = 20

as that is what the numbers sum up to. Further, all five of them have to be integers between 2 and 6. If we subtract the third equation from the first two, we get:

N + S = 8
W + E = 8

So how can one express 8 as the sum of two numbers? 1+7 is no good. 4+4 does not work because there is only one 4. There are no 1s or 7s in the puzzle. 2+6 and 3+5 are the possibilities, but because of commutativity, 6+2 and 5+3 are also possible. Therefore, one of N+S and W+E has to consist of 2 and 6 and the other one of 3 and 5.

The solution given in the puzzle is:

ANS: ACROSS: 2, 4, 6. DOWN: 5, 4, 3.

Yes, that solves the puzzle. But all the puzzle implies is that one of N+S and W+E is either {2,6} or {3,5} and the other one is the other of these. But then that means that Across: 6, 4, 2; Down 3, 4, 5 is just as good a solution. In fact there are eight solutions, because one could take either 2+6 or 6+2 and either 3+5 and 5+3 and one can put the 2 and 6 across or one could put the 3 and 5 across. The solutions are:

ANS: ACROSS: 2, 4, 6. DOWN: 5, 4, 3.
ANS: ACROSS: 2, 4, 6. DOWN: 3, 4, 5.
ANS: ACROSS: 6, 4, 2. DOWN: 5, 4, 3.
ANS: ACROSS: 6, 4, 2. DOWN: 3, 4, 5.
ANS: ACROSS: 5, 4, 3. DOWN: 2, 4, 6.
ANS: ACROSS: 5, 4, 3. DOWN: 6, 4, 2.
ANS: ACROSS: 3, 4, 5. DOWN: 2, 4, 6.
ANS: ACROSS: 3, 4, 5. DOWN: 6, 4, 2.

There is a chess-problems term for this type of puzzle that is given as having The Solution, but instead has many solutions. Such a puzzle is said to be cooked. This puzzle is cooked. The authors should have checked this before putting it in the paper. The answer they give is correct, but it is not the only one. A special danger of this type of misproblem is that it teaches children that there is only one way of doing things, that there is only One Answer. This stifles creativity in children. We have enough institutions in our society that insist that there is only One Answer, including government institutions, corporations, special interest groups, and especially religions. United Feature Syndicate should check their Kidspots before they publish them.

Thursday, February 07, 2008

Algebra Problem Helps to Determine Who Wins in November

In Beyond Opinion, I mention that Romney's quitting means that Lichtman Key 2 will probably stand. This could affect who wins in November. So the question is, how many Romney delegates would have to go to Huckabee (or Paul) to topple Key 2?

Here is the present delegate count:

McCain 714
Romney 286
Huckabee 181
Paul 16

Now suppose x of Romney's 286 delegates go to McCain. The 286 - x delegates go to Huckabee, say (they could go to Paul, too, but that does not affect the result). In that case, the delegate counts would be

McCain 714 + x
Romney 0
Huckabee 181 + (286 - x)
Paul 16

We want to know at which value of x McCain's vote is less than twice that of his competitors combined; i.e., when Key 2 falls. The resulting inequality and its solution:

714 + x <= 2(181 + (286 - x) + 16)
714 + x <= 2(483 - x)
714 + x <= 966 - 2x
x + 2x <= 966 - 714
3x <= 252
x <= 84
286-x >= 202
202 / 286 >= 72%

This means that at least 72% of the Romney vote would have to go to Huckabee. Instead, there probably will be pressure on Huckabee to withdraw, and that will clinch it for McCain and secure Key 2 as well.

Elections are good sources of algebra problems, and I will make more mention of these in the future.

Monday, January 21, 2008

Predicting a Primary Winner before CNN Does

It is primary season, and several primaries are being held. After they are held, the networks show the returns and eventually call winners. Sometimes they call winners immediately after the event closes, as with the Nevada Republican caucuses on 2008 January 19, when they called Romney the winner. However, at night they did not call for a long time the winner of the South Carolina Republican primary, nor the Nevada Democratic caucuses, which were held the same day.

Nevertheless, I was able to call the winner in the South Carolina primary at 7:16 pm, as described on my Beyond Opinion site. How did I do this?

It turns out that CNN provides election totals for the candidates after the polls or caucuses close. These don't help with close contests because they can easily reverse. CNN does carry out exit or entrance polls, however. These list the vote according to various aspects of the electorate, including male/female, church attendance, party affiliation, feelings on immigration and so forth. Here is an example of one of the exit polls in the Republican South Carolina primary, after extracting to Microsoft Excel:
Feelings About Bush Administration Candidate Huckabee McCain Romney Thompson
Enthusiastic -0.17 0.28 0.34 0.18 0.18
Satisfied -0.52 0.35 0.3 0.14 0.17
Dissatisfied -0.25 0.29 0.38 0.14 0.13
Angry -0.05 0.15 0.44 0.22 0.12
It shows the percentage of the electorate in each of four categories: Enthusiastic, Satisfied, Dissatisfied, and Angry, in parentheses. Excel thinks these are negative numbers, and so it stuck minus signs in front of each entry. But they are really positive, and they give the percentage distribution of the electorate across these four categories. Note that I have deleted the minor candidates to avoid distorting the web page with a wide table. This means that the row sums will not add up - the difference is the total of the minor candidates.

For each category, it shows the percentage of each of the votes for the category according to the candidate they voted for or supported. So those who were Satisfied with Bush voted 2% for Giuliani, 35% for Huckabee, 0% for Hunter, and so forth. The 35%, or 0.35, then shows a conditional probability: the probability that a voter voted for Huckabee given that he was satisfied with Bush. The formula for a conditional probability is:

p(A|B) = p(A & B)/p(B)

where A & B means both A and B.

Therefore:

p(Huckabee|satisfied) = p(Huckabee & satisfied)/p(satisfied)

Now if one sums over all the categories, one gets:

P(Huckabee|satisfied) = p(Huckabee & satisfied)/p(satisfied) + p(Huckabee & enthusiastic)/p(enthusiastic) + p(Huckabee & dissatisfied)/p(dissatisfied) + p(Huckabee & angry)/p(angry)

But this is the same as

P(Huckabee|satisfied or enthusiastic or dissatisfied or angry)

This is simply p(Huckabee), the percentage of the vote that went to Huckabee, assuming a person being polled could not say "none of the above". So this gives us a means of finding out for each candidate what percentage of the vote went for each candidate.

To do this, one must take pairwise products of two columns from this array, and add these together. It turns out that Excel has a function, namely SUMPRODUCT, that does this. So one could enter in the box below the Giuliani column:

-SUMPRODUCT($B2:$B5,C2:C5)

And this gives the Giuliani vote. Then simply copy across the candidates. I put a minus sign in front to counteract the unwarranted assumption that Excel made about the category distribution percentages being negative. The dollar signs tell Excel to keep this coordinate at B; that is, always use the category percentages rather than move across the spreadsheet. The result is:
Feelings About Bush Administration Candidate Huckabee McCain Romney Thompson
Enthusiastic -0.17 0.28 0.34 0.18 0.18
Satisfied -0.52 0.35 0.3 0.14 0.17
Dissatisfied -0.25 0.29 0.38 0.14 0.13
Angry -0.05 0.15 0.44 0.22 0.12
    0.3096 0.3308 0.1444 0.1545
And this shows that McCain won with 33% of the vote, with Huckabee at 31% of the vote, Thompson with 15%, and Romney with 14%. These numbers and hence my spreadsheet analysis were available for 30-45 minutes before CNN called the race for McCain.

I did the same with the Democratic caucuses in Nevada and concluded that Hillary Clinton won. This technique will be useful for all the following primaries, provided CNN provides an exit or entrance poll immediately after the polls close and does not call a winner right away. It is good to double check by using two or three of the categories, to be sure about the same results are obtained each time.

How good is this technique? Only as good as the exit polling of CNN (and other networks). Exit polling is much more reliable than election returns, since they cover the entire state, rather than first the urban results and then the rural ones that require hand-counting, for example. There has been only one major error of an exit poll that I know about, namely the call of Florida for Gore in 2000.