This is part one of a two-part editorial series on the recently revealed AMS Electoral Fraud.
While many of our readers have probably read the preliminary report issued by the Elections Committee, and were possibly even present at the last Council meeting, there hasn’t been a detailed technical explanation provided about exactly how the system was broken. Through brief discussions with members of the EC I believe I now understand how the attack occurred.
The crux of the matter is that student numbers were not validated during the final submission phase, which allowed for a trivial exploit of the system. Due to the simplicity of this hack, I remain deeply concerned about the validity of any of the election results, and I will be hesitant to accept their accuracy even following the final auditor’s report.
I would however like to emphasize that I’m not trying to tear down or belittle the work that this year’s EC did. This year’s elections were probably the smoothest and best-organized elections I have seen during my time at UBC, with this one exception. It is a new system, and mistakes do happen; albeit a rather titanic one in this case. That aside, I think the response following the discovery of the exploit has been handled very well, and I appreciate the levels of public disclosure.
More detailed and technical analysis after the jump.
The Voting Process (in techspeak)
The actual exploit itself is frustratingly simple. In fact, I’m frankly irritated I didn’t discover the hole myself and point it out before voting closed. Bear with me a bit if things get technical, I’ll do my best to make things clear.
Note: This information is based off my understanding of the system, the presentation from the EC at AMS Council, and some questions asked of the CRO. No guarantees are provided regarding the specific details of the exploit.
When a voter logged into the system, it asked for their CWL information. Their CWL username and password were then sent to UBC’s authentication server, which then responded with information indicating if the login was successful as well as details about the student. One of these details is the user’s student number. This number was then cross-referenced with a list provided by the Registrar’s office to ensure the user was a valid AMS member who was eligible to vote.
At this point, the system has determined that the student should be able to vote, and thus displays the ballot page. The user then fills out their choices on the page and clicks “Submit”. The data entered in the form is then sent to the server where it is presumably validated and stored in a vote database. From comments made during the EA’s presentation, we also know that a user’s IP address and student number are stored along with their vote in the database.
The Problem
While the above might seem like a logical, straightforward, and superficially secure method of balloting, it suffers from one debilitating flaw. Specifically, validation of the voter is done before displaying the ballot, and not when submitting the vote to the database.
This means that all someone had to do was save the ballot page to disk, and they could submit it as many times as they liked! Yes, folks. It was that simple.
Of course, it would also be relatively trivial to create a script which would post ballots repeatedly to the server, thus allowing one to specify a desired degree of manipulation without doing any of the tedious forgery by hand. Since we know at least 731 invalid ballots were cast over a 4-hour span, I find it likely that a script was used in this process. Fortunately for us, the author of said script didn’t feel it was necessary to hide his submissions and they were thus noticed with relative ease.
The question that keeps coming to my mind is quite simply… How on earth did such an obvious and gaping security hole not get noticed in our voting software? One of the paramount rules in handling form data is always validate server-side after submission. You have to assume that someone will try to submit faulty (or in this case multiple) data sets, and your script must therefore perform detailed checks on the data once it is received.
The Solution
From a technical standpoint this problem would be relatively easy to solve. There are multiple ways this could be achieved without compromising the privacy of voters. Upon logging in, the system could verify that you are A) eligible, and B) haven’t voted yet. It would then assign a one-time session key which is stored in a list of active sessions. When your ballot is submitted it would then check to see if the associated session key is still valid, and if so, invalidate it. Another simple method would be to cryptographically hash student numbers (say using an md5sum or other desired method) and store the hash with the ballot. Since it is (virtually) impossible to break a hash short of using a brute force method, this would ensure privacy while preventing multiple voting.
Ramifications
As much as it pains me to say it, I don’t see that there is any way we can conclusively determine whether or not there has been additional tampering of the election results. Due to the nature of the exploit, it is entirely reasonable to believe that multiple people may have found it. While it appears that one person’s ‘hack’ was obvious in nature, it is possible that someone else created a script to submit results from different IPs over a wider range of times. To me, this would be essentially impossible to detect. (I have been informed by a couple people that there are statistical analysis methods which could reveal evidence of tampering or not, but I remain unconvinced. I’ll let our resident statistician/math whiz explain if he so chooses.)
As I find this possibility entirely too real, I’m afraid I won’t be trusting any results from this election, even after the auditor’s report is filed. That said, running another election at this point is completely out of the question. Turnout would be abysmal, the AMS would become more of a joke to the general student body, and it would waste a large amount of time and resources.
So what do I propose? The revised executive, BoG, Senate and I.Rep results from the Elections Committee (provided the auditors can provide some assurance that they believe them to be accurate) should be accepted. However, everything the executive does this year must be taken with a higher degree of oversight and scrutiny. As well, none of them should be operating under the belief that they have a clear ‘mandate’ from students, particularly those in close races.
With regards to the referenda, the results should be nullified. They are the one part of this election that *can* be re-run at a later date without any real negative effect. They also have the most lasting effect on the organization and must therefore require the highest degree of certainty.
This was surprisingly easy to understand, I think. (?)
However, you didn’t mention the VFM races in your proposal…which are the only positions (I suppose besides the Executive’s 25K salary) that have money involved.
Dumdumdum…Monday should be interesting. Thanks for your analysis!
But remember Andrew, all but one of the referenda failed to reach quorum, so if additional faulty votes had been inconspicuously been submitted, then those referenda would just more so fail to reach quorum. It is only for the one that passed is there any argument to be made.
One thing we can go off of is past results we believe to secure. It’s reasonable to expect a turnout proportional to the number of votes, and you can generate a model to predict turnout by year.
Further, you can model what random independent identically distributed votes would look like. Then you can analyze clusters to point out times where the data breaks the model. (as I pointedly showed last time I extrapolated this stuff, it’s more complicated than a simple logarithmic distribution)
This analysis is made a lot more accurate if you take the results of last year’s election to be secure. If you don’t, well, then error would become a lot larger.
An important thing to note is there’s statswiz’s a-crunchin’ their numbers to pick apart this data. I’m sure they’ll be full and forthright in their report on Monday.
(Also, this stuff is fascinating. Anyone in data forensics looking for a keen mathematician with some firm stats under his belt, get in touch.)
Really? Really?!!
So sending an un-authenticated GET/PUT request will count as a valid vote as far as the system is concerned?
WTF.
“cryptographically hash student numbers (say using an md5sum or other desired method) and store the hash with the ballot. Since it is (virtually) impossible to break a hash short of using a brute force method”
Is this for the voting process or storing the votes until after the election?
Since student numbers all follow patterns, brute forcing them is easy. 8 numbers, third to last probably 0, second to last probably 5-9, first tends to be 6 or 7, etc… you start with the most probable guess and keep going. You can also store each try in a database, so after you crack one student #, you can test all your tries so far against the next hash.
I find this story rather incredible. I’m on the exec at the GSA in Waterloo, and run our election system. Luckily, I know our ballot wouldn’t ever do this.
The way our ballot is designed is that ANYONE can see the ballot. Once you cast your vote, a login screen appears (using our “Quest” UserID and password – I guess like your CML system). The system checks to see if you’re in the voter list, and then if you’re not on the list of people who have already voted. Your vote is recorded if and only if you pass both tests.
I’m not about to go in-depth as to the nuts and bolts of how our ballot works but the html source code of any voting page only reveals a “shortcode” from WordPress.