Friday, May 31, 2013

On Apathy, Voting, and Scalability

Though I have mostly focused on web portal bugs here, this has led to a sort of apathy for a number of reasons.
First, there are too many bugs in web portals. If you look for them, there seems to be, on average, more than one bug per web-page on the internet. Some of these are common, glaring, and recurrent to the point that they become boring to find. Navigating them is a lot like communicating with someone who does not speak the language well and is not paying attention - you just deal with it while trying not to become too bogged down in the workarounds. Facebook interface problems alone could fill a blog with daily entries for several years.
I have also recently spent a good deal of time on latin-american sites, where QA seems to be an alien concept, navigation is atrocious, functionality is spotty at best, and information is tremendously unreliable. Critiquing these is a waste of time, because the list of what works correctly is generally shorter than the list of what does not. Web-construction tends to be superficially imitative and full of poorly-implemented bells & whistles, while basic navigation and visual organization are clearly mostly not understood. If you like computing, it's a bit soul-crushing.
So, in the interest of not stopping altogether, here is one I found some time back on YouTube. This is interesting because YouTube seems to be on a continual release cycle, and the number and quality of improvements Google has made to the property since the purchase is absolutely staggering. It seems that, without much of a splash, virtually every pixel on YouTube now is subtly changed and has slightly different and vastly improved code behind it, while the visual metaphor has been gently shifted.

What's wrong with this picture?
So here's some random video with 5,348 likes and 15 dislikes after 1,448 total views... There you go: Bug du Jour. I'm sure it was fixed in no-time, and I bet it's related to scalability. Here's why:

Back in 2007, when I worked at CurrentTV, one of the early problems I faced had to do with voting that was falling on its face, showing the user a beach-ball/hourglass for 15s after the click, at a time when Current.com had about 500 real users. That's abysmal performance by any standard, but when you consider the deals Current TV had in place with vendors that had Oracle running on fully-loaded state-of-the-art servers, there's obviously an issue, right?
Well, this is not an unusual scenario. Functionality that develops dynamically pushes prototypes into production with time-to-market trumping architecture, and the belief that this can be fixed later at center-stage in most product and release-planning decisions. All true and reasonable. People, however, have a tendency to freak out at this juncture in a prototype's life, but that's another topic. The story here is that programmers had conceived of a voting system that showed a tally, and made the mistake of trying to get technical sign-off from non-technical stake-holders*1, which had resulted in an additional (and reasonable) requirement that the user's vote be reflected instantly in the displayed tally of votes. Naturally, the programmer who wrote this function sent the vote to the SQL server and ran a query asking for the count of upvotes. The trouble was that the indexes were wrong on the table, so the update query was pretty slow, and the transaction isolation level was set to repeatable read, causing the query to block the count query.
To solve the problem I made a transactional table with asset IDs & vote counts (actually I did collect user IDs, timestamps, & origin IPs for analytical & weighting reasons, but not relevant to the story),  indexed on these columns separately, and made the code get the vote tally initially on load, use a read_uncommitted connection, send the vote update off on a separate thread, and instead of re-querying for the tally, simply increment the on-load tally by 1 locally.
Net effect: wait-time for voting at 500 users reduced from ~15s to ~240ms, database reads for votes reduced by a factor of 2, expected performance for 1,000,000,000 users: ~630ms*2

So a little planning and application of basic algebraic principles can make the difference between handling a billion users comfortably and reaching a performance limit at 500 users.No doubt, YouTube was in the middle of making a change somewhat along this vein to their voting algorithm when I stumbled onto this bizarre result. I have, in any case, been unable to reproduce it since.

*1 - One might ask why this is a mistake. The reason is that non-technical users don't usually have practice visualizing a complete site from specifications, or being aware of the interdependence between logically separate presentation elements unless specifically shown. What happens then is that when users are shown partial implementations, it sets their wheels turning about the functionality that will go on top of the current functionality, and without looking at the current spec, they will at worst fall into the trap of reinventing already-specified functionality, and at best think of brand new functionality to add, sometimes resulting in significantly increased scope, regressions, and project slippages.

*2 - The performance prediction is based on the number of seeks it requires to read a b-tree index. Updating or inserting a row in a database will require two additional seeks, so we can add 2 to the result, but what is of interest here is noting that this number can be computed pretty reliably using the formula:
log(row_count) / log(index_block_length / 3 * 2 / (index_length + data_pointer_length)) + 1

since block-lengths are usually 1024 bytes, an index on  a mediumint column is three bytes long, a data pointer able to hold 4G discrete values is 4 bytes, then for 500 rows:
log(500) / log(1024 / 3 * 2 / (3 + 4)) + 1 = ln(500)/ln(2048/21) + 1 = 2.35
-a write requires 2 additional seek requests. So if we take the complete operation on a 500 row table to require 6.7 seeks (2.35 for the original read and 4.35 for the update), then the same operation on a 1 billion row table will require ~7.8 seeks, and 9.8 for the update, for a total of 17.6.
If 6.7 seeks took ~240ms, 17.6 will take 17.6/6.7*240=630ms
QED