We've made significant updates to the infrastructure supporting Where's it up this year. Many of these changes were necessitated by some quick growth from a few thousand tests per day to several million. Frankly without the first two, I'm not sure we could have remained stable much past a million tests per day.
MongoDB Changes
I blogged about our switch to MongoDB for Where’s it Up a while back, and we’ve been pretty happy with it. When designing our schema, I embraced the “schema free world” and stored all of the results inside a single document. This is very effective when the whole document is created at once, but it can be problematic when the document will require frequent updates. Our document process was something like this:
- User asks Where’s it up something like: Test google.com using HTTP, Trace, DNS, from Toronto, Cairo, London, Berlin.
- A bare bones Mongo document is created identifying the user, the URI, tests, and cities.
- One gearman job is submitted for each City-Test pair against the given URI. In this instance, that’s 12 separate jobs.
- Gearman workers pick up the work, perform the work, and update the document created earlier using $set
This is very efficient for reads, but that doesn’t match our normal usage: users submit work, and poll for responses until the work is complete. Once all the data is available, they stop. We’ve optimized for reads in a write heavy system.
The situation for writes is far from optimal. When Mongo creates a document, under exact fit allocation, it considers the size of the data being provided, and applies a padding factor. The padding factor maxes out at 1.99x. Because our original document is very small, it’s essentially guaranteed to grow by more than 2x, probably with the first Traceroute result. As our workers finish, each attempting to add additional data into the document, it will need to be moved, and moved again. MongoDB stores records contiguously on disk, so every time it needs to grow the document it must read it off disk, and drop it somewhere elsewhere. Clearly a waste of I/O operations.
It’s likely that Power of 2 sized allocations would better suit our storage needs, but they wouldn’t erase the need for moving entirely, just reduce the number of times it’s done.
Solution:
Normalize, slightly. Our new structure has us storing the document framework in the results collection, and each individual work result from the gearman worker is stored in results_details in its own document. This is much worse for us on read: we need to pull in the parent document, then the child documents. On write, we’re saved the horrible moves we were facing previously. Again, the usage we’re actually seeing is: Submit some work, poll until done, read complete dataset, never read again. So this is much better overall.
We had some slight work to manage to make our front end handle both versions, but this change has been very smooth. We continue to enjoy how MongoDB handles replica sets, and automatic failover.
Worker Changes
We’ve recently acquired a new client which has drastically increased the number of tasks we need to complete throughout the day: checking a series of URLs using several tests, from every location we have. This forced us to re-examine how we’re managing our gearman workers to improve performance. Our old system:
- Supervisor runs a series of workers, no more than ~204. (Supervisor’s use of python’s select() limits it to 1024 file descriptors, which is allows for ~204 workers)
- Each worker, written in PHP, connects to gearman stating it’s capable of completing all of our work types
- When work arrives, the worker runs a shell command that executes the task on a remote server using a persistent ssh tunnel. Waits for the results, then shoves them into MongoDB.
This gave us a few problems:
- Fewer workers than we’d like, no ability to expand
- High memory overhead for each worker
- The PHP process spends ~99.9% of its time waiting, either for a new job to come in, or for the shell command it executed to complete.
- High load with 1 PHP process per executing job (that actually does work)
We examined a series of options to replace this, writing a job manager as a threaded Java application was seriously considered. It was eventually shot down due to the complexities of maintaining another set of packages,
Truncated by Planet PHP, read more at the original (another 3799 bytes)
more
{ 0 comments... » Where's it Up - System Upgrades. read them below or add one }
Post a Comment