dvara: A Mongo Proxy

We wrote dvara, a connection pooling proxy for mongo, to solve an immediate problem we were facing. We were running into the connection limits on some of our replica sets. Mongo through 2.4 had a max-max conn limit of 20,000. As the number of our application servers grew, the number of concurrent active connections to our replica sets grew. Mongo 2.6 removed this limit, but it was unfortunately not ready at that time (we’re still testing it and haven’t upgraded to it yet). Even if it were ready, the cost per connection is 1MB, which takes away precious memory otherwise used by the database. A sharded cluster with mongos as the proxy was another path we considered. Enabling sharding may have helped, but that change would spill over into our application logic and we use at least some of the restricted features. We are experimenting with sharded replica sets in our environment, and from our experience we weren’t confident they would actually help with our connection limit problem. So we set out on what seemed like an ambitious, and in my mind, a difficult goal of building a connection pooling proxy for mongod.

Down to the Wire

We started off with a simple proof of concept, working backwards from legacy wire protocol documentation. We got it far enough to serve basic read/write queries in a few weeks. We attribute the speed at which we got the prototype working to using Go to build it. Go allowed us to write easy to follow code, and yet not pay the cost of a thread per connection, or the alternative of having to write callbacks or some other form of manually managed asynchronous network IO logic. Additionally, while our proxy prefers to not look at the bytes flowing through or decode the BSON for performance reasons, Gustavo Niemeyer‘s excellent mgo driver, along with its bson library made it trivial for us to introspect and mutate the traffic we needed to. The first of these cases was the isMaster and the replSetGetStatus commands. These command return the member/host information the client uses to decide who to connect and talk to. We need to replace the real host/ports with the proxy host/ports.

Yet another command that needed special handling, and one of the known problems we had to solve was to handle the way Mongo 2.4 and earlier require a second follow up call for getLastError. Fortunately this got some much needed love in 2.6, but until 2.4 mutation operations were essentially split into two parts: first, the mutation itself; and second, the getLastError command which included some important options, including the write concern. Consider what a connection pooling proxy does: a client sends a command, we take a connection from our pool, proxy the command and the response, and put the connection back into the pool for someone else to use. A good proxy would hold a connection from the pool for the least amount of time possible. Unfortunately the design of getLastError means we can’t do that, because getLastError is state that exists in mongod per-connection. This design is awkward enough that it actually requires special logic for the mongo shell to ensure it doesn’t get inadvertently reset. It was clear we’ll need to similarly maintain this state per connection in the proxy as well. Our implementation tries to preserve the semantics mongod itself has around getLastError, though once we’ve moved all our servers and clients to 2.6 this will be unnecessary with the new wire protocol.

Proxying in Production

An aspect we refined before we started using this in production was to auto discover replica set configuration from the nodes. At first our implementation required manual configuration that mapped each node we wanted to proxy. We always need a mapping in order to alter the responses for the isMaster and replSetGetStatus responses mentioned earlier. Our current implementation automatically configures this and uses the provided member list as a seed list. We’re still improving how this works, and likely will reintroduce manual overrides to support unusual situations that often arise in real life.

One of the benefits of dvara has been the ability to get metrics about various low level operations which were not necessarily readily available to us. We track about 20 metrics including things like number of mutation operations, number of operations with responses, latency of operations, number of concurrent connections. Our current implementation is tied to Ganglia using our own go client but we’re working on making that pluggable.

We’ve been using dvara in production for some time, but we know there are mongo failure scenarios it doesn’t handle gracefully yet. We also want a better process around deploying new versions of dvara without causing disruptions to the clients (possibly using grace). We want to help improve the ecosystem around mongo, and would love for you to contribute!

June 23, 2014

Fun with TokuMX

TokuMX is an open source distribution of MongoDB that replaces the default B-tree data structure with a fractal tree index, which can lead to dramatic improvements in data storage size and write speeds. Mark Callaghan made a series of awesome blog posts on benchmarking InnoDB, TokuMX and MongoDB, which demonstrate TokuMX’s remarkable write performance and extraordinarily efficient space utilization. We decided to benchmark TokuMX against several real-world scenarios that we encountered in the Parse environment. We also built a set of tools for capturing and replaying query streams. We are open sourcing these tools on github so that others may also benefit from them (we’ll discuss more about them in the last section).

In our benchmarks, we tested three aspects of TokuMX: 1. Exporting and importing large collections; 2. Performance for individual write-heavy apps; and 3. Database storage size for large apps.

1. Importing Large Collections

We frequently need to migrate data by exporting and importing collections between replica sets. However, this process can be painful because sometimes the migration rate is ridiculously slow, especially for collections with a lot of small entries and/or complicated indexes. To test importing and exporting, we performed an import/export on two representative large collections with varying object counts.

  • Collection1: 143 GB collection with ~300 millions of small objects
  • Collection2: 147 GB collection with ~500 thousands of large objects

Both collections are exported from our existing MongoDB collections, where collection1 took 6 days to export and collection2 took 6 hours. We used the mongoimport command to import collections to MongoDB and TokuMX instances. Benchmark results for importing collection1, with a large number of small objects: TokuMX is 3x faster to import.

# Collection1: exported from MongoDB for 6 days

Database         Import Time
MongoDB           58 hours 37 minutes
TokuMX            14 hours 28 minutes

Benchmark results for importing collection2, with a small number of large objects: TokuMX and MongoDB are roughly in parity.

# Collection2: exported from MongoDB for 6 hours

Database         Import Time
MongoDB           48 minutes
TokuMX            53 minutes

2. Handling Heavy Write Loads

One of our sample write-intensive apps issues a heavy volume of “update” requests with large object sizes. Since TokuMX is a write-optimized database, we decided to benchmark this query stream against both MongoDB and TokuMX. We recorded 10 hours of sample traffic, and replayed it against both replica sets. From the benchmark results, TokuMX performs 3x faster for this app with much smaller latencies at all histogram percentiles.

# MongoDB Benchmark Results
- Ops/sec: 1695.81
- Update Latencies:
    P50: 5.96ms
    P70: 6.59ms
    P90: 11.57ms
    P95: 18.40ms
    P99: 44.37ms
    Max: 102.52ms
# TokuMX Benchmark Results
- Ops/sec: 4590.97
- Update Latencies:
    P50: 3.98ms
    P70: 4.49ms
    P90: 6.33ms
    P95: 7.61ms
    P99: 12.04ms
    Max: 16.63ms

3. Efficiently Using Disk Space

Space efficiency is another big selling point for TokuMX. How much can TokuMX save in terms of disk utilization? To figure this out, we exported the data of one of our shared replica sets (with 2.4T data in total) and imported them into TokuMX instances. The result was stunning: TokuMX used only 379G disk space —about 15% of the original size.

Benchmark Tools

Throughout the benchmarks, we focused on:

  • Using “real” query patterns to evaluate the database performance
  • Figuring out the maximal performance of the systems

To achieve those goals, we developed a tool, flashback, that records the real traffic to the database and replays ops with different strategies. You can replay ops either as fast as the database can accept them, or according to their original timestamp intervals. We are open sourcing this tool because we believe it will be also useful for people who are interested in recording their real traffic and replaying it against different production environments, such as for smoke testing version upgrades or different hardware configurations. For more information on using flashback, please refer to this document. We accept pull requests!

Kai Liu
June 20, 2014

Dependency Injection with Go

Dependency Injection (DI) is the pattern whereby the dependencies of a component are provided to it and made part of its state. The pattern is used to isolate components from the implementations of their dependencies. Go, with its interfaces, eliminates many of the reasons for a classic DI system (in Java, etc). Our inject package provides very little of what you’ll find in a system like Dagger or Guice, and focuses on eliminating the need to manually allocate instances and wire up the object graph. This is both because many of those aspects are unnecessary, and also because we wanted to make injection simpler to understand in our Go codebase.

Our path to building inject went through a few stages:



It started with a unanimous, noble goal. We had global connection objects for services like Mongo, Memcache, and some others. Roughly, our code looked like this:

var MongoService mongo.Service

func InitMongoService(url string) {
  MongoService = ...

func GetApp(id uint64) *App {
  a := new(App)
  return a

Typically the main() function would call the various init functions like InitMongoService with configuration based on flags/configuration files. At this point, functions like GetApp could use the service/connection. Of course we sometimes ran into cases where we forgot to initialize the global and so got into a nil pointer panic.

Though in production the globals were shared resources, having them had (at least) two downsides. First, code was harder to ponder because the dependencies of a component were unclear. Second, testing these components was made more difficult, and running tests in parallel was near impossible. While our tests are relatively quick, we wanted to ensure they stay that way, and being able to run them in parallel was an important step in that direction. With global connections, tests that hit the same data in a backend service could not be run in parallel.


Eliminating Globals

To eliminate globals, we started with a common pattern. Our components now explicitly depended on say, a Mongo service, or a Memcache service. Roughly, our naive example above now looked something like this:

type AppLoader struct {
  MongoService mongo.Service

func (l *AppLoader) Get(id uint64) *App {
  a := new(App)
  return a

Many functions referencing globals now became methods on a struct containing its dependencies.


New Problems

The globals and functions went away, and instead we got a bunch of new structs that were created in main() and passed around. This was great, and it solved the problems we described. But… we had a very verbose looking main() now. It started looking like this:

func main() {
  mongoURL := flag.String(...)
  mongoService := mongo.NewService(mongoURL)
  cacheService := cache.NewService(...)
  appLoader := &AppLoader{
    MongoService: mongoService,
  handlerOne := &HandlerOne{
    AppLoader: appLoader,
  handlerTwo := &HandlerTwo{
    AppLoader:    appLoader,
    CacheService: cacheService,
  rootHandler := &RootHandler{
    HandlerOne: handlerOne,
    HandlerTwo: handlerTwo,

As we kept going down this path, our main() was dominated by a large number of struct literals which did two mundane things: allocating memory, and wiring up the object graph. We have several binaries that share libraries, so often we’d write this boring code more than once. A noticeable problem that kept occurring was that of nil pointer panics. We’d forget to pass the CacheService to HandlerTwo for example, and get a runtime panic. We tried constructor functions, but they started getting a bit out of hand, too, and still required a whole lot of manual nil checking as well as being verbose themselves. Our team was getting annoyed at having to set up the graph manually and making sure it worked correctly. Our tests set up their own object graph since they obviously didn’t share code with main(), so problems in there were often not caught in tests. Tests also started to get pretty verbose. In short, we had traded one set of problems for another.


Identifying the Mundane

Several of us had experience with Dependency Injection systems, and none of us would describe it as an experience of pure joy. So, when we first started discussing solving the new problem in terms of a DI system, there was a fair amount of push back.

We decided that, while we needed something along those lines, we needed to ensure that we avoid known complexities and made some ground rules:

  1. No code generation. Our development build step was just go install. We did not want to lose that and introduce additional steps. Related to this rule was no file scanning. We didn’t want a system that was O(number of files) and wanted to guard against an increase in build times.
  2. No subgraphs. The notion of “subgraphs” was discussed to allow for injection to happen on a per-request basis. In short, a subgraph would be necessary to cleanly separate out objects with a “global” lifetime and objects with a “per-request” lifetime, and ensure we wouldn’t mix the per-request objects across requests. We decided to just allow injection for “global” lifetime objects because that was our immediate problem.
  3. Avoid code execution. DI by nature makes code difficult to follow. We wanted to avoid custom code execution/hooks to make it easier to reason about.

Based on those rules, our goals became somewhat clear:

  1. Inject should allocate objects.
  2. Inject should wire up the object graph.
  3. Inject should run only once on application startup.

We’ve discussed supporting constructor functions, but have avoided adding support for them so far.



The inject library is the result of this work and our solution. It uses struct tags to enable injection, allocates memory for concrete types, and supports injection for interface types as long as they’re unambiguous. It also has some less often used features like named injection. Roughly, our naive example above now looks something like this:

type AppLoader struct {
  MongoService mongo.Service `inject:""`

func (l *AppLoader) Get(id uint64) *App {
  a := new(App)
  return a

Nothing has changed here besides the addition of the inject tag on the MongoService field. There are a few different ways to utilize that tag, but this is the most common use and simply indicates a shared mongo.Service instance is expected. Similarly imagine our HandlerOne, HandlerTwo & RootHandler have inject tags on their fields.

The fun part is that our main() now looks like this:

func main() {
  mongoURL := flag.String(...)
  mongoService := mongo.NewService(mongoURL)
  cacheService := cache.NewService(...)
  var app RootHandler
  err := inject.Populate(mongoService, cacheService, &app)
  if err != nil {

Much shorter! Inject roughly goes through a process like this:

  1. Looks at each provided instance, eventually comes across the app instance of the RootHandler type.
  2. Looks at the fields of RootHandler, and sees *HandlerOne with the inject tag. It doesn’t find an existing instance for *HandlerOne, so it creates one, and assigns it to the field.
  3. Goes through a similar process for the HandlerOne instance it just created. Finds the AppLoader field, similarly creates it.
  4. For the AppLoader instance, which requires the mongo.Service instance, it finds that we seeded it with an instance when we called Populate. It assigns it here.
  5. When it goes through the same process for HandlerTwo, it uses the AppLoader instance it created, so the two handlers share the instance.

Inject allocated the objects and wired up the graph for us. After that call to Populate, inject is no longer doing anything, and the rest of the application behaves the same as it did before.


The Win

We got our more manageable main() back. We now manually create instances for only two cases: if the instance needs configuration from main, or if it is required for an interface type. Even then, we typically create partial instances and let inject complete them for us. Test code also became considerably smaller, and providing test implementations no longer requires knowing the object graph. This made tests more resilient to changes far away. Refactoring also became easier as pulling out logic did not require manually tweaking the object graphs being created in various main() functions we have.

Overall we’re quite happy with the results and how our codebase has evolved since the introduction of inject.



You can find the source for the library on Github:


We’ve also documented it, though playing with it is the best way to learn:


We love to get contributions, too! Just make sure the tests pass:


May 13, 2014

Smart Indexing at Parse

We love running databases at Parse! We also believe that mobile app developers shouldn’t have to be DBAs to build beautiful, performant apps.

No matter what your data model looks like, one of the most important things to get right is your indexes. If your data is not indexed at all, every query will have to perform a full table scan to return even a single document. But if your data is indexed appropriately, the number of documents scanned to return a correct query result should be low.

We host over 200k mobile apps at Parse, so obviously we can not sit down and individually determine the correct indexes for each app. Even if we did, this would be an ineffective mechanism for generating indexes because developers can change their app schemas or query access patterns at any time. So instead we rely on our algorithmically generated smart indexing strategy.

We perform two types of index creation logic. The first generates simple indexes for each API request, and the second does offline processing to pick good compound indexes based on real API traffic patterns. In this blog post I will cover the strategies we use to determine simple single-column indexes. We will detail our compound indexing strategies in a followup post.

At a high level, our indexing strategy attempts to pick good indexes based on a) the order of operators’ usefulness and b) the expected entropy of the value space for the key. We believe that every query should have an index; we choose to err on the side of too many indexes rather than too few. (Every index causes an additional write, but this is a problem that is easier to solve operationally than unindexed queries.)

Operator usefulness is determined primarily based on its effectiveness in providing the smallest search space for the query. The order of operators’ usefulness is:

  • equals
  • <, <=, =>, >
  • prefix string matches
  • $in
  • $ne
  • $nin
  • everything else

For data types, we heavily demote booleans, which have very low entropy and are not useful to index. We also throw out relations/join tables, since no values are stored for these keys. We heavily promote geopoints since MongoDB won’t run a geo query without a geo index. Other data types are ranked by their expected entropy of the value space for the key. Data types in order of usefulness are:

  • array
  • pointers
  • date
  • string
  • number
  • other

We score each query according to the above metrics, and run ensureIndex() on the three top-scoring fields for each query. If the query includes a $or, we compute the top three indexes for each branch of the $or.

When we first started using smart indexing, all indexes were ensured inline with each API request. This created some unfortunate pile-on behavior any time a large app had schema or query pattern changes. Instead of ensuring the index in the request path, we now drop the indexing job in a queue where it is consumed by an indexer worker. This ensures that no more than one index can be generated at a time per replica set, which protects us from these pile-on effects.

Even the best indexing strategy can be defeated by suboptimal queries. Please see Abhishek’s primer from last week on how to write queries that aren’t terrible.

Charity Majors
April 1, 2014

Writing Efficient Parse Queries

Any time you store objects on Parse, they get stored on a database. Usually, a database stores a bag of data on disk, and when you issue Parse queries it looks up the data and returns the matching objects. Since it doesn’t make sense for the database to look at all the data present in a particular Parse class for every query, the database uses something called an index. An index is a sorted list of items matching a given criteria. Indexes help because they allow the database to do an efficient search and return matching results without looking at all of the data. For the more algorithmically inclined, this is an O(log(n)) vs O(n) tradeoff. Indexes are typically smaller in size and fit in memory, resulting in faster lookups.

There are a couple of operations – $ne (Not Equal To) and $nin (Not In) which cannot be supported by indexes. When you run a query with a “Not Equal To” clause, the database has to look at all objects in the particular Parse class. Consider an example: we have a column “Color” in our Parse Class. The index helps us know which objects have “Color” blue, green and so on. When running the query:

 { "Color" : { "$ne" : "red" } }

The index is useless, as it can only tell which objects have a particular color. Instead, the database has to look at all the objects. In database language this is called a “Full Table Scan” and it causes your queries to get slower and slower as your database grows in size.

Luckily, most of the time a $ne query can be rewritten as a $in query. Instead of querying for the absence of values, you ask for values which match the rest of the column values. Doing this allows the database to use an index and your queries will be faster.

For example if the “User” object has a column called “State” which has values “SignedUp”, “Verified”, and “Invited”, the slow way to find all users who have used the app at least once would be to run the query:

 { "State" : { "$ne" : "Invited" } }

It would be faster to use the $in version of the query:

 { "State" : { "$in" : [ "SignedUp", "Verified" ] }

In a similar vein a $nin query is bad, too, as it also can’t use an index. You should always try to use the complimentary $in query. Building on the above example, if the “State” column had one more value, “Blocked”, to represent blocked users, a slow query to find active users would be:

 { "State" : { "$nin" : [ "Blocked", "Shadow" ] } }

Using a complimentary $in query will be always be faster:

 { "State" : { "$in" : [ "Active", "Verified" ] } }

We’re working hard here at Parse so that you don’t have to worry about indexes and databases, and knowing some of the tradeoffs in using the $ne, $nin operators goes a long way in making your app faster.

March 27, 2014

MongoDB 2.6 Production Release Highlights

The Parse platform relies heavily on MongoDB. We use MongoDB for a variety of workloads, such as routing and storing application data, monitoring, real-time API request performance analysis, and billing and backend platform analytics. We love the flexibility and richness of the feature set that MongoDB provides, as well as the rock-solid high availability and primary elections.

So we are incredibly excited about the upcoming MongoDB 2.6.0 production release. You can read the full release notes here, but I’d like to highlight a few of the changes and features we are most excited about.

First and foremost, the index build enhancements. You will now be able to build background indexes on secondaries! For those who are unfamiliar with the way mongo performs indexing, the default behavior is to build indexes in the foreground on both primaries and secondaries. Foreground indexing means the index op will grab the global lock and no other database operations will be able to execute until the index has finished building (this includes killing the index build). Obviously this is not a reasonable option for those of us who ensure indexes are built in the normal request path. You can instruct the primary to build indexes in the background, which lets the indexing operation yield to other ops, but until now there has been no similar functionality on the secondaries. Parse makes extensive use of read queries to secondaries for things like our push services, and we have had to implement a complicated set of health checks to verify that secondaries are not locked and indexing while we try to read from them. Background indexes on secondaries will make this process much simpler and more robust.

Another terrific indexing improvement is the ability to resume interrupted index builds.

We are also very excited about a number of query planner improvements. The query planner has been substantially rewritten in 2.6.0, and we are eager to take it for a test drive. Mongo has also now implemented index intersection, which allows the query planner to use more than one index when planning a query.

The explain() function has been beefed up and they have added a whole suite of methods for introspecting your query plan cache. In the past we have often been somewhat frustrated trying to infer which query plan is being used and why there are execution differences between replica set members so it is great to have these decisions exposed directly.

Another interesting change is that PowerOf2Sizes is now the default storage allocation strategy. I imagine this is somewhat controversial, but I think it’s the right call. PowerOf2 uses more disk space, but is far more resistant to fragmentation.  An ancillary benefit is that padding factors are no longer relevant. One issue we have had at Parse (that no one else in the world seems to have, to be fair) is that we cannot do initial syncs or repairDatabase() because it resets all the padding factors to 1.0. This causes all updates or writes to move bits around disk for weeks to come as the padding factors are relearned, which in turn hoses performance. The inability to do initial sync or repair means we have had no way of reclaiming space from the database.

The hard-coded maxConns limit is also being lifted. Previously your connection limit was set to 70% of ulimit or 20k connections, whichever is lower, but the hard cap is now gone.  This totally makes sense and I am glad it has been lifted. However you should still be wary of piling on tens of thousands of connections, because each connection uses 1mb of memory and you do not want to starve your working set of RAM.

Here’s another thing I missed the first dozen or so times I read through the release notes: QUERY RUNTIME LIMITS! MongoDB now lets you tag a cursor or a command with the maxTimeMS() method to limit the length of time a query is allowed to run. This is a thrilling change. Parse (and basically everyone else who runs mongo at scale) has a cron job that runs every minute and kills certain types of queries that have been running past their useful lifespan (e.g. their querying connection has vanished) or are grabbing a certain type of lock and not yielding. If maxTimeMS() works as advertised, the days of the kill script may be gloriously numbered.

Ok, so those are the delicious goodie bits. Lastly let’s take a look at a painful but necessary change that I am afraid is going to take a lot of people by surprise: stricter enforcement of index key length. In all previous versions of mongo, it would allow you to insert a key with an indexed value larger than 1024 bytes, and it would simply warn you that the document would not be indexed. In 2.6 it will start rejecting those writes or updates by default. This is unquestionably the correct behavior, but will probably be very disruptive for a lot of mongo users when previously accepted writes start to break. They have added a flag to optionally preserve the old behavior, but all mongo users should be thinking about how to move their data sets to a place where this restriction is acceptable. The right answer here is probably some sort of prefix index or hashed index, depending on the individual workload.

There is a lot of exceptionally rich feature development and operational enhancements in this 2.6 release. We have been smoke testing the secondaries in production for some time now and we’re very much looking forward to upgrading our fleet. Be sure to check out the rest of the 2.6 release notes for more great stuff!

Charity Majors
March 25, 2014

Building Scalable Apps on Parse

At the SF Parse Meetup earlier this month, I talked with some of you about tips for making your Parse app faster and more reliable. Here is a recap of that presentation.

At Parse, we are constantly improving our infrastructure to make your app scale auto-magically. Our platform load balances your traffic, elastically provisions your app’s backend resources, and backs up your data on redundant storage. On top of our MongoDB datastore, we built an API layer that seamlessly integrates with our client-side SDKs. Our cloud infrastructure uses online learning algorithms to automatically rewrite inefficient queries and generate database indexes based on your app’s realtime query stream.

You can further improve your app’s performance with a few more simple tips. (Some links here are for Android, but we have analogous docs for other platforms too.)

  1. Cache whenever you can. You can turn on client-side query caching using the CachePolicy option.
  2. Use Cloud Code for post-processing. In Cloud Code, you can run custom JavaScript logic whenever Parse Objects are saved or deleted. This is great for validating data and modifying related objects (see the displaying counts and text search examples below). We also offer Background Jobs for running more time-consuming data migrations.
  3. Write restrictive queries. You should write queries that return only what the client actually needs. Returning a ton of unnecessary objects back to the client can make your app slower due to network traffic. Please review our Querying Guide to see whether you should add any constraints on your existing queries. You should also try to avoid non-restrictive query clauses such as whereNotEqualTo. If you are issuing queries on GeoPoints, make sure you specify a reasonable radius. In addition, you can limit the object fields returned with selectKeys so that the phone client doesn’t have to wait for large unrelated fields to download.
  4. Design your data model for efficient querying. Please check out our Relations Guide for details.

We also have a few recommendations for specific scenarios:

  1. Displaying counts on the UI. Suppose you are building a product catalog. You might want to display the count of products in each category on the top-level navigation screen. If you run a count query for each of these UI elements, they will not run efficiently on large data sets because MongoDB does not use counting B-trees. Instead, we recommend that you use a separate Parse Object to keep track of counts for each category. Whenever a product gets added or deleted, you can increment or decrement the counts in an afterSave or afterDelete Cloud Code handler.
  2. Text search. MongoDB is not efficient for doing partial string matching except for the special case where you only want a prefix match. For adding efficient text search to your app, please see our previous post.

We hope that these tips will help you build apps that are responsive and delightful to use. I also want to thank everyone who submitted feedback or questions because this talk was mainly based on the feedback from our wonderful developer community. So keep building awesome apps, and send us your thoughts!

Stanley Wang
December 19, 2013

MongoNYC and Masters Summit Recap

As many of you may already know, the Parse backend makes extensive use of a nonrelational document database called MongoDB. On June 20-21 I was thrilled to attend MongoDB Days in NYC, as well as my first MongoDB Masters’ summit.

The masters’ summit was phenomenal. The event drew 20 or 30 of the world’s most experienced MongoDB engineers of all sorts — CTOs, developers, evangelists, devops, and third-party contributors — to share their experiences and provide feedback on the product. There were breakout sessions on everything from from driver APIs to indexing to deployment best practices. We also got to preview some exciting new features.

The next day was the MongoNYC conference. This was actually my third MongoDB conference, and each one seems to get bigger and better than the last. I particularly enjoyed “High Performance, High Scale MongoDB on AWS: A Hands On Guide” from Miles Ward, a solutions architect at Amazon Web Services, and “MongoDB Hacks of Frustration”, from Leo Kim of Foursquare.

I also gave a talk on “Managing a Maturing MongoDB Ecosystem”, based on some of the lessons we have learned scaling MongoDB at Parse over the past year and a half. My talk covered things like managing multiple clusters with Chef, dealing with fragmentation and compaction on aging data, configuring for high availability on AWS, and other operational issues.
It was a privilege and a treat to meet so many of the fine folks at 10gen, and spend a few days hanging out and talking shop with other engineers who are running MongoDB at scale. We also came away with some ideas for holding great developer events that we are eager to put into practice for the Parse developer community. Can’t wait for the next one!
Charity Majors
July 10, 2013

Summary of the June 11th Service Disruption

We would like to share some information about the service disruption on Tuesday, June 11th. Around 8:00 A.M. PDT, one of our database clusters triggered an indexing bug in the underlying database layer. This data was replicated across four different machines; however, the bug triggered on three of the machines simultaneously. Service was unaffected at this point because our configuration allowed the primary to continue serving traffic.

However, this left the data in a dangerous state where continued operation would be too risky without our typical replication standards. In order to restore acceptable functionality, we began to take the live copy down and perform a snapshot to restore our working replica set members to health. We throttled access to the cluster at 11:30 A.M., which disabled service for a small percentage of our users. Unfortunately, our snapshotting mechanism failed due to a low-level hardware issue that we are still investigating. This caused the recovery procedure to take more time than initially estimated. As a failover recovery mechanism, we provisioned new volumes and performed an rsync of the data, then reattached those volumes to a secondary node. Once data redundancy was restored, we turned back on access to the cluster. Full access was restored to all apps around 4:45 PM.

In response to this incident, we have several planned improvements to improve future performance in similar incidents:

  • We are working to improve our process to allow recovery from multiple database failures without downtime by isolating failed operations.

  • We are developing a mechanism for notifying individual developers of downtime incidents that impact their applications.

  • We are investigating the source of the initial database bug that led to failures in secondaries.

  • We are investigating the source of the underlying snapshot failure that delayed our first recovery efforts.

  • We are working on a monitoring tool to observe recovery progress to enable faster decision making when multiple recovery methods are possible.

We apologize for the outage. We take application reliability very seriously and will strive to apply the lessons learned from this outage to enhance future reliability.


Charity Majors
June 12, 2013

Summary of the May 23rd Parse Service Disruption

We would like to share some more details about the service disruption we experienced on Thursday, May 23rd, and the steps we are taking to mitigate this sort of issue in the future.

At around 4:46 P.M. PDT we experienced a surge of elevated error rates. We tracked this down to an underlying hardware event on the primary node for our application routing data cluster, which caused the kernel to kill the software RAID controller and sent the database primary into an indeterminate state. This also triggered a rare client side driver bug that caused all our application layer services to hang. We failed over to a secondary node and restarted all of the services, and full service was restored by 5:29 P.M.

We are taking the following steps to ensure that we can recover more quickly from similar hardware events in the future:

  • Let the application services read routing data from secondaries, so they do not depend on the availablity of the primary
  • Better detection for when the mongo client drivers need to be reinitialized in the application code
  • Continuous warmup on all secondary nodes, so that failover is consistently fast

We apologize for this outage. We are working very hard to improve the resiliency of our platform, and we know that outages are extremely disruptive to our customers. If you have any questions, please reach out to us through our Help & Community portal.

Charity Majors
May 24, 2013