Parse for PHP: A Fractal of Rad Design

Parse for PHP

Today we are very happy to release the Parse PHP SDK, which will enable Parse integration for a whole new class of apps and different use cases. This is our first SDK for a server-side language, and the first to be truly open-source.

PHP is an incredibly popular programming language and has consistently been in the top 10 on the TIOBE index for the past 15 years. Some metrics report that it is still serving the vast majority of websites on the internet. Until now, if you wanted to access Parse from PHP, the REST API was the only option. A few Parse API wrapper libraries have been released by third parties on GitHub. While we think this is awesome, many developers requested better PHP support and we decided to build a first-party SDK.

Earlier this year at Facebook’s f8 developer conference, we launched the completely re-invented Facebook SDK for PHP. In the 3 months since it was released there have been over 160 commits, with good solid contributions and enhancements from 20 passionate developers who use and care about the product. It has been a lot of fun managing the process for the Facebook SDK, and I’m looking forward to working together with the community on this Parse SDK.

For a fast overview of the SDK, check out the README file. We’ve also updated our documentation with a new PHP guide, and added a PHP Quickstart with installation instructions. We encourage you to report any issues and requests on GitHub.

Go build something awesome with Parse + PHP!

Fosco Marotto
August 5, 2014

Parse Security VI – Quiz Time

In Part VI of our five-part series on how to secure your Parse app, let’s take a quiz and see how well you know your stuff. If you can’t handle having Part VI in a five-part series, then maybe you should go read up on buffer overflow exploits. ;-)

Part I   Part II   Part III   Part IV   Part V

Bryan Klimt
August 4, 2014

Parse Security V – How to Make Friends

In the first four parts of this five-part series on how to secure your Parse app, we’ve taken a look at all of the different features that Parse has to help you secure your app and your users’ data. In Part V, let’s put it all together and take a look at a real example of how you can use these features to solve a complex use case.

Most classes in your app will fall into one of a couple of easy-to-secure categories. For fully public data, you can use class-level permissions to lock down the table to put publicly readable and writeable by no one. For fully private data, you can use ACLs to make sure that only the user who owns the data can read it. But occasionally, you’ll run into situations where you don’t want data that’s fully public or fully private. For example, you may have a social app, where you have data for a user that should be readable only to friends whom they’ve approved. For this you’ll need to use ACLs, roles, and Cloud Code together to enable exactly the sharing rules you desire.

If you aren’t clear on those features, go back and read Parts II – IV of this series. Once you’re clear on what ACLs, roles, and Cloud Code are, let’s dig into some code. The first thing you’ll need to do is set up a place for the user’s social data to live. Let’s assume you want to create a “FriendData” object for each user that stores the data that should be visible to their friends. To create this object for each new user, you can use an afterSave handler.

Parse.Cloud.afterSave(Parse.User, function(request, response) {
    var user = request.object;
    if (user.existed()) { return; }
    var roleName = "friendsOf_" + user.id;
    var friendRole = new Parse.Role(roleName, new Parse.ACL(user));
    return friendRole.save().then(function(friendRole) {
        var acl = new Parse.ACL();
        acl.setReadAccess(friendRole, true);
        acl.setReadAccess(user, true);
        acl.setWriteAccess(user, true);
        var friendData = new Parse.Object("FriendData", {
          user: user,
          ACL: acl,
          profile: "my friend profile"
        });
        return friendData.save();
    });
});

The first couple of lines just set up the request. On line 3, we check to see if the user already existed. If so, then we’ve already completed this setup, and there’s no reason to do it again. On lines 4-6, we create a new role. This role will represent the friends of the user being created. We generate a unique name for the role. The role is created with an ACL that only allows this user to read or write it. In other words, only the user themselves can decide who their friends are.

Once the role is created, lines 7-10 create an ACL that grants read permission to friends of the user. And of course the user receives read and write permission for their own social data. Once the ACL is set up, lines 11-16 actually create the object. The user is set on the object so that it can be found with a query later. The profile field is set as an example of data that will be readable only to the users friends.

So, that’s everything you need to set up an object for every user that will be readable only to people in their special friends role. Of course, that’s worthless without any way for people to get added to your friends list, so we need to address that with another function. Technically, this function could be written in any client and be secure, but for the sake of consistency, let’s make another Cloud Code function. This function will be called “friend.”  It will take one parameter: the objectId of the person to friend. This person will then be added to the friends role of the current user.

Parse.Cloud.define("friend", function(request, response) {
    var userToFriend = new Parse.User();
    userToFriend.id = request.params.friendId;

    var roleName = "friendsOf_" + request.user.id;
    var roleQuery = new Parse.Query("_Role");
    roleQuery.equalTo("name", roleName);
    roleQuery.first().then(function(role) {
        role.getUsers().add(userToFriend);
        return role.save();

    }).then(function() {
        response.success("Success!");    
    });
});

Lines 1-3 just set up the function and create a User object to represent the person being added to the friend role. Lines 5-8 fetch the Role object using its name. A Role is just a special kind of Parse object, so we need to fetch it before we can modify it. Once the role has been fetched, we add this user to it on line 9, and save it on line 10. That’s all there is to it! Now we have an easy way to store data that’s only accessible to a user’s friends.

Over the course of this five-part series on how to secure your Parse app, we’ve looked at a lot of features. You know the about the various keys used to access your app. We’ve looked at the permissions you can set to lock down a whole class. You’ve learned about ACLs and how they can secure per-user data. And we’ve even dived deep into a particular example of how you can use Cloud Code to tackle even the trickiest data models.

We hope that you’ll use these tools to do everything you can to keep your app’s data and your users’ data secure. Together, we can make the web a safer place.

Part I   Part II   Part III   Part IV

Bryan Klimt
July 28, 2014

Parse Security IV – Ahead in the Cloud

In our first three posts on how to secure your Parse app, we looked at your app’s keys, class-level permissions, and object-level ACLs. For many apps, that’s all you need to keep your app and your users’ data safe. But sometimes you’ll run into an edge case where they aren’t quite enough. For everything else, there’s Cloud Code.

With Cloud Code, you can upload JavaScript to Parse’s servers, where we will run it for you. Unlike client code running on users’ devices that may have been tampered with, Cloud Code is guaranteed to be the code that you’ve written, so it can be trusted with more responsibility. For example, if you need to validate the data that a user has entered, you should do it in Cloud Code so that you know a malicious client won’t be able to bypass the validation logic. Specifically to help you validate data, Cloud Code has beforeSave triggers. These triggers are run whenever an object is saved, and allow you to modify the object, or completely reject a save. For example, this is how you create a Cloud Code trigger to make sure every user has an email address set:

Parse.Cloud.beforeSave(Parse.User, function(request, response) {
  var user = request.object;
  if (!user.get("email")) {
    response.error("Every user must have an email address.");
  } else {
    response.success();
  }
});

To upload this trigger, follow the instructions in our guide for setting up the Parse command line tool.

You can also use your master key in Cloud Code to bypass the normal security mechanism for trusted code. For example, if you want to allow a user to “like” a “Post” object without giving them full write permissions on the object, you can do so with a Cloud Code function. Every API function in the Cloud Code JavaScript SDK that talks to Parse allows passing in a useMasterKey option. By providing this option, you will use the master key for that one request.

Parse.Cloud.define("like", function(request, response) {
  var post = new Parse.Object("Post");
  post.id = request.params.postId;
  post.increment("likes");
  post.save(null, { useMasterKey: true }).then(function() {
    response.success();
  }, function(error) {
    response.error(error);
  });
});

One very common use case for Cloud Code is sending push notifications to particular users. In general, clients can’t be trusted to send push notifications directly, because they could modify the alert text, or push to people they shouldn’t be able to. Your app’s settings will allow you to set whether “client push” is enabled or not; we recommend that you make sure it’s disabled. Instead, you should write Cloud Code functions that validate the data to be pushed before sending a push.

In this post, we’ve looked at how you can use Cloud Code to write trusted code, to keep your data secure in cases where class-level permissions and ACLs aren’t enough. In Part V, we’ll dive into a particular example of how to use ACLs, Roles, and Cloud Code to let users have data that is shared only to their friends.

Part I   Part II   Part III   Part V

Bryan Klimt
July 21, 2014

Parse Security III – Are You On the List?

In Part II, we looked at class-level permissions, which allow you to quickly set permissions for an entire class at once. But often, the different objects in a class will need to be accessible by different people. For example, a user’s private personal data should be accessible only to them. For that, you have to use an Access Control List, usually referred to as an ACL. If you have a production app and you aren’t using ACLs, you’re almost certainly doing it wrong.

An ACL specifies a set of users who can read or write an object’s data. So, before you can use ACLs, you have to have Users. There are many ways to handle users on Parse. You can have usernames and passwords, or you can use Facebook Login. If you don’t want to make your users create usernames just to log in, you can even use Parse’s automatic anonymous users feature. That allows you to create a user and log them in on a particular device in a secure way. If they later decide to set a username and password, or link the account with Facebook, then they can then log in from any other device. Setting up automatic anonymous users is easy, so you have no excuse not to protect per-user data!

[PFUser enableAutomaticUser];

Once you have a user, you can start using ACLs. To set an ACL on the current user’s data to not be publicly readable, all you have to do is:

PFUser *user = [PFUser currentUser];
user.ACL = [PFACL ACLWithUser:user];

Most apps should do this. If you store any sensitive user data, such as email addresses or phone numbers, you need to set an ACL like this so that the user’s private information isn’t visible to other users. If an object doesn’t have an ACL, it’s readable and writeable by everyone. The only exception is the _User class. We never allow users to write each other’s data, but they can read it by default. (If you as the developer need to update other _User objects, remember that your master key can provide the power to do this.) To make it super easy to create user-private ACLs for every object, we have a way to set a default ACL that will be used for every new object you create.

[PFACL setDefaultACL:[PFACL ACL] withAccessForCurrentUser:YES];

If you want the user to have some data that is public and some that is private, it’s best to have two separate objects. You can add a pointer to the private data from the public one.

PFObject *privateData = [PFObject objectWithClassName:@"PrivateUserData"];
privateData.ACL = [PFACL ACLWithUser:[PFUser currentUser]];
[privateData setObject:@"555-5309" forKey:@"phoneNumber"];

[PFUser setObject:privateData forKey:@"privateData"];

Of course, you can set different read and write permissions on an object. For example, this is how you would create an ACL for a public post by a user, where anyone can read it:

PFACL *acl = [PFACL ACL];
[acl setPublicReadAccess:true];
[acl setWriteAccess:true forUser:[PFUser currentUser]];

Sometimes, it’s inconvenient to manage permissions on a per-user basis, and you want to have groups of users who get treated the same. For example, your app may have a group of admins who are the only ones who can write data. Roles are are a special kind of object that let you create a group of users that can all be assigned to the ACL. The best thing about roles is that you can add and remove users from a role without having to update every single object that is restricted to that role. To create an object that is writeable only by admins:

PFACL *acl = [PFACL ACL];
[acl setPublicReadAccess:true];
[acl setWriteAccess:true forRoleWithName:@"admins"];

Of course, this snippet assumes you’ve already created a role called “admins”. In many cases, this is reasonable, because you have a small set of special roles you can set up while developing your app. But sometimes you will need to create roles on the fly. In a future blog post, we’ll look into how you can use ACLs and roles to manage a user’s data so that it will only be visible to their friends.

So far in this security series, we’ve covered class-level permissions and ACLs. These features are sufficient to secure the vast majority of apps. In Part IV, we’ll look at how you can use Cloud Code to secure apps even in rare edge cases.

Part I   Part II   Part IV   Part V

Bryan Klimt
July 14, 2014

Parse Security II – Class Hysteria!

In Part I, we took a look at the different keys that a Parse app has and what they mean. As we learned, your client keys are easily accessible to anybody, so you need to rely on Parse’s security features to lock down what the client key is allowed to do.

The easiest way to lock down your app is with class-level permissions. Almost every class that you create should have these permissions tweaked to some degree. For classes where every object has the same permissions, class-level settings will be most effective. For example, one common use case: having a class of static data that can be read by anyone but written by no one. If you need different objects to have different permissions, you’ll have to use ACLs–Access Control Lists–(discussed in Part III). To edit your class-level permissions, click on “Set permissions” under the “More” menu in the Data Browser for the class you want to configure.

Set permissions menu

For each setting, you can choose to either leave it open to everyone, or to restrict it to certain predefined Roles or Users. A Role is simply a set of Users and Roles who should share some of the same permissions. For example, you can set up a Role called “admins” and make a table writeable only by that role.

Class-level Permissions

Let’s go over what each permission means.

  • Get – With Get permission, users can fetch objects in this table if they know their objectIds.
  • Find – Anyone with Find permission can query all of the objects in the table, even if they don’t know their objectIds. Any table with public Find permission will be completely readable by the public, unless you put an ACL on each object.
  • Update – Anyone with Update permission can modify the fields of any object in the table that doesn’t have an ACL. For publicly readable data, such as game levels or assets, you should disable this permission.
  • Create – Like Update, anyone with Create permission can create new objects of a class. As with the Update permission, you’ll want to turn this off for publicly readable data.
  • Delete – This one should be pretty obvious. With this permission, people can delete any object in the table that doesn’t have an ACL. All they need is its objectId.
  • Add fields -This is probably the strangest permission. Parse classes have schemas that are inferred when objects are created. While you’re developing your app, this is great, because you can add a new field to your object without having to make any changes on the backend. But once you ship your app, it’s very rare to need to add new fields to your classes automatically. So you should pretty much always turn off this permission for all of your classes when you submit your app to the public.

There’s also an app-level permission to disable client class creation. You should turn this off for production app so that clients can’t create new classes. Usually this isn’t a big security risk, but it’s nice to stop malicious people from creating new tables that you’ll see in the data browser.

Client Class Creation

Class-level permissions are great for globally-shared data where permissions for each object in the class should be exactly the same. In Part III, we’ll learn about ACLs, which allow you to have different permissions for each object in a class.

Part I   Part III   Part IV   Part V

Bryan Klimt
July 7, 2014

Parse Security I – Are you the Key Master?

So, you’ve finished version 1 of your app, and you’re ready to send it out into the world. Like a child leaving the nest, you are ready to push your app out to the various app stores and wait for the glowing reviews to come streaming in. Not so fast! You wouldn’t send a child out into the world without teaching them how to protect themselves. Likewise, you shouldn’t send your app out into a user’s hands without taking some time to secure it using industry-standard best practices. After all, if your app gets compromised, it’s not only you who suffers, but potentially the users of your app as well. In this five-part series, we’ll take a look at what you can do to secure your Parse app.

Parse App Keys

Security starts with understanding your app’s keys. A Parse app has several different keys: a client key, a REST key, a .NET key, a JavaScript key, and a master key. All of the keys–other than the master key–have basically the same permissions. They are just intended for use on different platforms. So, we usually refer to any of those keys as a “client key.” The first thing you need to understand is that your client key is not a security mechanism. It’s not even intended to serve as such. Your client key is shipped as a part of your app. Anyone can decompile your app or proxy network traffic from their device and see your client key. With JavaScript, it’s even easier. One can simply “view source” in the browser and immediately know your key. That’s why Parse has many other security features to help you secure your data. The client key is given out to your users, so anything that can be done with just the client key is doable by the general public, even malicious hackers.

The master key, on the other hand, is definitely a security mechanism. Using the master key allows you to bypass all of your app’s security mechanisms, such as class-level permissions and ACLs. Having the master key is like having root access to your app’s servers. You should guard your master key with the same zeal with which you would guard your production machines’ root password. Never check your master key into source control. Never include your master key in any binary or source code you ship to customers. And above all, never, ever give your master key out to strangers in online chat rooms. Stranger danger!

In Part II, we’ll take a look at Parse’s advanced features, which allow you to control what people with your client key can do.

Part II   Part III   Part IV   Part V

Bryan Klimt
June 30, 2014

Login Love for your Android App

We love UI components–as you may know through the pre-built UI classes in our iOS SDK. Today, we are bringing this same love to Android. We are launching ParseLoginUI, an open-source library project for building login screens on Android with the Parse SDK. This ultra-customizable library implements screens for login, signup, and password help. We are releasing this as a standalone project (apart from the Parse Android SDK) so that you have the flexibility to make further changes to its look and feel when you integrate it into your app.

To use ParseLoginUI with your app, you should import the library project, and add the following to your app’s AndroidManifest.xml:

<activity 
    android:name="com.parse.ui.ParseLoginActivity" 
    android:label="@string/app_name" 
    android:launchMode="singleTop">
    <meta-data 
        android:name="com.parse.ui.ParseLoginActivity.PARSE_LOGIN_ENABLED" 
        android:value="true"/>
</activity>

Then, you can show the login screen by launching ParseLoginActivity with these two lines of code:

ParseLoginBuilder builder = new ParseLoginBuilder(MyActivity.this);
startActivityForResult(builder.build(), 0);

Within ParseLoginActivity, our library project will automatically manage the login workflow. Besides signing in, users can also sign up or ask for an email password reset. The default version of each screen (login, signup, and recover password) is shown below.

Basic Login Screens

Let’s see how we can configure our login to look different.  Make the following changes to our AndroidManifest.xml:

<activity 
    android:name="com.parse.ui.ParseLoginActivity" 
    android:label="@string/app_name" 
    android:launchMode="singleTop">
    <meta-data 
        android:name="com.parse.ui.ParseLoginActivity.PARSE_LOGIN_ENABLED" 
        android:value="true"/>

    <!-- Added these options below to customize the login flow -->
    <meta-data 
        android:name="com.parse.ui.ParseLoginActivity.APP_LOGO"  
        android:resource="@drawable/app_logo"/>
    <meta-data  
        android:name="com.parse.ui.ParseLoginActivity.FACEBOOK_LOGIN_ENABLED"  
        android:value="true"/>
    <meta-data  
        android:name="com.parse.ui.ParseLoginActivity.TWITTER_LOGIN_ENABLED"  
        android:value="true"/>
    <meta-data  
        android:name="com.parse.ui.ParseLoginActivity.PARSE_LOGIN_HELP_TEXT"  
        android:value="@string/reset_password"/>
    <meta-data  
        android:name="com.parse.ui.ParseLoginActivity.PARSE_LOGIN_EMAIL_AS_USERNAME"  
        android:value="true"/>
</activity>

With these simple configurations, we’ve changed the app logo, added Facebook & Twitter logins, and changed the text shown for the password-reset link. We also enabled an option to automatically save the email address as the username, so that you don’t have to manually save it in both fields of the ParseUser object. When this option is turned on, both the login and signup screens are also automatically updated to prompt for email address as username.

Customized Login Screens

Our Android documentation contains guides for both basic and advanced use cases. You can find the source code for ParseLoginUI at our GitHub repository. Try it out and let us know what you think!

Stanley Wang
June 25, 2014

dvara: A Mongo Proxy

We wrote dvara, a connection pooling proxy for mongo, to solve an immediate problem we were facing. We were running into the connection limits on some of our replica sets. Mongo through 2.4 had a max-max conn limit of 20,000. As the number of our application servers grew, the number of concurrent active connections to our replica sets grew. Mongo 2.6 removed this limit, but it was unfortunately not ready at that time (we’re still testing it and haven’t upgraded to it yet). Even if it were ready, the cost per connection is 1MB, which takes away precious memory otherwise used by the database. A sharded cluster with mongos as the proxy was another path we considered. Enabling sharding may have helped, but that change would spill over into our application logic and we use at least some of the restricted features. We are experimenting with sharded replica sets in our environment, and from our experience we weren’t confident they would actually help with our connection limit problem. So we set out on what seemed like an ambitious, and in my mind, a difficult goal of building a connection pooling proxy for mongod.

Down to the Wire

We started off with a simple proof of concept, working backwards from legacy wire protocol documentation. We got it far enough to serve basic read/write queries in a few weeks. We attribute the speed at which we got the prototype working to using Go to build it. Go allowed us to write easy to follow code, and yet not pay the cost of a thread per connection, or the alternative of having to write callbacks or some other form of manually managed asynchronous network IO logic. Additionally, while our proxy prefers to not look at the bytes flowing through or decode the BSON for performance reasons, Gustavo Niemeyer‘s excellent mgo driver, along with its bson library made it trivial for us to introspect and mutate the traffic we needed to. The first of these cases was the isMaster and the replSetGetStatus commands. These command return the member/host information the client uses to decide who to connect and talk to. We need to replace the real host/ports with the proxy host/ports.

Yet another command that needed special handling, and one of the known problems we had to solve was to handle the way Mongo 2.4 and earlier require a second follow up call for getLastError. Fortunately this got some much needed love in 2.6, but until 2.4 mutation operations were essentially split into two parts: first, the mutation itself; and second, the getLastError command which included some important options, including the write concern. Consider what a connection pooling proxy does: a client sends a command, we take a connection from our pool, proxy the command and the response, and put the connection back into the pool for someone else to use. A good proxy would hold a connection from the pool for the least amount of time possible. Unfortunately the design of getLastError means we can’t do that, because getLastError is state that exists in mongod per-connection. This design is awkward enough that it actually requires special logic for the mongo shell to ensure it doesn’t get inadvertently reset. It was clear we’ll need to similarly maintain this state per connection in the proxy as well. Our implementation tries to preserve the semantics mongod itself has around getLastError, though once we’ve moved all our servers and clients to 2.6 this will be unnecessary with the new wire protocol.

Proxying in Production

An aspect we refined before we started using this in production was to auto discover replica set configuration from the nodes. At first our implementation required manual configuration that mapped each node we wanted to proxy. We always need a mapping in order to alter the responses for the isMaster and replSetGetStatus responses mentioned earlier. Our current implementation automatically configures this and uses the provided member list as a seed list. We’re still improving how this works, and likely will reintroduce manual overrides to support unusual situations that often arise in real life.

One of the benefits of dvara has been the ability to get metrics about various low level operations which were not necessarily readily available to us. We track about 20 metrics including things like number of mutation operations, number of operations with responses, latency of operations, number of concurrent connections. Our current implementation is tied to Ganglia using our own go client but we’re working on making that pluggable.

We’ve been using dvara in production for some time, but we know there are mongo failure scenarios it doesn’t handle gracefully yet. We also want a better process around deploying new versions of dvara without causing disruptions to the clients (possibly using grace). We want to help improve the ecosystem around mongo, and would love for you to contribute!

Naitik
June 23, 2014

Fun with TokuMX

TokuMX is an open source distribution of MongoDB that replaces the default B-tree data structure with a fractal tree index, which can lead to dramatic improvements in data storage size and write speeds. Mark Callaghan made a series of awesome blog posts on benchmarking InnoDB, TokuMX and MongoDB, which demonstrate TokuMX’s remarkable write performance and extraordinarily efficient space utilization. We decided to benchmark TokuMX against several real-world scenarios that we encountered in the Parse environment. We also built a set of tools for capturing and replaying query streams. We are open sourcing these tools on github so that others may also benefit from them (we’ll discuss more about them in the last section).

In our benchmarks, we tested three aspects of TokuMX: 1. Exporting and importing large collections; 2. Performance for individual write-heavy apps; and 3. Database storage size for large apps.

1. Importing Large Collections


We frequently need to migrate data by exporting and importing collections between replica sets. However, this process can be painful because sometimes the migration rate is ridiculously slow, especially for collections with a lot of small entries and/or complicated indexes. To test importing and exporting, we performed an import/export on two representative large collections with varying object counts.

  • Collection1: 143 GB collection with ~300 millions of small objects
  • Collection2: 147 GB collection with ~500 thousands of large objects

Both collections are exported from our existing MongoDB collections, where collection1 took 6 days to export and collection2 took 6 hours. We used the mongoimport command to import collections to MongoDB and TokuMX instances. Benchmark results for importing collection1, with a large number of small objects: TokuMX is 3x faster to import.

# Collection1: exported from MongoDB for 6 days

Database         Import Time
---------------------------------------------------------------------
MongoDB           58 hours 37 minutes
TokuMX            14 hours 28 minutes

Benchmark results for importing collection2, with a small number of large objects: TokuMX and MongoDB are roughly in parity.

# Collection2: exported from MongoDB for 6 hours

Database         Import Time
---------------------------------------------------------------------
MongoDB           48 minutes
TokuMX            53 minutes

2. Handling Heavy Write Loads


One of our sample write-intensive apps issues a heavy volume of “update” requests with large object sizes. Since TokuMX is a write-optimized database, we decided to benchmark this query stream against both MongoDB and TokuMX. We recorded 10 hours of sample traffic, and replayed it against both replica sets. From the benchmark results, TokuMX performs 3x faster for this app with much smaller latencies at all histogram percentiles.

# MongoDB Benchmark Results
- Ops/sec: 1695.81
- Update Latencies:
    P50: 5.96ms
    P70: 6.59ms
    P90: 11.57ms
    P95: 18.40ms
    P99: 44.37ms
    Max: 102.52ms
# TokuMX Benchmark Results
- Ops/sec: 4590.97
- Update Latencies:
    P50: 3.98ms
    P70: 4.49ms
    P90: 6.33ms
    P95: 7.61ms
    P99: 12.04ms
    Max: 16.63ms

3. Efficiently Using Disk Space


Space efficiency is another big selling point for TokuMX. How much can TokuMX save in terms of disk utilization? To figure this out, we exported the data of one of our shared replica sets (with 2.4T data in total) and imported them into TokuMX instances. The result was stunning: TokuMX used only 379G disk space —about 15% of the original size.

Benchmark Tools

Throughout the benchmarks, we focused on:

  • Using “real” query patterns to evaluate the database performance
  • Figuring out the maximal performance of the systems

To achieve those goals, we developed a tool, flashback, that records the real traffic to the database and replays ops with different strategies. You can replay ops either as fast as the database can accept them, or according to their original timestamp intervals. We are open sourcing this tool because we believe it will be also useful for people who are interested in recording their real traffic and replaying it against different production environments, such as for smoke testing version upgrades or different hardware configurations. For more information on using flashback, please refer to this document. We accept pull requests!

Kai Liu
June 20, 2014