Case Study: The missing guide to Memcached - Scaling PHP Applications

Scaling PHP Applications (2014)

Case Study: The missing guide to Memcached

Memcached is cool. It was one of the original bad-boys in the scaling world and sits at the core-foundation of many of the biggest websites on the internet. Why? It’s simple, fast, and convenient. Memcached can be installed easily on almost every platform and is simple to configure.

It’s tempting to consider many of the Memcached-alternatives out there. I’ve seen some “modern-takes” that speak the same protocol but are implemented in Golang or Scala. I would avoid any serious usage of Memcached servers written in a Garbage collected language— you don’t want your site’s bottleneck to be caused by a stop-the-world garbage collector in your Memcached server.

The original C-based memcached is still a great choice— it’s simple, stable, and reliable.

Choosing Memcached over Redis?

So when should you use Memcached over Redis? For MOST purposes, I prefer to use Redis. Redis is very fast, has a bunch of convenient data types not found in Memcached like Sets and Lists, and stores data to disk. Redis is extremely convenient for keeping cached data persistent across restarts. For instance, it’s common for sites to run very slow after a Memcached restart when the cache is cold and MySQL is getting hammered because of the empty cache. You avoid this problem with Redis entirely because the cache is immediately hot (although, Redis can take a significant amount of time, 60s+, to initially read the stored data into memory on restart).

For most deployments, I recommend only using Redis. It’s not worth adding a new moving part to your infrastructure— since Redis can do everything that Memcached can (and more). However, Memcached does bring these two key advantages to the table:

1. It’s multi-threaded. Redis, being evented, can only work with a single connection at a time. Even without threads, Redis is DAMN fast, but multi-threaded Memcached means that it can accept and process more connections, and take full advantage of all of the CPU cores available to it (Redis can only use one core).

2. It uses less memory. Redis has to keep several pointers for each item it stores. Memcached allocates memory in slabs, avoids malloc entirely, and can keep down the bookkeeping overhead. There are some ways to mitigate this using Hashes in Redis (Hint: it’s ugly), but item-for-item, Memcached will use less memory. While barely apparent in small numbers, this can make a difference over hundreds-of-millions of keys and save you GBs of expensive memory.

Installing Memcached

On Ubuntu, installing Memcached is dead simple, and gets you a very recent version.

1 $ apt-get install memcached

Edit the config file /etc/memcached.conf and change the maximum memory (-m), maximum number of connections (-c), and maximum number of threads (-t). I like to use the number of CPU cores as a starting point for number of threads. You’ll also want to bump up the open-file limit and tune the network settings via sysctl as previously covered in earlier chapters. Remember, the memory limit (-m) is a hard cap— when memcached hits this limit, it’ll start throwing out old data.

The configuration file, after tuning, should look something like this:

1 # Run memcached as a daemon. This command is implied, and is not needed for t\

2 he

3 # daemon to run. See the README.Debian that comes with this package for more

4 # information.

5 -d

6

7 # Log memcached's output to /var/log/memcached

8 logfile /var/log/memcached.log

9

10 # Be verbose

11 # -v

12

13 # Be even more verbose (print client commands as well)

14 # -vv

15

16 # Start with a cap of 64 megs of memory. It's reasonable, and the daemon defa\

17 ult

18 # Note that the daemon will grow to this size, but does not start out holding

19 # this much memory

20 -m 24000

21

22 # Default connection port is 11211

23 -p 11211

24

25 # Run the daemon as root. The start-memcached will default to running as root

26 # if no -u command is present in this config file

27 -u memcached

28

29 # Specify which IP address to listen on. The default is to listen on all

30 # IP addresses. This parameter is one of the only security measures that

31 # memcached has, so make sure it's listening on a firewalled interface.

32 -l 10.0.1.10

33

34 # Limit the number of simultaneous incoming connections. The daemon default

35 # is 1024

36 -c 4096

37

38 # Lock down all paged memory. Consult with the README and homepage before

39 # you do this

40 # -k

41

42 # Return error when memory is exhausted (rather than removing items)

43 # -M

44

45 # Maximize core file limit

46 # -r

question

What happens when Memcached runs out of memory? Memcached uses a LRU (least-recently used) algorithm to track the number of times a key is requested. When Memcached runs out of space and there are no expired keys to delete, it will start evicting the least-used keys avaliable. You can track the evictions key under the stats to see what your churn rate looks like. Typically, a high eviction count will signify that you need more cache space than you have.

Hooking up PHP to Memcached

If you look in PHP’s PECL Library, you’ll find two different Memcached extensions avaliable to you. The difference is subtle, but it can make a difference— there’s the memcache extension and the memcached extension (notice the d).

The memcache extension is older, has less features, and uses its own homebrew library for connecting and interacting to Memached. There are new beta versions that can be installed by specifying the flag state=beta when installing, but the latest stable version is several years old. I recommend avoiding this extension.

The newer, and better, extension to use is the memcached PECL extension. It uses libmemcached to connect to Memcached, so it has a well-supported and tested mechanism for connecting to Memcached. In addition, the memcached extension offers a host of new features that otherwise is not available in the older memcache extension.

To install, just run this command:

1 $ pecl install memcached

Protocols

Clients connecting to Memcached server can talk to it in two different ways— via an ASCII protocol and a Binary protocol. The default method is ASCII, which is convenient for debugging (because you can easily read the communication back-and-forth), but it’s completely unnecessary for production and just increases network chatter. Change it to Binary Protocol—

1 $m = new Memcached;

2 $m->setOption(Memcached::OPT_BINARY_PROTOCOL, true);

By the way, in case you were curious the ASCII protocol is dead simple and I’ll show you how you can use it to debug Memcached from telnet at the end of this chapter.

As an example, the ASCII protocol looks like this:

1 $ telnet localhost 11211

2 > GET hello # Retrieve the key named 'hello'

3 < VALUE world 0 5 # Memcached server returns the value 'world'

Compression and Serialization

A common use-case for Memcached is to use it to store more complicated values, like PHP arrays or objects. Typically, if possible, you want to avoid this alltogether. Serialized objects take up quite a bit of space, and the bigger objects, the more memory you’ll chew through. If you find yourself needing to store complex objects or arrays often, I suggest considering Redis as an alternative, as it has more complex data types better designed for this type of work.

That being said, if you absolutely must store arrays and objects in Memcached, it’s easy. Just pass them into the set function. The default behavior of the Memcached client is to use PHPs built in serialize function on the object before sending it. This allows you to take advantage of custom__serialize() functionality provided by PHP’s magic object methods. You can add a method named __serialize() to any class and add your own custom logic to properly serialize nested objects or only return a limited subset of values.

Years ago, I worked with a guy who insisted on comma-separating arrays with explode before storing them in Memcached. He would join the arrays back together when retrieving the data. Don’t be that guy. PHP does the serialization for. If you find yourself often needing a more sophisticated data type— you need Redis.

PHP’s built-in serialization can be pretty heavy. Since it’s an ASCII based serialization, it takes up an non-optimal amount of bytes. There is a drop in serialization replacement extension called igbinary that allows you to use a more-complex but smaller footprint form of binary serialization. It “just works” seamlessly once you install the module. If you’re going to be storing a bunch of serialized data in Memcached, igbinary is a must have.

1 $ pecl install igbinary

And then enable it for the Memcached extension

1 <?php

2 $m = new Memcached;

3 $m->setOption(Memcached::OPT_SERIALIZER, Memcached::SERIALIZER_IGBINARY);

information

One downside of using serialization and compression is that generally you won’t be able to read the values from Memcached using any language besides PHP. If building a Service-Oriented-Architecture, with multiple backend languages, this can be a deal breaker. You may want to look at Memcached::SERIALIZER_JSON for this type of work.

Similarly to Serialization, PHP’s Memcached extension can also compress the values of your items using standard gzip. This is convenient because it allows you to use less memory when storing bigger objects. Enable it with the following code:

1 $m = new Memcached;

2 $m->setOption(Memcached::OPT_COMPRESSION, true);

With regards to compression, it will only be enabled on larger values (small values will skip compression), and benchmarks generally show an all-around performance improvement when compression is enabled.

Server Pools

It’s often desirable to spread your Memcached usage over several servers, both for increased memory access and to avoid a single point of failure.

information

Note: Out of the box Memcached doesn’t offer anything based around replication. As you know, data in Memcached is considered ephemeral, so loss is considered “okay”. There are some alternative Memcached servers and plugins that offer a replication option.

We use the concept of Server Pools to enable multiple Memcached Servers. In code, it looks like this:

1 <?php

2 $m = new Memcached;

3 $m->setOption(Memcached::OPT_LIBKETAMA_COMPATIBLE, true);

4 $m->addServers(

5 array(

6 array('192.168.0.1'),

7 array('192.168.0.2'),

8 array('192.168.0.3')

9 )

10 );

Remember, that the data is distributed over the various servers— it’s NOT replicated. We are sharding the data over several servers and no single server holds all of the data. The amount of memory avaliable is addative— if all three servers in the pool have 32GB of memory avaiable to them, you effectively have 96GB of usable cache space.

Consistent Hashing

If you’re wondering how Memcached knows which server to get which keys from, since any of the three servers in the pool can only hold 1/3 of the total keys, GOOD QUESTION!

If you’re also wondering why we set Memcached::OPT_LIBKETAMA_COMPATIBLE in the previous example, another good question!

The Memcached client is responsible for handling all routing. That is to say, if KEY1 is on 192.168.0.1, and you ask 192.168.0.2 for KEY1, it will tell you the value doesn’t exist. The servers don’t talk between themselves and have no idea what keys exist besides the particular set they have stored locally. There is no cross-communication between Memcached servers in a pool.

When you ask the PHP Memcached client for KEY1, it has to figure out which server to connect to, otherwise it won’t get the right answer back (and it would be really bad for scaling if it had to check all of them, imagine if you had 50!). Since keys can be stored on any server in the Memcached pool, we need a way to figure out which server “owns” a particular key. It does this by hashing the key— that is, running the key through a function that will always return a consistent output. In the case of Memcache, the input is the key string and the output is the server in the pool.

Here is a very simplistic memcached key hashing function using strlen and modulo math.

1 <?php

2 $servers = array("192.168.0.1", "192.168.0.2", "192.168.0.3");

3

4 function memcache_hash($key) {

5 $key_length = strlen($key);

6 $no_of_servers = count($servers);

7

8 return $servers[$key_length % $no_of_servers];

9 }

Obviously, this is a poor hashing algorithm because most of our keys are going to be short and won’t distribute over Memcached evenly, but you get the idea— with the same key, we will always output the same server. Thus, we can determine which server “owns” a key when setting and retrieving it without having to check all of the servers.

Great, we’ve got hashing down, but what the heck is consistent hashing?

Consistent hashing is the same concept, except with a slightly different twist. In the example above, if the number of servers change (we add or remove one), the $no_of_servers count changes, which impacts the end hash that’s produced. That is to say, when you change the count of$servers, all keys will generate a new output, effectively wiping your cache pool clean. Even though the data is still there, the hashing algorithm will produce a different output for previously stored keys, so it won’t map the keys correctly, and PHP won’t be able to find previously stored data.

Consistent hashing fixes this. Using different techniques (pre-defined bucketing, token ring, etc), when the number of servers changes, only a small percentage of keys are remapped, allowing us to add and remove servers without effectively wiping our entire Memcached cluster.

Understanding the underlying algorithm isn’t total important. What is important is that you set the following option when using multiple Memcached servers in order to enable consistent hashing. By default, PHP will use Modulo-based key distribution, which will remap all keys when adding or removing a server.

1 $m->setOption(Memcached::OPT_LIBKETAMA_COMPATIBLE, true);

Enabling this does a few things— it turns on Consistent Hashing and sets the key hash algorithm to use MD5. Libketama is a general hashing algorithm for Memcache that is avaliable in most languages, so using Libketama means your hashing system will be easy accessible from almost every modern programming language.

Adding Weight

One option that the Memcached client provides us with when adding servers the choice to specify weight. Weighting allows you to influence the proportion of keys that are mapped to a particular server. This is useful for circumstances when your Memcached servers have different amounts of resources, particular memory, avaliable to them.

1 <?php

2 $m = new Memcached;

3 $m->setOption(Memcached::OPT_LIBKETAMA_COMPATIBLE, true);

4 $m->addServers(

5 array(

6 array('192.168.0.1', 11211, 25),

7 array('192.168.0.2', 11211, 25),

8 array('192.168.0.3', 11211, 50)

9 )

10 );

In the above example, the last server, 192.168.0.3, will get twice as many keys to store as the other two servers, because it has double the weight. This would be ideal, if, for example, 192.168.0.3 had double the memory of the other two boxes.

Persistent Connections

Instead of opening a new memcached connection on each PHP request, we can use persistent connections and re-use the same connection for multiple requests. Each PHP-FPM worker gets its own persistent connection, so if you have 50 workers, you will have 50 static connection to memcached.

Persistent connections are generally preferred as they are slightly faster because of reduced per-request connect/disconnect overhead and also keep the number of connections to memcached static. They can be especially helpful when you have several memcached servers in your pool that you have to connect to.

The way persistent connections are defined with the memcached extension is slightly confusing, so let me explain how it works.

First, you need to define a persistent connection pool name, by passing a string into the constructor, like so:

1 <?php

2 $m = new Memcached("my_pool");

Just like without persistent connections, the next step is to add the servers to the pool, but with one slight change. When we pass a pool name to the Memcached constructor, we’re asking it grab the existing persistent connection pool named my_pool if it already exists, otherwise give us back an empty pool. This is a different PHP paradigm then we’re used to be, because the pool lives across multiple requests.

So, before calling addServers, we need to check if the Memcached constructor has returned an already configured persistent connection pool or an empty connection pool. If it’s already configured, we don’t need to do anything! If it’s empty, we need to call addServers and any setOption’s that we may have.

The code looks like this:

1 <?php

2 $m = new Memcached("my_pool");

3

4 if (empty($m->getServerList())) {

5 $m->addServers(array(

6 array("192.168.0.1"),

7 array("192.168.0.2")

8 ));

9 }

information

It’s extremely important to do this check when using persistent connections. If you just call addServers on every request, it will re-add the same servers to the connection pool over and over. It doesn’t do any duplication checking, so you can potentially add thousands of the same server to a single connection pool.

Atomic Counters

One great use case of Memcache is for counting things— page visits, image views, active users, etc. This is for two reasons. The first is that if you’re currently using MySQL to count something that is update frequently, you can save a huge amount of resources by moving that counter out of the database and into a cache. Remember, everytime you write to your MySQL database, you’re destroying the entire query cache. If you happen to still have the query cache enabled and don’t see many writes otherwise, moving frequently-updated counter columns out of MySQL can make the query cache useful. Likewise, less writes to MySQL just means less load. The second reason why counting things is a great use for Memcached is also because it offers atomic counters, that is— lock-free (in your code, anyways).

You may be tempted to write your counter using Memcache like so:

1 <?php

2 $memcached = new Memcached;

3 $count = $memcached->get("counter:{$page->id}");

4 $count += 1

5 $memcached->set("counter:{$page->id}", $count);

This is wrong because it creates a race condition. If this code is run in two separate requests at the same exact time, it’s possible that one of the requests will overwrite the others increment. You can fix it by using CAS tokens (explained later), or, more easily, by using the atomic counters in Memcached. Consider this:

1 <?php

2 $memcached = new Memcached;

3 $count = $memcached->incr("counter:{$page->id}");

Likewise, Memcached also implements a command for decrementing

1 <?php

2 $memcached = new Memcached;

3 $count = $memcached->decr("counter:{$page->id}");

And you can increment or decrement by multiple steps by passing in a second argument with a numeric value.

1 <?php

2 $memcached->decr("counter:{$page->id}", 4);

Also worth noting— incr and decr return the newest value of the counter, so it reduces a roundtrip from the badly-designed GET/SET pattern described earlier.

Avoiding the Rat Race: CAS Tokens

CAS Tokens, also known as Check-And-Set Tokens, allow you to use Memcached for more-advanced use cases while avoiding potential race conditions. To use them, you must have a modern version of Memcache (1.4 or above). They work like this:

1. Before making any changes, call getCas to generate a CAS token for the key you plan on changing.

2. Calculate the new value you’d like to set for the key.

3. Send the new value of the key, along with the CAS token, to Memcached. Think of the CAS token as a timestamp of when the key value was last modified. Memcached compares the CAS token you sent with it’s own CAS token for that key. If they match, it means there have been no changes and it’s safe to change the value. If they do not match, it means there has been a change since you got your CAS token, and the value is not safe to change.

4. If the operation returns false (CAS tokens did not match), we try the same operation again.

For example, we could implement a simple counter like so (although, in practice, please use increment and decrement).

1 <?php

2 $m = new Memcached;

3

4 $tries = 0;

5 while ($tries < 10) {

6 $token = $m->getCas("counter:1");

7 $value = $m->get("counter:1");

8

9 $value += 1;

10

11 if ($m->setWithCas("counter:1", $value, $token)) {

12 break;

13 }

14

15 $tries++;

16 }

Two things to notice. The first is that we have to wrap our algorithm in a while loop, because it’s possible it can fail multiple times in a row if it’s a key that sees many updates. This is why it’s better to use atomic operations if possible. The second is to notice that we add a number of limits to the tries. This is to prevent the loop from just spinning and taking too much time. A preferred alternative, is to use a time-based deadline rather than a simple counter.

A deadline based CAS algorithm is implemented below, with a 20ms deadline.

1 <?php

2 $m = new Memcached;

3

4 $begin = microtime();

5 $deadline = 20;

6 while ($begin - microtime() < $deadline) {

7 $token = $m->getCas("counter:1");

8 $value = $m->get("counter:1");

9

10 $value += 1;

11

12 if ($m->setWithCas("counter:1", $value, $token)) {

13 break;

14 }

15 }

For convenience, it’s generally a better idea to wrap this sort of functionality into a helper instead of reinventing the wheel every time you need to write this code.

1 <?php

2 function cas_deadline($deadline, $block) {

3

4 $begin = microtime();

5

6 while ($begin - microtime() < $deadline) {

7 if ($block()) {

8 break;

9 }

10 }

11 }

12

13 // Used as so

14

15 $m = new Memcached;

16 cas_deadline(20, function() use ($m) {

17 $token = $m->getCas("counter:1");

18 $value = $m->get("counter:1");

19

20 $value += 1;

21

22 return $m->setWithCas("counter:1", $value, $token);

23 });

Watch out for timeouts

My favorite topic of discussion! Timeouts! Luckily, the memcached PECL extension has very fine-grain support for timeouts down to the millisecond level. Perfect. In older versions of the libmemcached library, there were some bugs around timeouts not working properly, but they are fixed now and we can use timeouts reliably with high precision (just make sure you have a recent version of libmemcached installed on your box).

1 <?php

2 $m = new Memcached;

3 $m->setOption(Memcached::OPT_CONNECT_TIMEOUT, 50);

4 $m->setOption(Memcached::OPT_RETRY_TIMEOUT, 50);

5 $m->setOption(Memcached::OPT_SEND_TIMEOUT, 50);

6 $m->setOption(Memcached::OPT_RECV_TIMEOUT, 50);

7 $m->setOption(Memcached::OPT_POLL_TIMEOUT, 50);

The default for each of these is either 1000 (1s) or 0 (forever), which is bad news on a production server. Depending on your setup, you may want to change my setting of 50ms, but I find that Memcached typically responds within milliseconds across local ethernet, so I keep it at 50ms to allow some breathing room without impacting the speed of the site. Remember, if PHP-FPM starts blocking on Memcached connections you will run out of PHP-FPM workers and your app will go down.

Using Memcached for Sessions

The Memcached extension has a session handler which allows you to use PHP’s built-in session handling, except it stores the sessions in Memcached instead of on the filesystem. This is necessary for scaling your app servers horizontally. Some bad guides on the internet will tell you to use NFS or MySQL. Don’t do that. Using Memcached for your sessions is as easy adding some php ini values.

1 <?php

2 ini_set('session.save_handler', 'memcached');

3 ini_set('session.save_path', '192.168.0.1:11211:50,192.168.0.2:11211:50');

4

5 ini_set('memcached.sess_consistent_hash', 1);

6 ini_set('memcached.sess_binary', 1);

7 ini_set('memcached.sess_prefix', 'app_session.');

8 ini_set('memcached.sess_remove_failed', 1);

9 ini_set('memcached.sess_number_of_replicas', 1);

10 ini_set('memcached.sess_connect_timeout', 50);

I like to use ini_set for these types of per-app values instead of defining them in php.ini, but it’s certainly an option to put them in php.ini if you’d like.

Notice that session.save_path can include multiple servers, port number, and weight. This gives you the ability to spread your session storage across multiple servers for failure and scalability.

The rest of the values are pretty self explanitory but I’ll run through them quickly.

· memcached.sess_consistent_hash: Turn on consistent hashing using libketama. A must with multiple memcache servers.

· memcached.sess_binary: Use the binary protocol for session connections.

· memcached.sess_prefix: Set a unique prefix for these session keys. Necessary if you have multiple PHP apps sharing your Memcached pool, without it you may have session collisions.

· memcached.sess_remove_failed: If one of your Memcached servers becomes unavaliable, remove it from the pool automatically.

· memcached.sess_number_of_replicas: One cool features that the PECL Memcached extension has is the ability to replicate the session data to multiple memcached servers, so if one dies, you don’t lose the data. Set it to the number of replicas (extra copies) of that you want.

· memcached.sess_connect_timeout: The connection timeout before the server is marked bad and skipped. Default is 1000 (1s), which is too high. On local ethernet, the connection time should be not be more than a couple of milliseconds. Set it to 50ms to give yourself some breathing room, but still quick enough to not slow down your app if a server goes down.

Anyways, after you set your session handler to memcached, you can use PHP sessions like normal—

1 <?php

2 session_start();

3 $_SESSION["foo"] = "bar";

Debugging Memcached from Telnet

Even if you’re using the binary protocol to access Memcached from PHP, that doesn’t mean you can’t use the ASCII protocol for testing! It makes it very easy to test Memcached from Telnet.

1 telnet 127.0.0.1 11211

2 >

From here, you can run any commands, like GET, SET, STATS, and FLUSH.

warning

If you don’t know what flush does, be careful! It flushes the entire cache from memory, without warning or confirmation. You’ve been warned.

The most useful debugging command is STATS. It will give you a bunch of useful information such as connections used, number of items, hit/miss ratios, etc.

1 telnet 127.0.0.1 11211

2 > STATS

3 STAT pid 4061

4 STAT uptime 9

5 STAT time 1390339143

6 STAT version 1.4.14

7 STAT pointer_size 64

8 STAT rusage_user 0.000000

9 STAT rusage_system 0.004994

10 STAT curr_connections 5

11 STAT total_connections 6

12 STAT connection_structures 6

13 STAT reserved_fds 20

14 ....

15 STAT threads 4

16 STAT conn_yields 0

17 STAT hash_power_level 16

18 STAT hash_bytes 524288

19 STAT hash_is_expanding 0

20 STAT expired_unfetched 0

21 STAT evicted_unfetched 0

22 STAT bytes 0

23 STAT curr_items 0

24 STAT total_items 0

25 STAT evictions 0

26 STAT reclaimed 0

27 END

You can also get and set values pretty easily from the command line for debugging purposes.

1 telnet 127.0.0.1 11211

2 > set foo 0 100 4

3 > data

4 < STORED

Here’s a quick ASCII protocol primer. set is the command, foo is the key name, 0 is an aribitrary unsigned 16-bit integer flag that Memcache lets us set (it’s how PHP marks a key as compressed, so it can uncompress the value later, for example), and 4 is the number of bytes of the value for this key. Lastly, we have data which is the value we want to store for the key foo.

Retrieving an item is just as easy.

1 telnet 127.0.0.1 11211

2 > get foo

3 < VALUE foo 0 4

Similar to the set command, we just pass the key name to get and it returns the value of the key, the flag, and the byte size of the data. Easy.

My Memcached Setup

What would a case study be without some real-world knowledge? This is how we have our Memcached server setup in it’s current incarnation.

Hardware

· Dual Xeon 2620 (12x 2.0GHz)

· 128GB Memory

· 1Gbps Ethernet

We have a handful of these servers— lots of cores because Memcached is multi-threaded and lots of memory because we have an incredible amount of data to cache.

Why run Memcached on it’s own servers?

One common misconception is that you can simply run a Memcached daemon anywhere that you have free memory. Let me tell you why this is not always a great idea— I used to do this. I ran memcached on all of our app servers, and after much trouble switched to dedicated Memcached boxes.

My motivation was to take advantage of all available resources. I noticed that the app servers had quite a bit of free memory, and across 25 or so boxes, it amounted to an extra 250GB that could be used for memcached. Score!

Except it caused problems. And the worst kind of problems, they were non-obvious and took almost a year to track down.

Let’s say there are 5 servers— 192.168.0.11 through 192.168.0.15. These servers are running both Memcache and PHP-FPM, serving web requests. We’re sharding our memcache load across all 5 boxes, like so:

1 <?php

2 $m = new Memcached;

3 $m->setOption(Memcached::OPT_LIBKETAMA_COMPATIBLE, true);

4 $m->setOption(Memcached::OPT_CONNECT_TIMEOUT, 100);

5

6 $m->addServers(

7 array(

8 array('192.168.0.11', 11211, 20),

9 array('192.168.0.12', 11211, 20),

10 array('192.168.0.13', 11211, 20),

11 array('192.168.0.14', 11211, 20),

12 array('192.168.0.15', 11211, 20),

13 )

14 );

All is great, timeouts are set, sharding is working, and we have a ton of extra cache available to us now.

Here’s the problem: load can be highly irregular on any of the individual app servers. Load spikes on 192.168.0.11 because an increase in web requests and the box slows down. Normally, this would be no problem— haproxy would simply fail it out of the pool gracefully and life would continue on.

But that doesn’t work as cleanly as we’d expect with this setup. Even though 192.168.0.11 is out of web pool, memcached requests are still going to it, and they’re taking longer to respond because of the high load on that box. Connections start timing out, consuming more time in PHP-FPM workers across the cluster, and “cause the infection to spread”.

Now with more PHP-FPM workers across the entire cluster slowing down because 192.168.0.11 is taking longer to respond, it’s causing the load to increase across the entire cluster. It gets worse. As the load increases across the cluster, 192.168.0.12 also starts to respond a little bit slower to memcache requests. And so on and so forth— the end result is that increased load on one server perpetuates to the entire cluster like a domino falling and they all go down.

This is because we’ve broken a principal rule of scaling! We want things to be horizontally scaled but when we add multiple roles to a single server, and introduce cross-communication (for instance, a single app server ends up talking to all app servers), we’re no longer horizontally scaled.

Recommended Options

Here are all of the PHP Memcached Client options that I recommend using for the best performance and future scaling options (ability to add and remove servers, for instance).

1 <?php

2 // Use persistent connections

3 $m = new Memcached("pool");

4

5 if (empty($m->getServerList())) {

6 // Set timeouts to 50ms

7 $m->setOption(Memcached::OPT_CONNECT_TIMEOUT, 50);

8 $m->setOption(Memcached::OPT_RETRY_TIMEOUT, 50);

9 $m->setOption(Memcached::OPT_SEND_TIMEOUT, 50);

10 $m->setOption(Memcached::OPT_RECV_TIMEOUT, 50);

11 $m->setOption(Memcached::OPT_POLL_TIMEOUT, 50);

12

13 // Turn on consistent hashing

14 $m->setOption(Memcached::OPT_LIBKETAMA_COMPATIBLE, true);

15

16 // Enable compression

17 $m->setOption(Memcached::OPT_COMPRESSION, true);

18

19 // Enable Binary Protocol and igbinary serializer

20 $m->setOption(Memcached::OPT_BINARY_PROTOCOL, true);

21 $m->setOption(Memcached::OPT_SERIALIZER, Memcached::SERIALIZER_IGBINARY);

22

23 $m->addServers(array(

24 array("192.168.0.1"),

25 array("192.168.0.2")

26 ));

27 }

Further Reading and Tools

twmemproxy

An open-source tool by Twitter, twmemproxy acts as a proxy server for memcached. Can shard data automatically, handle persistent connections, provide stats, and lower the connection count to the backend memcached daemon when you have many, many web workers.

Link to GitHub

peep

A tool for dumping the memory contents of a live memcached server to peek at what values it is storing. Great if you want to know exactly how memcached is being used.

Link to GitHub

PHP igbinary / compression / memcached benchmarks

Benchmarks showing different combinations of serialization, compression and memcached extensions (spoiler: igbinary + binary protocol + compression + memcached extension wins).

Link to GitHub

twemperf

Another open-source tool by twitter that can be used to benchmark memcached server performance.

Link to GitHub