Optimization Pitfall: Memcached Memory Limit

Google describes Python as a language that is easy to develop with. It can greatly improve developer productivity and code readability. This Python’s strength comes at a price of slow performance. Well, interpreted high-level languages are inherently slow.

Fortunately Django comes with great caching techniques which can bypass all the calculations on the Python level. Memcached is the fastest of them all as described in the Django documentation. In order to effectly use this feature, you should understand how Memcached works or else you will fall into the same problem I encountered this week.

I spent so much time to figure out why Memcached suddently stopped caching the data. Here is code:

    from django.core.cache import cache
    def my_django_view(request, template, page):
        articles = cache.get('all_popular_articles')
        if articles is None:
            articles = some_complex_query()
            cache.set('all_popular_articles', articles, 60*60)
        return render_to_response(template, 
                                  {'articles', paginate_func(articles, 
                                                             page)})

Of course this works during the development even when in production. But as the data grows, caching suddently stopped. When I run it using other types of Django caching like local memory (CACHE_BACKEND = ‘locmem://’) it works perfectly.

I realized later that Memcached has a default limit of 1MB. So caching all these articles into a single key would eventually exceed the limit when the data grows.

In order to overcome this limit, I grouped the articles into a certain size which Memcached can manage.

Update (Mar 7, 2012): This technique is proven to be worse than not using Memcached at all. :(