June 2008


If you have a database of zipcodes and their latitudes and longitudes, you can use a version of this query to get the geographically closest zipcodes:

SELECT b.zipcode, b.city, b.state, b.latitude, b.longitude,
       ACOS(SIN(RADIANS(a.latitude))
          * SIN(RADIANS(b.latitude)) +
            COS(RADIANS(a.latitude))
          * COS(RADIANS(b.latitude))
          * COS(RADIANS(a.longitude - b.longitude))) as distance
 FROM zipcodes.zip_to_latlong a,
      zipcodes.zip_to_latlong b
WHERE a.zipcode=?
ORDER BY distance
LIMIT 20

The “distance” there is…I dunno…radians? I think the original is assuming the points are on a sphere, and converts from radians to degrees to miles using the 1.1515 statue miles per nautical mile standard.

I’m mostly a Perl guy (with secret love of Javascript), so I try to stay out of the Python stuff at dayjob where possible. But recently I’ve been taking the lead on a bunch of Memcached optimizations, which are starting to trickle over into the Python side.

A nice feature of the Perl Cache::Memcached module is the ability to define a “namespace” when you create the Memcached object:

my $memd = new Cache::Memcached (namespace => "foo_");

Then, any keys passed to the $memd object via get/set/etc. are automatically prefixed with “foo_”: $memd->get("123") actually requests the memcached key “foo_123”.

Python’s memcache module supports namespaces for the *_multi methods, but not on the individual get/set/etc calls. Also, the namespace must be passed on each call — you can’t specify it in the constructor. Well, subclassing saves the day again:

class Client(memcache.Client):
    def __init__(self, servers=None, debug=0, namespace=None):
        super(Client, self).__init__(servers, debug=debug)

        if namespace:
            self._namespace = namespace
        else:
            self._namespace=""

    # GET
    def get(self, key):
        try:
            val=self.get_multi([ key ])[key]
        except KeyError:
            val=None
        return val

    def get_multi(self, keys, key_prefix=''):
        if self._namespace: key_prefix=self._namespace + key_prefix
        return super(Client, self).get_multi(keys, key_prefix=key_prefix)

    # SET
    def set(self, key, val, time=0, min_compress_len=0):
        return self.set_multi({ key : val }, time=time, min_compress_len=min_compress_len)

    def set_multi(self, mapping, time=0, key_prefix='', min_compress_len=0):
        if self._namespace: key_prefix=self._namespace + key_prefix
        return super(Client, self).set_multi(mapping, time=time, key_prefix=key_prefix, min_compress_len=min_compress_len)

    # DELETE
    def delete(self, key, time=0):
        return self.delete_multi([key], time=time)

    def delete_multi(self, keys, seconds=0, key_prefix=''):
        if self._namespace: key_prefix=self._namespace + key_prefix
        return super(Client, self).delete_multi(keys, seconds=seconds, key_prefix=key_prefix)

    # EVERYTHING ELSE
    def add(self, key, val, time=0, min_compress_len=0):
        if self._namespace: key=self._namespace + str(key)
        super(Client, self).add(key, val, time=time, min_compress_len=min_compress_len)

    def incr(self, key, delta=1):
        if self._namespace: key=self._namespace + str(key)
        super(Client, self).incr(key, delta=delta)

    def replace(self, key, val, time=0, min_compress_len=0):
        if self._namespace: key=self._namespace + str(key)
        super(Client, self).replace(key, val, time=time, min_compress_len=min_compress_len)

    def decr(self, key, delta=1):
        if self._namespace: key=self._namespace + str(key)
        super(Client, self).decr(key, delta=delta)

The __init__ method is overridden to take an additional “namespace” parameter, which is stored in self._namespace. The get/set/delete methods all have namespace-capable *_multi versions, so for those I just pass the calls off to the appropriate one. The *_multi methods themselves are subclassed to check the self._namespace value as well as the namespace parameter, like normal. Finally, the add/incr/replace/decr methods are all modified to check the self._namespace value and prefix it to the key. Obviously, get/set/delete could have been done the same way.