1. Get an EC2 account.
  2. Choose an international server zone from the drop down in upper right.
  3. Create a tiny instance, keys, etc.
  4. Edit security zone to allow connections from your network to the instance via some non-priv port, say, 1999.
  5. SSH to the new instance.
  6. Use ssh-keygen to generate a local keypair on the server & add the public key to ~/.ssh/authorized_keys
  7. Run: ssh -gND 1999 localhost
  8. Use your server instance’s IP and the port from steps 4 & 7 in your proxy settings.

If you’re trying to use the proxy from a Mac or iOS devices, create a .pac file that looks like this, replacing the IP_ADDRESS and PORT to match the server:

function FindProxyForURL(url, host) {
 return "SOCKS <IP_ADDRESS>:<PORT>";
}

Obviously, if you’d like to keep the international spooks out of your traffic, you could do step seven between your local machine and the EC2 instance and configure your client to proxy locally. The traffic would then be encrypted by SSH while traveling over FLAG Atlantic 1, but if that’s the problem you’re trying to solve TOR is probably a better solution.

BEWARE: this is a completely open proxy. Don’t leave it running. If you do, someone will find it and proxy their child porn through it and the NSA will find you.

Normally git uses your default keys for authentication – ~/.ssh/id_rsa or whatever. Sometimes though, you want to use some other key pair. The fix is simple, but not obviously documented in the few places I looked, so here you go:

Let’s say you would typically use something like this:

# git remote add origin git@git.foo.com:bar/baz.git

But you want to use ~/.ssh/my_other_key to authenticate.

In ~/.ssh/config add a block like this:

Host git-foo-com-other-key
  HostName git.foo.com
  IdentityFile ~/.ssh/my_other_key

Then, instead of the above git-remote, use this:

# git remote add origin git@git-foo-com-other-key:bar/baz.git

Now, git push origin will use the appropriate key (the corresponding public key is known to git.foo.com, right?)

In httpd.conf:
RewriteEngine On
RewriteMap urlshort txt:/path/to/file/map.txt
RewriteRule ^/s/([a-zA-Z0-9]+)$ ${urlshort:$1} [R]
in map.txt:
abcdef http://www.foo.com/blah/
bcdefg http://www.bar.com/foo/
The RewriteRule can be tweaked (removing the /s, using a fixed-length key, whatever) as long as whatever matches in the parens is the key to the map.txt table. And map.txt could be a different kind of map (including an external program that looks the key up in a DB, logs the request, etc.)

I was recently asked about enabling mod_concat on the work servers to make it easier to combine things like .js and .css files into a single request. It works nicely:

<script src="foo.js" type="text/javascript"></script>
<script src="bar.js" type="text/javascript"></script>

becomes

<script src="./??foo.js,bar.js" type="text/javascript"></script>

This is nice and it’s reasonable with Last-Modified-Date headers, etc. But if you have some PHP files in your document tree and request something like this:

http://www.foo.com/blah/??foo.php,bar.php

It’ll expose unparsed PHP which is almost never what you want to happen.

The right solution, obviously, would be to use subrequests to get the content back from each file and then concatenate all of that. But that’s much more complicated and 90% of the time unnecessary: how often do you really want to combine the output of two dynamic URLs but can’t just do it within the application?

Anyway…the obvious answer was to just not do it. Let mod_concat return static files that would have been returned but the standard static content handler, but bail on everything else. This patch does exactly that.

So you have a running EC2 instance. It works great, except it’s one of the ephemeral, kill-it-and-you-lose-everything kind. An EBS-backed instance is the logical choice, so how do you convert it? Easy:

Step 1: Create the EBS volume

Just do it in the web interface. You could use the command-line tools, but why? While you’re there, attach it to your running EC2 instance, making note of the volume-id and device it’s connected to, eg: vol-abcd1234 and /dev/sdf

While you’re in the web interface, make a note of the ramdisk and kernel your running instance is using. This will be important later. They’ll be something like “ari-12345678” and “aki-abcdef12”, respectively.

Step 2: Sync your running instance

If you have things like mysql running, shut them down. It’ll save you hassles later. Then create a FS on your EBS volume:

# mkfs.ext3 /dev/sdf

Next, mount it:

# mkdir /mnt/ebs

# mount /dev/sdf /mnt/ebs

Now, use rsync to copy everything over to the EBS volume:

# rsync -a –delete –progress -x / /mnt/ebs

You won’t have /dev/sda2 for your /mnt partition on EBS, so you need to remove it from the copied fstab in /mnt/ebs/etc/fstab. Comment it out, remove it, whatever.

(Added 11/2010 thanks to Mark Smithson in the comments)

You may need to create some device files on the new EBS volume. If console, zero or null don’t exist in /mnt/ebs/dev, create them using some or all of these:

# MAKEDEV -d /mnt/ebs/dev -x console
# MAKEDEV -d /mnt/ebs/dev -x zero
# MAKEDEV -d /mnt/ebs/dev -x null

Unmount the EBS volume:

# umount /mnt/ebs

Step 3: Get your keys in order

You’ll need an EC2 X.509 cert and private key. You get these through the web interface’s “Security Credentials” area. This is NOT the private key you use to SSH into an instance. You can have as many as you want, just keep track of the private key because Amazon doesn’t keep it for you. If you lose it, it’s gone for good. Once you have the files, set some environment variables to make it easy:

# export EC2_CERT=`pwd`/cert-*.pem

# export EC2_PRIVATE_KEY=`pwd`/pk-*.pem

Step 4: Make your AMI

Now you can make a snapshot of your EBS volume. This is the basis of the AMI you’ll be creating. Whatever you copied to the EBS volume in step 2 will be there — user accounts, database data, etc. First, the snapshot (using the volume-id from step 1):

# ec2-create-snapshot vol-abcd1234

That’ll give you a snapshot-id back. You then need to wait for the snapshot to finish. Keep running this until it says it’s “completed”:

# ec2-describe-snapshots snap-1234abcd

Finally, you can register the snapshot as an AMI:

# ec2-register –snapshot snap-1234abcd –description “your description here” –name “something-significant-here” –ramdisk ari-12345678 –kernel aki-abcdef12

(The arguments to ec2-register should be normal Unix-style long options: “-“, “-“, “snapshot”; “-“, “-“, “kernel”. WordPress seems to be displaying those as an mdash instead. It needs to be a double-dash.)

Step 5: Launch!

At this point, you should see your EBS volume, the snapshot, and your AMI in their respective areas of the web interface. Launch an instance from the AMI and you’ll find it pretty much exactly where you left your original instance.

Edward Tufte’s sparklines are a nice way to present small minimal graphs in contexts where larger, full charts are not needed or warranted. The growing support for the HTML <canvas> tag gives an opportunity to create these graphs dynamically in Javascript.

(Download yahoo-widget-sparkline-1.0b.zip.)

This widget was built against YUI version 2.7.0 but may work in other versions. It subclasses the “charts” module and is easily loaded via YUILoader. The test.html creates a auto-updating sparkline based on a simple FunctionDataSource within a paragraph of text.

Features:

  • If passed a height of “font-size” or “line-height” will cause the module to examine the CSS of the element to which it will be rendered to determine its height. This allows you to easily insert a sparkline into blocks of text without worrying about scaling the graph by hand.
  • By default it will remove all content within the element that will contain the <canvas>. This allows “alt=” type data to be displayed when <canvas> is not available.
  • Arbitrary highlighting of points with custom colors.
  • mean, median, standard deviation and arbitrary “normal” range shading.
  • Uses the Chart object’s polling setup to auto-update charts by re-polling the DataSource.

Usage:

Pull in the module:

var loader = new YAHOO.util.YUILoader();
loader.addModule({ name: 'sparklinecss', type: 'css',
                   fullpath: 'Sparkline.css'});
loader.addModule({ name: 'widget.Sparkline', type: 'js',
                   requires: ['charts', 'sparklinecss'],
                   fullpath: 'Sparkline.js'});
loader.require('widget.Sparkline', 'datasource', 'element');

Then create a sparkline:

var sl = new YAHOO.widget.Sparkline("sparkline-container",
                                    myDataSource,
                                    { height: 20, yField: "value" });
sl.subscribe("sparklineUpdate", updateHandler);

Configuration Options:

max
Force a maximum y-value. Default is the largest value in the data.
min
Force a minimum y-value. Default is the smallest value in the data.
width
Width of the graph. Default is one pixel per data point.
height
A pixel value or “font-size” or “line-height”. Default is 20 pixels.
logscale
Change the y-axis to a logarithmic scale.
clearParent
Remove any children of the node passed to render() before adding the canvas. Default true
zero, mean, median
Add horizontal lines at the appropriate places. Each can have a boolean true value or a string specifying the color for the line.
stddev
Calculate the standard deviation of the data and shade the background between +/- one standard deviation from the mean. Can be true or a color.
normal
An object with two or three fields: min, max, and (optionally) color. A background rectangle covering the range will be drawn in the specified (or default) color.
color
Graph color
highlightPoints
An array of two-field objects: x and color. x is the index of the point in data to highlight and color is the color to use. x-values of max and min highlight the maximum and minimum values. An x-value of -1 highlights the last point in the data.

To Do:

  1. Graceful failure when <canvas> is unavailable.
  2. Allow graphing of data extracted from HTML markup.

(updated 7/7/2009)

Many years ago I wrote a simple perl+mysql polling script for work. It collected votes for multiple-choice polls, stored them in a database, and returned a results page. It was so simple and easy to use, we’ve continued using it for YEARS. Every so often we ask a question that is HIGHLY controversial (I’m looking at you NOW) and we are suddenly inundated with votes, causing much of the site to slow down or return errors.

The last time this happened I made some changes to the code that made it more difficult to vote more than once. Part of that involved storing the current vote totals in Memcached instead of asking the database each time. But even with that change, we’re still running a steady 3 or 4 votes per second and would be vulnerable to legitimate spikes in traffic from on-air promotion or something similar.

So today, I’ve reworked the script to run ALL traffic through memcached and NEVER hit the database at all unless memcached is completely unavailable. The basis for this is BroddlIT’s “memcached as simple message queue” at least as far as I can figure from his rough sketch.

The general idea is, there’s a lock, a pointer to the next slot in the queue, and the queue entries themselves.

Getting the lock

This waits up to a second to get the lock by sleeping for 1/1000th of a second between attempts. You can implement multiple queues by varying the $key_prefix. If the loop exits without getting the lock, you have to decide what to do.

  my $key_prefix = "poll-queue";
  my $total_sleep = 1000000;
  while($total_sleep > 0) {
    my $lock_key = $memd->incr("$key_prefix-lock");

    if(!defined $lock_key) { # No key in cache.
      $memd->set("$key_prefix-lock", $lock_key = 1);
      $memd->set("$key_prefix-curr", 0);
      last;
    } elsif($lock_key == 1) { # We got the lock.
      last;
    }

    # Someone else has the lock, so wait 1/1000th of a second.
    $total_sleep -= usleep(1000);
  }
  if($total_sleep <= 0) {
    # You didn't get the lock. Figure something out.
  }

If you make sure you normally hold the lock for a very small amount of time, you can assume that waiting significantly longer than that without success gives you reasonable cause to take over the lock. The only trick there, as you might expect, is that you need to then release it before another process makes the same assumption.

Get Next Key Pointer

The $key_prefix-curr memcached key contains the index of the last entry in the queue. By calling the $memd->incr() method we get the next one and update the memcache at the same time. If for some reason the pointer doesn’t exist, it is created. The $key variable becomes the memcached key for this message.

  my $next_key = $memd->incr("$key_prefix-curr");
  if(!defined($next_key)) { $memd->set("$key_prefix-curr", 1); }
  my $key = sprintf("$key_prefix-key-%d", $next_key);

Store the Message in the Queue

Simple. Run this and the previous step in a loop to insert multiple values. But be careful because the lock is active, so the longer you take the longer you block other requests.

  $memd->set($key, $message);

Release the Lock

This releases the lock, which allows another process to insert its messages.

  $memd->set("$key_prefix-lock", 0);

Reading and Flushing the Queue

Obtain the lock in the same way as above, then get the $key_prefix-curr value and loop from 1. Just make sure you process the values after you release the lock so you don’t block new entries.

  my $last_key = $memd->get("$key_prefix-curr");
  for(my $i=1; $i<=$last_key; $i++) {
    my $key = sprintf("$key_prefix-key-%d", $i);
    push @messages, $memd->get($key);
  }
  $memd->set("$key_prefix-curr", 0);
  $memd->set("$key_prefix-lock", 0);

  process_messages(@messages);

So that’s it. It should require only Cache::Memcached and Time::HiRes.

Update: if you run multiple memcached daemons and connect to them all (allowing key-hash load balancing), you will probably want to set the “no_rehash” flag and add a bit more error checking. If you allow Cache::Memcached to re-hash your keys if a server connection fails, you could end up losing the various keys temporarily.

So you have a directory structure that contains all of the applications needed to run a set of servers, but no server needs the entire tree. This rsync configuration is the easiest way I can come up with to rsync only what is needed for a single server:

If /export/apps contains bin, lib, var, etc, share and sbin but you only want bin, lib, and one subdirectory of share rsynced, `hostname`-exclude-list.txt then looks like this:

+ /apps/bin/
+ /apps/lib/
+ /apps/share/vim/
- /apps/share/*
- /apps/*

Using this command, you can then sync the directories:

# rsync -avz --delete-excluded \
        --exclude-from=`hostname`-exclude-list.txt \
        /export/apps /local

With the --delete-excluded, if you change the exclude file, newly excluded entries will be removed from the destination. The copy will end up in /local/apps. More usefully, /export/apps could be on a remote server and the destination would be /export.

A method for randomly selecting N values from a finite (but unknown length) list of elements, without replacement. Algorithm is from Taming Uncertainty. This version takes lines of input on STDIN (or files on @ARGV) and selects 20 of them at random and prints the results to STDOUT.

#!/usr/bin/perl

use strict;
my $N = 20;
my $k;
my @r;

while(<>) {
  if(++$k <= $N) {
    push @r, $_;
  } elsif(rand(1) <= ($N/$k)) {
    $r[rand(@r)] = $_;
  }
}

print @r;

The VGT Omnivore’s Hundred

1) Copy this list into your blog or journal, including these instructions.
2) Bold all the items you’ve eaten.
3) Cross out any items that you would never consider eating.
4) Optional extra: Post a comment at www.verygoodtaste.co.uk linking to your results.

I’ve eaten 60/100.
I’ll skip 4/100.

(via. Dean Sabatino)

1. Venison
2. Nettle tea
3. Huevos rancheros
4. Steak tartare
5. Crocodile

6. Black pudding
7. Cheese fondue
8. Carp
9. Borscht
10. Baba ghanoush
11. Calamari
12. Pho
13. PB&J sandwich
14. Aloo gobi
15. Hot dog from a street cart

16. Epoisses
17. Black truffle
18. Fruit wine made from something other than grapes
19. Steamed pork buns
20. Pistachio ice cream
21. Heirloom tomatoes
22. Fresh wild berries
23. Foie gras
24. Rice and beans

25. Brawn, or head cheese
26. Raw Scotch Bonnet pepper
27. Dulce de leche
28. Oysters
29. Baklava

30. Bagna cauda
31. Wasabi peas
32. Clam chowder in a sourdough bowl
33. Salted lassi
34. Sauerkraut
35. Root beer float

36. Cognac with a fat cigar
37. Clotted cream tea
38. Vodka jelly/Jell-O
39. Gumbo
40. Oxtail
41. Curried goat

42. Whole insects
43. Phaal
44. Goat’s milk
45. Malt whisky from a bottle worth £60/$120 or more
46. Fugu
47. Chicken tikka masala
48. Eel
49. Krispy Kreme original glazed doughnut

50. Sea urchin
51. Prickly pear
52. Umeboshi
53. Abalone
54. Paneer
55. McDonald’s Big Mac Meal
56. Spaetzle
57. Dirty gin martini
58. Beer above 8% ABV

59. Poutine
60. Carob chips
61. S’mores
62. Sweetbreads
63. Kaolin
64. Currywurst
65. Durian
66. Frogs’ legs
67. Beignets, churros, elephant ears or funnel cake
68. Haggis
69. Fried plantain
70. Chitterlings, or andouillette
71. Gazpacho
72. Caviar and blini
73. Louche absinthe
74. Gjetost, or brunost
75. Roadkill
76. Baijiu
77. Hostess Fruit Pie
78. Snail
79. Lapsang souchong
80. Bellini
81. Tom yum
82. Eggs Benedict
83. Pocky

84. Tasting menu at a three-Michelin-star restaurant.
85. Kobe beef
86. Hare
87. Goulash
88. Flowers
89. Horse
90. Criollo chocolate
91. Spam
92. Soft shell crab
93. Rose harissa
94. Catfish
95. Mole poblano
96. Bagel and lox
97. Lobster Thermidor
98. Polenta
99. Jamaican Blue Mountain coffee

100. Snake