Back to: Did somebody just try to buy the British government? | Forward to: In case you were wondering ...

Ouch!

Welcome to my blog.

I normally get around 10,000-11,000 readers a day poking their noses in here. By blog standards, that's a bit hardcore — but it seldom spikes over 15,000 visitors, even if I post something interesting enough to get BoingBoing'd or front page coverage on Hacker News or Slashdot. The most visitors I'd ever seen until yesterday were 120,000 on two consecutive days, when I bloviated about the iPad and struck a nerve.

It's a good thing I upgraded my server last month ...

In the past 24 hours, the blog entry I posted yesterday has had 1.44 million readers. No, I'm not making that up; I wish I was. I burned through about a year's worth of bandwidth in one day. The comment system turned to sludge and I had to switch it off for a while. The hits are still incoming (many thanks to Neil Gaiman for tweeting the URL to his million and a half followers!) and I've taken down all the comments (and switched off comment posting) on that article, reducing its page size from 250Kb to 56Kb in the faint hope of not having to, like, pay for the consequences of the entire internet suddenly deciding to flashmob my server.

I don't have an exact breakdown of what just happened, because my logfile analyser is still churning, from 2am last night. That's because the Apache logfile today is over 500Mb — it's normally about 150Kb.

Lessons learned?

1. Static HTML, with no graphics, is king. If you accidentally break a major news story, it will save your ass. They say one picture says a thousand words, but a thousand words of HTML is about 7-8Kb; a picture can easily be ten times the size. And dynamically generating content on the fly is expensive; while my server could probably keep up with 10k visits/day, 100k would kill it. By pre-generating the pages, my blog was able to keep up with millions of visits/day — as long as it wasn't running CGI scripts to regenerate the comments.

2. Even a PC with mundane desktop specs (dual 2.4GHz Athlon, 2Gb RAM, 1Tb disk space) can serve pages at an insane pace today — it may be a budget box, but it kicks sand in the face of the serious production servers I was working with at Datacash in 1998-2000.

3. You can optimize the hell out of CGI scripts but that won't do you much good if your load spikes by two orders of magnitude. So: aim to bulletproof your server for a one order of magnitude load spike (to cover every case but the once-a-decade flashmob), and be prepared to degrade service gracefully (e.g. by disabling comments) if you get that mobbing.

4. Tune Apache now — you won't get a chance to do it when the stampede arrives!

We now (I hope) return you to your regular scheduled service. If things go dark, it's probably because my hosting company pulled the plug. If they didn't, I'd like to take this opportunity to plug the hell out of ByteMark!

75 Comments

1:

Since you yanked comments on the previous post, here's the pithy comment I wanted to make:

Hey Charlie, I hope this doesn't mean you have to re-write the sequel to halting state again.

2:

It is to be hoped, as you say, that the comments wil re-appear. Or, at least post them as a static page, just so we can all see the fun ....

3:

Out of curiosity where did everyone come from? If BoingBoing/Slashdot only peak at 150k, then 1.5m suggests a much more widespread coverage.

4:

Well, Charlie said that the log analyzer is still churning...

5:

Well, it hit the front page of Reddit and stayed there for a significant time. I saw it a couple of other places too. StumbleUpon and one other place.

6:

Probably from word of mouth on Twitter, I saw Ben Goldacre pick it up as well as Neil Gaiman, who regularly makes websites fall over when he links to them. Them, plus retweets from their followers is easily enough to generate those kinds of numbers.

Incidentally, a good idea for hosting a robust blog I heard is to host it on Google App Engine. Most of the time you'll use so little capacity they won't bother to charge, and then their prices are reasonable if you do break the limits. Also, being Google, their capacity is, to a good approximation, infinite.

7:

You made front page on reddit too (twice). Isn't reddit larger than slashdot now?

8:

All that pain for an OITC hoax? Ouch!

9:

As mentioned, Neil Gaiman's 1.5 million Twitter followers were notified. I'd expect a reasonably large proportion of them to follow such a juicy story to here.

10:

Well congratulations for building a webserver that didn't melt when hit with that kind of traffic.
Kick arse author and grade A sysadmin, quite a combination :)

11:

The comments will be restored when the hordes directed my way by Neil et al subside. Hopefully tomorrow.

12:

Maybe sooner -- we're down to only 18,000 readers per hour!

13:

Nick: Ben Goldacre, The Guardian, Reuters business blog, Reddit, and more than 500 tweets in the first two hours. That'll do it.

14:

These days I outsource a lot of the sysadmin work. But I'm not a total numpty, and I tend to think in terms of identifying weak spots and ensuring they aren't brittle. You may have noticed that the core of this blog is image-free, and the pages are all pre-generated -- CGI scripts are only run when someone posts a comment or a new article.

15:

I'd go less "static HTML" and more "cache friendly". If you install something like Varnish (varnish-cache.org) anything which sets reasonable cache-control headers will limit on network speed, and importantly will gain protections against things like many slow clients which traditional servers like Apache weren't designed to handle. The other nice aspect is that it'll limit backend requests to one per cacheable resource so you avoid your server repeating the same work for many visitors.

I'd also plug the disqus-style AJAX comments approach, which allows the page to load even if the comment system is overloaded. This works nicely with the caching philosophy since you don't invalidate the whole page when someone posts a comment.

16:
and more than 500 tweets in the first two hours. That'll do it.

Does this mean the term "slashdotted" is obsolete and we should now used "twitted"? :)

17:

me, I prefer to go the not-even-Apache route (gatling, in my case) and use a static generator (lots of them out there) for my blog .. not that it ever gets a lot of visitors what with it being squarely aimed at my relatives and noone else ;)

still, the problem remains that in order to allow comments, some sort of dynamic thingy is needed in the background.

18:

Interesting. I can't wait to the see the headline, "SF author scoops major news networks!".

It's a shame that you don't have books coming out next week. Unless you do and I haven't been paying attention again.

Do you have anything setup to auto-shutdown areas of the blog? I'm thinking of a tracker that notices you're getting 100k hits in the last 10 minutes and shuts off comments. Might help limit the damage somewhat.

19:

Charlie's post is now a Wikipedia footnote link in the article on the Peer in question. There could be a fair amount of traffic from that.
http://en.wikipedia.org/wiki/Lord_James_of_Blackheath

20:

do you get to write the costs off against tax as a business expense. It's effectively publicity, and a necessary function of your job as a author

21:

Varnish is on the to-do list once the shit-storm dies down -- not while it's in progress.

AJAX -- hate it. I like to ensure my web experience is solid and runs cross-platform. And my choice of browser experience involves NoScript being enabled by default. And I cater to folks who think like me. So: no AJAX-based comments.

22:

Yes, my colo server is a business expense.

23:
…I like to ensure my web experience is solid and runs cross-platform.…

Bless you. And if that is not enough, I can repeat this from my cellphone.

24:

@Charlie - thanks for giving me one of the biggest "WTF???" moments of my life, ever. A marvellously surreal bit of reading.

@Andy Simpson #6 - I picked it up from Ben Goldacre's BadScience link, too

@Vincent Archer #16 - Maybe "twatted" would be more appropriate?

25:

I posted the URL to the UKNot email list, and got the following intriguing tip back: the Foundation X may be the so-called "Office of International Treasury Control", URL http://en.wikipedia.org/wiki/Office_of_International_Treasury_Control

These individuals seem to in effect be a Government-scale 419 scam, albeit one fond of threatening litigation. I say this not as an accusation of dishonesty on their part, but more as speculation as the sort of money these guys casually refer to does not casually swan about on the world stage unnoticed.

The United Nations and US Federal Reserve say that they have never heard of this group, and the group consistently refuses to give any proof of its status, hiding instead behind some claim of secrecy.

This office was first heard of in 2005, when it made an offer to buy the Rover Group car company, depositing a postal order for one pound as a deposit. The offer was not taken seriously by anyone; subsequently they have gone on to strut the world stage as a sort of international Arthur Daley.

This is their website: http://www.unoitc.org/ and profoundly unimpressive it is, too.

26:

Yes, exactly. Internal monologue:

"What?" "Naah; it's a setup for something." "Wait, that appears to be a genuine Hansard link."

...

"That is a genuine Hansard link."

...Googling...

"And this guy has genuine financial chops. WTF?!"

27:

You have at least some readers who think like you and appreciate your web design choices.

You are, however, in my NoScript whitelist, because I do comment here occasionally. if you AJAXed up the site, I'd dump you from the whitelist and move you to the "only turn on javascript when I want to comment" category.

Again, thanks for clean design. It is a thing of beauty from this side of the internet.

28:

Yep - that was the conclusion that the comments on the blog entry yesterday came to, and it seems to be generally accepted elsewhere by now as well.

However, given that those comments are currently invisible to keep the bandwidth overload down, I can't blame you for not being able to know the latest state of that discussion.

29:

Just so you know- the disqus stuff is actually quite nice. When integrated w/ Wordpress, it actually keeps the original comments on the page to allow for spiders, but allows for caching for a longer time. The active (AJAX) users post onto the disqus servers, and see the disqus threading. It bidirectionaly moves comments between the 2 systems, to keep things in sync, but it was fairly nice looking.

Mobile folks still see a working comment system, as do NOSCRIPT folks, just a slower version, that refreshes less often.

Very handy for large comment sites- it's very hard to deal w/ this otherwise.

30:

Disqus is... well, interesting. I wouldn't be leaping to move to it.

Out of curiosity, why stick to Apache when nginx and PHP play nicely these days? Apache's quite flexible but nginx has a massive performance advantage over it, and since it's event-driven it scales extremely well. Combine it with memcached and you can cache static gzipped assets in RAM directly, not to mention static pages- it's a cache server and web server all rolled into one. PHP-FPM for the backend and you're set. I migrated from a very busy Apache box to nginx, and was somewhat shocked by the difference - 10MB of RAM out of the box where a well-tuned Apache was using 600MB.

31:

PHP -- over my dead body. (It's a root exploit in motion.) nginx -- I'm not used to it, and I prefer to stick to what I know reasonably well. Folks aren't paying me to administer a web server, they're paying me to write fiction.

32:

I know that everyone is currently going gaga over the fact that you've publicised the UK buyout, but just let me say that
PHP -- over my dead body. (It's a root exploit in motion.)
Really made my day :-).

33:

It may be a momentary respite, but it looks from here like the load has dropped right off in the last hour.

34:

I looked up it's "who is" information and got back a few interesting bits.

It's coming from one IP address 'Stateside (70.87.111.98), but is co-hosted with a fine selection of sites that appear to be a some else's old domains and which are forwarding aggregated content.

A quick and dirty "who is" report here: http://www.robtex.com/dns/www.unoitc.org.html#summary Source code for the front page says that it was built with Wysiwyg page editor, an online service at http://www.wysiwygwebbuilder.com/

Smells like a small-time programmer or wannabee IT type with too much time on his or her hands.

35:

I noticed it took a while to load (took me about 5 or 6 tries to get a full page view, rather than a time out message from Firefox), but since I'm over in Australia, I assumed the problem was at our end rather than yours.

Hope things calm down soonish.

36:

Sometime over the past couple of hours the http://www.unoitc.org/ website has apparently exceeded its allocated bandwidth.

What is the world coming to when bandwidth costs more than even a shadow government can afford.

37:

We are now back to normal. Phew!

38:

Houston, baby! (with a possible Dallas co-location) Commercial server @ ThePlanet,com, street address, everything.

http://whois.domaintools.com/70.87.111.98

Reports going in....

39:

Well, if people that high in government and finance fall for something like this, that's arguably relevant news in itself.

40:

It will be interesting to see what your traffic is like in 6 months. I for one got sucked in by the Hansard story, via the Delta Green mailing list, but will likely return regularly now I know you are here. I never realised you had a blog!

41:

Apologies if you've answered this elsewhere, but what do you use for content management? Is it a name we might recognize or have you rolled your own?

Congratulations on weathering the traffic spike.

42:

Can I just observe that IME Disqus is missing a "t" at the end? Seriously, it just plain doesn't work properly for anyone who doesn't have a Disqus account.

And totally off-topic, but several of my real-World friends are getting Charlie Stross books as part of their Christmas presents this year!

43:
it just plain doesn't work properly for anyone who doesn't have a Disqus account.

Friendly Atheist switched to Disqus a while back, and I was a bit surprised that it wasn't just rabid Noscript luddites (like myself) who were having problems.

The tech guy responsible spent a couple of days explaining that actually, it really does work just fine as long as all you commenters switch to Firefox, so please do that. What do you mean you can't install software at work? ... "Don't want to?" You're all wrong!

IIRC it lasted a couple of days before they switched back.

I guess this all means that if Charlie caters to people like me, I'm morally obliged to buy more of his books. Damn. I guess I don't really need both food and that hardware upgrade. Good thing I've stocked up on instant noodles.

44:

I would have agreed if he hadn't already made my day with the line about his choice of browser experience involves NoScript being enabled by default (and how he caters to folks who think like him).

45:

Another source of the heavy hits could be Pharyngula; someone posted a link there, on The Endless Thread.

46:

And totally off-topic, but several of my real-World friends are getting Charlie Stross books as part of their Christmas presents this year!

I beg to differ. Real-World purchases have plenty to do with a major traffic spike. Tea and scones all round! And cat kibble. Just for the cats, not all round.

47:

You'll be chuffed to hear Sky News just linked to the original post...

48:

I'm calling it now: there's no more actual readers than usual; this was all a massive DDoS attack orchestrated by Iain Banks in long-awaited vengeance for a Westercon snub back in '05.

Best served cold, indeed. YOU HEARD IT HERE FIRST

49:

I'm an SRE on Blogger and while I'm sure you are happy with your setup (so I'm honestly not trying to convert you here) dealing with this kind of burst is one of the better reasons for using one of the larger hosted services (such as Blogger).

We are a regular target of DoS attacks so we've got very good at dealing with traffic bursts, although the kind of traffic you are talking about wouldn't really change our day to day numbers.

Notably Neil Gaiman uses blogger- try doing 'host journal.neilgaiman.com' on a command line.

50:

You might find the bit.ly statistics for the piece interesting...

51:

I submit that the Twitter replacement for slashdotted should be:

TWITTERFRIED

52:

Pleased to see the sensible PHP attitude.

On my box, I have one or two users making very light use of PHP. I force it through fastCGI, which means the PHP processes can be swapped out independently of the httpd processes, which helps. (my box is not very large).

53:

Does this strike anyone as a design flaw in the internet? The more people are interested in something, the worse it performs.

54:

The term has long since generalized, (boing'd, notably, but Penny Arcade and a rogues gallery of big name bloggers can do it) and I'm not even sure /. has that kind of juice anymore. Twitter storms and reddit/stumbleupon are multiplier effects, not root causes.

55:

Not in all cases. With BitTorrent-style peer-to-peer protocols, the more people interested in something, the better it performs.

I leave it as an exercise for the reader to implement a distributed blog architecture.

56:

Apache serves static files really quite well, but I'll have to second the recommendation for nginx, the performance really is astounding.

We recently switched to it from Apache (serving up static files and reverse proxying cinema website software, those Harry Potter midnight premieres are busier than one would think).

It's well worth putting time into it, if one foresees load spikes, but it does have a few surprising gotchas. The reverse proxy has some odd corner cases when coupled with SSL and the cache module is IMHO awful (we use varnish for the cache instead). The documentation is comprehensive, but linguistically confusing. By 'a bit of time' I mean that setting it up is easy, but getting it right is tricker for setups more complex than just static files and a bit of php.

57:

I'm obviously biased, but the idea of moving to App Engine isn't a bad one. My own blog system, Bloggart, renders all content at post-time, then serves it statically. Which, as you suggest, is about as efficient as you can get.

Feel free to get in touch if you're interested in help migrating. ;)

58:

I'd love to know if traffic like that translated into book sales. Do authors get a nice sales/second ticker from online book sales or is that sort of delicious data reserved for publishers?

59:

AhahaHaHA! ... Oops.

Tim, that would be logical. But this is the publishing industry you're talking about. Even the big publishers don't get that kind of real time fine-grained information. Bookscan -- the best book sales tracking network out there -- only captures about 50% of sales (in the US) and can tell you how they're doing from week to week. (In the UK, bookscan is a lot more efficient and, I'm told, captures everything, with daily granularity. But even so ...)

60:

Something shady all around...(ok, ok, I put on my tinfoil hat this morning)

Do some research, this time about ThePlanet.com...talk about a convoluted corporate structure that could probably use a visit from one or more of the investigative arms of the United States government.

I hunted around ThePlanet.com's website and eventually came across yet another group of owners/investors??? http://www.gipartners.com/investments

Now that was interesting and I am sure I am delving into a lot of corporate shell games, but why is it that although everyone here has already figured out that unoitc.org is hosted by ThePlanet.com, why is it www.netcraft.com doesn't list/find unoitc.org among the multitude of websites it tracks and yet has a rather thorough alphabetical listing of sites run by ThePlanet.com

One can start here: http://toolbar.netcraft.com/netblock?q=netblk-theplanet-blk-14,74.52.0.0,74.55.255.255 and narrow down ones search by using that marvelous troubleshooting technique known as half-splitting.

61:

Scratch my last, Mr. Stross...found it on netcraft after a lot of additional searching...

http://toolbar.netcraft.com/site_report?url=http%3A%2F%2Fwww.unoitc.org

62:

all the traffic is from the estonia sf award, we should know kilgore trout one one years ago...

63:

I like to ensure my web experience is solid and runs cross-platform. And my choice of browser experience involves NoScript being enabled by default. And I cater to folks who think like me. So: no AJAX-based comments.

I agree 100%. However, that doesn't rule out a well-written AJAX addition. If it is done right, it wouldn't be a problem for browsers that don't do JS well, nor for those with noscript.

Note that Disqus is not an example of AJAX comments done right.

64:

@Thorne: theplanet.com is an ISP. They host all sorts of stuff that has as much connection to them as people with gmail.com addresses have to Google.

65:

Better this than bacon cat.

66:

On the subject of ever so mundane and commonplace conspiracies of the dull but remunerative variety ..along the corrupt lines of .." what harm can it do since THEY are Bound to do something about it and I really NEED the Money!" ...

"Highly enriched uranium that could be used to make a nuclear bomb is on sale on the black market along the fringes of the former Soviet Union, according to evidence emerging from a secret trial in Georgia.

Two Armenians, a businessman and a physicist, have pleaded guilty to smuggling highly enriched uranium (HEU) into Georgia in March, stashing it in a lead-lined package on a train from Yerevan to Tbilisi."

http://www.guardian.co.uk/world/2010/nov/07/nuclear-material-black-market-georgia

It is always ... Always ! .. in my 'umble opinion a question of some-one saying to -say - a humble Jr Technician ..' very clever young man ..so You noticed that the trucks were delivering concrete at one side of the Building site and other trucks were collecting said concrete at the other side ? Well if you don't keep your clever little 16 year old mouth shut you will wind up under some of that concrete!' This back in the very cheerful peace and love ridden mid 1960s United Kingdom...that was also ridden with local government and police corruption.

Billions of euros and Us of Avian Dollars and a few decades later and beyond the borders of euro land? It's still pretty much the same sort of currency but with a few zeros added to the equation .. Conspiracy Theories exist because Conspiracies exist.

The dafter conspiracies are usually veils for the real ..and really Mundane ... financial conspiracies along the lines of my favorite ...

"In January 1984 Adriaan Nieuwoudt started the so-called "Kubus" scheme with an apparent beauty product in South Africa. Subscribers to the scheme bought a supposedly biological substance called an "activator", that was used to grow cultures in milk. After growing for a week or two, the cultures were harvested and dried, and sold back to the scheme. The cultures were never used for a beauty product but were simply ground up and resold to further investors as activators.[5] Other schemes by Nieuwoudt include investment in a holiday resort and a scheme involving collecting useless old postage stamps. He is currently seeking investors for a get-rich-quick coaline mining operating on his farm. "

http://en.wikipedia.org/wiki/List_of_Ponzi_schemes As for the latest scam? ..

" Georgia trial reveals how sting netted highly enriched uranium that had been smuggled via train inside lead-lined cigarette box "

Evil Bastards! A 'Cigarette Box ' Clearly the smugglers were .... Smokers!Who planed their nefarious scheme whilst they were taking a ciggy break!

Vast Schemes to make Billions through supposedly Ever So Clever politicians and their advisor's? .. Just follow the Money.

67:

Can you tell, what blog engine do you use?

68:

Movable Type, somewhat customised, backing onto MySQL, on Linux ( not saying what flavour lest someone goes dumpster-diving for backdoors).

69:

Well, that was fun. Reddit and StumbleUpon alone would be enough to melt most servers. (I'm not going to mention which prominent weblog known for slashdotting smaller sites has been slashdotted by StumbleUpon, but you can work it out.) Neil tweeting, ditto. Throw in Boing Boing, Hacker News, Ben Goldacre, the Guardian, and David Ickes and his ilkes, and it's impressive that your site held up at all.

I assumed when I first read the story that it had to be an attempted fraud, because otherwise it just didn't make sense. I'll grant it's more sophisticated than most. Lord James of Blackheath was a good choice of target.

Re your comment #59, are you sure you'd want institutions knowing exactly which books everyone is reading?

70:

Teresa, I don't think Bookscan in the UK knows who is buying what books -- merely how many copies of each title are sold. (I'm told Hachette's ability to track sales around the UK is astonishing to an almost scary degree -- able to track down to number of copies of a given title per individual store per day -- but again: they don't know who is buying them, much less who is reading them.)

The most reassuring thing about the server-melting experience was learning that my ISP doesn't consider it to be an unusual load -- they have other clients who get hammered a lot harder! (Also: I have a damn fine sysadmin on call to help me out with the stuff I'm not competent to deal with myself.)

71:

Well Charlie, make that "they don't always know...buying them, at least in principle" and I'll agree. With the way that card transactions are recorded on EPOS systems, the only reason that every retailer (rather than just major E-tailers) can't produce data that says that "Paws4thot bought 'The Atrocity Archives', 'Iron Sunrise', 'Michael Tolliver Lives' and 'Chasm City' last month" is that high street retailers don't all maintain the relevant "customer purchased titles" table for card transactions. E-tailers do, which is what drives their "recomendations for you" list systems.

72:

It's. the. royals. buying. post. collapse. asylum. ;)

73:

I think the term should be twitterpated. Heh heh heh. Suffering from too much affection.

I guess I picked a bad day to catch up on the blog. At least I'm an irregular regular, rather than one of the horde.

74:

I think the term should be twitterpated. Heh heh heh. Suffering from too much affection.

75:

errr. doh. can't edit or remove a comment once belated realized that I was responding to old news as if it were current. oops.

Specials

Merchandise

About this Entry

This page contains a single entry by Charlie Stross published on November 4, 2010 11:31 AM.

Did somebody just try to buy the British government? was the previous entry in this blog.

In case you were wondering ... is the next entry in this blog.

Find recent content on the main index or look in the archives to find all content.

Search this blog

Propaganda