« How I got here in the end, part ten: head-first into the Singularity | Main | How I got here in the end, part twelve: the end of the beginning »

How I got here in the end, part eleven: the music stops

I don't remember very much about 1999; it was all a bit of a blur.

I know I was there, physically: I remember 1998, and I wouldn't be here today if I hadn't lived through 1999. But 1999 itself is an enigma wrapped in a puzzle enrobed in a chocolate coating of darkest mystery. Lots of things happened in 1999 and I remember some of those things but I don't remember much about being there, probably because it was so intense ...

In my spare time, I wrote "Lobsters". Emailed it to a friend; "this is really great, but you'll never sell it — the audience would have to have been overdosing on slashdot for six months before they got it!" Emailed it to the editor of Spectrum SF, who bounced it rapidly: "vapid, style-obsessed, pointless rubbish". Stuck it in the post to Gardner Dozois at Asimov's SF in the faint hope of getting a handwritten explanation when he, too, rejected it ... Then I went back to work on another project for my spare time. In 1993-95 I'd written a novel that didn't work. (John Jarrold, then editorial director at Earthlight, explained at length in his rejection just why it didn't work, so convincingly that I led it out round the back of the barn and shot it rather than sending it out to face the big bad world of publishing once more.) But some bits of it were still twitching in my hind-brain. A couple of elements showed up in the short story "Antibodies". Now I had some other bits — and some more ideas that went together in a short novel called "The Atrocity Archive". Which took up rather a lot of my spare time that autumn.

Spare time of which I didn't have a lot. That summer, the development group of Datacash moved to new premises. There were about eight of us by the time we moved (Dave and I began looking at new space when there were five of us, knowing we'd be mushrooming rapidly) and by March 2000, when I left, there were about thirty-something folks working there. Every week: new faces. I was still holding the servers I'd written together by hand and trying to jump through the certification hoops for new banks; meanwhile, new jobs were appearing about as fast as Dave could hire new programmers, and the nasty collision between Brook's Law and that ongoing 10-25% monthly compound growth prevented us from getting any traction in turning my workload into a team process.

One day, Dave asked me to write up my to-do list and email it to him. Three hours later: "Charlie, where's that to-do list I asked you for?" "Dave, I'm still writing it." In the end it ran to about four thousand words describing between two and eight man-years of work that was graded between "critical" and "very urgent".

Given this kind of pressure, it shouldn't surprise you to learn that things broke. In this case, two things broke: the Y2K bug, and transaction reconciliation.

In 1999, there was something of a panic over the Year 2000 problem. Today, in 2009, most folks laugh it off if you mention it — but it was deadly serious at the time. The banks in particular, being obsessed with risk management, were just about crapping themselves over it. Standard process in 1999 was to spend the first six months of the year badgering their suppliers, EPOS software vendors, and others to check their software; then in mid-1999 they went into a total lockdown mode, not permitting any changes to systems other than Y2K bugfixes, throwing barrels of money at 70 year old COBOL programmers to come out of retirement for one last month, and so on.

Datacash was written in Perl between 1997 and 1999. What could it possibly have to do with Y2K? Well, I had completely failed to forsee the possibility that the demo program might still be in use, much less in large-scale bet-the-company production use, two years later. It was a demo — built to be a dog-and-pony show in front of an audience and be thrown on the scrapheap after six months, replaced by something newer and shinier. Except that it wasn't. Gavin had been selling it as a service, and customers were running on it, and there was no time and no programmer resources available to write the replacement. And at numerous places in the back end, there were lines of code like this:

printf("It shall be 19%d\n", $that_year);

that should have looked like this:

printf("It shall be %d\n", 1900 + $that_year);

We got it right ... in time. Just. We had a bad hiccup on January 2nd, when a chunk of code that generated the batch transaction files for one of our banks turned out to harbour a lurking Y2K issue that turned the file into gibberish. Luckily the bank mainframe bounced the batchfile and we tracked down the bug and squished it before the customers, hung over from the millennium celebrations, came back into work and started asking where their money was. But the millennium bug wasn't a joke; at Datacash, it was a bullet dodged — unnecessarily, if we'd taken the time to go through the iterated design process properly rather than shoving a prototype out the door to tout for trade on the street corner.

Batchfile reconciliation ... that was even more arcane, but even more of a toxic lurking hazard, and it came closer to killing Datacash in the early days than Y2K ever did. First, some background:

When you hand your credit or debit card over to a shop assistant with an EPOS terminal, several things happen to make the money disappear from your bank account and appear in the shop's suspense account. First, they establish a brief online connection to the bank's realtime system to authorize a withdrawal from your account. This is done in front of you, in the store, using the EPOS terminal; they stick in your card and the value of the transaction, and the bank's realtime system comes back with an answer: either an authorization code, or a message like DECLINED (you don't have enough money in your account) or KEEP CARD (the latter is a bad one).

Later, probably some time around midnight, the EPOS terminal (which keeps a journal file of all the days transaction) goes online to talk to a different bank computer. It uploads a batch file full of transactions, one per line. These are processed in bulk by the bank's mainframe cluster, which at that point shuffles the actual money and balances the books between accounts.

Now, the bit you don't see: the mainframe batchfile run generates an output report, which describes (hopefully) what it's just done, and what total number of credit card transactions have been successfully processed, which were rejected because something has gone wrong between the initial card authorization request and the batchfile run, and whether the batchfile was overall a success.

Got that? Batchfile containing umpty-something transactions hits the mainframe; some time later the mainframe burps up a report that might indicate that not all transactions have been processed successfully.

Datacash was growing exponentially at that time. So were our customers. Let me give you a handle on that: UK2.net is a major British hosting and domain registration business today, with a couple of hundred staff. Back in 1998, around September, one guy called Bo got in touch with Dave to start processing credit card transactions. Bo had written this neat little script that could register domains and do DNS registry searches, and he wanted to sell domains over the internet. In his first month, he sold about 200 domains. In his second month, he did rather more business. In his third month, the bank yanked the emergency brake handle and cleaned out his account, because no way could this be a legitimate business, not with 10,000% monthly compound growth! The first anyone knew about this was when Bo phoned Dave and asked, plaintively, "where's my money?" The sorry saga came to a happy ending about a week later with a new bank account at a much more cooperative institution (who were more than happy to take advantage of a competitor's idiocy) ...

Anyway, as we passed five hundred customers, and found ourselves processing multiple thousands of transactions through half a dozen different banks every night, the batchfile reconciliation headache began to emerge as a critical issue. A single dodgy card could cause a batch of several thousand transactions for between ten and a hundred different merchants to be rejected. Worse, it could be very hard to work out which ones were affected. So we began talking to the banks. "Can you supply us with a daily report broken down by merchant and card number showing which transactions have succeeded or failed?" We asked them. "Eh?" Came the reply. The banks were used to dealing with individual merchants, not organizations like Datacash, a Payment Service Provider.

PSPs need different types of report from merchants. Some of the banks couldn't even produce a report on a per-merchant basis showing the total number of transactions and total amount of money that had gone into their account on a given night. One of the banks could ... but it came printed in 132-column capitals on music-ruled line printer paper, and they insisted on faxing it to us. Sometimes the drone operating the fax machine got it wrong and faxed the report to us upside-down. And if something went wrong with the day's process, the bank IT staff couldn't work late to help us fix it — all the phone and fax lines were switched off at 6pm promptly to stop the staff from using them for personal calls!

By late 1999, getting the days batchfiles reconciled was an urgent — but very labour-intensive — job. We had one guy whose primary responsibility was to keep tabs on where everything was going using a spreadsheet, and to make sure everything added up. In late August, he didn't come in to work one day. It turned out he'd had a mild stroke; he was in no shape to return to work for months. And then I noticed a puzzling discrepancy. It turned out that he'd been cutting corners, not keeping proper records. A whole bunch of transactions — several thousand — had been bounced by the banks, and we hadn't been notified.

Untangling that particular mess permanently soured my relationship with Gavin (the CEO, remember); we managed to get it squared away in the end, and the quarter-million odd pounds worth of transactions ended up in the right place. But there was much wailing and gnashing of teeth from down south, and ultimately an out of court settlement to a threatened lawsuit. And because I was the bearer of the bad news, Gavin really didn't want to hear my name. He was under pressure at the time: both he and Dave, and indeed everyone else, were operating beyond our levels of competence. Datacash was in the process of fumbling its way towards, not an IPO, but a reverse-takeover of a moribund public corporation (a high-level headhunting outfit). Strange selachimorphean people in expensive suits with dorsal fin slits in their jackets began circling. The smell of venture capital in the water brings a very special kind of shark to the banquet: high finance folks who know nothing about the business in question and everything about the value they can squeeze out of it if they can get themselves into the loop.

Datacash's angel investor had bought a stake in an executive recruitment company who were into placing high-end officers with other corporations — the sort of headhunters who needed to fill just five to ten employment slots a year to keep a company of fifty people in Aeron chairs and Armani. They weren't doing too well, but they understood the business of looking businesslike, and they had lots of City connections, and they could see the value proposition inherent in being on the board when they were acquired as a shell by a dot-com run by a bunch of clueless yokels.

A seemingly endless string of empty suits turned up in odd niches on what passed for our org chart; meanwhile, much discussion of stock options and promotions all round fizzled out among the developers with a lousy off-the-shelf option scheme pre-approved by the Inland Revenue, and worth maybe 20% of a year's salary when they vested after five years. At least, that's what the Morlocks in the development group in Edinburgh got offered; Gavin and Dave as co-founders did somewhat better, but most of the shares seem to have vanished into the pockets of the strangely abstract creatures of pure money who appeared out of nowhere three months before the reverse takeover and vanished once the liquidity fountain ran low on juice.

Datacash went public in January of 2000 — with a market cap of something like £32 million — and all I got was this imaginary tee shirt. Well: that and £4000 of shares: just in time for the dot-com bubble to burst and for them to spent the next three years in the tank. I was pretty much burned-out by then, feeling somewhat aggrieved and totally not on speaking terms with our smooth-talking CEO, who I understood to have been rolled by the snake oil salesmen from the city. And so I figured that it was time to move on.

About five years earlier, another acquaintance on the Edinburgh dot-com circuit had tried to recruit me. Andrew had started up another web consultancy in 1996; smaller and less successful than FMA, his client base was also more diversified. By January 2000, NSL Internet had become a thriving web consultancy and colocation provider, and Andrew and his partner were planning in taking the company public via an IPO on AIM backed by 3i. We'd stayed in touch, and now Andrew made me an offer I couldn't refuse. Post-IPO, his plans included setting up a software development division to produce commercial web-related applications. Would I like to set up and run the new development group? I'd figured out by this time that I probably couldn't be any worse at managing a development team in a dot com than Dave ... so in a fit of insanity I said "sure".

So it was that in late January or early February of 2000 I went to visit Dave and, in no uncertain terms, burned my bridges. I was at that point on three months notice either way (I was sufficiently central to Datacash that in 1999 their early due diligence documents had listed losing me as a critical risk to the company). I was okay with serving out my notice period, but a lot of my work could be done from home, and it was made fairly clear that I wasn't expected to turn up at the office any more. (Hints that your employer doesn't want you to turn up at the office: every time you go in, your PC has been dismantled and moved to a different cubicle and someone else is in your old one.) Well, I was okay with that. I was feeling burned out, and I could pretend to work from home just as long as they kept paying me not to turn up at the office. It was quite restful, actually.

But then the bad news arrived. The dot-com bubble burst on March 10th, 2000. There had been warning signs for a while, did I but have the wits to read them. Datacash was probably one of the last British dot-coms to make it to the relative sanctuary of a listing on AIM. With almost mathematical precision, the dominos began to fall: Andrew ran into trouble as 3i caught wind and backed out of NSL's IPO. He'd had an ambitious growth program under way, and was counting on a successful IPO to raise the capital he needed to keep growing — and indeed to maintain the operation. NSL hit the buffers in April, and the result was extremely messy: instead of an IPO there was a hostile takeover under threat of liquidation. Andrew resigned, having negotiated terms such that the purchaser of his company would provide some degree of job security for his employees: Edinburgh is a small city and he knew he might end up re-employing them in a future venture.

But as for me ... there I was, resting up at home as the growing thunderclouds of bad news gathered overhead. I'd been participating in a game of musical chairs — and the music had stopped, just as a recession hit the industry I'd been working in.

I was about to take the biggest gamble of my working life so far.

(To be continued ...)

|


38 Comments

1:

Mehr! Mehr! Mehr!

Definitly the stuff for some nonfiction book!

user-pic
2:

Thanks. That gives a beautiful example of non Y2K compliant code to explain the issue to a non believer.

3:

The really important point is that Y2K code was an issue in software written in the late 1990s -- long after it was known to be a problem. (I'd first read Yourdon on the subject in the early 90s. Before he went off the deep end.) And that folks who knew about it could still perpetrate it. "It's just a demo." Words to hang yourself with ...!

user-pic
4:

That brings back memories. In 1998/99 I worked for a large retailer's POS systems. I remember fixing almost that exact bug. It was in the code that created the data file containing the transaction record transmitted by the stores to the central office. If it had not been fixed, all transactions for a 300 store chain would have been corrupted.

This was code written in the early eighties, though. Fortunately, we caught all the bad ones, though they had the odd report with "Jan 1, 19100" in the header.

5:

Strange selachimorphean people in expensive suits with dorsal fin slits in their jackets

I came so close to spraying my screen and desk with breakfast there.

6:

Y2K:

I was then running the Systems department of my ISP.* We were pretty sure we'd fixed all of our affected code (updated routers, dialup servers, authentication and logging software, reviewed all our website and batch scripts, etc. etc.) but I was one of the volunteers to stay up just in case and make sure things were still working.

At midnight, I had a browser open to the official US Time site, http://www.time.gov/, run by the National Institute of Standards and Technology, to watch the rollover:
23:59:59 December 31, 1999
00:00:00 January 1, 19100

* I had started it as President, gradually melted down under several different kinds of pressure, and the previous year decided to hand the wheel over to someone else rather than take the ship down with me.

7:

I've learnt the hard way that it's never just a demo, no matter how clear you (think you) are about it to the management...

user-pic
8:

Paragraph 8 ± 1: s/built to to/built to be/

Great reading, especially the life-lessons such as "it was just a demo..." which we (almost) never fully learn. Do please keep the anecdotes coming!

9:

Christopher: not many more pieces to come. In fact, the next installment (which I wrote this evening and will post on Monday) is the final one in my pre-writing autobiography. And I'm not ready to autobiographize my current career track just yet!

user-pic
10:
I was at that point on three months notice either way (I was sufficiently central to Datacash that in 1999 their early due diligence documents had listed losing me as a critical risk to the company).

Please accept my apology in advance, but morbid curiosity moves me to ask: were there others within the company accorded such status, or did Datacash have a bus number of one?

11:

On the subject of the bus number business: I think Sam Kington may have been in a similar position (he single-handedly wrote the customer-facing reporting system), and Gavin and Dave certainly were (CEO and CTO respectively). Other folks, not so much: but if any of the four of us had been hit by the proverbial omnibus prior to September 1999 (or even after it), things would have been very dicey for the survivors.

12:

Charlie, seems like you were the one to keep the engine running, but didn't get a whole lot of thanks for it.

I would guess most of us readers here have been in similar situations (I know I have). Sometimes I wonder why people keep doing this, giving up large chunks of their life doing a critical-part job where there is not a lot of reward for it...

13:

Alex: Charlie, seems like you were the one to keep the engine running, but didn't get a whole lot of thanks for it.

In the medium term, I got HALTING STATE. Which, I suspect, will be worth more in the long term than any sane amount of stock they might have bestowed on a mere artisan programmer.

user-pic
14:

"Strange selachimorphean people in expensive suits with dorsal fin slits in their jackets"

There are whole industries which get ripped off by those types.

We are the Underpeople.

(In my case, Estate Agents... Is it any wonder I have a selection on anarchist songs on my mp3 playlist?)

user-pic
15:

Charlie,

What I can't understand from the write-up. You knew you were key to the viability of the business, and that bundles of shares were being cast around with the IPO. Why didn't you make them an offer the couldn't refuse? "x% of the company, or I walk out the door"

OK, maybe you were too nice, but you sound like you weren't exactly friendly with the board and are too switched on to get stiffed like that. Particularly with Y2K around the same time and thus code and programmers on an all time high.

I've seen similar things in the past, key individuals getting fractions of a percent of the value, and just wonder why YOU didn't play hardball?

user-pic
16:

I've seen / caught the occasional Y2K bug since the millennium - amazing how many developers regress immediately to old habits.

And yes, we're still producing demo software that then gets sold onto customers. Although, to be fair, the customers rarely want to pay the costs, or wait the time, for non-demo quality software.

17:

Ian @15: it simply didn't occur to me.

18:

Charlie @ #3:

Printing years as "19%d" is no worse than the absolute worst Y2K "work-around" I have ever seen: "%s%d", (year

Essentially, year is the current year - 1900 (so in 2009, year is 109). The above-code ONLY works between 1950 and 1999. Written in 1997, by someone actively trying to avoid introducing Y2K issues.

19:

Argh. The arguments to the format string should've been:
(year < 50 ? "20" : "19"), year

user-pic
20:

Charlie @17:
Yep, that seems to be the usual reason. Knowledge based businesses pull the wool over key individual's eyes and give them pennies while entirely replaceable (and planned to be replaced) managers make a killing.

I remember seeing it with one business where the key team members, who had basically the entire value of the business in their heads and could walk out the door, ended up sharing 0.5% of the total value.

There's probably a viable business, in better times, coming in on behalf of the key workers and negotiating on their behalf during IPOs, MBOs, takeovers, etc. for a share of the value recovered.

I wonder if there are any stories of IPOs being scuttled because the key workers got a clue and demanded their share, or indeed finance geeks being forced to hand over much bigger percentages of the loot? Seems so obvious its suspicious you never hear of it....

21:

Ian: I should probably add that I'm currently earning more as a novelist than I ever did as a programmer. Admittedly, wages for programmers are lower in the UK than in silicon valley, and lower still in Edinburgh than in the South East. And also admittedly, I'm more highly motivated (I get to keep what I earn) and I'm successful (my entire back-list has stayed in print, so far). But yes ... the techies tend to get the shaft because they've spent years learning the tech, rather than learning how to stroke the ego of the senior management who hand out the bonuses.

user-pic
22:

Charlie @21

In the situation you describe you should have been able to clear £500k out of a £32m IPO - utilising the right 'leverage' and threats. Only 1.5% of total value. That would have helped a lot in setting you up to be a novelist I'll bet?

I do think techies are purposely kept from thinking such thoughts, and as a consequence aren't valued as highly as they should be by senior management. To paraphase Ferris Bueller, 'cause you can't respect somebody who kisses your ass'.

Maybe I should give something back to society by going out to 1st year S&T undergrads and explaining a few facts of techie life to them; understand and nurture your value; understand business and balance sheets; pay scales are designed to give an excuse NOT to reward you; always consider if you should fire your manager; there is no such thing as security; networks are key; a moving target is harder to hit, etc.

Someone should be teaching it, a one hour lesson is all it should take.

23:

Sure Ian,

but I doubt I would read the books of a smug techie who got rich with a dot.com with the same enthusiasm that I read the books of a guy who actually has to make sure people read what he writes.

user-pic
24:

"It's just a demo"

Happened to me too, I wrote a utility for a small group of people to solve a problem, didn't know they were still using it (the whole department now) 9 months later. Guess where the ONLY Y2K bug hit?

25:

I worked for a Value Subtracted Reseller around 2000 (like a Value Added Reseller, but for negative amounts of Value) that made a killing pre-2000 selling new computers and networks to nervous business types -- Replacing some ancient Netware 2.11 server with the then current Small Business Server offering from Microsoft or similar.
No linux though, it was a Microsoft "Partner" (trying to point out that they could sell Linux servers and support without having to pay Microsoft gave the jumped up accountant-turned-salesman a confused look).

They went bust a couple of years later since they had nothing else to offer (and didn't pay employee's superannuation requirements, the bastards)

26:

Ah... Y2k. Easiest 800 quid I ever made (on-call bonus). And if my boss of the time is here, I was at home waiting by the phone completely sober ;-)

I actually saw a Y2k bug that made it through the net: HP-UX screwed up 29th Feb 2000. The day ran 1/365 too fast (or was it 1/366, can't recall). Which resulted in jobs running a couple of minutes early after a few hours (I contracted to a London black cab company, so we noticed that kind of thing).

HP denied it was a Y2k bug, of course. "No, that's a 29 February issue." That only happens in 2000.

27:

Of course it's not the Y2K bug, it's the Y400*n bug. Although I imagine, that the Y100*n bug would have been worse as far as leap years go. If the .com bubble had been in 1899, I don't know how many people would not have realized that 1900 would *not* be a leap year in addition to changing from the 18xx to the 19xx.

With all the dangers that leap years present, I can't imagine though, why nobody takes my warning seriously that unless we change our calender, we will be 3 days ahead of the real date by the year 10,000 ... so maybe 4000, 6000 and 10000 shouldn't be leap years after all.

;)

user-pic
28:

Charlie "There had been warning signs for a while, did I but have the wits to read them"

I seem to recall using the words 'tulip craze' to you some time about then. Do I win £5?

On Y2k: (1) it did for the US company which had the global lock on ballistic imaging machines, leaving a gap through which their junior competition (now monopolist...) Forensic Technologies, could step.

(2) Jan 1st, 2000, 10am on the north Norfolk coast. Just out to sea, an RAF Tornado flew east-west, very close to an AWACS, at about a thousand feet. I imagined the conversation the week previousl: "We need someone to check if it's still working. So it's no hangovers for you, you, and you."

user-pic
29:

Re: "Halting State" moments ...
Is THIS:
http://news.bbc.co.uk/1/hi/technology/8129261.stm
An "Unwiring" moment (from your collection "Wireless") or am I imagining things?

user-pic
30:

I remember an IBM Mainframe having a 32nd January in 1992 (I think, or possibly 1988). Presumably there was an 'off by one' error in the 'add an extra day in a leap year' code. The Y1.992K problem.

31:

http://www.theregister.co.uk/2009/07/06/police_headcam_fireballs/

That looked like a Halting State moment, but turned out to be slightly less exciting.

user-pic
32:

haha i remember fixing that exact same perl bug in 99 along with a mess of other localtime ones. I also remember sitting around at 12:01am on 1/1/2000 with a bunch of engineers trying to make sure that 1:) the site didn't go down and 2:) the world was not crashing. Certainly we took it quite seriously.

The .357 I got as a final piece of insurance still has a "Y2K Ready" sticker on the butt...

user-pic
33:

Have been reading Atrocity Archives and Jennifer Morgue along with the autobiography of this part of your career. It's made your novels seem "realistic" (if that makes any sense).

I worked one day in a government agency and literally could not go back for a second day. You have truly captured the horror of this type of work.

Great stories, great bio.

34:

The Y2K Perl sample code brought back a flood of memories. I still did a lot of work in Perl at the time, and there were a few places where I'd done the same thing that needed fixing.

But what I remember most about Y2K were the less technical amongst my friends and family being the most panicked and some of them not getting why I did *nothing* to prepare for it. I didn't think for a minute there would be any major problems, I knew how much prep had been done.

I wasn't on duty that night - I'd quit my last job and had the next lined up to start in January, so I was taking it easy.

35:

It is valuable to prepare for the next internet industry boom and bust cycle by reflecting on lessons learned from Charles Stross's exegesis. We are in another boom, as the valuation of Facebook indicates. Just as there were lessons learned in the Great Depression which could have reduced the severity of the current Global Recession, and lessons learned from Robert Macnamara's war in Vietnam that could have prevented Donald Rumsfeld's war in Iraq. Those who do not know History...

37:

You know what annoys me? When people now talk about Y2K as if it was all snake oil sold by bullshitting contractors. It wasn't, it was a real problem, as serious as described, and we all dodged a bullet by working our arses off too. Fuck.

38:

I worked as a programmer for one of the major high street banks during this period. Charlie, I can completely sympathise with you over the stress of reconciling transactions when statements throw up an error. Finding the errors and proving why they will not happen again to satisfy testers, analysts and the business can be a lengthy and gruelling process.

Banking IT can seem a strange world to outsiders as they are lots of business analysts, project managers and process co-ordinators to meet until you find the technical people who know how it all works. The bank has thousands of people working for its IT section out of which just a handful can probably answer deeply technical questions relevant to a business unit and its activities. This is why site visits can be useful because you get to meet people who you might never have known existed.

Y2K was a severe problem and I recall stats suggesting that a significant proportion of the bank's systems would experience total failure. The majority of work was done in '98 with '99 there to pick up any pieces and do vast amounts of testing and re-testing. Y2K was a success because it was a programme that was properly funded and the management completely supported it. Projects which are underfunded or have continually changing requirements litter the world, Y2K was not one of them.