Back to: Notes from the coal face | Forward to: PSA: Why there won't be a third book in the Halting State trilogy

Michael Walker trained a Markov chain with the King James Bible and Structure and Interpretation of Computer Programs, a classic computer science textbook.

The result is King James Programming:

And Satan stood up against them in that day, and leap for joy: for, behold, your reward is great in heaven: for he maketh his sun to rise on the evil and on the role of procedures in program design.

22:14 The mouth of strange women is a deep and wonderful property of computation.

In APL all data are represented as arrays, and there shall they see the Son of man, in whose sight I brought them out

This was not, obviously, silly enough for my tastes, so ...

Half an hour on CPAN and in vim, and then some discreet dumpster-diving in the nether reaches of the internet, brought me three things:

  • A dodgy copy of the complete works of H. P. Lovecraft
  • The text of the King James Version of the Bible
  • And the first code I've written in, oh, close to two years (please go easy on me)

Here it is:

    #         FILE:
    #        USAGE:  ./ 
    #      VERSION:  1.0
    #      CREATED:  05/12/2013 20:08:15 GMT
    #     REVISION:  ---
    use strict;
    use warnings;
    use Algorithm::MarkovChain;
    use Path::Class;
    use autodie; # die if problem reading or writing a file

    my @inputs = qw(king_james_bible.txt lovecraft_complete.txt); 
    my $dir = dir(".");
    my $f = "";
    my @symbols = ();
    foreach $f (@inputs) {
        my $file = $dir->file($f);
    	my $lcounter = 0;
        my $wcounter = 0;
        my $file_handle = $file->openr();
        while( my $line = $file_handle->getline() ) {
    		chomp ($line);
    		my @words = split(' ', $line);
            push(@symbols, @words);
    		$wcounter += scalar(@words);
    	print "$lcounter lines, $wcounter words read from $f\n";
    my $chain = Algorithm::MarkovChain::->new();
    $chain->seed(symbols => \@symbols, longest => 6);
    print "About to spew ...\n";
    print "---\n\n";
    foreach (1 .. 20) {
        my @newness = $chain->spew(length   => 40,
                                   complete => [ qw( the ) ]);
        print join (" ", @newness), ".\n\n";
Yes, it's a Markov chain generator, seeded with the King James Bible and the complete works of H. P. Lovecraft. Sample output:

krina:markov charlie$ ./ 2> /dev/null
99820 lines, 821134 words read from king_james_bible.txt
16536 lines, 775603 words read from lovecraft_complete.txt
About to spew ...
    the backwoods folk -had glimpsed the battered mantel,
    rickety furniture, and ragged draperies. It spread over it a
    robber, a shedder of blood, when I listened with mad
    intentness. At last you know!At last to come to see me. Now
    the absence of any real link with that of 598 Angell Street
    was as the old castle by the shallow crystal stream I saw
    unwonted ripples tipped with yellow light, as if those
    depths of their rhythm. The training saved them.
    the bed, and make thee borders of gold with studs of silver.
    1:12 While the case histories, to expect. As mental
    atmosphere. His eyes were pits of a hundred and fifty
    shekels, 30:24 And he laughed mockingly at the village
    the commandment of the room; then this. If this thing. 25:1
    If he had no way to turn either to the coyote - or to
    something was wrong. Marsh and Marceline represents. I am
    strong. 26:16 I also in me. 14:2.
    the ghouls, whose utter strangeness and their backsliding, I
    will love him, and have redeemed them, yet thou never gavest
    me a people: 8:11 And I said unto them, and I believe that
    the king doth behold the upright. 33:2 Thus.
    the gleaming sand, bobbing lanterns. The Philistines be upon
    thee, and because the famine in the heart proceed evil for
    Israel, with hesitancy, and which I had known it, to
    himself, he said, How shall depart from his house. 7:2 That.
    the results we learned that no harm him, and rent it. 7:22
    My face again no not to inform me, even all the heads of the
    unutterable consequences. It could tell, it thunders. The
    thing came out of Egypt. Who knoweth.
    the grass-grown line on the glassy, phantom bones. 50:18
    Therefore the children of Israel dedicated the sea, diverse
    and I hung an air of the war, to rest in my brother for
    nought, and the counsellor, and the cunning workman, and.
    the great hill that put bitter weeping; Rahel weeping for
    Tammuz. 8:15 As it fastened his body to the dead youth who
    would "go the king lifted up his Son of Professor George
    Saintsbury - "the criminal is securely strapped to.

As you can see, the output is pretty crude. Obviously this was a half-hour hack, not a properly finished product; but I think it shows promise — His eyes were pits of a hundred and fifty shekels — and a definite feel of familiarity — It spread over it a robber, a shedder of blood, when I listened with mad intentness.

Stuff to do: fine-tune the parameters of the Markov chain output, pick different seed words, possibly filter out chapter headers, titles, and verse numbers, possibly scan the output for sentence-shaped lexical chunks and top and tail them (capitalize and terminate properly).

I wonder: if I run it for long enough, will it emit a fully-formed draft of the Necronomicon?



This is wonderful stuff, thank you for making a fairly dull Saturday a whole lot more amusing, and especially for posting the perl to go with it. It's reminded me how much cool pre-written and easy to use code there is out there.


"will it emit a fully-formed draft of the Necronomicon?"

Stop now - before it's too late!


I think this is an excellent idea and with a few tweaks (matching and removing things like "and.") it could be a really interesting generative Laundry training manual generator!


You cannot comprehend the magnitude of the forces you are meddling with!


The author of King James Programming seems to have an inclination towards Lovecraft as well and mixed it up with the programming book:

'Feeding a Markov chain with CS and Lovecraft: "The recommended method of storage makes use of the great bronze statues on the walls"'


So his tools are mostly in Python, not perl, but you may find the research of Nick Montfort (who for all I know you have met already) on prose generation (including interactive prose generation) interesting:

Relevant examples: "Curveship is an interactive fiction system that provides a world model (of characters, objects, locations, and things that happen) while also modeling the narrative discourse, so that the narration and description of the simulated world can change. Curveship can tell events out of order, using flashback and other techniques, and can tell the story from the standpoint of particular characters and their perceptions and understandings." [this is in Python]

Post about Nanogenmo (National Novel Generation Month):

World Clock (his Nanogenmo entry)

This is not useful but amusing

This BASIC program on the VIC-20 generates two quotations from Samuel Beckett novels:

(full disclosure: I organized Récursion and its ancestors, as well as @party)


Squee! That is all.


But how do you make it maintain a consistent theme over three volumes of a trilogy?


Feeding a Markov chain with CS and Lovecraft

I nearly misread that as C.S. Lewis and Lovecraft. That might make Lewis interesting.

Anyhow, the output seems fairly smooth, though disjointed in spots. Reminds of one reason I've never been able to get into Lovecraft's writing; that he seemed to try for a style similar to the KJV, which I suppose he considered high-literature?


I wonder: if I run it for long enough, will it emit a fully-formed draft of the Necronomicon?

I think you're more likely to get the complete works of Cormac McCarthy!



Adding in some of Shakespeare's histories might kick it up a notch.


I think it's been exceeded but it probably was high literature, or at least a partially successful attempt.


Mix in a grimoire or two, chatbot with a voice synthesizer and recognition and you are well on your way to Computational Demonology 101.


"The thing came out of Egypt. Who knoweth."

There is excellence here.



If it was CS Lewis and the Gita would you use Emacs?


I know this algorithm by the name Dissociated Press. Wonderful stuff.

For further tuning, you might consider weighting one source corpus more heavily than another. Say, 75% KJV and 25% H. P. Lovecraft.


Oh, and doing it by character instead of word (with match windows of about 5 characters) can have nice results as well, including some fun neologisms.


I think for maximum effect I'd at least eliminate all the genealogies in the Old Testament and perhaps focus on Isaiah, Daniel, Leviticus, Psalms, and Revelation. Maybe get rid all of the Gospels except John.


This is the real reason that Bob Howard got the attention of the Laundry.


Yeah, tossing the Gospels and the Epistles is probably a good choice. Too much there about salvation and virtue. A few begats scattered in might be OK. "And Amos begat Saul on the frozen plain" sort of thing.


Once it's appropriately tweaked to production grade Necronomicon text, bang out a copy in printers pairs in appropriate German Fraktur or similar textura script, seeded with illuminated versals, and I can get it to some good medieval type bookbinders to cover it in disturbing leather.


There's a Mr. Angleton for you on line 2; he says it's urgent ...


This may very well be the process that extruded the most recent Dan Brown novel. Tweak the input a bit to be more parameterized and you may also be able to generate the works of Clive Cussler and Tom Clancy.



I've bored through press releases and news articles that made even less sense than that. And presumably people got paid to write those...


The Book of Mormon is another excellent source for high-grade religious gibberish. No offense intended to anyone.


Actually, Charlie - if you get this going well, you could (in a much nicer way) emulate Elron... Start a new religion & make loadsamoney!

Amd, after all, if you want more smiting & burning in the hells etc, there's always "the recital"

Or maybe not such a good idea?


Here's a much older (1990?) version of the same idea, produced using the posts to the Usnet groups comp.unix.wizards and soc.christian (and possibly alt.evil as well):


Nicely done. If you'd like a slightly nicer interface, or the ability to deal with more input text, consider checking out the "Hailo" module. It's a Markov-chain chatterbot backend that some of us have been working on that's capable of online learning pretty much forever without running you out of RAM like most of the alternatives. It also comes with its own tokenizer, so using it is about as easy as use Hailo; my $hailo = Hailo->new; $hailo->learn(@lines); print $hailo->reply;


Since the late 90's one of my bon mots about Perl has been "Did you know the Necronimicon was written in Perl?". I can't believe you made that real.

Fortunately this copy will hold no power since your invocation was done with a blessed editor rather than the Editor for Manipulating Accursed Circles of Summoning.


Then there is the DaDa engine:

I read some of this out loud to my dad when he asked what I was laughing about. After I was done he read a quite similar excerpt from the William Faulkner novel he was reading.


1- Be famous : check 2- Start a new religion : check 3- World domination : incoming...

for further fun I'm waiting what you could do with a 3D printer ... what's the use of a new religion without new idols and demons ?...


Very nice!

In another sign that the Great Old Ones are stirring, all I have to say is:


Mr. Stross:

I do think this sort cyber-sorcery is a promising avenue of exploration, if not the holy grail which alchemists, poets, prophets and programmers have sought for centuries. However – and I’m sure this warning is unnecessary for one such as you – keep in mind that by mixing words and memes in this manner, you are tampering with the very fabric of reality, and risk rending it irreparably. Proceed with caution!


Read "That Hideous Strength" and you would not be surprised by the coupling.

In fact the themes in the book woudl fit quite well with the laundry verse.

"The story involves an ostensibly scientific institute, the N.I.C.E., which is a front for sinister supernatural forces." Sound familiar?


Pity Lingua::Romana::Perligata doesn't provide a translation from perl, for further effect.


If you feed it Structure and Interpretation, the collected Lovecraft, the King James Bible, and Atrocity Archives, could you give us a new Laundry story every week?


I would happily pay good money to own a copy of the finished product (Necronomicron or no). This type of thing is right up my alley.


This makes me want to make a markov-like variable rewriter, to modify say C code, and feed the Linux kernel source and Lovecraft in.


Somewhere in the back of my mind, a tiny voice is screaming "nooooo!", and towards the middle is another screaming "UN War Crimes Tribunal!", but what the heck.



I suspect that if you did that then you'd be getting infinite copyright + patent lawsuits from Micros**t for infringing on their method of generating new Windows versions: Markov-chain the old version and the complete Lovecraft, and, if it compiles, it ships!

What misbegotten program provided the initial seed of C, we will never know.


Not on topic - but feeling like Halting State :

(you can read it using "porn mode" if you are above your monthly page count)



I'm just trying to contemplate the resulting confusion should they try this in the shark pool that is EVE Online.

Though I suspect the underlying culture of paranoia there might make them feel right at home.

In other news, I finally shelled out for the bulk of the Cubicle 7 game treatment of the Laundryverse this weekend. I like it, really nicely done stuff.


Or read the non-paywalled Guardian story here.

This, incidentally, is why I will not be writing the third book in the [projected] Halting State trilogy -- ever. Not because I don't want to -- I'd really love to have been able to make it a trilogy -- but those books took too long to write: book #3 was going to take me 2 years, so I was forced to shelve it back in autumn of '12. If I started writing it right now it wouldn't be ready to hand in until December 2015, and couldn't be in print before late 2016. And it'd be in continuity with a novel -- Halting State -- set in a 2017 that was already obsolete.

That doesn't mean that I won't be writing any more near-future Scottish thrillers. Quite the contrary: I'll be discussing future projects with my editor at Ace some time next year, and another one will be on the agenda. But it's not going to be a direct sequel, like "Rule 34" was.

If you think I ought to be writing some kind of near-future circa-2020 panopticon surveillance state thriller in the wake of the Snowden revelations ... well, you'd be absolutely right. And it's the new Merchant Princes trilogy I'm working on right now. I just hope it isn't obsolete before it's finished ...


The Merchant Princes (as well as everything else they are) are also "alternate reality" aren't they? $Thing being "out of date" here just means that the timelines aren't perfectly synchronised.

Oh and good news about more Scottish near future stuff (for the degree of realism of place even if we can't get another Liz Cavenaugh).


Sorry, charlie, but this is just actual proof that you are just dabbling into Evil and still need to learn with a true EVIL OVERLORD(tm).

TRUE EVIL(tm) would not have written a novel, TRUE EVIL(tm) would have applied for a software patent, then sued NSA, GCHQ and like for breaching their copyright. Oh, and afterwards still write a novel for the evulz, BTW, who's to say the agencies didn't get the idea after reading through the nominations for the Hugo and Locus[1] awards of 2008?


[1] Err, I first of read that one "locust". Strange, though like any (part time) metalhead I kinda liked Revelations, I'm not that much into it. But I think helping a friend writing on those critters is to blame...


On another note, besides wondering who "krina" is:

99820 lines, 821134 words read from king_james_bible.txt 16536 lines, 775603 words read from lovecraft_complete.tx

I'm somewhat surprised old squidlover's ouevre is that big, though I guess this omitts his letters. No idea about the reworks, though.


Err, why nice? Actually, according to some studies about the lay-analysis/self-improvement part of New Religious Movements, OGH's ideas about fertile recruitment pools for the Order of the Black Pharaoh are not that much of the mark. And being mean to mean and greedy people is so much fun...


besides wondering who "krina" is

My guess would be Krina Alizond-114, heroine of a very recently published novel by a certain writer not too far from these columns.


Why is everybody so fixated on Fraktur for "evil" texts?

Personally, I thing somewhat more "flowing" scripts[1] like maybe the Merovingian minuscule much better suited:


[1] OK, every typographer present is invited to stone me.


Err, I haven't read "Neptune's Brood" yet, can I use that as an excuse?


You mean Tori Amos?


Why is everybody so fixated on Fraktur for "evil" texts?

Because evil texts should be difficult to read and, for those not used to it, Fraktur is exactly that.


Problem is, I'm not that sure how Lewis' underlying ideas, e.g. his strain of Christianity, fit into the Laundryverse, and how they would be seen by the other protagonists.

My first guess is somewhat like Tom Clancy's opinions on Dirty Smelly Hippies, e.g. naive at best, more likely dangerous "useful idiots" for some hostile agency or an Old One doing the costume thing. This is not to say Laundry and Christianity can't go together, Bob's friend shows there is some way, it just says that, to use a stereotype, I can quite easily see a Jesuit in the Laundry, an Dominican, not so much.


Well, that's an explanation, though personally, I always though it somewhat akin to the notorious Metal umlaut:

Thing is, us fiends are a fickle lot, elaborate in our designs, refined in our taste, jaded in our demands. So while I agree on the idea of using a difficult to read script, I somewhat prefer a more fluid, "decadent" style. ;)


This looks a lot like much of the spam I used to get. I guess those bots used Markham chains and classic seeding texts too.


Dear Mr. Stross:

Using the NSA Precrime Futureview-o-matic to steal my ideas again, hey? Well, I'll see your Markov chains and raise you random passes in and out of English via Google Translate before Markov training/processing! haHA!

I've been screwing around with this sort of thing on and off (mostly off in this decade) since the days of the AltaVista Babelfish, but the Markov chain post on Boing Boing fired up the part of my brain where this particular brand of madness is enthroned and back to work!

Your choice of corpus: brilliant. Currently, I'm fiddling with heavily downvoted youtube comments, 80s television show intros/theme song lyrics, and random advertising copy. Results:

I'm currently working on hooking my script up to a single threaded page crawler and turning it loose for some seriously random textual jazz but I won't have time to complete my 21st Century rendition of Finnegan's Wake Performed Live By The Internet if you invoke the Second Coming of Jeezthulu Our Lord And Slaver.


You put me to shame Mr. Stross. All I could manage was this Shakespearean Insult Generator ( ). Written in PHP. I never could get the hang of Perl.


Someone else may have suggested it, but if it's possible to do a three-way version, one with the KJV, Lovecraft, and SICP would produce results suitable to the Laundryverse.


"Why is everybody so fixated on Fraktur for "evil" texts?"

Well, the whole Nazi thing might be part of it. Certainly didn't help.


Oddly (maybe) my first encounter with Fraktur was earlier this year while cataloging new donations to the small synagogue library I look after. I believe it was a Passover Haggadah in Hebrew and German from the 1920s. It was obviously German, but with characters I didn't recognize, but later found what they are from Wikipedia.

Speaking of Metal Umlauts, I've been wanting to make a fake band t-shirt with Shpëelküß in gothic letters. That's supposed to be Shpilkes: Yiddish for the feeling of being on pins and needles.


And ... because, in Germany, fraktur was falling out of favour, and books were being printed in "normal roman" typefaces ... right up until early 1933. Then, Dolfie stated that only pure German (aryan) text should be used ... and Fraktur was more-or-less compulsory until 1945. So it carries a lot of "associations" shall we say?


err, actually it's somewhat more complicated than that. incidentally, i put a link to the wiki article on the antiqua-fraktur debate somewhere up there.

long story short, broken typefaces were originally not only used in german-speaking areas, they originally stem from northern france. but they persisted somewhat longer in those. when napoleon invaded the holy roman empire, they became something of a sign of resistance against the french, where of course, anti-napoleonic sentiments had their usual stints into reactionary and antisemitic ideas, even the modern non-religious racist.

as seen in the article, later on there was some discussion about phasing them out in favour of antiqua, with the support somewhat in line with the nationalist, which, alas, was quite diverse. the passover prayer book mentioned might be an example.

of course, part of those later wound up with the nazis, and first of there was some support for the fraktur types with those. thing is, this was the voelkish movement, where hitler was quite inimical to them, so in 1941 fraktur types were branded as "schwabach jew letters" and declared obsolete in a letter by bormann. the same letter using a fraktur head might show how this was enforced.

i'm aware of the connotations, i just don't think them quite appropiatete. that is not to say some of the more "modern" fractured groteske like e.g. tannenberg scream "the black corps" to me. others, otoh, are more in line with sauerbraten. of goethe.


A perl of great price...


Err, if you need any help with the German or the script, feel free to ask. Though I might need some practice, since Fraktur is very seldom used today, usually it's in a similar vein to the Blackletter fonts,

e.g. to give a certain "traditional" feel. And quite a lot of the books from before the 20th century are printed in it, so if you have a little bibliophile streak, you are likely to be confronted with it.

Then, there are some guys who like it for various reasons, e.g. aethetics, it seems one Hermann Hesse wanted his books printed in Fraktur. On the slightly nutty level, we have one society trying to reintroduce it, arguing inter alias it's more applicable to the German language, there are some special signs common to Fraktur fonts usually not implemented in Antiqua fonts, though they forget to mention the same signs already have Antiqua versions, and even with Fraktur, they were infrequently used. And then there is quite some, err, misaimed fandom...

As for the haggadah, as already said I wouldn't be that surprised if some conservative German Jews stood on the Fraktur side of the Antiqua-Fraktur debate, especially since many German Jews were quite open at demonstrating their patriotism, where it might be of interest that full citizenship for all German Jews AFAIK only came with the German Unification of 1871. Let's just say the Germany had a strange way of showing its graditude...

One of the leading states with emancipation was BTW Prussia, which might be somewhat symptomatic for the somewhat complicated heritage of said country, yes, there was quite some militarism, but also a relative openness to minorities and immigrants, e.g. French Hugenotts. And at the end of Weimar, one of the last bastions of democracy was Prussia, which was only abolished through a somewhat coup d'etat:ßenschlag

As for my personal relation to Fraktur, I think it's part of German literary history, the first Gutenberg bible was set in a somewhat related font, and old books are set in it, but I really can't stand the mysticism around. And then there is the misaimed fandom, for some mostly harmless examples, I think Rammstein somewhat funny, and have some, err, fond though somewhat intoxicated memories about listening to their first album "Herzeleid" with some metalheads during one late 90s, though in general I can't take "Neue deutsche Härte"ärte

serious, and the joke got old quite fast.

As for "schpilkes", while "auf Nadeln sitzen" or "to sit on needles" is a widely used phrase in Modern German, I'm not aware of this word in contemporary German, though there are plenty of Yiddish terms in German, somewhat more in slangs, dialect or sociolects. Since those are somewhat phased out by High German in many areas, actually quite a few of those have a "folkish" tone to them. It seems like it comes from "spilke", a form of "spille", which actually is a word for a spinning spindle. And the surname corresponding to somebody manufacturing those, e.g. "Spilker", is moderately common in some parts of Germany, though there might be some risk of it running you into trouble in others. Not so much for any antisemitism, but it seems to be somewhat confined to Eastern Westphalia, and there are some jokes about the animosity between those and the Westphalians around Münster. Which are only wholly exagerrated...

Whatever, sorry for the excursion, back to text.


As for the Necronomicon, actually it seems that according to

there was a Blackletter version in the 15th century in Germany, which might have been in Schwabacher or some other forerunner of Fraktur, since Fraktur proper is said to stem from the early 16th century. ;)

Note this was the Latin version by Olaus Wormius, and historically, Latin texts were mostly set in Antiqua even in Germany. And as already mentioned, it was likely no Fraktur, but some other Blackletter font. But I guess a version of the Necronomicon with a similar Blackletter font to the ones used in certain psalters, e.g.

with some uncials and differing text colours for some first letters, likely some miniatures etc. would be quite nice. For the appearance of age and reverance, I might add, not for being "evil", since, as we all know, the Old Ones transcend us mortal apes notions of good and evil, right and wrong, they revel in ...

Er, sorry, got carried away. Of course, if we wanted to go for a true medieval feeling, e.g. some codex, there would be some other font to use, maybe one of the minuscles like the Carolingian

or the Greek one:

For me, Fraktur would be too modern for this.


Seriously, this needs to be made complete.

The uses and applications are endless!


I think you mean 'abuses and applications'.



About this Entry

This page contains a single entry by Charlie Stross published on December 7, 2013 11:00 AM.

Notes from the coal face was the previous entry in this blog.

PSA: Why there won't be a third book in the Halting State trilogy is the next entry in this blog.

Find recent content on the main index or look in the archives to find all content.

Search this blog