Charlie's Diary: Doing it wrong

« An oldie but a goodie | Main | Merciless »

Doing it wrong

Three tenuously-related pieces of news have caught my eye recently.

Firstly: NDNAD, the UK's National DNA Database, run by the Forensic Science Service under contract to the Home Office contains DNA "fingerprints" for lots of folk — 5.2% of the population as of 2005, or 3.1 million people. Some of them are criminals; some of them are clearly innocent, but were either charged with a crime and subsequently found not guilty, or had the misfortune to be detained but not subsequently charged (that is: they're not even suspects). The Home Office takes a rather draconian view of the database's utility, and objects strenuously to attempts to remove the records of innocent people from it — it took threats of legal action before they agreed to remove the parliamentary Conservative Party's Immigration spokesman from the database (which he'd been added to in the course of a fruitless investigation into leaked documents that had embarrased the government) — so if senior opposition politicians have problems with it, consider the prospects for the rest of us.

In use ...

Whenever a new profile is submitted, the NDNAD's records are automatically searched for matches (hits) between individuals and unsolved crime-stain records and unsolved crime-stain to unsolved crime-stain records - linking both individuals to crimes and crimes to crimes. Matches between individuals only are reported separately for investigation as to whether one is an alias of the other. Any NDNAD hits obtained are reported directly to the police force which submitted the sample for analysis.

Now, this in itself is merely a steaming turd in the punchbowl of the right to privacy: but its use as a policing intelligence tool is indisputable. While there are some very good reasons for condemning the way it's currently used (for example, its use in the UK has sparked accusations of racism), I can't really see any future government forgoing such a tool completely; a DNA database of some kind is too useful. So what interests me here is the potential for future catastrophic failure modes.

I'd like to note in passing that the cost and effort required to conduct DNA sequencing is dropping like a stone, following a path faster than Moore's Law — the price of sequencing has fallen off a cliff, and an exhausting personalized genome sequence can now be had for around $50,000 and a couple of weeks' work. For comparison, back in 1998 or thereabouts the same job had taken several years and $100M. We're en route to hand-held realtime sequencers within the very near (5-10 year) future. And, aside from medicine, the consequences will be interesting ...

This week sees the publication of a paper that suggests that standard molecular biology techniques such as PCR, molecular cloning, and recently developed whole genome amplification (WGA), enable anyone with basic equipment and know-how to produce practically unlimited amounts of in vitro synthesized (artificial) DNA with any desired genetic profile. See also: faking up a crime scene. Because of the nature of DNA evidence it's actually physically easier to distribute it around a location than it would be to fake conventional forensic evidence such as fingerprints.

Meanwhile, in Australia ... oh, this one almost beggars belief:

Police computer security experts claimed responsibility for taking over the r00t-you.org cybercrime forum as part of a sting operation on ABC's Four Corners TV programme ... The Feds had reportedly configured their own systems as a honeypot designed to track and trace denizens logging into the forum. Police gained access to the forum not through infiltration but after raiding the Melbourne home of the forum's alleged administrator last Wednesday. ... Unfortunately the wheels fell off the scheme, because the officers involved failed to set a password on the database behind the honeypot site.

Yes: they tried to guddle a bunch of hackers and forgot to set the root password on the MySQL database they were using to store the evidence.

Combined with other instances of mind-boggling stupidity this is beginning to convince me that policing and IT security work are incompatible; that is, that the culture, training, and career structure of policing is generally inimical to understanding IT security. The vast majority of police work is about tracking down and apprehending lawbreakers after a crime has been committed; the vast majority of offenses are committed on the spur of the moment by not-terribly-bright folks with poor impulse control: and police are frequently expected to multi-task and deal with multiple cases in parallel. But in the INFOSEC sector the paradigm is turned on its head — it's necessary to carefully consider and plan to defend against attacks that haven't happened yet and to work on the assumption that the attacker is intelligent, tenacious, and has invested a vast amount of effort in advance planning. Even if they haven't, even if the attacker is merely a script kiddie playing with a tool someone else invented, you're up against the inventor's brain rather than the idiot attacking you — the rifle designer rather than the trigger man.

What are the risks of a national DNA database maintained for policing intelligence purposes, once DNA evidence faking becomes possible?

Well, one possibility is that, if sequence information for a named individual can be obtained from the database, your upper class of criminal might well use it to frame rivals — spreading it around the site of a bank robbery or wholesale drug distribution hub, for example.

Another possibility is that if the database is inadequately secured — and with cops waving handheld scanners with live broadband connections around, that's not a wild stretch — we might see some alarming injection attacks on the database, along the lines of short tandem repeat sequences tied to the name and other details of extremely violent criminal. If you really hate someone and want to fuck them up, stick their DNA in such a database, tagged as belonging to a violent serial rapist or armed robber.

Why do I think this is a problem? Well, the NDNAD is a single, fat, juicy target for hackers: to do its job it must remain accessible to police officers all over the country, which in turn means it has to be online, and therefore difficult to secure. To a wily hacker it's a priceless target: one they can use to both mislead ongoing police investigations and assault their rivals (using the police as a proxy). And the singular nature of the database makes it a single point of failure for the forensic science service.

This leads me to a fairly important conclusion: the can of worms — the hackable, fakable, fallible DNA database — is already here, and the law of bureaucracy says it isn't going away. But it needs to be secured. To do so, it's essential that it not be used as an authentication tool for identifying individuals. Moreover, DNA evidence can no longer be seen as sufficient on its own to secure a conviction in court. Online checks will still have a place — but only if they're used to match individuals against evidence found at crime scenes, and even then, only as an indicator (not as evidence in its own right).

Posted by Charlie Stross on August 20, 2009 1:46 PM | Permalink

69 Comments

bbot | August 20, 2009 15:36

I recall reading somewhere about how DNA databases don't scale, thanks to the Birthday Problem. There may be a one in a billion chance of a false positive, but if you've got ten million people in your database that drops signifigantly. This wouldn't be a problem with full DNA sequencing, but DNA databases don't store the full sequence, and will not for quite some time.

Lawrence Osborn | August 20, 2009 16:08

Charlie, you mentioned that ‘Because of the nature of DNA evidence it's actually physically easier to distribute it around a location than it would be to fake conventional forensic evidence such as fingerprints.’ In fact, criminals have known this for years. According to Philip Kitcher in his book The Lives to Income, some prostitutes make extra money by selling used condoms to criminals.

Lawrence Osborn | August 20, 2009 16:11

Aargh! Sorry that should have read The Lives to Come.

Charlie Stross | August 20, 2009 16:12

bbot: the UK NDNAD isn't searchable on full sequence -- it's searched on short tandem repeats -- but samples are retained permanently and linked to records; it's entirely possible that if the cost of sequencing drops by another three orders of magnitude they'll start exhaustively sequencing the samples. Following the current cost curve for sequencing, that's about 8-10 years away.

truth is life | August 20, 2009 16:12

Well, the NDNAD is a single, fat, juicy target for hackers: to do its job it must remain accessible to police officers all over the country, which in turn means it has to be online, and therefore difficult to secure.

But does it necessarily have to be online? I'm pretty sure at least some branches of the US government have a parallel internet for sensitive data (the NSA and CIA and such like); surely it would be possible to do something similar with this? Of course, given the expense I know that the government is going to throw it all on the Internet and have done with it.

Charlie Stross | August 20, 2009 16:36

@5: designing it so that it's not online may actually make the job harder. Let's posit that cops carry scanners with an offline copy of the database; samples that go into the database are taken at police stations and treated in accordance with the rules of evidence. We've then got a massive synchronization and replication problem -- the NDNAD is almost certainly quite large. (Even if it's compressed to only 1Kb per record it's a multi-gigabyte database; more likely it's a lot bigger. If it evetually goes to full sequence it's going to be on the order of 10Gb to 1Tb per individual, before compression.) Actually, I'd be astonished if it's less than 100Gb; more realistically, 1-5Tb in size, plus a couple of orders of magnitude more once they add full sequences.

Now consider that there are up to 141,000 police in the UK. Even if only one in ten of them is involved in patrol duties and needs access to the database, keeping everything in sync is going to be a big job ... especially as you still have to restrict access to the database, whether it's online or offline.

Andrew Suffield | August 20, 2009 17:29

There may be a one in a billion chance of a false positive, but if you've got ten million people in your database that drops signifigantly.

Let's tack a number on that.

$ perl -e '$n = 1.0; foreach (1..10_000_000) {$n *= (1.0 - ($n/1_000_000_000))}; print 1-$n,"\n"'

0.00990099010875656

(Yeah I know it's inefficient. Easier to pound out the loop than to look up the formula)

About a 1% chance that the database contains false matches at that size.

This is of course complete nonsense, because it's based on the assumption that no two people in the database are related. The probability of false matches on relatives is much, much higher. We do not seem to have good information on exactly how high. So it's hard to say anything about the actual probability, except that it's going to be considerably larger than 1%.

Even if it's compressed to only 1Kb per record

It's 10 small integers plus whatever identifying information they want to add to each record (name, address, ID number?). Probably less than a kilobyte; assume they've stripped it down for this purpose and locked the XML goons in a cupboard.

Replication is a technologically solvable problem (Lotus Notes was doing this stuff efficiently back in the 90s)... but you've accomplished diddly-squat, because now every scanner is a handheld object that you have to steal to gain a copy of the database, and you can't do anything useful with cryptography here. That's easier than breaking into the online version.

You can't secure this thing. Best you could come up with is a handheld "scanner" that just displays the 10 numbers and has no database lookup capacity; the user calls it in on their radio like a numberplate, and somebody in the office runs the search. That's still hopeless, given how many people have bribed their way into accessing police records in the past.

Stephen Harris | August 20, 2009 17:33

Add in your previous concerns about errors in database management (my DNA tagged with your name? The more DNA samples taken the greater the risk of mistake. Handheld scanners lead to almost certain guarantees of mistakes) and a recent high court decision that protect the state from blame for these errors ( http://www.theregister.co.uk/2009/08/05/moj_andre_power/ ) and we've got a massive problems even before you start to worry about hackers.

Kurt H. | August 20, 2009 17:41

There was an interesting news bit from Australia five years ago about how easy it is to fake DNA evidence with a PCR machine. It involved getting a sample from someone, synthesizing a bunch of their DNA, and sticking it in a perfume bottle. When sprayed over the crime scene, it overwhelms any other DNA left behind. http://www.abc.net.au/catalyst/stories/s1199805.htm

Robert Sneddon | August 20, 2009 17:43

10:

@6: I think the previous commentator @5 is saying that the database system will not be accessible via the "Internet" but run on a parallel system with separate hardware, routing and connections as military and other high-security systems currently do today.

You may recall the secure financial systems I worked on and the care we took to keep them from being contaminated by physical cross-connections to the public Internet. It's one reason we frowned so much on the idea of wireless connectivity, eventually only permitting it via phone dial-up to our own proxy servers where we could verify the connection's trustworthiness before allowing a remote user to access the company's systems. AFAIK they still don't permit 3G access to the core secure systems.

heteromeles | August 20, 2009 18:00

11:

Interestingly, a story about how to fake DNA in blood evidence came out this week as well, in the New York Times no less. http://www.nytimes.com/2009/08/18/science/18dna.html?_r=1&ref=science

Basically, you take the DNA specimen of your choice, do a genome wide amplification to get lots of DNA (or if you know what the forensics dudes are amplifying, just amplify that stretch). Then you take a blood sample, centrifuge it to remove all the white cells (i.e. remove most or all of the cells containing DNA), add in your DNA sample, and use the fake material to decorate the crime scene of your choice.

In the article, they say that this is too tricky for the typical criminal to do, and they're probably right. However, anyone who can do the basic chemistry of making crack or meth probably has the basic technical acumen to do this, they just need to invest a few thousand in getting the right equipment and materials (and some time to practice up). Fortunately for us, some of the chemicals in DNA work are pretty toxic, and their distribution is regulated on that basis. However, I predict that there will be an increase in thefts in college labs. Before, the thieves were after precision balances for the crack and meth trade. When they start stealing PCR machines and pipetters, we'll know that the horse has well and truly left the barn.

One thing I'd recommend on the law enforcement side is that someone do a bunch of research fast on how well laundering degrades DNA in clothes, and publishes this information. If there's a good way to clean the DNA out of clothes, I'd hope that used-clothing resellers get required by law to run all their clothes through that process before selling them. Otherwise, sooner or later, intelligent thieves are going to realize what a goldmine of DNA misinformation used clothing stores are. Why, grandma's hanky might show up at a murder scene....

Randolph | August 20, 2009 18:04

12:

I think this touches on an issue I've been thinking about lately: for a long time it was so difficult to interconnect computers that everything we did was slanted towards connecting them and making them communicate. But now we need to start working on keeping them from transmitting information and accepting unauthorized information. Maybe the intelligence services know something about this. Maybe. But whether or not they do, we need to begin engineering efforts to develop simple and widely deployed secure computing technologies.

Where's Bell Labs when you need them?

Randolph | August 20, 2009 18:07

13:

Oh, and, high-tech circumstantial evidence is overrated (including DNA evidence), eyewitness evidence is overrated, and old-fashioned low-tech circumstantial evidence is underrated.

heteromeles | August 20, 2009 19:07

14:

An blue-sky thought just struck me (ouch!). Thought I'd share.

I just noticed something weird. Nature doesn't bother with DNA analysis. Of course there's a good reason, in that DNA tends to be shielded behind a bunch of membranes, but there's no reason why lysozymes couldn't have developed to pop out DNA or immobilize RNA for analysis, if it really was that efficient a tool for identifying things.

Instead, nature went in for senses of taste and smell, and developed them to the level that bloodhounds (to name one of many examples) can tell homozygous twins apart by their smell (i.e. their volatile exudates).

While I don't think DNA technology is going to disappear, I suspect that in the long run, sniffers will replace DNA analysis at crime scenes, and it will probably occur around the time when DNA technology (especially DNA production) is so cheap that posted DNA hacks start clogging up the web.

Basically, nature has shown us that sniffers and taster are more accurate and cheaper in terms of resource use. Hopefully, we'll eventually figure out how they work.

Alastair McKinstry | August 20, 2009 19:09

15:

Indeed: I'd go further and say that maintaining a DNA database should be a criminal offence. No matter who holds it; in fact any organisation under political control (eg. a governmental DB) would be an especially heinous matter.

Given the valuable nature of DNA evidence, anything that hampers its use is a serious matter. If no DNA database exists, you can still use DNA evidence if you have a suspect by other means: you can get the suspects DNA at the time, compare it to the crime scene DNA, and tie them.
Better to have no DNA database than compromise this possibility.

Someone having DNA (of an opponent, political or criminal) could undermine this. So, having a DNA database and their DNA sequence could be immediately suspect.

Compare for example the German law on having a DB of ID numbers. In the US and elsewhere its typical for commercial enterprises to draw up lists of national ID numbers (eg using SSN's as keys for marketing data). The Germans take an opposite approach: its not only illegal, its criminal. Every individual has a unique national ID number, but there are exactly 2 databases allowed to contain them:
(1) The police 'stolen or fraudelent ID card list', which contains stolen ID numbers, but not names
(2) The National ID registers list, which contains the mapping of ID number -> name.
The Germans take the attitude that without the 2nd list, you can't create a valid (but fake) ID card. So having such a list is strictly verboten.

Note that even without such a database, DNA sequencing is still useful. You could still collect anonymized databases; and rapid sequencing means you don't need to keep my DNA on file for medical reasons: if you need it, take a copy now in front of me, do the work such as checking for $rare_disease in front of me, and destroy the sample.

MIchel | August 20, 2009 19:15

16:

Re: 11

Forget used clothing stores, just steal a laundry bag out of some hotel or other, get it while it's fresh.

Charlie Stross | August 20, 2009 19:35

17:

Re 11, 16: top deck of a bus, pocket vacuum cleaner, plenty of shed skin cells in the seat cushions.

Ed | August 20, 2009 20:07

18:

OK, how do you compose a SQL injection attack only using the letters A, C, T and G? :-)

http://xkcd.com/327/

Omri Schwarz | August 20, 2009 20:51

19:

I'll keep it short and sweet:

If you have an abusive partner who has access to this database, you cannot escape him.

I've already heard anecdotes of women in Belgium being advised that the only way to be out of reach of a boyfriend who is a cop is to leave the country. Does Britain want this?

Mattan Ingram | August 20, 2009 20:58

20:

What about just generic security checkpoints for office buildings or whatever. Would a quick pin-prick and DNA scan be harder to trick than optical scanning or scent sniffing as someone earlier suggested?

Would you rather have your bank account linked to your DNA or your iris pattern?

heteromeles | August 20, 2009 21:55

21:

@17: Oh yeah, that's right Charlie, you did write Halting State, didn't you? If I see someone with gloves on vacuuming a bus, perhaps I should call the cops?

The point is that there are a bunch of ways to spoof DNA detection, and if you're worried about your privacy, I'd suggest spoofing as much as possible. Simply make sure that you don't leave any trace of yourself where it won't be over-ridden by tens to hundreds of others.

@15: I suspect that, were you to criminalize DNA libraries, only criminals would study evolution. Guess what the taxonomists (what few of them are left) use as tool #1? There are a lot of people who would love that in the US.

Anyway, there are two countervailing issues here. One is the question of how private our DNA should be, which is an ethical question with some real-world consequences. The other issue is the bureaucratic issue of how much people want to know in order to make their work doable, and both how to manage that data and how to secure it against theft, misuse, or corruption.

Looking at it, I think the big issue is the information management problem: getting the data out of the cell, into the computer, error correct (forgot that step? there are a lot of mistakes), store, search, and keep protected. Right now, we're bottlenecked at the first few steps, but I suspect that, in the longer run, keeping the database up to date, readable, and secure are going to be the critical issues.

Not that I'm going to entirely denigrate privacy, but this is very similar to the issue with leaving traces on the web.

@20: I think the problem with the DNA scan is a) it still takes a while to get the DNA, amplify and confirm, so you really need to log in the day before you go to work, and b) as I noted, there are a lot of ways that the process can go haywire, especially in the strongly non-sterile conditions of an entry lock. Waiting a week for a clean sequence so you can go to work probably is a non-started so far as security is concerned.

As for bank account, forget it. I don't really want it linked with either, because I don't want to be transmitting my DNA or iris pattern over the web. Also, what about shared or corporate accounts?

Graydon | August 20, 2009 22:01

22:

Heh.

No one's mentioned the best hacker tactic for such a database; assigning guilt.

If you've really got the database hacked, you catch the crime scene DNA sample results and make them match the records of the guy you don't like.

Of course, so can any police officer (or public employee) who transcribes results into the database.

Given that it'll take at least a decade and a very, very public, very high profile disaster to get the DNA=WordOfGod meme out of public perception, that's at least a decade of police and security forces having the perfect framing tool at their disposal.

Alastair McKinstry | August 20, 2009 22:13

23:

@21:
Absolutely, call the police. In Northern Ireland people now understand that if you see a bunch of people in white CSI-like forensic overalls, get away quickly. They've been used by the provisional IRA to avoid leaving DNA evidence.

For DNA libraries and online lookups: doing away with them and still doing science is doable. Beyond anonymised libraries (just keeping indexes of relationships, etc.) imagine a scheme where my online proxy holds my genome, and answers search requests of it on my behalf. Your research agent needs access from me before it gets access.
e.g. "Will you allow this software agent access to your genome? It claims to be searching for Huntingdons disease trait XXX, will output yes/no to its author, and has been validated by institution XXX ?"

I have to agree with Charlie that bureaucratic inertia makes it unlikely to eliminate the current UK database, but elsewhere saner models can be introduced.

Kevin | August 20, 2009 22:52

24:

The real problem, as Charlie has implied but not said )at least here) is that the Courts and Police LOVE DNA evidence. This is the elephant in the room. Reasonable doubt? Away! The numbers say there is a less than 1% chance the evidence is incorrect. Basically, something is proved "beyond reasonable doubt" the criminal standard in both US and Britain at anywhere upwards of 70% to 80% probability. DNA evidence against you at present means you WILL be convicted and all your protestations of innocence will be discounted. The possibility of error in the database will not be considered. End of.

@13 you are right, scientifically it is overrated but guess what, the legal system ain't run by scientists.

And @11 re: criminals obtaining DNA scanners and the like. Precursor chemicals and most weapons are supposed to be regulated but sufficiently determined criminals can get hold of anything. If the basic chemistry is no harder thamn making Crystal meth, assume that criminals will be doing it and that right soon.

heteromeles | August 20, 2009 23:08

25:

@24: Yes Kevin, I think the real issue is whether it's worth the trouble. For many years, labs kept an eye on their scales, because they were a) easy to steal, and b) extremely useful in the drug trade (which is, to cross-post, a perverse but effective way to teach dumb people to do math in their heads properly). Doing PCR isn't that hard, but it does take both a few hundred to a few thousand dollars in equipment (including a -80 freezer), plus a fairly sterile room and some toxic chemicals. Generating sequences is more difficult, and mostly it's done by machines in a dedicated lab. I'd guess that the easiest way to get them is to buy such a machine through a dummy corporation when someone surpluses it.

I doubt very many crooks are bothering with the setups, yet, because, as Charlie noted above, it's easier to spoof the DNA detectors by adding massive amounts of DNA from other sources, than it is to set up a dedicated spoofing lab. The other thing is that getting caught with a spoofing lab would be a Bad Thing.

When we see thefts of such equipment rising, and/or grad students or company employees getting convicted for running DNA spoofs in their labs in their spare time, that's when we'll know that the crooks think it's cost-effective to bother.

@23: Alastair, Your proposal could be read as a blanket ban on all DNA databases, whether someone's studying obscure diseases, rare whales, chicken development, or British criminals. If your intent is to keep human DNA from being used for unethical purposes, a more sophisticated proposal might be in order. DNA sequencing is a necessary tool in too many parts of the life sciences, and I'd go so far to say that given the choice between surrendering the privacy of my genome vs. giving up all libraries, I'd surrender privacy first. Going blind is not a good option.

Alex | August 21, 2009 00:52

26:

I've already started referring to the bus hoover as the Stross attack. I'm actually very surprised that we've not yet had a documented case in the wild. Perhaps it works?

12: The answer is "France".

steveg | August 21, 2009 01:56

27:

The Four Corners show on ABC was actually quite good. It explained things accurately and used terminology correctly. Not too dumbed down for techno-illiterate: http://www.abc.net.au/4corners/content/2009/s2655088.htm. There's a link there to the video (not sure if it's viewable outside Australia).

Recommended.

My brother works in computer forensics for [can't say. Enigmatic, no?]. His opinion on Australian police capabilities are not high -- basically underresourced and typically behind the eight-ball. I expect that's typical around the world; there are very smart people fighting the good fight, but not enough of them.

The other rumour I've heard recently from a good source inside one Australia's Big Four banks: IT related bank fraud is *incredibly* common. No numbers to back it up, though.

Keeping it in the family, my mum worked for [another organisation. Seriously :-)] who kept databases about [stuff]. It was not untypical for staff there, before purchasing a house, to check out the entire street for undesirables. Highly frowned upon, and technically illegal. But hey, I would too.

Now... to turn this into storyable material. Assume 1) DNA database is hacked. 2) Assume low to no-cost DNA replication. 3) add 1 + 2 together. Smear an area with DNA samples and then add in an "incident". The Bank Job, with not just missing photos of Royals, but implicated by DNA.

Or maybe make it an inside job -- Charlie you know how easy it can be to "tee" an input stream. What if someone replicates the database for their own ends. It takes one dodgy firewall setting to open up an organisation (and it's not unusual for an organisation to have a (cough)secret "don't filter or log" proxy setting for IT staff to buck the system (ostensibly for servers to connect outwards).

Or, projecting forward enough in time, when full-sequencing takes 3 seconds, we could end up in a situation analogous to now; where new evidence from full sequencing overturns results from partial sequencing.

Or, in The Future When Space is Colonised by Clones, DNA is pretty well useless. Hmmm. What about fingerprints, are they identical for clones?

Enough rambling.

johnny chimpo | August 21, 2009 02:05

28:

WGA does not add epigenetic modifications like methylation onto DNA - in the near future mePCR will have to be the standard.

P J Evans | August 21, 2009 03:34

29:

@27
Identical twins - the nearest we have to clones - don't have identical fingerprints, AFAIK. Apparently there's some randomness built into that system.

Vic | August 21, 2009 03:52

30:

Cheap DNA scanners could replace Astrology! Are you and your mate compatible? How should you raise your children? Where can I pre-order?

DaveBell | August 21, 2009 09:44

31:

I understand that a big problem with DNA is that the figures used as a base for the false-match calculations were not a good sample.

At one extreme, your sample are people from the same village, all families which have been there for many generations. You have a much higher number of false matches, both admitted blood relationships and otherwise.

At the other extreme, you compare a remore South American tribe with a remote African tribe, and get a low number of false matches.

In the early days, some of the research was close to the second extreme (and giving some scientifically useful information about human populations), and that's what the legal system (and the companies selling this wonderful new weapon against crime) used as a baseline.

I've my own doubts about whether the fingerprinting can work in a multiple-DNA situation. Maybe the Stross attack is slightly adrift of the point, and the CSI types are concentrating on specific DNA sources, such as bloodstains, where the target DNA overwhelms the enviromental background.

Is spoofing going to protect a criminal who leaves a high-density sample of their personal DNA?

Hacking the database seems likely to be much more effective.

DaveBell | August 21, 2009 09:54

32:

The UK lost a European court case (I don't recall just which court it was--European Convention on Human Rights, I think, was the baseline) on keeping this DNA data of unconvicted persons. And the government is dragging its feet on implementing the changes to obey the ruling.

But it hasn't been such a dreadfully long time, yet. Slower than I'd like, and reeking of BigBrotherism, and time for some arse-kicking, but we're waiting for rules to be defined, and there are consultations, and...

No fucking respect for the law, dammit!

[Not sorry for the language, Charlie]

Soon Lee | August 21, 2009 10:11

33:

heteromeles @11:
DIYBIO (http://diybio.org/about/) and Openwetware(http://openwetware.org/wiki/Main_Page) among others are coming up with ways to conduct molecular biology at home and on the cheap. How long before formerly expensive & tricky DNA synthesis & manipulations are doable in one's basement or garage?

Dan H. | August 21, 2009 13:34

34:

Actually, DNA spoofing is already moderately commonplace in the criminal fraternity. As a class, a lot of criminals smoke and some are stupid enough to smoke whilst out trying to burgle places, presumably to calm their nerves.

For this reason, the more alert, go-ahead type of burglar now routinely covertly picks up cigarette ends outside the seedier pubs which his criminal associates frequent, and scatters a few of these decoys at each burglary job. If found, the instant "evidence" gives the police a nice strong lead in the wrong direction, and even occasionally puts the wrong man behind bars.

Nix | August 21, 2009 16:00

35:

heteromeles@#14: 'no reason why lysozymes couldn't have developed to pop out DNA or immobilize RNA for analysis, if it really was that efficient a tool for identifying things.'

Well, that's basically how RNAi works already, except that it matches and blocks after transcription into mRNA. But it mostly seems to be an antiviral weapon and internal regulation mechanism. Some DNA viruses also insert themselves at specific places in genomes in similar fashion (although most are random). Also chunks of the immune system can match some common bacterial mRNAs, and IIRC intracellular alarms leading to interferon secretion can get triggered if certain common classes of viral DNA/RNA are identified. (very faint memories, may be totally inaccurate).

Identifying whole organisms that way seems impractical: any DNA you inhale is far more likely to belong to a random unicellular organism than to a large multicellular one which organisms might care about, far more likely yet to belong to a bacterium, and far more likely than that to belong to a virus (viral RNA/DNA is fearfully common). And most of that will be packaged away inside one of a variety of envelopes: the only naked DNA you're likely to find is a degraded plasmid or two, which is pretty much completely uninteresting to multicellular life.

So this is an immune system trick only, really.

ajay | August 21, 2009 16:00

36:

Another possibility is that if the database is inadequately secured — and with cops waving handheld scanners with live broadband connections around, that's not a wild stretch

Why on earth does Charlie assume this is going to happen? Normal cops don't have live fingerprint scanners now, and that would be a lot easier to do, technically speaking, than a hand-held DNA analyser. I would imagine that crime scene evidence will be processed at police stations, just as it is now.

Charlie Stross | August 21, 2009 17:11

37:

ajay: What's a "police station"?

See also current fads for hot-desking, etcetera ...

Yes, you need somewhere to store the customers while they're awaiting processing, but apart from that, why are your everso expensive cops clogging up offices doing paperwork and drinking tea instead of pounding pavement?

(NB: I don't actually believe this is a good idea, but I can see some future numpty at the Home Office deciding that if it's good enough for the dot-com sector it's good enough for the Police ... and don't you think those big old stations and their playing fields and car parks might be worth a pretty penny? See also older inner-city hospitals, PFI, the pages of Private Eye passim, etcetera.)

ajay | August 21, 2009 17:16

38:

Charlie: OK, in that case, I agree that if a very large and unsecured DNA database is set up
and if all the police stations and forensics labs are shut down and sold off to developers, except for a few cells in the back of the local Tesco, and ordinary police officers are left to do all their own admin and SOCO work with laptops and handheld DNA scanners...
then there will, indeed, be problems.

Kevin | August 21, 2009 17:35

39:

@31. Sure but the real problem is that what you might regard as a serious "false positive" problem (1% chance? 10%?) is regarded, for legal purposes as near-certainty. For near certainty, as the legal system doesn't like fuzzy results, read "certain" and "you're going down, sonny".

Once convicted, the "criminal" has the burden of proof to show he is innocent, which is already very difficult and becomes nigh-impossible when contradicted by DNA evidence which is regarded by lawyers and judges (and politicians and newspapers) as basically infallible.

Kevin | August 21, 2009 17:38

40:

I meant to say, Charlies has touched on this here before. And he is absolutely right.

heteromeles | August 21, 2009 19:43

41:

General comment on the idea of DNA scanners.

My recollection from my foray into the great world of molecular biology, is that contamination is a huge problem, and also, that getting the damn system to amplify the sample correctly was a bit of a experiental art.

Most of the labs I worked in had an altar to "The Gel God" and there were jokes about "bandology" (i.e. the art of resolving bands in a gel and understanding what they meant).

A lot of this has been ameliorated by tech improvements, but the basic problem of cleaning a sample, extracting the DNA, cleaning the DNA, and amplifying it correctly is going to be tricky to miniaturize in a handheld, especially if your chore is figuring out which member of a family killed another family member, based on DNA evidence.

I agree with the people who say that it's going to stay in the hands of forensics for a while longer, just as I agree with the people who note that good beat cops aren't good at spotting or solving cybercrime.

OTOH, I think there's a place for handheld scanners with non-human evidence, because there's some decent "barcode" DNA that can be used to identify a sample to species or clade pretty quickly (at least if it's a eukaryote), and that can be implemented on a portable detector (although the results may take a few hours). I did some work on that years ago. In any case, contraband could be identified to species in the field, unless it was (as noted above) salted with a bunch of foreign DNA. Hmmmmm. That might be a profitable place for DNA spoofing.

So the first people we might see with DNA scanners would be customs and ag inspectors. Cops on similar beats might start using them as well. For human-on-human crimes, I'm not sure whether Barney the Beat Cop needs his DNA tricorder just yet.

Christopher Hawley | August 21, 2009 20:08

42:

It occurs to me that if genemod therapy ever becomes a viable/practical art, there will be an entirely unforseen application: replacing non-critical sections of one's genome to make it clearly distinct from DNA evidence collected at the scene of a crime.

Not that this is the easiest way to avoid identification -- the dustbuster plus bus seat method in Halting State is far more effective -- but it might be used retroactively by someone too addled to plan in advance or clean up after themself.

Charlie Stross | August 21, 2009 20:16

43:

What I see as a likely application of handheld scanners for is: well, what hereromeles says -- customs and agriculture inspectors -- but also (if the NDNAD remains in use in its current form) for identification of uncooperative suspects and mobile capture of samples taken from persons of interest.

Forensic examination of crime scenes is, indeed, likely to remain the remit of specialists for the indefinite future (although portable scanners would cut a chunk out of the evidence preparation cycle -- just as digital cameras remove the lab time in preparing photographic records of crime scenes).

Marilee J. Layman | August 21, 2009 21:10

44:

Charlie, @37, some US municipalities are putting police and their computers on Segways. They have instant shutdowns on the laptops and Segways, and have to bind criminals to lightpoles and such.

Bruce Cohen (Speaker to Managers) | August 21, 2009 22:54

45:

It's been quite awhile since I worked in a bio lab, so I've no idea if this is reasonable, or what time frame it might be implemented in, but ... bio-MEMS and microfluidics are at the point where we can put multiple (up to 100, I think) reagent or immunological match tests on a a cheap (once somebody actually manufactures a bunch of them, anyway), throwaway chip. Just put a drop of fluid on the input port, and push the button. Couldn't that be done with PCR for target stretches of DNA? Certainly not the whole genome at once, at least not anytime soon, and just one test per chip, but make the chips cheap enough and it might be an attractive solution.

On the other hand, any crime scene test that doesn't allow archiving the original sample and the intermediate results would be rife for a "broken chain of evidence" attack by a defense attorney. Same with any technique that uses indirect queries of the database; if all the steps can't be shown explicitly, the defense can claim that it didn't work correctly this time, and the burden of proof would be on the prosecution. Those conclusions are based on what I know of US law, and may or may not be valid in the UK or elsewhere.

Adrian Smith | August 21, 2009 22:59

46:

Alex@26: I've already started referring to the bus hoover as the Stross attack. I'm actually very surprised that we've not yet had a documented case in the wild. Perhaps it works?

You could steal rubbish bags from hairdressers, that'd get you plenty of material to work with and all.

heteromeles | August 22, 2009 00:21

47:

@45: Bruce, I was thinking along those lines when I was working on barcoding stuff: a fairly self-contained, cheap system for checking to see whether a particular specimen has a particular sequence of DNA.

You run into two huge problems with this approach. One is that you can only spot what's already on the chip, so it's not only impossible to identify unknowns, it's impossible to tell when you have an unknown (as it will react as a null). The other problem is that these things can be spoofed by false positives, especially if they're based on DNA. A strand that's 1 or 2 nucleotides off can still give you enough of a signal that you might think you've got a positive.

And, as you rightly noted, the chain of evidence gets legally interesting, too.

These type of limited monitors are really useful in certain applications, and I think people are already developing them. I was thinking of using them for mapping soil-borne fungal diseases, for example, and I'll bet people working on cholera or other diseases would love to be able to check samples fast.

For police work, though, they're really no better than the cop who goes into a case thinking he knows who did it. All you can load in your detector are "the usual suspects." Anyone who is not on the list is invisible to the detector.

Alex Tolley | August 22, 2009 04:39

48:

Won't the spoofing of DNA by foreign samples inevitably lead to court challenges to its validity? I don't see the Gattaca style as being very likely in practice, but it makes you think whether spoofed DNA could be used to confuse a rape case. A few good examples of a +ve DNA match against a cast iron alibi could start to invalidate the technique's aura of high-tech infallibility.

After the 1970's(?) scandal involving police forensic labs faking evidence for convictions, is planted evidence still a bit of an issue? I recall OJ's attorney implied that the blood evidence was planted too.

truth is life | August 22, 2009 08:51

49:

@10: That is in fact exactly what I was asking: Would in not be possible to build a separate, but parallel network for police-related work (including this database)? Not that that would necessarily *help* much, as #7 points out.

@Alex:

After the 1970's(?) scandal involving police forensic labs faking evidence for convictions, is planted evidence still a bit of an issue? I recall OJ's attorney implied that the blood evidence was planted too.

Yes, very much so...around Houston, there was a *big* scandal a few years back involving the crime lab distorting evidence to help the prosecution--there are still people coming out on relateds *today*, IIRC.

@31 and 13: Wasn't fingerprinting treated much like DNA analysis is today back in the early 1900s? Also, (this one at #13 specifically), what is "low-technology" circumstantial evidence? Things like the murder weapon belonging to the suspect?
(Related to my previous statement, there have recently been a number of people in Texas who have been released from prison based on new DNA evidence who had been convicted based on eyewitnesses or older forensic methods--puts that in relief!)

Charlie Stross | August 22, 2009 13:08

50:

Alex @48: does the Shirley McKie case ring any bells?

Clifton | August 22, 2009 19:00

51:

@12: Bell Labs? Gone whither the woodbine twineth, after the Death of a Thousand Cuts.

Partially chopped up in the '80s, much of the remainder turned into a for-profit-company (Lucent) in the '90s, with most of the remainders downsized in the because they didn't contribute to short-term profits. Lucent was bought by Alcatel (that short answer 'France' above) and last year Alcatel announced it's discontinuing all remaining basic research.

So much for the home of the transistor, radio astronomy, background radiation, lasers, information theory, statistical quality control, Unix and C.

I don't know of any company that still funds that kind of basic research.

Greg. Tingey | August 22, 2009 22:51

52:

@ 24 & 50
So - the answer is:
A Guvmint agency that doesn't like you.

AND you get comprehensively framed, with (almost) no possibility of a get-out!

Looks like (50) it's already happened, no?
If not, it soon wil .....

Alex | August 22, 2009 22:52

53:

Statistical control? Much older, from Guinness at the fin de siecle. As I pointed out way back on the space colony thread, beer is the driving force in world history.

Alex Tolley | August 23, 2009 01:26

54:

Charlie@50 - I left Britain in 1988. I hadn't heard of this case, but it is an interesting link. You may know that there has always been an assumption that fingerprints are unique, although never proven with few large scale empirical tests. I recall that the assumptions behind DNA fingerprint uniqueness were wrong and that more points had to be taken to overcome the randomness assumption.

The problem I see is that regardless of the database quality and security issues, the means to collect and analyze evidence can be subverted and thus pervert justice. The science behind the technique just becomes the club to ensure juries convict. In the US, with crappy public defenders, this could mean (and does) that the innocent are deprived of their freedom.

Alex Tolley | August 23, 2009 01:26

55:

BigHank53 | August 23, 2009 02:31

56:

Actually, you left out one hack that I can think of. A perpetrator who has left DNA behind at a scene would now be very motivated to place some false positives in the database. "A double murder? Don't forget to charge me for Nicole Brown Simpson. And the Virginia Tech shootings. Have a nice day, officer."

Nicholas Hardison | August 23, 2009 05:58

57:

Charlie @6: I think the information requirements would be much smaller than that - the full sequence is about 3 billion characters, so you can store it uncompressed in about a GB and a half (two bases per byte). And since there is a _bunch_ of repeat sequences in there (many of them the same across the whole population), it compresses further nicely. It might even be more feasible to store a single reference sequence, and just record diffs from it.

Huh. I hate the concept of having one of these things, but it seems like I'm enough of a geek to figure out how to implement it...

Though there is work being done on methylation patterns and chromatin modification, I think that is more about understanding biological processes than of being used for identification purposes, and wouldn't be very suited for them anyway.

Robin | August 23, 2009 07:59

58:

A friend of mine who was a nanny was accused of stealing some money from her employer when she left. Despite there being no possible means of securing a conviction the officer arrested her and processed her, including taking DNA evidence. When I protested, his statement was 'I can arrest her so I'm going to'.

Needless to say the CPS dropped the prosecution within two weeks as there was not a shred of evidence other than the accusation.

The police in the UK have definitely switched to a system of arresting and collecting evidence before even thinking about wether a crime has been committed and I think it's because they know the system does the automatic DNA matching - therefore anyone accused of a crime is likely to be a handy catch for something else. 'They've done one crime, let's see what else we can stitch them up for' is probably the thought that whoever instituted this was having.

Police see criminals all the time, their thinking is biased, leaving them in charge of generic evidence gathering tool is dangerous.

DaveBell | August 23, 2009 09:33

59:

As for ID, a Playboy model who was murdered in Los Angeles was identified by the serial numbers of her breast implants.

That's something else to wonder about licit and illicit fabricators in near-future SF. Though I recall it was something mentioned in Cybergeneration version of the Cyberpunk roleplaying game. Something about hacking the fabricators in shopping malls, after hours, to produce unrecorded goods.

I'm having rather twisted thoughts about smuggling as well. In this case, the victims teeth and fingers were reported to have been removed. Sooner or later, there's going to be a repeat where the breast implants are removed as well. But could you smuggle something inside breast implants which would be worth the initial surgical procedures, and then a murder?

And do you use this to justify a DNA sample from everyone coming into the country?

Adrian Midgley | August 23, 2009 10:58

60:

A recipe not a blueprint.

The fingerprints are assembled from a recipe, hence variation in the details despite identical genomes.

But the themes in the result remain tied to the genome.

Denni | August 23, 2009 12:51

61:

DIY biology is hugely overrated. They make it sound as if you could do molecular biology in the kitchen sink, but you still need access to special reagents and these days the suppliers require confirmation that the request comes from a licensed lab. Even the hardware isn't that easy to come by. Openwetware works with bona fide institutions.

And sure, grad students could run some amplifications in their spare time, but the cost is forbidding and it's hugely ineffective compared with just harvesting natural DNA, which also solves the methylation problem.

BTW Nix@35, retroviruses are RNA viruses.

Robert | August 23, 2009 15:18

62:

But could you smuggle something inside breast implants which would be worth the initial surgical procedures, and then a murder?

Cocaine, probably.

Chrisj | August 24, 2009 11:14

63:

@59,62

But could you smuggle something inside breast implants which would be worth the initial surgical procedures, and then a murder?

Data. The value-density of a flash memory card containing complete information about $COUNTRY's current military and intelligence capacities and dispositions far exceeds that of any mere illegal chemical. Various types of industrial (including, but not limited to, military-industrial) espionage spring to mind, too.

guthrie | August 24, 2009 13:01

64:

Robin #58 - the other problem is that New labour have removed all independent action from the police. This means that they have a binary choice - arrest someone, or walk away claiming not to have seen anything. In "The good old days"* they had more discretion and could have talked to some people before arresting anyone, or in other cases would let some off with a talking to or as appropriate just ignore the complaint. You see quite a few policemen on blogs complaining about this, it means that even regular complaints from mentally ill people or neds, the kind that occur every Friday evening when they are drunk, consume paperwork. Instead of telling them "We know you do this every weeek so we aren't paying any attention to you, if you stop drinking the problem will go away" they have to fill out the forms and if necessary visit the complainant. Policing as we have the concept of the police in this country is a lot more complex than people like to think. Here the old standard was policing with consent, although I get the impression new labour have moved it more towards the evil foreign and possibly american version, which is "We're in charge motherfuckers and if you don't do what we want you to do you'll go to jail". Oddly enough a lot of long time served coppers don't like this.

*Which obviously were never as good as people like to think they were.

efnord | August 25, 2009 04:13

65:

Chrisj@63: But then implants are overkill- MicroSD cards are cheap, reliable, and tiiiiny. You can swallow one or smuggle it subcutaneously, as the Japanase mafia are rumored to do with razor blades in prison.

Chris Williams | August 26, 2009 06:56

66:

Where to start . . . I'd be sanguine as Ajay about all this, were it not for the fact that the Home Office's policy on police data processing _since 1969_ has been to put the officer on the street as close to the database as possible. See dtels.org and check out back issues of 'Intercom' for the MADE experiment. On the plus side, about four years ago HMIC woke up to the insecurity of the PNC, which is nice.

As for the role of Guiness, huzzah! but again, the Home Office was there way before: in the 1870s, using random ('promiscuous') sampling on their own database of criminal names and addresses, to prove that it was an overly-expensive dog and should be scaled down. See my chapter in Lomell et al 'By the very act of counting' (Routledge, real soon now).

Tangurena | August 30, 2009 06:24

67:

What about fingerprints, are they identical for clones?

One can examine that hypothesis with identical twins - who are effectively clones of each other: they have different prints.

gregm | September 7, 2009 17:45

68:

efnord @ 65:

Another flaw in Chrisj@63's reasoning: they're also more likely to be, shall we say, 'closely scrutinized'.

zunge | September 29, 2009 07:46

69:

Charlie's Diary

Being the blog of Charles Stross, author.

Doing it wrong

69 Comments