There’s a big debate going on in Norway these days about Datalagringsdirektivet, the Norwegian implementation of the EU Data Retention Directive (2006/24/EF). Basically, the directive allows the police to have access to communication data for everyone. Who is talking to who, who sends mail to eachother, etc. etc.
Now, think about this. What the police gets here is a graph of people that are talking with each other. With a little bit of statistical analysis and the appliance of a couple graph clustering algorithms they can find groups of people that talk to each other regularly. Now, these groups might be called families, friends and such – but what if the clustering is of something more nefarious?
Imagine this scenario:
- The police gathers a huge database of communication patterns in Norway. Each act of communication creates a link between two phone numbers or IP addresses, and in sum all the links create a graph. All of the addresses and numbers might not be possible to link to a specific person, but many would, so the data would still be very useful. Data mining this graph would be very difficult – but luckily, there are ways of improving the situation.
- Then you look at subsets of the graph. Let’s assume the police have a list of known neo-nazis, so they take a closer look at what numbers these people are calling and then go down the chain (looking at the next number, finding which numbers that one is calling, and so on) and start looking for clustering.
- When you find some clusters, you do a little digging. Who are these people? A few lookups in the phone directory should work, and with some luck you might find a couple more people who the police already knows are neo-nazis. The clusters that have multiple “interesting” people in them are exactly what the police are looking for, and finding suspects suddenly became a lot easier.
- Now the interesting bit: What if there hasn’t been a crime? This data is still very useful! Pick any group of people that the population at large might distrust, and start looking for clusters that can be identified by it’s known members. And then, start looking for politicians within or near those groups.
- Let’s assume a politician has a friend with friends within one of those “bad” groups. That politician is now “one degree of separation” from someone “bad.” This piece of information can be useful! (Here’s a good place for being creative.)
How in the world could information like this be used by the police? – Or for that matter, how could information like this be used by the database administrators? Or the data analysis people? Or anyone else that has access to the database? Easy: You send an anonymous letter to that politician, threatening to expose that link and thereby create a scandal that this politician might have to spend a year or two cleaning up – UNLESS that politician votes in a certain way at some legislation vote or budget meeting. Make sure to pick something where you can reap slow benefits from any secondary effects from that vote. Lather, rinse repeat.
With a little patience, anyone with access can get actual and real power. And what’s the politician to do? Call the police? Hahahahah! It’s so funny, that it’s sad.
No, the data retention directive is the first serious step towards a police state. Say bye-bye to your democracy, folks.