20 Mar Catching Criminals with Big Data
“Your honor, I plead not guilty.” This typical phrase can be heard in a native language daily all around the world. As the Universal Human Rights state, and those are normally accepted by democratic countries, suspects are innocent until proven otherwise. However, this does not mean a prosecutor won’t do anything in his mind to get the required evidence. Traditional methods as hearing have already been perfected for centuries, but as our lives shift to the digital worlds, so must the prosecutor’s methods.
Today’s computers and smartphones record peoples actions and behaviour second by second. If this data can be accessed, it’d be childsplay to convict someone when he’d done something wrong. However, maybe luckily, the government still cannot track you and every single other person for every second of every day, but we’re getting at it. You happily tweet, message or otherwise let other people know of all your adventures. So do most petty criminals. That’s not really strange, as you might think these drops will get lost in the sea of data published every minute. But as you might know, this data can be utilised to catch criminals!
To catch a criminal
What steps should we follow to find the evidence needed to prosecute a criminal or fraud?
- Identify a digital threat or actual crime, on the basis of previous cases.
- Collect the necessary data to confirm the threat or crime, by structuring unstructured big data such as tweets, emails, metadata from phone calls. This’ll likely take hours and maybe even days, but in due time you’ll find correlations.
- Derive patterns, and then pattern anomalies, in your huge dataset. Be sure to throw in some statistical magic!
- Preserve the collected data. A judge will ask for your entire dataset and investigation methods. Be sure to free up some space on your hard disk!
- Create a report on the threat or crime you discovered. It should be not too long, but must contain anything important. And it should not seem biased. Not really the most exciting stage, but you’ll have to do this to collect your pay check.
- Last but not least, present your findings to either your boss or directly to a judge. This might not be your favourite part of the job, but it is your chance to grab the spotlight and show that you’re the expert in the data forensics field. Make it visual with graphs and charts, to make it more understandable for the average Joe.
Really big data!
But do we have enough data to actually to conclude anything useful? Is there enough referencematerial to create patterns and thus find anomalies? Check out the stats below and find the answer! Every minute:
- Youtube users upload 72h of new video content
- Facebook users share 2.460.000 bits of content
- Whatsapp users send 347.222 photos
- Twitter users tweet over 277.000 strings of 140 words
Google receives 4.000.000 search queries
It does not stop here. In total, 2.5 Exabytes of data are created every day. You might not know what an Exabyte is, so we’ll put in a different perspective.
One Exabyte is:
- One billion Gigabytes
- 64,782 * one billion Gigabytes in average pages in Office Word = 64.782.000.000.000 pages
- 1/5 of all the words ever spoken by mankind
- 200.000.000.000 HTML websites
To make this even more amazing, it is calculated that the amount that is added daily doubles each 40 months.
What if they want you?
The legal system might interested in your daily escapades. Maybe you’ve been interested in looking up Al-Qaida’s Inspire web magazine for ‘educational’ purposes, or your Twitter is affiliated with the wrong persons. These factors could get you a couple of checkmarks behind your name.
In The Netherlands, this data can freely be accessed, so it’s probably not too smart to spam your Facebook feed full with Jihadist propaganda or bragging posts of how you collaborated with Anonymous in DDOS-ing Paypal. However, you could be telling your mother, a friend or a fellow petty criminal of your latest adventures.
This is where the minister of justice comes in. He can order a telephone-tap, which allows the secret service to collect and listen to all the calls you make. Collection of other forms of data, that might not be immediately visible for the general public, can also be accessed with the right tools and government approval.
Different threats or situations that differ from the general pattern of today’s society could become a point of interest. But luckily, they’ll still have to prove you did something wrong. If you did something wrong, beware, as everything you’ve done the past couple of years is collected somewhere in a datacentre.
Careers for everyone
As you’ve seen, these steps that should be followed to catch a criminal could be done by a single person with a high expertise level. However, bigger organisations will of course split these tasks by finding the right person for every single part of the process. A technical administrator for keeping up with the servers, an expert whose specialised in finding usable datasets, your statistical magician who works all the odds and a great visualizer skilled in arts.
Could there be a career for you in the big data forensics industry? Let us know in the comment section below!