What Do They Know About You?

In my opinion, one of the biggest things standing in the way of online security is an archaic understanding of “Knowledge”. Specifically, the way that humans “know” things is fundamentally different from the way in which machines (specifically computers) “know” things. To begin, let’s try to define knowledge. This is my personal definition, made up for the purpose of this discussion.

Knowledge:

Having data

Ability to access it

Ability to use it for some purpose

Let me give an example. I ‘know’ my parents phone number. Somewhere in my physical brain, there’s some neurons aligned in a certain way to store a 10 digit series of numbers. In addition to that, there’s another bit of data there that says those numbers go to my parents. I have another part of my brain that knows that xxx-xxx-xxxx format with a name attached is a phone number. All of this is data that’s stored in my brain. My conscious mind has the ability to access that data. I could say those numbers aloud in the correct sequence right now if I wanted. Lastly, I have the physical ability and knowledge of technology to use that. I could pick up a phone push those numbers in that sequence, and I know it would ring through to their phone. If one of those links is broken however, my ability to make that phone call is gone. If I can’t recall the number right now because I’m stressed (I still know it, because I’ll be able to recall it later without the stress hormones influence) no call. If I have brain damage and the data is physically erased, no call. If you hand me a foreign phone number that’s say 12 digits in a different format, with a foreign name that I don’t recognize as a name, then I might have no idea what to do with it. Even though the part of my brain that knows how to dial the phone is still fully functional.

This may be a little bit of a silly example, but I’m trying to illustrate how many complex pieces go in to ‘knowing’ something in a useful manner.

Now that we’re thinking about these aspects of ‘knowledge’, let’s compare how humans and computers ‘know’ things:

Humans:

HIGHLY imperfect data storage compared to a computer

Very disorganized retrieval system

Very good at using data  

Very good at extrapolating/enhancing data (i.e., watching something from one viewpoint, and understanding)

Tremendous amount of “unconscious” knowledge

Computers

“Perfect” storage

“Perfect” retrieval

Less power to apply the data (but growing every day)

Has to be told what to care about (for now)

Different ability to extrapolate data than a human being (for now)

Different ability to aggregate knowledge

No “unconscious” knowledge

Before we proceed, I want to clarify what I mean by ‘unconscious knowledge’. As humans, we all have a tremendous number of things that we just “know”. For example, most of us are able to speak articulately in our native language with little to no understanding of the nuances of grammar. Even those that are educated in grammar don’t generally consciously think through the grammar of every sentence. Another example is physical abilities. Any decent basketball player can walk out onto a driveway with a goal, pick a spot, and start making baskets somewhat consistently. If she doesn’t make the first few, she’ll probably be able to make 5-6 shots from the same spot, then start hitting them. However, she probably can’t tell you exactly how far away from the goal she is, or how high the goal is, if it’s not regulation, and almost certainly couldn’t tell you how many newton meters of force she’s exerting on the ball when she makes a basket or which muscles are being used to aim. Her brain is unconsciously estimating all of that, but we don’t consider it, and unless we’ve trained ourselves to, really can’t translate an innate ability to hit baskets into an ability to estimate distance. On the other hand, for a computer to control a machine that shoots baskets, it must also measure all of those variables and more, and it will store those variables in a manner that can be retrieved and/or used for other things. So, while a human driver might know that they were “going kinda fast” when they hit that power pole, a computer in their car should be able to say they were traveling at 140.7 kph, at what point they swerved, and when the wheels lost traction. Hopefully that clarifies this idea of “unconscious knowledge”.

Most of us are accustomed to thinking of computers as storage and retrieval devices. Any time you record a video on your smartphone and play it over and over, your phone is storing and retrieving a quantity of audio and visual data that’s far in excess of what your memory can handle. Every detail is preserved. However, Artificial Intelligence (AI) is rapidly moving beyond that. Computers have always been able to do complex tasks, and that’s what they’re good for. What they lack (for now) is the ability to extrapolate information, learn on their own and make decisions outside of a predefined framework.

All of this is rapidly changing, but our thinking on how that effects “knowing” things is not changing so quickly.

The implications of “knowledge” in a computer enhanced society:

Up until very recently, we’ve been using computers to record and display information, and do searches that dramatically speed up certain tasks, but not make decisions, or decide what to look for themselves. So, from a surveillance perspective, this means that while we can dragnet emails, and do keyword searches, there still has to be a human looking at the flagged emails to suss out the meaning. While we could record video and run facial recognition, we needed a human to determine what the identified person was doing, and how to proceed with the information. That has been the limiting factor.

Before that (and this is how most of us average citizens still think) for someone to be surveilled required an actual person spending a lot of time. For someone to sift through EVERYTHING you ever wrote would have been a literally impossible task. Today, while we will probably always have a little bit of writing that happens on paper, the vast majority of writing, occurs in a digital medium. That means it can be recorded and searched, probably throughout your lifetime. Additionally, more and more speech is through a digital medium, or is within range of a recording device (i.e., a quiet conversation at a rally, next to someone recording it on their phone). This means that theoretically, we could eventually have 1984 style society where everything is watched via the telescreen. Then it’s recorded, and continuously compiled and compared to the rest of your ‘profile’. I don’t believe this is close to possible today, but we’re moving in that direction, in terms of technical ability.

To summarize the state of things today; I believe that we have a tremendous amount of data being recorded, but because it’s in different systems and human beings are needed to analyze it then much of it can only be used retroactively, or in small targeted ways. For example, if someone commits a serious crime today, it’s quite likely that we can find video footage surrounding the event, and perhaps emails etc. This makes it easier to convict them, but doesn’t help with preventing the crime.  Similarly with corporations, there is lots and lots of “big data” and profiling going on, but it’s somewhat distributed between companies, and is usually only focused on predicting your likelihood to buy things.

Is it really that bad?

Take heart, the situation I’ve described where everything is monitored would take an astronomical amount of computing power, even by today’s standards. To simply monitor every person and store all the data they create would be a staggeringly huge endeavor, and a machine that could compile it and make decisions would be as big or bigger a demand for computing resources. On top of that, there ARE ways that we can protect ourselves from a panopticon state or company through technology.

There are two things that I think will best help prevent this.

First is Zero Knowledge encryption systems, which is where only the user is able to decrypt their data, and the service provider only ever sees encrypted data. Encryption is becoming more and more common as a security measure, but unfortunately end to end/zero knowledge systems are still fairly uncommon because they block the profiling activity that is what most online companies use to make money.

The second thing is to have more distributed systems. Unfortunately, with cloud computing we’re moving in the opposite direction. I think that mobile devices should have larger storage, and more things should be stored ON the device, and run on the device. An example is navigation software. When you use Google Maps, everything’s running through their online server. This works well, but means that Google has to track you. If you use local system (such as an independent GPS, or OSMAnd app) then everything is running on your device, and no-one else has to see it.

These two areas are things that you and I can strive to use as much as possible in our daily lives. If enough people do that, it will tip the balance towards privacy, and, even if it doesn’t it will still protect those who use privacy oriented services, so you can’t really lose. Stay tuned for some future posts on some of the tools that I use to protect myself.

Advertisement

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s