Last week I introduced the topic of artificial intelligence and technological singularities, and why they remain grounded more in fantasy than reality. We continue on that discussion today.
You've probably heard of Google's recent push for their indoor mapping product Project Tango -- they are probably the most notable company making forays into this technology, though they are hardly the first. Digital wayfinding is a new and booming industry, and it is prepared to push the boundaries on how people interact with and navigate spaces.
I'm sure many of us are familiar with wayfinding by vehicle, and have made it a near-daily part of our lives with the rise (and fall) of Mapquest, personal GPS devices and the ubiquitous Google Maps. But indoor navigation (and its eventual integration) is not as simple as roadway navigation; road systems generally employ naming conventions thorough enough that they would never need to rely on an instruction like "turn right at the gas station." This is what has allowed outdoor/vehicular navigation to proliferate while indoor mapping lags behind: very little relies on line-of-sight beyond identifying roads, which are predictable and usually static. But in an environment like a mall, where hallways do not have street names and stores change quite frequently, a system for text directions ends up somewhat ambiguous. Stores come and go, the same way paintings in a museum will rotate. How does a computer navigate a user when its reference points are as temporary as a store's lease? Or, perhaps more difficult: how does a computer understand what a human could identify as a reference point?
Picture this problem:
- You are a computer generating path directions for a user. Your programming tells you that a store on the left is closest to your current position, and thus it serves as an ideal reference point. You tell the user, "Make a left turn at [Store]." It is all your code has instructed you to you.
- You are a user, navigating a mall with a navigational system. It directs you down a hall and then tells you that you must make a left turn at [Store.] However, you cannot see any signage that identifies that nearby store, as the entrance to that store is around the corner, and to see it, you will have to make the turn anticipating it. The map says you will be walking 42 metres, but is your perception of distance so accurate? What if it's the NEXT corner?
I am sure most people could wing this direction and end up in the right place. After all, it's a mall, not a maze, and critical thinking makes up for the lack of nuance in the computer's code. But as technology advances towards machines doing our thinking for us, we're seeing more and more people driving their cars into rivers or ending up in the wrong city because a GPS told them to -- we want artificial intelligence to wrangle it for us.
On the computer's end, the current lack of nuance is an issue with line of sight. The computer cannot replicate line of sight; it has no eyes, no brain, and only a vague, pre-programmed concept of space. If the application does not involve indoor positioning -- not all clients want to install hundreds of little beacons, after all -- then the application cannot even offer "turn left in 42 meters." It is limited to contextual locations, and it has no way of knowing where signage appears, what the mall looks like at any given moment, or that despite having been mapped closer, Store A is visually a poorer visual reference point than Store B.
Writing rules to get a computer program to wrangle a concept like line of sight requires the program to not only have "eyes", but also have the ability to process a mall spatially as a human would. It demands a form of artificial intelligence.
I illustrate this not to hammer home the computer's lack of intelligence but rather the simplicity of the machine. Computer programming is a series of switches, and programming is the translation of these combinations of switches into human identifiable commands -- it is a computer code is language in its own right, translated through a variety of different programming languages that are more legible to humans, and then again translated into the human languages. Any command a computer undertakes has to be one fed through a variety of languages: say, ideas spoken and communicated in English, put into practice in a coding language, and then commanding a computer in 1s and 0s.
Put like this, computer communication is a marginally about instructions. If x, then y. When q, do z. These instructions can get complicated and overlap, and the more complex they are, the bigger the risk of conflict, not to mention the massive undertaking of keep these instructions sensible. There's also a broad distinction between following instruction and intelligence: at no time is a computer ever doing anything it was not specifically (or accidentally) programmed to do.
Next week, we touch on those old ghosts in the machine, and the bugs that will stand in the way of genuine AI.