I have always been fascinated by the human brain (well, consider the source being fascinated :-P), so when I saw a class opening up on the subject in my university, I immediately signed up for it.
The basic goal of neuroscience is to reverse-engineer a complete blueprint of the brain. I am not simply referring to its actual physical makeup:
The interesting part is how it actually works. Its API if you will, or “instructions”. One of the main topics covered in this class was sight. Most of us are familiar with the “lies-to-children” version of how vision works:
(click to enlarge)
Well, the human brain does perform some sort of transformation, but it is a much more useful one.
To simplify matters, let’s consider the surface of the human eye (more specifically, the cornea) to be a round disc (like a vinyl music record). I assume all of you are familiar with Cartesian coordinates, or more commonly known as X-Y coordinates. Well, Cartesian coordinates are very convenient when working with square surfaces (like a map or graph), but not so well when the surface is circular. A more appropriate approach would be to use a Polar coordinate system. Instead of the two values X and Y, we have R and θ (Theta); R being the length of the straight line between the center of the circle and the point, and Theta indicating the angle between that straight line and the horizon. For example:
Everybody knows that the right part of the brain controls most of the left side of the body and vice versa. Well, in the field of vision it’s a bit more complicated. The left side of the brain does not handle visual information from the right eye, but from the right field of vision. So it will handle data from the right side of the right eye, and from the right side of the left eye. But that is less relevant for what I want to talk about, and we could just as well assume we are all Cyclops, i.e. we only have one eye (the second eye is important for detecting depth, but that’s a whole other subject by itself). That means the left side of the brain handles information from the right side of that eye and vice versa.
The part which is first in line to handle information from the eyes and is in charge of sending the most “raw” data about what we see to the rest of the brain is called the visual cortex. Again, we simplify its surface area to make things clearer. With that in mind, one could consider the visual cortex to be rectangular.
So suppose a beam of light hit our eyes. What happens now? The same image will be replicated on the visual cortex, but it will do so after undergoing a transformation along the way. That transformation will take the parameters R and Theta, and will display them on the visual cortex, in Cartesian coordinates – the Y value will be equal to Theta (in radians), and the X value will be equal to log(R). That’s right. The function log(). The one in your calculator.
After you get over the initial shock of realizing that your brain is carrying out log calculations as you are reading these words (Yes, even now. And now. :-P), you start thinking – “why log?”; “what makes it so special?”; The only possible answer comes to mind – evolution.
First you must realize that we do not see 3 dimensions. Sure, we’re all experts at not bumping into things, but we have no receptors for depth. Our eye is flat, and everything we see can be considered to be a 2D image, just like the ones on television. So how come we still have fighter pilots? Our brain uses very elegant algorithms to determine parameters such as depth from the image it receives. One of the algorithms relies on this log transformation – it helps us detect movement. Consider a 2D surface (we’ll consider a circular 2D area – the surface of our eye). A 2D object on that flat surface can transform in three ways:
(if you don’t see any animation, click on the images)
Think about what each of these transformations mean in the real world. Suppose something is becoming bigger and bigger in your field of vision… you should probably duck for cover. Rotation is another good indicator that something is being hurled at you, or simply moving in a manner which you should pay attention to, unless you want to finish your day as fast food. Meaning you would probably want to decide to duck/run/attack as fast as you can. That’s where log comes in. Suppose you were the programmer of the human brain. And you need to build an algorithm which detects objects growing in size. Remember – we don’t see things getting closer. We just see them getting bigger. Look at the animated scale gif. Your algorithm needs to detect that the same object is growing in all directions. That will probably not be such a time efficient algorithm. The same goes for detecting rotation (we will get to transition later). This is where the beauty of log comes in. The mathematical qualities of this function dictate that if someone throws a wrench in your direction, the image shown on your visual cortex will resemble this:
(left side – what your eye “sees”. right side – what your visual cortex “sees”.)
In general, scaling will be transformed into movement of the same sized object along the X axis, while rotation will be transformed into movement of the same sized object along the Y axis. The algorithm for detecting such change is much more time efficient. The reason our brain uses the log function stems simply from evolution – it’s a good way to detect danger, so we use it (when I say “we”, I obviously don’t mean just human beings. This didn’t happen overnight :-)).
I haven’t discussed the last possible movement – transition. But we get over that obstacle using our eyes themselves. Place your finger between you and your monitor. Focus on your finger. Now start moving it in front of you. Almost instinctively, your eyes will follow your finger, leaving it in the middle of your field of vision, effectively nullifying its transition. So that’s how we deal with transition.
Bear in mind: “If the brain were so simple we could understand it, we would be so simple we couldn’t.” – Lyall Watson