What is the difference between computer vision and human vision?
What is computer vision?
Computer vision (CV) enables computers to “see” and understand digital images, such as photographs or videos. It attempts to mimic human vision by recognizing objects in photographs, and then extricating information from these objects through automation. This means that computers can make inferences about images without any human help. Computers may conduct tasks such as determining if an object is in a photo, how many objects are in the photo, where the objects are located in the photo, and much more. This seems simple because humans can effortlessly view the world around them; however, teaching a computer to see like a human is difficult because we still don’t really understand how human vision works.
How do computers see?
Computers see the world through pixels. Pixels are tiny squares that make up images—each pixel’s brightness is represented by a number, and each number represents a color
So how do machines recognize which pixels make up certain images? Computer vision essentially revolves around pattern recognition. For CV to work, machines must learn from people to recognize images. This can be accomplished through repetition; computers need to be fed as many labeled images as possible. For example, if you wanted to teach a computer program to identify sheep, you would show it numerous pictures with labeled sheep. To label the sheep, you would simply draw a box around the sheep, and write “sheep”. The computer would recognize which specific pixels are in the box, and then would associate that structure of pixels with sheep.
After viewing enough images, the computer would be able to successfully infer what pictures contain sheep on its own. Computers might have to view thousands, or even millions, of labeled images to effectively detect sheep by itself, but the enormous amount of data that is currently available (more than 3 million images are shared online everyday) is making this possible.
How do humans see?
Human vision revolves around light and does not involve repetition or patterns. In other words, we do not need to learn how to see—it is biologically ingrained in us. Human vision consists of multiple steps. First, light bounces off the image in front of you and enters your eyes through the cornea. The cornea then directs light to your pupils and iris, which work together to control the amount of light entering the eye. After light passes through the cornea, it enters the retina; the retina has special sensors called cones and rods, which are involved in seeing color.
Once the cones and rods have been exposed to light, they translate the visual information into electric information. The optic nerve sends this information to your brain. The brain’s visual cortex interprets the electrical form of the image, allowing you to form a visual map.
As you can see, human vision is a very complex process, which can explain why computer vision is so difficult to develop.
Computer and human vision biases
Both computer and human vision can be fooled. For example, humans are biased to recognize faces even if they’re not really there. In the rotating mask illusion, as the mask rotates, you should see it from the back, but your brain automatically perceives it from the front. Computer vision is not biased when it comes to viewing faces—however, it might classify images not recognizable to humans as different objects. For instance, a computer might identify a conglomeration of colors as a robin.
The image on the left is known as a white noise image. White noise images present issues because these images might contain significant amounts of some object’s colors in a concentrated location, even though that object is not really there. If there are enough of these colors next to each other, the program could classify them as an object, as seen above.
How is computer vision being used today?
Although computer vision has taken decades to develop, huge strides are being made in the field. Mobile technology with built-in cameras, advanced computing power, accessible hardware designed for computer vision, new algorithms like convolutional neural networks, and an abundance of data have all contributed to the rise of computer vision.
CV is currently being employed in self-driving cars, facial recognition, healthcare, and many other fields. It is enabling doctors to identify cancer patients with greater speed and accuracy when analyzing chemotherapy response assessments. Additionally, the U.S. Air Force is discovering the speed and power of CV in its dogfighting simulator.
Computer vision is much faster than human vision—in CV, the signal is transmitted through electrical impulses, while in humans, it’s transmitted through a chain of chemical events involving sodium and potassium ions. In sum, computer vision is already capable of drastically improving various systems, and we haven’t even come close to fully tapping its abilities.