In real terms it’s not as simple as this. The human brain and vision system has had millions of years of evolution, which allows the human learning system to identify and differentiate between objects with one single exposure. Humans only need to see one example of a cat for us to be able to determine that any other example of the same thing is a cat, no matter the size, colour, breed, angle and even part of the cat we see. Computer Vision is still new by comparison and requires far more ‘learning’, i.e. exposure to hundreds of images of cats of different breeds, colours and poses, before it can achieve the same level of recognition accuracy.
However, this downside in learning is more than compensated by Computer Vision’s ability to process exponentially higher volumes of data than a human would be capable of. Where a human might be able to review hundreds of photos per day to detect and classify cats, Computer Vision could process hundreds of thousands.
More important is the difference in scalability. Attempting to process and detect cats in tens of millions of images per month would require an inconceivable number of human operators. Achieving the same using Computer Vision is simply a question of adding a few more CPU cores and more memory.
This processing efficiency is enabling new and wonderful technological breakthroughs from self-driving cars to cancer screening, and entirely new industries, such as brand monitoring and automated retail-shelf management.