-
Notifications
You must be signed in to change notification settings - Fork 0
Computer Vision
Computer vision is a field focused on methods of analyzing and understanding images and video, with the intent of producing symbolic information about the content of those images. Computer vision is closely related to image processing, building upon basic image processing algorithms to extract more detailed information in a more useful format from images. The study of computer vision generally focuses on a few common applications, detailed below, though it is a rapidly expanding field.
##Common Applications ###Recognition The most fundamental problem solved through computer vision is to recognize objects or people, either identifying specific subjects or classifying them to generate a better understanding. While this comes extremely naturally to humans, it is a very complex problem for computers. Images are usually stored as an array of light intensity values in a computer, and converting this into data about the nature of the subject(s) of the photograph is a nontrivial problem. Most recognition applications fall into one of three categories:
- Object recognition (or object classification): the process of identifying certain objects in an image. This is often accomplished through a software learning about certain objects using a convolutional neural network. The position, orientation, and possibly other attributes of the objects are also usually determined.
- Identification: the process of identifying specific instances of an object, e.g. a person's face or a license plate number.
- Detection: the triggering of a system's behavior upon detection of a type of object, e.g. a camera that begins recording once movement is detected.
###Motion Analysis Motion analysis applications attempt to determine the velocity of objects moving through space by analyzing a video of the object moving. More specifically, these applications involve the analysis of optical flow.
###Scene Reconstruction Scene reconstruction applications attempt to produce a 3D map of a scene based on multiple images of that scene. By analyzing the disparity between the locations of different points in images taken of the same scene from different angles, the z-distance, or distance from the camera, of points in a scene can be estimated. These data are used to construct the scene in 3D.
For further reading, please read the Wikipedia page on the subject.