One of humankind's most important senses for navigating our world is our ability to see in three dimensions (3D), so it makes sense to give our robots the same capability. But how we see may or may not be the best approach for machines to use. To guide robot operation with depth-sensing vision, three key approaches need consideration: stereoscopic, structured light, and time-of-flight.

Robot systems have been around for decades, but until recently they have mostly been working blind. Equipped only with contact, proximity, and position sensors, robots are manipulating heavy materials, performing delicate assembly, or welding complex structures with elegantly choreographed and endlessly repetitive motion. But their successful operation depends upon precision in their environment and placement of the materials they must utilize, and careful mapping and programming of their movements.

The situation is changing, however. Research in machine vision and visual intelligence, advances in semiconductor fabrication, and emergence of the cellphone market for image sensors have simplified the development and lowered the cost of vision systems, making them an increasingly cost-effective option for extending robotic capabilities. With vision, especially 3D, to help guide them, robots are becoming more capable of interacting with an unstructured world, more flexible in their operation, and easier to adapt to new tasks.

The vision characteristics a given robot requires, though, are highly application dependent. Robots that must visually navigate their movement through a busy warehouse will need long-range sensing of a dynamic environment, but with only modest accuracy. These are completely different needs from those of stationary robots that retrieve parts mixed together in a bin and sort them into piles of the same type. They might require high precision vision within a limited range. Robots performing precision assembly have yet another set of needs. Thus, determining which 3D vision approach to employ begins with understanding the principles governing the way machines "see."

Stereoscopic machine vision
Because of its similarity to the way in which we see, the easiest 3D approach to understand is stereoscopic vision. This is a form of triangulation, in which two (or more) images captured by cameras a distance apart (or one camera that moves between images) get compared to determine the distance to objects in the camera fields of view. The camera separation creates parallax, in which the alignment of nearer objects against a distant background differs, and the closer the object is to the cameras the greater the differences.

This can be seen in the simple example shown in Figure 1. Two cameras, pointing along parallel axes with their sensors aligned and separated by baseline distance B, each capture an image of point (P) in 3D-space (X, Y, Z). The images they have captured will contain that point at different locations (uL and uR) in their 2D image planes. Geometrically, that position is equivalent to where a ray from the point to a camera passes through a plane perpendicular to the camera's optical axis (ZA), located a distance equal to the camera lens' focal length (f).

Figure 1 Simplified stereoscopic vision geometry

If you take the points where ZA for each camera intersect that plane and treat them as the origin for each image's 2D coordinate system, then you can compute the distance between the two imaged points to obtain that point's disparity (d). The point's distance from the image plane (depth) is then easily calculated as:

depth = f * B/d

In general, though, real-world systems are not so conveniently aligned. Figure 2 shows the more generalized setup, in which each camera has its own coordinate system based on the direction of its optical axis and the rotational orientation of its image sensor's pixel grid. Determining the image points disparity becomes more complicated than a simple distance calculation, involving coordinate transformations and geometrical corrections, but the triangulation principle is the same.

Figure 2 Real-world stereoscopic vision geometry

Fortunately, there is a large body of both commercial and open-source software available that can handle those calculations. There is also software that uses the camera images of a grid to determine all the necessary coordinate transformations so that developers do not need to precisely determine camera orientations. Calculating the depth information for a single point in space thus is a relatively straightforward operation in machine vision systems.

But many other computational challenges remain. One of the most significant is having the system determine which points in the various camera images correspond to the same physical point in space. Making that determination can involve a highly complex correlation process requiring small groups of pixels from one image being compared to all the groups comprising the other image to determine which groups match, and then repeating that process for all the small groups comprising the first image.

[Continue reading on EDN US: Structured lighting for depth determination]

Rich Quinnell is an engineer, writer, and Global Managing Editor for the AspenCore Network.

Want to learn more? Check out these other articles in AspenCore's Special Project on machine-vision-guided robots:

3D vision enhances robot opportunities
Vision guided robotics (VGR) has long used 2D imaging, but the advent of cost-effective 3D is opening new application opportunities.

Open-source software meets broad needs of robot-vision developers
Robot vision applications can bring a complex set of requirements, but open-source libraries are ready to provide solutions for nearly every need. Here are some of the many open-source packages that can help developers implement image processing capabilities for robotic systems.

Applications for Vision-Guided Robots
Perhaps the most significant recent developments regarding robotics have involved the combination of high-resolution imaging, artificial intelligence, and extreme processing capabilities.

Designer’s Guide to Robot Vision Cameras
Giving a robotic system vision requires the right camera selection. Here’s a guide to get you started.



Robotic vision electronics design for industry and space
18 years into the new millennium, there are a number of exciting and evolving electronic innovations taking place. Among them is the development of ‘intelligent’ robots for industry, especially in smart factories.

Related articles: