Depth sensors are key to enabling next-level machine vision and unlocking autonomy.
What can human vision do that computer vision can’t? Humans perceive the world in three dimensions, and depth sensors are key to enabling next-level machine vision and unlocking autonomy.
More and more machines are endowed with the ability to sense, act, and interact with their environment, supported by recent advances in sensing technologies. EE Times Europe scanned the 3D vision landscape to get a clearer picture of the market drivers, the opportunities and challenges for component suppliers, and the technologies emerging to enable higher levels of depth sensitivity.
At the module level, the 3D sensing market is currently valued at US$6.8 billion and will grow at a 15% CAGR to US$15 billion by 2026, according to Yole Développement.
“In the mobile and consumer markets, which are the driving force, there is a temporary hiatus in growth due to the Huawei ban as well as the fact that the Android camp has de facto abandoned 3D sensing,” Pierre Cambou, principal analyst in the photonics and sensing division at Yole Développement, told EE Times Europe. On the other hand, he added, “Apple is accelerating the trend by including LiDAR sensors in iPads and iPhones.”
Also accelerating is the use of 3D in the automotive context, said Cambou. LiDAR sensors and in-cabin 3D cameras are being adopted, and “we are very optimistic about 3D sensing in the automotive market, which should quadruple in the next five years.”
At present, the prevalent 3D imaging technologies are stereo vision, structured light, and time of flight (ToF).
Stereo vision has been very strong for long-range sensing applications beyond 10 meters, such as consumer drones from companies like DJI and forward-looking ADAS cameras like those in Mercedes, Jaguar, and Subaru models, said Cambou.
Structured light has been the preferred approach for short-range sensing below 1 meter, typically in the Apple iPhone for front-mounted Face ID but also in some industrial applications, addressed by companies such as Photoneo.
Time-of-flight systems are mainly used for medium ranges and currently come in two flavors, Cambou said. Indirect ToF was used in Android phones (from vendors such as Huawei, Samsung, and LG) on the rear side in 2019 and 2020 for photographic purposes. Direct ToF is used by Apple in its most advanced smartphones. “Direct time of flight is the technology being used for LiDARs [for example, by Velodyne, Innoviz, Ibeo, Hesai, and RoboSense], which may eventually use a matrix-shaped sensor on the receiver side,” said Cambou. “It is gaining ground due to the excitement around autonomy.”
EELs or VCSELs?
LiDAR’s ability to capture the entire scene makes it a valuable technology for machine-vision applications. The two systems most commonly used to get a three-dimensional point cloud are flash LiDAR and scanning LiDAR. In a scanning LiDAR system, a focused pulsed laser beam is directed to a certain small solid angle by either a mechanical rotating mirror or a microelectromechanical-system (MEMS) mirror, said Matthias Hoenig, global marketing manager at Ams-Osram. Because the high-power laser beam is controlled so that it is emitted into only a small solid angle, the reachable distance with the optical power used can be much larger compared with the distance achievable with a 3D flash system. “Edge-emitting lasers [EELs] are the product of choice for this system architecture, as they deliver a particularly large amount of light in a small space via a small emission area and thus also score in terms of power and range,” Hoenig said.
Osram, now part of Ams, said it has recently made progress with the waveguide stability of its lasers as the temperature in the package rises during application. Products with a higher number of wavelengths for LiDAR applications are also being explored.
Regarding laser diodes, EELs are currently the largest market opportunity, but vertical-cavity surface-emitting lasers (VCSELs) will rapidly catch up in the future, Yole predicts. VCSELs combine the high power density and simple packaging of an infrared LED with the spectral width and speed of a laser.
“The advantages of the technology, including excellent beam quality, simple design, and advances in miniaturization, explain the growth of the VCSEL market,” said Hoenig. “In general, they require somewhat more installation space than EEL emitters but offer advantages for certain fields of application.” For instance, their radiation characteristics make them particularly suitable for flash LiDAR systems as well as for active stereo vision in industrial applications such as robotics and logistics vehicles, he said.
As for technical challenges related to VCSELs, Hoenig said Ams-Osram is working on higher optical output. Following its acquisition of Vixar in 2018, it demonstrated dual- and triple-junction VCSELs that provide better efficiency and speed than single-junction VCSELs. At this year’s Photonics West, it launched the PowerBoost VCSEL portfolio based on this multi-junction technology. The company said it is also exploring various ideas to improve heat dissipation — for instance, by making a change from top- to bottom-emitting components.
All common 3D sensing approaches depend on the smooth interaction of the various system building blocks, said Lei Tu, senior marketing manager at Ams-Osram. Usually, these systems consist of a light source, special optics, a detector, and downstream software that processes the detected signals accordingly. In the future, she continued, “for component manufacturers such as Ams-Osram, the focus will be on meeting customers’ requirements in the best possible way. This includes the miniaturization of components [and optimization of] their optical performance, lifetime, and, of course, ease of use.” Tu added that some customers prefer “a ready-made, pick-and-place solution,” while others are more inclined to assemble the individual components themselves or have them assembled by third parties into a complete solution.
Depth- and side-sensing for blind-spot detection
Depth perception is the ability to see things in three dimensions and to measure how far away an object is. LiDAR indeed acts as an eye to a self-driving car, and many car manufacturers use it to build a three-dimensional map of the environment around the vehicle. Nonetheless, developments have focused predominantly on front-facing LiDAR systems that have a long detection range (beyond 200 meters) but a relatively small field of view (about 20° to 30°).
OQmented, a 2019 spinoff from Fraunhofer Institute for Silicon Technology (ISIT) in Germany, is working to change that. The company says it has developed a MEMS mirror technology that enables side LiDARs with a 180° field of view.
“The side-looking LiDAR systems are more targeting the short range” to enable blind-spot detection, said Ulrich Hofmann, founder and managing director of OQmented. Blind-spot detection is an important safety feature that makes the short-range side-scanning systems “even more relevant than far-looking systems,” he added. For example, “you need those observing LiDAR systems for the short range when entering intersections, because there is a lot of traffic from pedestrians, cyclists, cars, etc., which can easily lead to confusion and accidents. For that reason, it is important to have a clear overview over a wide angle but also high lateral resolution to discriminate between different objects —static and moving.”
OQmented has placed a curved glass lid on top of its MEMS mirror device, in contrast to a plane-parallel glass lid, to successfully transfer the laser beam into the package and out of the package and enable 180° laser scanning. The patented Bubble MEMS technology not only offers “hermetic vacuum packaging and protection” from environmental contaminants but also ensures that the laser beam successfully transfers into and out of the package, because it always hits the glass in a perpendicular way, Hofmann said. That is not always the case when a planar parallel glass lid is used; for large scan angles, part of the light is reflected back into the package at the lid. That is unacceptable for any kind of LiDAR solution, said Hoffman.
Closer to the data source
Image sensors generate an enormous quantity of data. While most of the processing currently resides in the cloud or in the central processing unit, the trend is to take computing closer to the source of data and embed intelligence near or within the sensor.
For viewing purposes, the data is usually compressed with H264, which means it can be funneled through bandwidths in the 100-Mbps range, said Yole’s Cambou. “In the context of sensing, data streams are typically 10× to 100× larger — 1 Gbps for machine vision is very typical — and if 10 cameras are being used simultaneously, then you very quickly reach 10 Gbps and beyond. The necessity to manage the data close to the sensor arises from the burden at the CPU level. All preprocessing, cleaning, and AI enhancement, if needed, must be done closer to the sensor in order to lower the burden on the CPU.”
Today, however, there is little computation done on the sensor itself because it generates heat, said Cambou.
Image sensors are a key autonomy enabler, but they cannot be added indefinitely; the required computing power would explode. One solution is improving data quality, Yole’s analyst said. “If you really want to solve autonomy, you will need more diversity quickly.”
New technologies are emerging to add a level of sensitivity and build machines that can see better. Cambou identifies two directions: neuromorphic sensing, in which each pixel acts as a neuron and embeds some level of intelligence, and quantum imaging, which detects each photon individually.
France-based neuromorphic startup Prophesee has rolled out what it says is the first event-based vision sensor in an industry package: its third-generation Metavision sensor. “If you couple this Metavision sensor with a VCSEL projector or another kind of projector that can project a suitable pattern, you can realize an event-based structured light sensor,” said Simone Lavizzari, product marketing and innovation director at Prophesee. Why, exactly? Today’s state-of-the-art depth-sensing techniques impose a tradeoff between exposure time, accuracy, and robustness.
Coupling an IR projector with Prophesee’s Metavision sensor yields a fast response time for each independent pixel, in turn allowing for temporal pattern identification and extraction directly inside the sensor, said Lavizzari. “If you use an event-based sensor to do structured light, the response is very fast. We can have a 50× [scanning time] improvement, so [you need] 1 millisecond to get the full 3D scanning versus the conventional 10 to 33 milliseconds with frame-based approaches.” The accuracy is state-of-the-art, but the “software complexity is reduced to the minimum, because we don’t need to do matching in post-processing.”
Matching is not done on frames after the event but pixel by pixel, at the sensor level. Among other benefits, “there is no motion blur, because we can capture the point cloud very fast, and we are compatible with outdoor applications,” said Lavizzari. Ultra-fast pulse detection indeed enables a power increase while maintaining the technology’s eye-safe rating.
On the quantum imaging side, Cambou mentioned Gigajot Technology’s Quanta Image Sensors (QIS), which are single-photon image sensors with photon-counting capabilities. Gigajot, a California-based startup, claims dynamic scenes can be reconstructed from a burst of frames at a photon level of 1 photon per pixel per frame.
This article was originally published on EE Times Europe.
Anne-Françoise Pelé is editor-in-chief of eetimes.eu and EE Times Europe.