Goes on Mobileye hunt with ability to process two video streams at once
PARIS — The quest for a “Mobileye-killer” computer vision SoC suffers no shortage of vision chip vendors. Ambarella is prominent among them, fighting to place itself toe-to-toe with the firm that pioneered and still dominates the vision-based ADAS and highly automated vehicle market.
Ambarella is rolling out Wednesday (March 28) a new camera SoC called CV2, designed to offer both a deep neural network and stereovision process, targeting ADAS and autonomous vehicles. Ambarella boasts two elements that distinguish it from the competition.The first is a new computer vision architecture developed by VisLab, a European developer of computer vision and intelligent automotive control systems, which Ambarella acquired in 2015. The other is Ambarella’s own field-proven, low-power, HD and Ultra HD video processing chips used for IP security, sports, drone, and after-market dash cam.
By combining with VisLab’s expertise, Ambarella has developed the automotive-qualified CV2, which integrates advanced computer vision, image processing, 4Kp60 video encoding, and stereovision in a single chip. CV2 follows Ambarella’s CV1, announced at the Consumer Electronics Show earlier this year. The company claims that CV2 delivers up to 20 times the deep neural network performance of its predecessor.
CV1 and CV2 offer both monocular and stereo processing in the same chip. “Monocular vision detects and classifies objects further in the distance — up to 180 meters away — while stereo vision camera captures the shape of objects in 3D and detects generic obstacles without training,” Alberto Broggi, a founder of VisLab, explained to EE Times.
Broggi said that Ambarella’s vision SoC, “designed for urban driving,” does “all the perception processing.” Other tasks such as path-planning and maneuvering — steering and stop — are currently left to an onboard PC. However, Broggi added that his team plans to port the fusion and decision-making layers to a future generation of CV chips.
Spun out of the University of Parma, Italy, a computer vision team led by Professor Broggi is the backbone of Ambarella’s AV software stacks for highly automated vehicles.
Asked about Ambarella’s vision solutions, Egil Juliussen, director of research for infotainment and ADAS at IHS Markit, said, “Ambarella’s new products look very good and have potential for ADAS, L4, and L5.”
He added, “I think it’s their technology from algorithms to patent to SoC” that separate Ambarella’s AV platform from the crowd.
Ambarella is inviting media and automotive stakeholders this week to the company’s Santa Clara headquarters for a fully autonomous test drive in a car built on what Ambarella calls its embedded vehicle autonomy (EVA) platform. The car will drive on public roads near the Santa Clara Convention Center, according to a company spokesperson.
“It is good they also have an AV testing permit in California — even if they are the 52nd company to get a license,” said Juliussen. He stressed the fact that Ambarella already “has the development tools and APIs for T1s and OEMs to interface with ADAS and L3 to L5 systems.”
Ambarella plans to start sampling CV2 in the second quarter.
Asked why the company is rolling out two vision SoCs so close together, Broggi said that Ambarella already had CV1 last May. “But we spent the last six months in the development of software and tools for CV1, and we made sure that the chip is fully working when installed on a vehicle,” he said.
Because the groundwork was already laid out and the software for CV2 is compatible with that of CV1, “it is quite easy for our customers to swap the two,” added Broggi. CV2 is automotive-qualified, but CV1 wasn’t.
Mono vs. stereo
Monocular versus stereoscopic vision is a decade-old debate among those developing ADAS and fully automated vehicles. Mobileye — now an Intel subsidiary after being acquired last year — has always maintained that monocular cameras do just fine in identifying lanes, pedestrians, and many traffic signs and other vehicles on the road.
But stereo proponents often argue that the monocular system is not as robust and reliable when calculating the 3D view of the world from the planer 2D frame that it receives from the single camera sensor.
Ambarella melds both mono and stereo worlds on its chip. Its EVA platform, for example, offers 4K CV1-based stereo-vision cameras, with a perception range beyond 150 meters for stereo obstacle detection. Simultaneously, it offers monocular classification of objects over 180 meters, according to the company.
Broggi believes that stereo is a major upgrade. “So much information is contained inside all the 3D shapes of objects,” he said. “Even when the cameras see an object with an unknown shape — something the system has never seen before and it is not trained for identifying — stereo will get that.”
Having both monocular and stereoscopic capabilities also introduces vital redundancies. Stereo vision can presumably spot objects unseen or unrecognizable by a monocular camera.
Asked to weigh in on mono versus stereo, Juliussen said, “I think Mobileye is the main company with this [monocular camera] strategy. That worked well when cameras were costlier. But the price of stereo cameras is coming down. My perspective is that both mono and stereo are good and give Ambarella a larger market and application segments.”
High dynamic range matters
Broggi often likes to tell us, “There couldn’t have been a better union than VisLab and Ambarella.” There is no overlap between what each team does. More importantly, CV1 and CV2 both exploit Ambarella’s image-signaling pipelines for high-dynamic range (HDR) imaging, ultra HD processing, and automatic calibration in a stereo camera.
Beyond all the power of CNN and DNN, vision sensors deployed by ADAS and AVs benefit from the very clean, high-definition images and high-performing image pre-processing functions offered by vision SoCs.
Speaking of stereo cameras that need to be very stable, Broggi said, “Calibration in stereo cameras can be a challenge, especially in automotive applications,” because cars vibrate and operate in a wide temperature range. With Ambarella’s CV1 and CV2 chips, “we do real-time auto calibration on the fly on the chip,” he said.
Also relevant is the possible confusion that ADAS and AVs can experience when the machines face LED-based headlights and traffic signs. When LED blinks — as it switches on and off — its flickering artifacts could trigger machine-vision glitches. This happens when the frequencies of LED lights and cameras are not in sync, explained Broggi. Ambarella’s CV1 and CV2 can mitigate such artifacts by averaging the number of frames. “We can clean it up by using pre-processing early stage in our image pipeline,” he explained.
While Broggi withheld comment on the recent Uber accident that killed a woman crossing a street at night, Ambarella claimed that its chips, offering HDR features, “can process images in very low-light conditions.”
For test drives, Ambarella developed an autonomous vehicle platform called EVA. Built on a Lincoln MKZ, EVA’s sensing suite is heavily based on vision sensors, enabling autonomous driving without using lidars. The EVA-based AV, however, also contains front-bumper-mounted radars made by Bosch for better head-on vision.
Broggi maintained that he believes that vision is the only sensor that gives automated vehicles “full confidence.”
Compared to a lidar that generates 2 million 3D points per second, Broggi said that the long-range stereoscopic camera captures 800–900 million 3D points per second.
The vehicle comes with two different sets of stereo cameras.
One set does long-range perception, the other short-range. The long-range cameras use two 4K sensors (8 megapixels) with a 30-cm baseline and a 75-degree horizontal view. The short-range system uses four stereo cameras using a 2-megapixel sensor with a 10-cm baseline and fish-eye lenses.
Ambarella explained that each long-range camera embodies two CV1 SoCs. In the short-range system, all cameras are connected to a central module with two CV1 SoCs processing all four cameras.
CV2 is designed to support four stereo cameras and four mono cameras in one chip.
While CV1 is manufactured in a 14-nm CMOS process geometry, Ambarella’s CV2 will be fabricated by a 10-nm process at Samsung. Despite the added complexity of the CV2, Broggi said, “We are able to maintain the same low power — 4 to 5 watts.”
Ambarella, a company in transition
Asked about subscribers to CV1 or CV2, Chris Day, vice president of marketing and business development at Ambarella, only revealed that the company is in discussions with OEMS and Tier 1s.
As reported before, Ambarella, which once generated as much as 30% of its revenue from GoPro, a leading vendor of mobile action cameras for sports enthusiasts, is in transition. On the eve of CES, GoPro announced plans to exit the drone business and cut 250 jobs. Ambarella’s plan is to make up for lost revenue with the surveillance (professional and consumer) and auto OEM markets.
Juliussen observed that developing computer vision products for automotive is a major effort by Ambarella, to which the company allocated more than 50% of its R&D budget last year. “And it looks like they have made good progress. Their current products have been used for ADAS segments, but primarily by Chinese OEMs,” said Juliussen.
“Their current products are also used for aftermarket cameras to store driving data for event data recorders (EDR-black box) applications. EDR will be required for L4 and L5 AVs,” he added.
This explains why CV2, among many features, supports 4Kp60 AVC and HEVC video encoding. The chip can add video recording to automotive ADAS and self-driving systems.
Referring to Ambarella’s 10K reports and fourth-quarter earnings, Juliussen said, “I noticed that their automotive business (which is small now) is mostly with Chinese and Asian companies.” He added, “That can certainly grow a lot. The key is how successful they will be with the European, U.S., and Japanese OEMs and Tier 1s.”
— Junko Yoshida, Chief International Correspondent, EE Times