Putting AI into the Edge Is a No-Brainer

Article By : Duncan Stewart, Jeff Loucks

In 2020, Deloitte predicts that more than 750 million edge AI chips will be sold, representing US$2.6 billion in revenue.

In 2020, Deloitte predicts that more than 750 million edge AI chips — full chips or parts of chips that perform or accelerate machine learning tasks on-device, rather than in a remote data center — will be sold, representing US$2.6 billion in revenue. Furthermore, the edge AI chip market will grow much more quickly than the overall chip market. By 2024, we expect unit sales of edge AI chips to exceed 1.5 billion, possibly by a great deal. That represents compound annual unit sales growth of at least 20%, more than double the longer-term forecast of 9% CAGR for the overall semiconductor industry.

Figure 1: Locations in which intelligence can be embedded (Image: Deloitte Insights)

These edge AI chips will likely find their way into an increasing number of consumer devices, such as high-end smartphones, tablets, smart speakers, and wearables. They will also be used in multiple enterprise markets: robots, cameras, sensors, and other devices for the internet of things. The consumer market for edge AI chips is much larger than the enterprise market, but it is likely to grow more slowly, with a CAGR of 18% expected between 2020 and 2024. The enterprise edge AI chip market is growing much faster, with a predicted CAGR of 50% over the same time frame.

Figure 2: The edge AI chip market (Image: Deloitte Insights)

Nevertheless, this year, the consumer device market will likely represent more than 90% of the edge AI chip market, both in terms of the numbers sold and their dollar value. The vast majority of these edge AI chips will go into high-end smartphones, which account for more than 70% of all consumer edge AI chips currently in use. Indeed, not just in 2020 but for the next few years, AI chip growth will be driven principally by smartphones. We believe that more than a third of the 1.56 billion-unit smartphone market this year may contain edge AI chips.

Because of the extremely processor-intensive requirements, AI computations have almost all been performed remotely in data centers, on enterprise core appliances, or on telecom edge processors — not locally on devices. Edge AI chips are changing all that. They are physically smaller, relatively inexpensive, use much less power, and generate much less heat, making it possible to integrate them into handheld devices as well as non-consumer devices such as robots. By enabling these devices to perform processor-intensive AI computations locally, edge AI chips reduce or eliminate the need to send large amounts of data to a remote location, thereby delivering benefits in usability, speed, and data security and privacy.

Keeping the processing on the device is better in terms of privacy and security; personal information that never leaves a phone cannot be intercepted or misused. And when the edge AI chip is on the phone, it can do all these things even when not connected to a network.

Of course, not all AI computations have to take place locally. For some applications — for instance, when there is simply too much data for a device’s edge AI chip to handle — sending data to be processed by a remote AI array may be adequate or even preferred. In fact, most of the time, AI will be done in a hybrid fashion: some portion on the device and some in the cloud. The preferred mix in any given situation will vary depending on exactly what kind of AI processing needs to be done.

The economics of edge AI in smartphones

Smartphones aren’t the only devices that use edge AI chips; other device categories — tablets, wearables, smart speakers — contain them as well. In the short term, these non-smartphone devices will likely have much less of an impact on edge AI chip sales than smartphones, either because the market is not growing (as for tablets) or because it is too small to make a material difference (for instance, smart speakers and wearables combined are expected to sell a mere 125 million units in 2020). Many wearables and smart speakers depend on edge AI chips, however, so penetration is already high.

Currently, only the most expensive smartphones — those in the top third of the price distribution — are likely to use edge AI chips. But putting an AI chip in a smartphone doesn’t have to be price-prohibitive for the consumer.

It’s possible to arrive at a fairly sound estimate of a smartphone’s edge AI chip content. To date, images of phone processors in Samsung, Apple, and Huawei show the naked silicon die with all its features visible, allowing identification of which portions of the chips are used for which functions. A die shot of the chip for Samsung’s Exynos 9820 shows that about 5% of the total chip area is dedicated to AI processors. Samsung’s cost for the entire SoC application processor is estimated at US$70.50, which is the phone’s second-most expensive component (after the display), representing about 17% of the device’s total bill of materials. Assuming that the AI portion costs the same as the rest of the components on a die-area basis, the Exynos’s edge AI neural
processing unit (NPU) represents roughly 5% of the chip’s total cost. That translates to about US$3.50 each.

Figure 3: A die shot of the chip for Samsung’s Exynos 9820 shows that about 5% of the total chip area is dedicated to AI processors. (Image: ChipRebel; Annotation: AnandTech)

Similarly, Apple’s A12 Bionic chip dedicates about 7% of the die area to machine learning. At an estimated US$72 for the whole processor, that percentage suggests a cost of US$5.10 for the edge AI portion. The Huawei Kirin 970 chip, estimated to cost the manufacturer US$52.50, dedicates 2.1% of the die to the NPU, suggesting a cost of US$1.10. (Die area is not the only way to measure what percentage of a chip’s total cost goes toward AI, however. According to Huawei, the Kirin 970’s NPU has 150 million transistors, representing 2.7% of the chip’s total of 5.5 billion transistors. That would suggest a slightly higher NPU cost of US$1.42).

Figure 4: Apple’s A12 Bionic chip dedicates about 7% of the die area to machine learning. (Image: TechInsights/AnandTech)

Although the cited cost range is wide, it’s reasonable to assume that NPUs cost an average of US$3.50 per chip. Multiplied by half a billion smartphones (not to mention tablets, speakers, and wearables), that makes for a large market, despite the low price per chip. At an average cost of US$3.50 to the manufacturer, and a probable minimum of US$1, adding a dedicated edge AI NPU to smartphone processing chips starts looking like a no-brainer. Assuming normal markup, adding US$1 to the manufacturing cost translates into only US$2 more for the end customer. That means that NPUs and their attendant benefits — a better camera, offline voice assistance, and so on — can be put into even a US$250 smartphone for less than a 1% price increase.

Sourcing AI chips: In-house or third party?

Companies that manufacture smartphones and other devices vary in their approaches to obtaining edge AI chips, with the decision driven by factors such as phone model and, in some cases, geography. Some buy application processor/modem chips from third-party providers, such as Qualcomm and MediaTek, which together captured roughly 60% of the smartphone SoC market in 2018.

Both Qualcomm and MediaTek offer a range of SoCs at various prices; while not all of them include an edge AI chip, the higher-end offerings (including Qualcomm’s Snapdragon 845 and 855 and MediaTek’s Helio P60) usually do. At the other end of the scale, Apple does not use external AP chips at all: It designs and uses its own SoC processors, such as the A11, A12, and A13 Bionic chips, all of which have edge AI.

Other device makers, such as Samsung and Huawei, use a hybrid strategy, buying some SoCs from merchant market silicon suppliers and using their own chips (such as Samsung’s Exynos 9820 and Huawei’s Kirin 970/980) for the rest.

Over 50 AI accelerator companies vying for edge AI in enterprise and industrial

If edge AI processors used in smartphones and other devices are so great, why not use them for enterprise applications, too? This has, in fact, already happened for some use cases, such as for some autonomous drones. Equipped with a smartphone SoC application processor, a drone is capable of performing navigation and obstacle avoidance in real time and completely on-device, with no network connection at all.

However, a chip optimized for a smartphone or tablet is not the right choice for many enterprise or industrial applications. As discussed earlier, the edge AI portion of a smartphone SoC accounts for only about 5% of the total area and about US$3.50 of the total cost and would use about 95% less power than the whole SoC does. What if someone built a chip that had only the edge AI portion (along with a few other required functions, such as memory) and that cost less, used less electricity, and was smaller?

Well, they have. In all, as many as 50 different companies are said to be working on AI accelerators of various kinds. The standalone edge AI chips available in 2019 were targeted at developers, who would buy them one at a time for about US$80 each. In volumes of thousands or millions, these chips will likely cost device manufacturers much less to buy: some as little as US$1 (or possibly even less), some in the tens of dollars. We are, for now, assuming an average cost of about US$3.50, using the smartphone edge AI chip as a proxy.

Besides being relatively inexpensive, standalone edge AI processors have the advantage of being small. They are also relatively low-power, drawing between 1 and 10 W. For comparison, a data-center cluster (albeit a very powerful one) of 16 GPUs and two CPUs costs US$400,000, weighs 350 pounds, and consumes 10,000 W.

With chips such as these in the works, edge AI can open many new possibilities for enterprises, particularly with regard to IoT applications. Using edge AI chips, companies can greatly increase their ability to analyze — not just collect — data from connected devices and convert the analysis into action while avoiding the cost, complexity, and security challenges of sending huge amounts of data into the cloud. Issues that AI chips can help address include the following:

Data security and privacy
Collecting, storing, and moving data to the cloud inevitably exposes an organization to cybersecurity and privacy threats, even when companies are vigilant about data protection. This immensely important risk is becoming even more critical to address as time goes on. Regulations about personally identifiable information are emerging across jurisdictions, and consumers are becoming more cognizant of the data that enterprises collect, with 80% of them saying that they don’t feel that companies are doing all they can to protect consumer privacy. Some devices, such as smart speakers, are starting to be used in settings such as hospitals, where patient privacy is regulated even more stringently.

By allowing large amounts of data to be processed locally, edge AI chips can reduce the likelihood that personal or enterprise data will be intercepted or misused. Security cameras with machine-learning processing, for instance, can reduce privacy risks by analyzing the video to determine which segments of the video are relevant and sending only those to the cloud. Machine-learning chips can also recognize a broader range of voice commands so that less audio needs to be analyzed in the cloud. More accurate speech recognition can deliver the additional bonus of helping smart speakers detect the “wake word” more accurately, thus preventing it from listening to unrelated conversation.

Low connectivity
A device must be connected for data to be processed in the cloud. In some cases, however, connecting the device is impractical. Drones are an example. Maintaining connectivity with a drone can be difficult depending on where they operate, and both the connection itself and uploading data to the cloud can reduce battery life. In New South Wales, Australia, drones with embedded machine learning patrol beaches to keep swimmers safe. They can identify swimmers who have been taken by riptides or warn swimmers of sharks and crocodiles before an attack, all without an internet connection.

(Too) big data
IoT devices can generate huge amounts of data. For example, an Airbus A-350 jet has more than 6,000 sensors and generates 2.5 terabytes of data each day it flies. Globally, security cameras create about 2,500 petabytes of data per day. Sending all this data to the cloud for storage and analysis is costly and complex. Putting machine-learning processors on the endpoints, whether sensors or cameras, can solve this problem. Cameras, for example, could be equipped with vision-processing units (VPUs) — low-power SoC processors specialized for analyzing or pre-processing digital images. With edge AI chips embedded, a device can analyze data in real time, transmit only what is relevant for further analysis in the cloud, and “forget” the rest, reducing the cost of storage and bandwidth.

Power constraints
Low-power machine-learning chips can allow even devices with small batteries to perform AI computations without undue power drain. For instance, Arm chips are being embedded in respiratory inhalers to analyze data, such as inhalation lung capacity and the flow of medicine into the lungs. The AI analysis is performed on the inhaler, and the results are then sent to a smartphone app, helping health-care professionals to develop personalized care for asthma patients. In addition to the low-power edge AI NPUs currently available, companies are working to develop “tiny machine learning”: deep learning on devices as small as microcontroller units. Google, for instance, is developing a version of TensorFlow Lite that can enable microcontrollers to analyze data, condensing what needs to be sent off-chip into a few bytes.

Low-latency requirements
Whether over a wired or wireless network, performing AI computations at a remote data center means a round-trip latency of at least 1 to 2 ms in the best case and tens or even hundreds of milliseconds in the worst case. Performing AI on-device using an edge AI chip would reduce that to nanoseconds — critical for applications in which the device must collect, process, and act upon data virtually instantaneously. Autonomous vehicles, for instance, must collect and process huge amounts of data from computer-vision systems to identify objects, as well as from the sensors that control the vehicle’s functions. They must then convert this data into decisions immediately — when to turn, brake, or accelerate — in order to operate safely. To do this, autonomous vehicles must process much of the data they collect in the vehicle itself. Low latency is also important for robots, and it will become more so as robots emerge from factory settings to work alongside people.

The bottom line: edge AI will be vital for data-heavy apps

The spread of edge AI chips will likely drive significant changes for consumers and enterprises alike. For consumers, edge AI chips can make possible a plethora of features — from unlocking their phone to having a conversation with its voice assistant or taking mind-blowing photos under extremely difficult conditions — and without the need for an internet connection.

But in the long term, edge AI chips’ greater impact may come from their use in the enterprise, where they can enable companies to take their IoT applications to a whole new level. Smart machines powered by AI chips could help expand existing markets, threaten incumbents, and shift how profits are divided in industries such as manufacturing, construction, logistics, agriculture, and energy. The ability to collect, interpret, and immediately act on vast amounts of data is critical for many of the data-heavy applications that futurists predict will become widespread: video monitoring, virtual reality, autonomous drones and vehicles, and more.

That future, in large part, depends on what edge AI chips make possible: bringing the intelligence to the device.

— Duncan Stewart and Jeff Loucks are with Deloitte’s Center for Technology, Media and Telecommunications. This article is based on an article originally published by Deloitte for its TMT Predictions 2020 report.

Leave a comment