Vision-enabled designs have recently been adopted outside the production environment and are only entering IoT domains such as smart cities, smart homes, and healthcare...
Organizations around the world are increasingly adopting advanced technologies, which drive the Internet of Things (IoT) market. According to a Fortune Business Insight report, the global IoT market was valued at $190 billion in 2018 and is projected to reach $1,111 billion by 2026. The IoT facilitates the interchange of information between machine and device and can include components like sensors and meters, network connectivity devices, and software. Vision-based systems in production environments have a long history and are a “must-have” in production lines that require automatic inspection and sorting. However, vision-enabled designs have just recently been adopted outside the production environment and are only gradually entering areas such as smart cities, smart homes, elder care, and healthcare.
Challenges in IoT-enabled vision systems
Outside the production environment, challenges are faced in the adoption of smart vision systems because these systems are bulky, have high power consumption, are costly, and have a significant privacy risk as sensitive image data are stored in the device or transmitted to a server. A vision chip tailored for IoT applications will be able to overcome many of these obstacles and enable vision-based applications to enter the IoT market more smoothly. A vision chip integrates all parts of a vision system into one single chip and reduces the overall size of the system considerably. The benefit of integrating both camera and processor into a single chip is less complexity in integrating the system into a specific application, reduced power consumption since there is no overhead related to the transmission of data from the camera to an external processor and reduced production cost for the total vision system.
Force Technology’s microelectronics division (now part of Presto Engineering) has conducted research on solutions for easy implementation of smart vision systems into IoT applications. The research, which was supported by the Danish Ministry of Higher Education and Science, has resulted in the development of a new vision chip called Heimdal 2. The vision chip extracts high-level information from an observed activity at a production cost of less than $2 and with low power consumption. This emerging vision chip is a technology enabler for IoT applications utilizing massively distributed sensors. Characterized according to EMVA1288, the Standard for Measurement and Presentation of Specifications for Machine Vision Sensors and Cameras, Heimdal 2 incorporates on-chip image processing with configurable algorithms to interpret images and is suitable for multiple applications such as the detection/tracking of objects, intelligent interpretation of movement and many more.
Low-power consumption in miniaturize size
Figure 1: System diagram of Heimdal 2.
Compared to typical image sensors, which are equipped with many megapixels and capture detailed images, Heimdal 2 uses a low image resolution of 64×64 pixels. The goal is to achieve a high-level understanding of visual content by quickly interpreting low-resolution images while consuming minimum energy. The small number of pixels reduces both processing time and power consumption, as the algorithm processes captured images. Besides, both the memory and image sensor need less physical space and, therefore, save space. Another benefit is that it protects privacy as, for instance, face recognition is not possible at such low image resolution.
The vision chip includes a 16bit OpenMSP430 processor running at 16MHz with 12kBytes of SRAM and 32kBytes of flash memory. The embedded processor makes it flexible in terms of implementing different algorithms which often needs to be tailored to the application at hand. The chip can communicate with an external host or interface to multiple external sensors using the on-board SPI and 8 GPIOs. Combined with energy-harvesting technology, it is possible to design a fully autonomous chip that can view, analyze, and act on the outside world while harvesting its energy from its surroundings without the need for a battery.
Vision chip in standard CMOS process
Heimdal 2 is manufactured in a standard CMOS process, offering both cost and functionality advantages. The cost of producing a given chip depends not only on the silicon area but is also strongly linked to the complexity of the semiconductor process used. The benefit of using a standard CMOS process is the lowest cost. Another is the availability of IP blocks merged in a single chip. Typically, a vision chip is produced in a specialized CMOS Image Sensor (CIS) process, which is not compatible with many of the IP blocks needed for the design of a vision chip. However, CIS processes lack compatibility with non-volatile memory, making a single-chip solution expensive and unfeasible for many applications.
The main challenge in designing a vision chip in a standard CMOS process is to achieve good performance in the photodiode coinciding in each pixel cell of the image sensor array. As a standard CMOS process does not offer access to photodiode IP blocks with adequate performance, proprietary photodiodes have been designed and refined for this purpose.
Vision algorithms
To benefit from the features in the vision chip, algorithms need to be designed and tailored to the specific application. The algorithms take an image or sequence of images as input and analyze them to provide meaningful high-level output. In Heimdal 2, the algorithms are implemented in software and run on the embedded processor, making it flexible in terms of implementing new algorithms and updating existing ones. The algorithms combined with the image sensor form an autonomous system, which is suitable for indoor/outdoor navigation, robotic technology, and intelligent monitoring. The following two vision algorithms are for demonstration purposes:
Use cases
An example use case could be the issue of finding a vacant parking space, which can be difficult and time consuming for many drivers, especially in densely populated areas. Therefore, a system based on vision chips combined with a mobile app can help the driver to find the vacant parking space: A solar cell-powered vision module based on Heimdal 2 could be distributed throughout the city to cover all parking spaces. Each vision system would then continuously monitor the target parking space. When the parking space has changed its status either from occupied or non-occupied, the vision system would then awaken a wireless module to transmit the changed status to a centralized server. The driver would then be able to navigate to the nearest vacant parking space by use of a mobile app connected to the server.
Another application could be to gather information on shopping behavior in the retail industry while upholding customer privacy. Vision chips based on Heimdal 2 would be placed in the ceiling of a mart to detect the motion vector of the shoppers. Motion vectors from each vision chip would then be transmitted to a server for further data processing. The server would then be able to analyze the shopper behavior patterns, i.e. which item he/she looked at the most, how long he/she spent on looking at each product type, and which path the shopper followed when he/she entered the mart. The data would be analyze the behavior and optimize the layout and organization of products in the store. While the resolution of the Heimdal 2 is sufficient to extract high-level information as mentioned above, it is not sufficient for facial recognition algorithms, thereby inherently guaranteeing customer privacy.
Conclusion
The Heimdal 2 vision chip is suitable for IoT applications that need to see, interpret, and act on the outside world. It is especially suited for applications that are price sensitive, require low power consumption, and are space constrained. With an image resolution of 64×64 pixels, the purpose is not to capture details in the captured scene but rather capture high-level properties such as motion detection, object detection/tracking, and object differentiation. The added benefit of the low-resolution images is that the vision chip is also suited for applications where privacy is important.
Explore more:
Arm DevSummit: Virtual flagship conference for developers and engineers