Having considered FPGAs, ASSPs and ASICs, Andreas Schwope turns his attention to modules and possibly an ideal solution for Industry 4.0.
Another solution in the sense of multi-protocol is a combination of the two mentioned solutions. A module-based approach in an automation product foresees the exchange of rather small communication modules when another communication protocol is required. Using small modules from an external provider to modify the higher-level protocol of a system only partially follows the idea of an unmodified hardware: the more complex main system remains unchanged while only the less complex module is changed. This is a really good approach for prototype systems and low-volume products. But having regard to legacy products which cannot be changed, a module approach can easily allow the adaptation of a legacy system to a new communication protocol. Another advantage of the module approach is completely separating the application side from the communication hardware and stack software.
Module killing criterion
As communication modules are relatively expensive, they are at a big commercial disadvantage. Mechanically they sometimes need more space and volume (area x height of module plus its connector) to be integrated into a system. This is often the module-killer criterion for a small product with a narrow or flat housing.
These approaches to multi-protocol have their own strengths and make for a good choice under certain conditions or product phases. But all of them have a common disadvantage: pricing, especially under high volume conditions.
Wishlist from an "ideal" device
Let us now think about an ideal solution that allows the simple product structure in figure 3 compared with the traditional approach. First, our solution should be based on a single-chip device with characteristics of a small communication module provided by a single and well-defined hardware connection with different interface options. The application processing can be an option, in case our device must provide just the communication part of the system. In this case, the device must support a flexible and high-speed interface to the system CPU (host) with certain synchronisation capabilities for event handling and data exchange. To support a wide range of legacy systems, other lower speed interfaces, like UART and SPI, should also be available.
Turning to small network nodes requiring low- to mid-level performance to compute an application, our ideal device would be a SoC with its own CPU able to process both the communication protocol and the application.
Performance often comes with high power dissipation. Many products in the automation arena are sensitive with respect to power and temperature. They run in a high-temperature environment and use small housing without active cooling measures. Thus, our solution should include power-saving features to relax the typical performance/power barrier. Finally, our ideal single-chip device should provide a kind of communication-specific flexibility to support a multi-protocol capability for a wide range of the industrial Ethernet protocols. All this should be available in an ASSP-type device to enjoy a price advantage from low to high volumes.
One such hardware architecture is the R-IN Engine. The "R-IN" stands for Renesas Industrial Network. The architecture of this function is well suited to different industrial Ethernet protocols and is, as a distinct and independent block, already being used in various Renesas product families.
Figure 4: Block diagram of the R-IN Engine.
The R-IN Engine architecture shown in figure 4 is first and foremost a sub-system with an ARM Cortex-M3 CPU and all on SoC-required components: memory blocks for instructions, data and other functions, interrupt controller and the corresponding multi-master bus system to different internal and external device re-sources, as well as the ability to debug.
The multi-master bus allows concurrent data transfers between different areas without the requirement to interrupt the internal or external host CPU. Components for Ethernet communication include a Gigabit Ethernet switch with three ports, one internal and two external as already described. Further, the R-IN Engine includes an Ethernet MAC with associated DMA controller and required buffer memory explicitly used for Ethernet data transfers. To implement one protocol of the second group, the communication data path can be switched with respective multiplexers from the standard Ethernet path (IEEE 802.3 Switch and MAC) to the IE protocol controller of the second group.
Figure 5: Acceleration of frame processing. The special hardware works efficiently and greatly relieves the CPU load.
Beyond this purely functional approach to support industrial networks, the R-IN Engine includes some accelerators replacing functions that nowadays are typically implemented solely in software. In terms of real-time requirements, the processing of network functions is decisively speeded up with these accelerators at central points in the communication.
The HW-RTOS accelerator primarily supports the software execution with automated and prioritised task scheduling. Moreover, this special hardware supports task synchronisation via event flags, semaphores and mailboxes, as well as a general task and time management. Finally, certain RTOS functions can directly be executed by interrupt signals without any involvement of the R-IN CPU.
Being closely connected to the CPU, HW-RTOS can sometimes highly accelerate both the processing of the stack software and the actual application. Everything inside HW-RTOS is calculated by hardware in a fast and deterministic way without the typical delays and jitter found in software solutions. From the software perspective the use of the HW-RTOS accelerator is based on a μItron library with a documented API and the related SW parts allowing a smooth project start. Therefore, the HW-RTOS is completely transparent for the user and does not require a detailed knowledge of the control structures.
When receiving or sending a frame byte, the CheckSum accelerator automatically calculates on-the-fly the 4-byte checksum placed at the end of Ethernet frames. This calculation is solely done by the accelerator without loading the R-IN CPU. In the receive direction, the correctness of the data can be checked in a single step by comparing the calculated FCS (frame check sequence) value with the frame FCS field in the received frame.
In contrast, a typical software solution calculating the 32-bit CRC value consumes nearly 30% of the overall performance that is generally required for the Ethernet communication. Thus, the CheckSum accelerator obtains a correspondingly large saving in CPU performance in high Ethernet traffic situations.
The organisation of the frame data buffer for transmission or reception is normally byte-wise. Read-access to certain frame header information requires the collection of all necessary bytes in the frame buffer and their rearrangement into the right sequence. For the transmit direction, the rearrangement must be done in the opposite direction into a compressed frame format. This data processing typically requires about 15% of the overall CPU performance for a pure software-based TCP/IP stack. The Header EnDec accelerator has the task of automatically rearranging the data between the compressed frame format and the CPU-oriented 32-bit aligned format. With this accelerator, the CPU has a fast and direct read and write access to all frame header information without any latencies.
The Buffer Management accelerator automatically controls the buffer allocation and release functions in hardware for the Ethernet processing.