TAS: Ternarized Neural Architecture Search for Resource-Constrained Edge Devices

Deep Neural Networks (DNNs) have successfully been adapted to various computer vision tasks. In general, there is an increasing demand to deploy DNNs onto resource-constrained edge devices due to energy efficiency, privacy, and stable connectivity concerns [1]. However, the enormous computational intensity of DNNs cannot be supported by resource-constrained edge devices leading to the failure of existing processing paradigms in affording modern application requirements. A Ternary Neural Network (TNN), where both weights and activation functions are quantized to ternary tensors, is a variation of network quantization techniques that comes with the benefits of network compression and operation acceleration. However, TNNs still suffer from a substantial accuracy drop issue, hampering them from being widely used in practice. Neural Architecture Search (NAS) is a method which can automatically design high-performance networks. The idea of our proposed framework, dubbed TAS [2], is to integrate the ternarization mechanism into NAS with the hope of reducing the accuracy gap of TNNs. TAS is a fully automated framework for searching for the best architecture for ternary networks and training it from scratch. It includes three main stages: 1. Search architecture, 2. Training from scratch and 3. FPGA Implementation. Thanks to its scalability, it supports any quantization method (Figure 1).

1. Search Architecture: This stage will automatically search for the best architecture for the specific dataset, quantization method and quantization bit width.

2. Train from scratch: To obtain the final performance, one must train the best architecture from scratch.

3. FPGA Implementation: To evaluate the TAS performance on real hardware, we deploy the best architecture on FPGA using the DeepHLS framework [3].

References:
[1] Loni, Mohammad, et al. “Faststereonet: A fast neural architecture search for improving the inference of disparity estimation on resource-limited platforms.” IEEE Transactions on Systems, Man, and Cybernetics: Systems (2021).
[2] Loni, Mohammad, et al. “TAS: Ternarized Neural Architecture Search for Resource-Constrained Edge Devices.” 2022 Design, Automation & Test in Europe Conference & Exhibition (DATE). IEEE, 2022.
[3] Riazati, Mohammad, et al. “DeepHLS: A complete toolchain for automatic synthesis of deep neural networks to FPGA.” 2020 27th IEEE International Conference on Electronics, Circuits and Systems (ICECS). IEEE, 2020.