He Meng, Xu Dawen. Fault-tolerant neural network training framework based on client-server[J]. Microelectronics & Computer, 2021, 38(10): 73-78. DOI: 10.19304/J.ISSN1000-7180.2021.0035
Citation: He Meng, Xu Dawen. Fault-tolerant neural network training framework based on client-server[J]. Microelectronics & Computer, 2021, 38(10): 73-78. DOI: 10.19304/J.ISSN1000-7180.2021.0035

Fault-tolerant neural network training framework based on client-server

  • In order to realize low power consumption and real-time inference, AIoT devices have been applied in many fields of deep learning in recent years. However, some manufacturing processes cause some soft errors on AIOT devices in inference. For a neural network accelerator with a large amount of computation, it may lead to a large amount of computing error and a huge loss of prediction accuracy, which is intolerable for precision-sensitive applications such as autonomous drones. However, conventional fault tolerance techniques such as triple modular redundancy can incur considerable power consumption and performance penalty. In this paper, a client-server collaborative fault-tolerant neural network training framework is proposed. In the training, an AIoT processor with soft errors is used as the client, and the server learns the on-site computing errors with the application data of AIoT processor. Several representative neural network models were selected in the experiment. Compared with the off-line training model, the model trained by this method increases the top5 accuracy of the neural network by an average of 2.8%.
  • loading

Catalog

    Turn off MathJax
    Article Contents

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return