李鑫凯,王蒙.基于潜在编码空间的属性控制人脸图像翻译方法[J]. 微电子学与计算机,2024,41(4):85-95. doi: 10.19304/J.ISSN1000-7180.2023.0393
引用本文: 李鑫凯,王蒙.基于潜在编码空间的属性控制人脸图像翻译方法[J]. 微电子学与计算机,2024,41(4):85-95. doi: 10.19304/J.ISSN1000-7180.2023.0393
LI X K,WANG M. Attribute controlled face-to-face translation method based on latent space[J]. Microelectronics & Computer,2024,41(4):85-95. doi: 10.19304/J.ISSN1000-7180.2023.0393
Citation: LI X K,WANG M. Attribute controlled face-to-face translation method based on latent space[J]. Microelectronics & Computer,2024,41(4):85-95. doi: 10.19304/J.ISSN1000-7180.2023.0393

基于潜在编码空间的属性控制人脸图像翻译方法

Attribute controlled face-to-face translation method based on latent space

  • 摘要: 人脸图像翻译旨在将输入人脸图像经过一系列的条件操作,得到符合预期的目标人脸图像。然而,现有方法往往面临模型泛化性不足、属性耦合等挑战。基于此,提出了一种基于潜在编码空间的属性控制人脸图像翻译方法。首先,通过特征金字塔编码网络得到特征向量并组成潜在编码空间;其次,基于潜在编码空间的特征表示能力,对特征向量进行分类学习,得到属性法向量实现人脸属性控制;最后,使用属性法向量解耦和重训练两个步骤解决人脸属性耦合等问题。该方法在提高图像翻译质量的同时实现了人脸属性的精细化控制,并在草图到真实人脸任务中验证了该方法的泛化性。通过AttGAN等主流人脸图像翻译方法进行对比实验和分析,结果表明该方法在FID(Fréchet Inception Distance)等评价指标中较现有方法提升2% ~ 50%不等,在属性生成精确度上提升3% ~ 30%,证明该方法有效提升了属性控制下人脸图像翻译的性能。

     

    Abstract: Facial image translation aims to process the input facial image through a series of conditional operations to obtain the desired target facial image. However, existing methods often face challenges such as insufficient model generalization and attribute coupling. Based on this, a method for attribute controlled facial image translation based on latent space is proposed. Firstly, feature vectors are obtained through a feature pyramid encoding network to form a latent space; Secondly, based on the feature representation ability of the latent space, the feature vector is classified and learned, and the attribute normal vector is obtained to realize face attribute control. Afterwards, two steps, attribute normal vector decoupling and retraining, are used to solve the problem of facial attribute coupling. This method achieves fine control of facial attributes while improving the quality of image translation, and verifies its generalization in the task of sketching to real faces. Through comparative experiments and analysis of mainstream facial image translation methods such as AttGAN, the results show that this method improves the evaluation indicators such as Fréchet Inception Distance(FID) by 2% to 50% compared to existing methods, and improves the accuracy of attribute generation by 3% to 30%. This proves that this method effectively improves the performance of facial image translation under attribute control.

     

/

返回文章
返回