Weekly-220612

本文最后更新于:June 12, 2022 am

本周学习汇报

电子信息实践课大作业
扩散模型相关
Hyperspherical Consistency Regularization
CoMatch- Semi-supervised Learning with Contrastive Graph Regularization
DiVAE : Photorealistic Images Synthesis with Denoising Diffusion Decoder

1. 电子信息实践课大作业

2. 关于扩散模型

参考1
参考2

2.1 生成式学习的三个需求

2.2 扩散模型的思想

  • 正向扩散: 前向扩散过程通过逐渐扰动输入数据将数据映射为噪声。这是通过一个简单的随机过程正式实现的,该过程从数据样本开始,使用简单的高斯扩散核迭代生成噪声较大的样本。也就是说,在这个过程的每一步,高斯噪声都会逐渐添加到数据中。

  • 参数化反向扩散: 参数化的反向过程,取消正向扩散并执行迭代去噪。这个过程代表数据合成,并经过训练,通过将随机噪声转换为真实数据来生成数据。它也被正式定义为一个随机过程,使用可训练的深度神经网络对输入图像进行迭代去噪。

2.3 扩散模型存在的问题-生成过程慢、采样慢

2.3.1 潜空间扩散模型

  • NVIDIA 推出了 基于潜在分数的生成模型 ( LSGM ), 可以在潜在空间而不是直接在数据空间中训练扩散模型。
  • 优势:合成速度、表现力,以及定制的编码器和解码器。
  • 基本上简化了数据本身,首先将其嵌入平滑的潜在空间,在那里可以训练更有效的扩散模型。

2.3.2 临界阻尼朗之万扩散(CLD)

  • 问题: 扩散模型中的一个关键因素是固定前向扩散过程,以逐渐扰动数据。与数据本身一起,它唯一地决定了去噪模型学习的难度。
  • 优势: 使用定制 SDE 解算器加速采样,有更简单的评价函数。
  • 是一种改进的前向扩散过程,特别适合于更简单、更快的去噪和生成。

2.3.3 扩散算子去噪

  • 使用条件 GAN 对去噪分布进行建模。
  • 通过表达性多峰去噪分布,直接学习显著加速的反向去噪过程。

2.4 与VAE的对比

3. 关于对比学习中的正则化

题目: Hyperspherical Consistency Regularization


/2-Learn Method/4-Contrastive Learning/2022-Hyperspherical Consistency Regularization(CVPR).pdf

3.1. Introduction

  • Common scheme of contrasive learning: Jointly training supervised learning and unsupervised learning with a shared feature encoder. Taking advantage of both feature-dependent information from self-supervised learning and label-dependent information from supervised learning.
  • Motivation: This scheme remains suffering from bias of the classifier- the classifier which determines the ultimate predictions still suffers from the bias of semi-supervision or weak-supervision.
  • Contributions:
    • Analyze the relationship between the projection head and the classifier.
    • Proposed: hyperspherical consistency regularization (HCR)- to regularize the classifier using feature-dependent information and thus avoid bias from labels.
  • Contrastive learning: HCR builds a bridge between classical supervised learning and pretext tasks in self-supervised learning, and the regularization is plug-and-play to apply in these joint-learning methods.
  • Learning on the hypersphere: Reprojects the Euclidean feature space of the classifier into a hypersphere and explores its connection with the projection head’s hypersphere.

3.3. Methodology

3.4. Experiments

  • Semi-supervised learning
  • Fine-grained classification
  • Noisy label learning

3.5. Conclusion and Future Work

  • Propose a novel consistency regularization method for semi-supervised and weak-supervised learning, called hyperspherical consistency regularization (HCR).
  • HCR: Encourage the pairwise distance distribution of the classifier to be similar to the distribution of the projection head in the latent space.

4. 半监督学习中的对比图正则化

CoMatch- Semi-supervised Learning with Contrastive Graph Regularization


GitHub
/2-Learn Method/4-Contrastive Learning/2021-CoMatch- Semi-supervised Learning with Contrastive Graph Regularization(ICCV).pdf

4.1. Introduction

  • Semi-supervised learning: learning from few labeled data and a large amount of unlabeled data.
    • using the model’s class prediction to produce a pseudo-label for each unlabeled sample as the label to train against.BUT heavily rely on the quality of the model’s class prediction.
    • unsupervised or self-supervised pre-training, followed by supervised fine-tuning and pseudo-labeling.BUT Self-supervised learning methods are task-agnostic.
  • graph-based semi-supervised learning: Not shown competitive performance on ImageNet.
  • CoMatch: a new semi-supervised learning method that addresses the existing limitations.
    • The classification head: Using memory-smoothed pseudo-labels, where pseudo-labels are refined by aggregating information from nearby samples in the embedding space.
    • The projection head: using contrastive learning on a pseudo-label graph, where samples with similar pseudo-labels are trained to have similar embeddings.
  • Consistency regularization.
  • Entropy minimization.
  • Self-supervised contrastive learning.
  • Graph-based semi-supervised learning.

4.3. Methodology

  • Overview
  • CoMatch
    • Memory-smoothed pseudo-labeling
    • Graph-based contrastive learning
  • Scalable learning with an EMA model

4.4. Experiments

4.5. Conclusion and Future Work

  • Co-training of class probabilities and image embeddings.
  • Memory-smoothed pseudo-labeling to mitigate confirmation bias.
  • Graph-based contrastive learning to learn better representations.

5. 图像生成—去噪扩散解码器

DiVAE : Photorealistic Images Synthesis with Denoising Diffusion Decoder


/3-Generate/1-Network/5-DPM/2022-DiVAE - Photorealistic Images Synthesis with Denoising Diffusion Decoder.pdf

5.1. Introduction

  • Diffusion models have shown be capacity to generate high-quality synthetic images.
  • Propose: VQ-VAE architecture model with a diffusion decoder (DiVAE)—work as the reconstructing component in image synthesis.
  • Motivation: generate more detailed and photorealistic images to improve the reconstructing stage of multi stage image synthesis.
  • Compare:
  • Visual Synthesis: AR、GANs、VAEs、diffusion models and flow-based models.
    • GANs: often difficult to train and defective in capturing of diversity.
    • Auto-regressive (AR): generate model have advantages in density modeling and stable training.
  • Diffusion models: likelihood-based models.
    • Denoising diffusion probabilistic models (DDPM): Produce high-quality images and the promising prospect in visual synthesis.
    • Denoising diffusion implicit models (DDIM): developed a approach to fast sampling.
    • Guided Diffusion: find that samples from a class conditional diffusion model with a independent classifier guidance can be significantly improved.
    • Classifier-Free Diffusion: propose classifier-free guidance that does not need to train a separate classifier model.

5.3. Methodology

5.4. Experiments

  • Quantitative Analysis:

  • Qualitative Analysis:

5.5. Conclusion and Future Work

  • Proposes a DiVAE with a diffusion decoder to generate more photorealistic and detailed images to improve the reconstructing stage of multi stage image synthesis.
  • Achieves state-of-the-art results on reconstruction of images comparing with existing approach and samples more detailed images on Text-to-Image tasks.

6. 下一步计划

  • 扩散模型相关论文阅读
  • 对比学习与视觉表征学习