论文/代码速递2022.10.17!

2022-12-11 12:54:17 浏览数 (1)

整理:AI算法与图像处理

CVPR2022论文和代码整理:https://github.com/DWCTOD/CVPR2022-Papers-with-Code-Demo

ECCV2022论文和代码整理:https://github.com/DWCTOD/ECCV2022-Papers-with-Code-Demo

最新成果demo展示:

标题:SCAM! Transferring humans between images with Semantic Cross Attention Modulation

主页:https://imagine.enpc.fr/~dufourn/publications/scam.html

代码:https://github.com/nicolas-dufour/SCAM

论文:https://arxiv.org/pdf/2210.04883v1.pdf

最近的大量工作以语义条件下的图像生成为目标。大多数这类方法只关注较窄的姿势转移任务,而忽略了更具挑战性的对象转移任务,即不仅转移姿势,还转移外观和背景。在这项工作中,我们引入了SCAM(Semantic Cross Attention Modulation,语义交叉注意调制),这是一个系统,它对图像的每个语义区域(包括前景和背景)中丰富多样的信息进行编码,从而实现了以细节为重点的精确生成。这是由Semantic Attention Transformer Encoder实现的,该编码器为每个语义区域提取多个潜在向量,以及通过使用语义交叉注意调制来利用这些潜在向量的相应生成器。它仅使用重建设置进行训练,而受试者在测试时进行转移。我们的分析表明,我们提出的架构在编码每个语义区域的外观多样性方面是成功的。iDesigner和CelebAMask HD数据集上的大量实验表明,SCAM优于SEAN和SPADE;此外,它还开创了学科转移的新境界。

最新论文整理

ECCV2022

Updated on : 17 Oct 2022
total number : 4

Intel Labs at Ego4D Challenge 2022: A Better Baseline for Audio-Visual Diarization

  • 论文/Paper: http://arxiv.org/pdf/2210.07764
  • 代码/Code: None

The Surprisingly Straightforward Scene Text Removal Method With Gated Attention and Region of Interest Generation: A Comprehensive Prominent Model Analysis

  • 论文/Paper: http://arxiv.org/pdf/2210.07489
  • 代码/Code: https://github.com/naver/garnet

Autoregressive Uncertainty Modeling for 3D Bounding Box Prediction

  • 论文/Paper: http://arxiv.org/pdf/2210.07424
  • 代码/Code: None

Task Grouping for Multilingual Text Recognition

  • 论文/Paper: http://arxiv.org/pdf/2210.07423
  • 代码/Code: None

CVPR2022

NeurIPS

Updated on : 17 Oct 2022
total number : 13

Learnable Polyphase Sampling for Shift Invariant and Equivariant Convolutional Networks

  • 论文/Paper: http://arxiv.org/pdf/2210.08001
  • 代码/Code: None

AVLEN: Audio-Visual-Language Embodied Navigation in 3D Environments

  • 论文/Paper: http://arxiv.org/pdf/2210.07940
  • 代码/Code: None

MOVE: Unsupervised Movable Object Segmentation and Detection

  • 论文/Paper: http://arxiv.org/pdf/2210.07920
  • 代码/Code: None

One Model to Edit Them All: Free-Form Text-Driven Image Manipulation with Semantic Modulations

  • 论文/Paper: http://arxiv.org/pdf/2210.07883
  • 代码/Code: page:https://github.com/KumapowerLIU/FFCLIP.

Model-Based Imitation Learning for Urban Driving

  • 论文/Paper: http://arxiv.org/pdf/2210.07729
  • 代码/Code: https://github.com/wayveai/mile.

Quo Vadis: Is Trajectory Forecasting the Key Towards Long-Term Multi-Object Tracking?

  • 论文/Paper: http://arxiv.org/pdf/2210.07681
  • 代码/Code: https://github.com/dendorferpatrick/QuoVadis.

DART: Articulated Hand Model with Diverse Accessories and Rich Textures

  • 论文/Paper: http://arxiv.org/pdf/2210.07650
  • 代码/Code: None

Mix and Reason: Reasoning over Semantic Topology with Data Mixing for Domain Generalization

  • 论文/Paper: http://arxiv.org/pdf/2210.07571
  • 代码/Code: None

TokenMixup: Efficient Attention-guided Token-level Data Augmentation for Transformers

  • 论文/Paper: http://arxiv.org/pdf/2210.07562
  • 代码/Code: https://github.com/mlvlab/tokenmixup

Weakly-Supervised Multi-Granularity Map Learning for Vision-and-Language Navigation

  • 论文/Paper: http://arxiv.org/pdf/2210.07506
  • 代码/Code: https://github.com/peihaochen/ws-mgmap

Learning Active Camera for Multi-Object Navigation

  • 论文/Paper: http://arxiv.org/pdf/2210.07505
  • 代码/Code: None

Evaluating Out-of-Distribution Performance on Document Image Classifiers

  • 论文/Paper: http://arxiv.org/pdf/2210.07448
  • 代码/Code: None

A Consistent and Differentiable Lp Canonical Calibration Error Estimator

  • 论文/Paper: http://arxiv.org/pdf/2210.07810
  • 代码/Code: https://github.com/tpopordanoska/ece-kde

0 人点赞