整理:AI算法与图像处理
CVPR2022论文和代码整理:https://github.com/DWCTOD/CVPR2022-Papers-with-Code-Demo
ECCV2022论文和代码整理:https://github.com/DWCTOD/ECCV2022-Papers-with-Code-Demo
最新成果demo展示:
ECCV2022 | XMem: 高质量长期视频分割!
效果超群!
标题:XMem: Long-Term Video Object Segmentation with an Atkinson-Shiffrin Memory Model
论文:https://arxiv.org/pdf/2207.07115.pdf
代码:https://github.com/hkchengrex/XMem
摘要:
我们提出了 XMem,这是一种用于长视频的视频对象分割架构,具有统一的特征内存存储,受 Atkinson-Shiffrin 内存模型的启发。先前关于视频对象分割的工作通常只使用一种类型的特征记忆。对于超过一分钟的视频,单个特征内存模型将内存消耗和准确性紧密联系在一起。相比之下,遵循 Atkinson-Shiffrin 模型,我们开发了一种架构,该架构包含多个独立但深度连接的特征记忆存储:快速更新的感觉记忆、高分辨率工作记忆和紧凑的持续长期记忆。至关重要的是,我们开发了一种记忆增强算法,该算法通常将积极使用的工作记忆元素整合到长期记忆中,从而避免记忆爆炸并最大限度地减少长期预测的性能衰减。结合新的内存读取机制,XMem 在长视频数据集上的性能大大超过了最先进的性能,同时在短视频上与最先进的方法(不适用于长视频)相当数据集。
最新论文整理
ECCV2022
Updated on : 22 Jul 2022
total number : 47
Online Domain Adaptation for Semantic Segmentation in Ever-Changing Conditions
- 论文/Paper: http://arxiv.org/pdf/2207.10667
- 代码/Code: https://github.com/theo2021/onda
TinyViT: Fast Pretraining Distillation for Small Vision Transformers
- 论文/Paper: http://arxiv.org/pdf/2207.10666
- 代码/Code: https://github.com/microsoft/cream
Exploring Fine-Grained Audiovisual Categorization with the SSW60 Dataset
- 论文/Paper: http://arxiv.org/pdf/2207.10664
- 代码/Code: https://github.com/visipedia/ssw60
In Defense of Online Models for Video Instance Segmentation
- 论文/Paper: http://arxiv.org/pdf/2207.10661
- 代码/Code: https://github.com/wjf5203/vnext
Novel Class Discovery without Forgetting
- 论文/Paper: http://arxiv.org/pdf/2207.10659
- 代码/Code: None
Generative Multiplane Images: Making a 2D GAN 3D-Aware
- 论文/Paper: http://arxiv.org/pdf/2207.10642
- 代码/Code: https://github.com/apple/ml-gmpi
Approximate Differentiable Rendering with Algebraic Surfaces
- 论文/Paper: http://arxiv.org/pdf/2207.10606
- 代码/Code: None
Unsupervised Night Image Enhancement: When Layer Decomposition Meets Light-Effects Suppression
- 论文/Paper: http://arxiv.org/pdf/2207.10564
- 代码/Code: https://github.com/jinyeying/night-enhancement
An Efficient Spatio-Temporal Pyramid Transformer for Action Detection
- 论文/Paper: http://arxiv.org/pdf/2207.10448
- 代码/Code: None
Weakly Supervised Object Localization via Transformer with Implicit Spatial Calibration
- 论文/Paper: http://arxiv.org/pdf/2207.10447
- 代码/Code: https://github.com/164140757/scm
Mining Relations among Cross-Frame Affinities for Video Semantic Segmentation
- 论文/Paper: http://arxiv.org/pdf/2207.10436
- 代码/Code: https://github.com/guoleisun/vss-mrcfa
Human Trajectory Prediction via Neural Social Physics
- 论文/Paper: http://arxiv.org/pdf/2207.10435
- 代码/Code: https://github.com/realcrane/human-trajectory-prediction-via-neural-social-physics
D2-TPred: Discontinuous Dependency for Trajectory Prediction under Traffic Lights
- 论文/Paper: http://arxiv.org/pdf/2207.10398
- 代码/Code: https://github.com/vtp-tl/d2-tpred
FADE: Fusing the Assets of Decoder and Encoder for Task-Agnostic Upsampling
- 论文/Paper: http://arxiv.org/pdf/2207.10392
- 代码/Code: None
Error Compensation Framework for Flow-Guided Video Inpainting
- 论文/Paper: http://arxiv.org/pdf/2207.10391
- 代码/Code: None
NSNet: Non-saliency Suppression Sampler for Efficient Video Recognition
- 论文/Paper: http://arxiv.org/pdf/2207.10388
- 代码/Code: None
Pose for Everything: Towards Category-Agnostic Pose Estimation
- 论文/Paper: http://arxiv.org/pdf/2207.10387
- 代码/Code: https://github.com/luminxu/Pose-for-Everything.
Temporal Saliency Query Network for Efficient Video Recognition
- 论文/Paper: http://arxiv.org/pdf/2207.10379
- 代码/Code: None
LocVTP: Video-Text Pre-training for Temporal Localization
- 论文/Paper: http://arxiv.org/pdf/2207.10362
- 代码/Code: https://github.com/mengcaopku/locvtp
CADyQ: Content-Aware Dynamic Quantization for Image Super-Resolution
- 论文/Paper: http://arxiv.org/pdf/2207.10345
- 代码/Code: https://github.com/cheeun/cadyq
UFO: Unified Feature Optimization
- 论文/Paper: http://arxiv.org/pdf/2207.10341
- 代码/Code: None
OIMNet : Prototypical Normalization and Localization-aware Learning for Person Search
- 论文/Paper: http://arxiv.org/pdf/2207.10320
- 代码/Code: None
AutoAlignV2: Deformable Feature Aggregation for Dynamic Multi-Modal 3D Object Detection
- 论文/Paper: http://arxiv.org/pdf/2207.10316
- 代码/Code: https://github.com/zehuichen123/autoalignv2
SeedFormer: Patch Seeds based Point Cloud Completion with Upsample Transformer
- 论文/Paper: http://arxiv.org/pdf/2207.10315
- 代码/Code: https://github.com/hrzhou2/seedformer
AdaNeRF: Adaptive Sampling for Real-time Rendering of Neural Radiance Fields
- 论文/Paper: http://arxiv.org/pdf/2207.10312
- 代码/Code: None
Towards Accurate Open-Set Recognition via Background-Class Regularization
- 论文/Paper: http://arxiv.org/pdf/2207.10287
- 代码/Code: None
Grounding Visual Representations with Texts for Domain Generalization
- 论文/Paper: http://arxiv.org/pdf/2207.10285
- 代码/Code: https://github.com/mswzeus/gvrt
DeltaGAN: Towards Diverse Few-shot Image Generation with Sample-Specific Delta
- 论文/Paper: http://arxiv.org/pdf/2207.10271
- 代码/Code: https://github.com/bcmi/deltagan-few-shot-image-generation
Injecting 3D Perception of Controllable NeRF-GAN into StyleGAN for Editable Portrait Image Synthesis
- 论文/Paper: http://arxiv.org/pdf/2207.10257
- 代码/Code: https://github.com/jgkwak95/surf-gan
SGBANet: Semantic GAN and Balanced Attention Network for Arbitrarily Oriented Scene Text Recognition
- 论文/Paper: http://arxiv.org/pdf/2207.10256
- 代码/Code: None
SPIN: An Empirical Evaluation on Sharing Parameters of Isotropic Networks
- 论文/Paper: http://arxiv.org/pdf/2207.10237
- 代码/Code: https://github.com/apple/ml-spin
MeshMAE: Masked Autoencoders for 3D Mesh Data Analysis
- 论文/Paper: http://arxiv.org/pdf/2207.10228
- 代码/Code: None
On Label Granularity and Object Localization
- 论文/Paper: http://arxiv.org/pdf/2207.10225
- 代码/Code: https://github.com/visipedia/inat_loc
Spotting Temporally Precise, Fine-Grained Events in Video
- 论文/Paper: http://arxiv.org/pdf/2207.10213
- 代码/Code: None
2D GANs Meet Unsupervised Single-view 3D Reconstruction
- 论文/Paper: http://arxiv.org/pdf/2207.10183
- 代码/Code: None
Controllable and Guided Face Synthesis for Unconstrained Face Recognition
- 论文/Paper: http://arxiv.org/pdf/2207.10180
- 代码/Code: None
Video Anomaly Detection by Solving Decoupled Spatio-Temporal Jigsaw Puzzles
- 论文/Paper: http://arxiv.org/pdf/2207.10172
- 代码/Code: None
GOCA: Guided Online Cluster Assignment for Self-Supervised Video Representation Learning
- 论文/Paper: http://arxiv.org/pdf/2207.10158
- 代码/Code: https://github.com/seleucia/goca
Visual Knowledge Tracing
- 论文/Paper: http://arxiv.org/pdf/2207.10157
- 代码/Code: https://github.com/nkondapa/visualknowledgetracing
Tackling Long-Tailed Category Distribution Under Domain Shifts
- 论文/Paper: http://arxiv.org/pdf/2207.10150
- 代码/Code: https://github.com/guxiao0822/lt-ds
Latent Discriminant deterministic Uncertainty
- 论文/Paper: http://arxiv.org/pdf/2207.10130
- 代码/Code: https://github.com/ensta-u2is/ldu
Animation from Blur: Multi-modal Blur Decomposition with Motion Guidance
- 论文/Paper: http://arxiv.org/pdf/2207.10123
- 代码/Code: https://github.com/zzh-tech/Animation-from-Blur.
BRACE: The Breakdancing Competition Dataset for Dance Motion Synthesis
- 论文/Paper: http://arxiv.org/pdf/2207.10120
- 代码/Code: https://github.com/dmoltisanti/brace
Bitwidth-Adaptive Quantization-Aware Neural Network Training: A Meta-Learning Approach
- 论文/Paper: http://arxiv.org/pdf/2207.10188
- 代码/Code: None
Structural Causal 3D Reconstruction
- 论文/Paper: http://arxiv.org/pdf/2207.10156
- 代码/Code: None
AudioScopeV2: Audio-Visual Attention Architectures for Calibrated Open-Domain On-Screen Sound Separation
- 论文/Paper: http://arxiv.org/pdf/2207.10141
- 代码/Code: None
Continual Variational Autoencoder Learning via Online Cooperative Memorization
- 论文/Paper: http://arxiv.org/pdf/2207.10131
- 代码/Code: https://github.com/dtuzi123/ovae