Show Detail |
Timezone: America/Los_Angeles |
Filter Rooms:
MON 17 JUN
8 a.m.
Workshop:
(ends 6:00 PM)
Workshop:
(ends 1:00 PM)
8:25 a.m.
8:30 a.m.
Workshop:
(ends 5:30 PM)
Workshop:
(ends 5:30 PM)
Workshop:
(ends 12:00 PM)
Workshop:
(ends 5:30 PM)
Workshop:
(ends 5:30 PM)
Workshop:
The 4th Workshop of Adversarial Machine Learning on Computer Vision: Robustness of Foundation Models
(ends 5:30 PM)
8:45 a.m.
9 a.m.
10 a.m.
noon
12:45 p.m.
1 p.m.
Workshop:
(ends 5:45 PM)
1:20 p.m.
Workshop:
(ends 6:00 PM)
1:30 p.m.
Workshop:
(ends 5:30 PM)
Tutorial:
(ends 5:00 PM)
Tutorial:
(ends 5:30 PM)
Tutorial:
(ends 6:00 PM)
2 p.m.
Tutorial:
(ends 5:30 PM)
3 p.m.
TUE 18 JUN
7:50 a.m.
8 a.m.
Workshop:
(ends 12:05 PM)
Workshop:
(ends 6:00 PM)
8:10 a.m.
8:20 a.m.
8:30 a.m.
Workshop:
(ends 5:30 PM)
Workshop:
(ends 12:30 PM)
Tutorial:
(ends 5:00 PM)
8:45 a.m.
8:50 a.m.
9 a.m.
Workshop:
(ends 6:10 PM)
Workshop:
(ends 5:00 PM)
Tutorial:
(ends 6:00 PM)
Tutorial:
(ends 5:00 PM)
9:30 a.m.
10 a.m.
noon
1 p.m.
1:30 p.m.
Workshop:
(ends 5:30 PM)
Workshop:
(ends 6:00 PM)
Workshop:
(ends 6:00 PM)
Tutorial:
(ends 6:00 PM)
2 p.m.
3 p.m.
WED 19 JUN
8:30 a.m.
9 a.m.
Orals 9:00-10:30
[9:00]
Specularity Factorization for Low-Light Enhancement
[9:18]
FlowIE: Efficient Image Enhancement via Rectified Flow
[9:36]
Towards Robust Event-guided Low-Light Image Enhancement: A Large-Scale Real-World Event-Image Dataset and Novel Approach
[9:54]
Bilateral Event Mining and Complementary for Event Stream Super-Resolution
[10:12]
FMA-Net: Flow-Guided Dynamic Filtering and Iterative Feature Refinement with Multi-Attention for Joint Video Super-Resolution and Deblurring
(ends 10:30 AM)
Orals 9:00-10:30
[9:00]
GPLD3D: Latent Diffusion of 3D Shape Generative Models by Enforcing Geometric and Physical Priors
[9:18]
Retrieval-Augmented Layout Transformer for Content-Aware Layout Generation
[9:36]
Eclipse: Disambiguating Illumination and Materials using Unintended Shadows
[9:54]
Objects as Volumes: A Stochastic Geometry View of Opaque Solids
[10:12]
DiffusionLight: Light Probes for Free by Painting a Chrome Ball
(ends 10:30 AM)
Orals 9:00-10:30
[9:00]
MultiPly: Reconstruction of Multiple People from Monocular Video in the Wild
[9:18]
URHand: Universal Relightable Hands
[9:36]
Relightable Gaussian Codec Avatars
[9:54]
Semantic Human Mesh Reconstruction with Textures
[10:12]
Stratified Avatar Generation from Sparse Observations
(ends 10:30 AM)
10:30 a.m.
Posters 10:30-12:00
Adapt or Perish: Adaptive Sparse Transformer with Attentive Feature Refinement for Image Restoration
ZERO-IG: Zero-Shot Illumination-Guided Joint Denoising and Adaptive Enhancement for Low-Light Images
RankMatch: Exploring the Better Consistency Regularization for Semi-supervised Semantic Segmentation
(ends 12:00 PM)
(ends 6:45 PM)
11 a.m.
noon
1 p.m.
Orals 1:00-2:30
[1:00]
FreeU: Free Lunch in Diffusion U-Net
[1:18]
Ranni: Taming Text-to-Image Diffusion for Accurate Instruction Following
[1:36]
Instruct-Imagen: Image Generation with Multi-modal Instruction
[1:54]
Attention Calibration for Disentangled Text-to-Image Personalization
[2:12]
Style Aligned Image Generation via Shared Attention
(ends 2:30 PM)
Orals 1:00-2:30
[1:00]
Neural Redshift: Random Networks are not Random Functions
[1:18]
Neural Lineage
[1:36]
Learning Structure-from-Motion with Graph Attention Networks
[1:54]
Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks
[2:12]
In Search of a Data Transformation That Accelerates Neural Field Training
(ends 2:30 PM)
Orals 1:00-2:30
[1:00]
Point Transformer V3: Simpler Faster Stronger
[1:18]
Matching 2D Images in 3D: Metric Relative Pose from Metric Correspondences
[1:36]
Seeing the World through Your Eyes
[1:54]
Tri-Perspective View Decomposition for Geometry-Aware Depth Completion
[2:12]
Steerers: A Framework for Rotation Equivariant Keypoint Descriptors
(ends 2:30 PM)
1:15 p.m.
Expo Track Keynote:
Swami Sivasubramanian
(ends 2:15 PM)
2:30 p.m.
2:45 p.m.
3:45 p.m.
4 p.m.
Panel:
Fei-Fei Li · Matt McIlwain · Hadi Partovi · Oren Etzioni · Peter Lee
(ends 5:00 PM)
5 p.m.
Posters 5:00-6:30
AdaShift: Learning Discriminative Self-Gated Neural Feature Activation With an Adaptive Shift Factor
Unlocking the Potential of Prompt-Tuning in Bridging Generalized and Personalized Federated Learning
Learning Adaptive Spatial Coherent Correlations for Speech-Preserving Facial Expression Manipulation
DreamSalon: A Staged Diffusion Framework for Preserving Identity-Context in Editable Face Generation
Transcending Forgery Specificity with Latent Space Augmentation for Generalizable Deepfake Detection
AEROBLADE: Training-Free Detection of Latent Diffusion Images Using Autoencoder Reconstruction Error
VMC: Video Motion Customization using Temporal Attention Adaption for Text-to-Video Diffusion Models
(ends 6:30 PM)
7 p.m.
THU 20 JUN
8:30 a.m.
9 a.m.
Orals 9:00-10:30
[9:00]
Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation
[9:18]
EscherNet: A Generative Model for Scalable View Synthesis
[9:36]
WALT3D: Generating Realistic Training Data from Time-Lapse Imagery for Reconstructing Dynamic Objects Under Occlusion
[9:54]
Diffusion-FOF: Single-View Clothed Human Reconstruction via Diffusion-Based Fourier Occupancy Field
[10:12]
Rethinking Inductive Biases for Surface Normal Estimation
(ends 10:30 AM)
Orals 9:00-10:30
[9:00]
Comparing the Decision-Making Mechanisms by Transformers and CNNs via Explanation Methods
[9:18]
MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI
[9:36]
Eyes Wide Shut? Exploring the Visual Shortcomings of Multimodal LLMs
[9:54]
LISA: Reasoning Segmentation via Large Language Model
[10:12]
Visual Program Distillation: Distilling Tools and Programmatic Reasoning into Vision-Language Models
(ends 10:30 AM)
Orals 9:00-10:30
[9:00]
EventPS: Real-Time Photometric Stereo Using an Event Camera
[9:18]
EvDiG: Event-guided Direct and Global Components Separation
[9:36]
MemSAM: Taming Segment Anything Model for Echocardiography Video Segmentation
[9:54]
Transcriptomics-guided Slide Representation Learning in Computational Pathology
[10:12]
Correlation-aware Coarse-to-fine MLPs for Deformable Medical Image Registration
(ends 10:30 AM)
10:30 a.m.
(ends 6:45 PM)
Expo Track Keynote:
Andrea Gagliano
(ends 11:30 AM)
Posters 10:30-12:00
One-2-3-45++: Fast Single Image to 3D Objects with Consistent Multi-View Generation and 3D Diffusion
SeaBird: Segmentation in Bird’s View with Dice Loss Improves Monocular 3D Detection of Large Objects
Training Like a Medical Resident: Context-Prior Learning Toward Universal Medical Image Segmentation
ViLa-MIL: Dual-scale Vision-Language Multiple Instance Learning for Whole Slide Image Classification
Virtual Immunohistochemistry Staining for Histological Images Assisted by Weakly-supervised Learning
Diffusion Reflectance Map: Single-Image Stochastic Inverse Rendering of Illumination and Reflectance
ExMap: Leveraging Explainability Heatmaps for Unsupervised Group Robustness to Spurious Correlations
Siamese Learning with Joint Alignment and Regression for Weakly-Supervised Video Paragraph Grounding
Mitigating Object Hallucinations in Large Vision-Language Models through Visual Contrastive Decoding
(ends 12:00 PM)
11:30 a.m.
noon
1 p.m.
Orals 1:00-2:30
[1:00]
SAFDNet: A Simple and Effective Network for Fully Sparse 3D Object Detection
[1:18]
UnO: Unsupervised Occupancy Fields for Perception and Forecasting
[1:36]
EgoGen: An Egocentric Synthetic Data Generator
[1:54]
Learning to Segment Referred Objects from Narrated Egocentric Videos
[2:12]
Producing and Leveraging Online Map Uncertainty in Trajectory Prediction
(ends 2:30 PM)
Orals 1:00-2:30
[1:00]
SceneFun3D: Fine-Grained Functionality and Affordance Understanding in 3D Scenes
[1:18]
SpiderMatch: 3D Shape Matching with Global Optimality and Geometric Consistency
[1:36]
PaSCo: Urban 3D Panoptic Scene Completion with Uncertainty Awareness
[1:54]
PlatoNeRF: 3D Reconstruction in Plato's Cave via Single-View Two-Bounce Lidar
[2:12]
A Subspace-Constrained Tyler's Estimator and its Applications to Structure from Motion
(ends 2:30 PM)
Orals 1:00-2:30
[1:00]
Modeling Multimodal Social Interactions: New Challenges and Baselines with Densely Aligned Representations
[1:18]
An N-Point Linear Solver for Line and Motion Estimation with Event Cameras
[1:36]
RoHM: Robust Human Motion Reconstruction via Diffusion
[1:54]
Temporally Consistent Unbalanced Optimal Transport for Unsupervised Action Segmentation
[2:12]
FineParser: A Fine-grained Spatio-temporal Action Parser for Human-centric Action Quality Assessment
(ends 2:30 PM)
1:30 p.m.
(ends 2:30 PM)
2:30 p.m.
2:45 p.m.
3:45 p.m.
4 p.m.
5 p.m.
(ends 6:30 PM)
7 p.m.
FRI 21 JUN
9 a.m.
Expo Track Keynote:
Ece Kamar
(ends 10:00 AM)
Orals 9:00-10:30
[9:00]
Deep Generative Model based Rate-Distortion for Image Downscaling Assessment
[9:18]
360+x: A Panoptic Multi-modal Scene Understanding Dataset
[9:36]
Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives
[9:54]
Rich Human Feedback for Text-to-Image Generation
[10:12]
BioCLIP: A Vision Foundation Model for the Tree of Life
(ends 10:30 AM)
Orals 9:00-10:30
[9:00]
Grounding and Enhancing Grid-based Models for Neural Fields
[9:18]
NeRF-HuGS: Improved Neural Radiance Fields in Non-static Scenes Using Heuristics-Guided Segmentation
[9:36]
Mip-Splatting: Alias-free 3D Gaussian Splatting
[9:54]
pixelSplat: 3D Gaussian Splats from Image Pairs for Scalable Generalizable 3D Reconstruction
[10:12]
Learning to Produce Semi-dense Correspondences for Visual Localization
(ends 10:30 AM)
Orals 9:00-10:30
[9:00]
CroSel: Cross Selection of Confident Pseudo Labels for Partial-Label Learning
[9:18]
MLP Can Be A Good Transformer Learner
[9:36]
From SAM to CAMs: Exploring Segment Anything Model for Weakly Supervised Semantic Segmentation
[9:54]
LTGC: Long-tail Recognition via Leveraging LLMs-driven Generated Content
[10:12]
Improving Semantic Correspondence with Viewpoint-Guided Spherical Maps
(ends 10:30 AM)
10:30 a.m.
(ends 6:45 PM)
Posters 10:30-12:00
NeRF-HuGS: Improved Neural Radiance Fields in Non-static Scenes Using Heuristics-Guided Segmentation
Learning with Unreliability: Fast Few-shot Voxel Radiance Fields with Relative Geometric Consistency
DNGaussian: Optimizing Sparse-View 3D Gaussian Radiance Fields with Global-Local Depth Normalization
(ends 12:00 PM)
noon
1 p.m.
Orals 1:00-2:30
[1:00]
LDP: Language-driven Dual-Pixel Image Defocus Deblurring Network
[1:18]
S2MAE: A Spatial-Spectral Pretraining Foundation Model for Spectral Remote Sensing Data
[1:36]
Task-Driven Wavelets using Constrained Empirical Risk Minimization
[1:54]
Image Processing GNN: Breaking Rigidity in Super-Resolution
[2:12]
DART: Implicit Doppler Tomography for Radar Novel View Synthesis
(ends 2:30 PM)
Orals 1:00-2:30
[1:00]
Alchemist: Parametric Control of Material Properties with Diffusion Models
[1:18]
Generative Image Dynamics
[1:36]
Visual Anagrams: Generating Multi-View Optical Illusions with Diffusion Models
[1:54]
MonoHair: High-Fidelity Hair Modeling from a Monocular Video
[2:12]
Analyzing and Improving the Training Dynamics of Diffusion Models
(ends 2:30 PM)
Orals 1:00-2:30
[1:00]
InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks
[1:18]
Describing Differences in Image Sets with Natural Language
[1:36]
NoiseCLR: A Contrastive Learning Approach for Unsupervised Discovery of Interpretable Directions in Diffusion Models
[1:54]
MetaCloak: Preventing Unauthorized Subject-driven Text-to-image Diffusion-based Synthesis via Meta-learning
[2:12]
EGTR: Extracting Graph from Transformer for Scene Graph Generation
(ends 2:30 PM)
2:30 p.m.
2:45 p.m.
3:45 p.m.
4 p.m.
5 p.m.
Posters 5:00-6:30
Defense without Forgetting: Continual Adversarial Defense with Anisotropic & Isotropic Pseudo Replay
Byzantine-robust Decentralized Federated Learning via Dual-domain Clustering and Trust Bootstrapping
Deep-TROJ: An Inference Stage Trojan Insertion Algorithm through Efficient Weight Replacement Attack
PixelRNN: In-pixel Recurrent Neural Networks for End-to-end–optimized Perception with Neural Sensors
Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild
(ends 6:30 PM)