We introduce a 3D scene generation pipeline that creates immersive scenes with full 360° coverage from text prompts of any level of specificity.
We propose a new implicit neural representation, that enables fast and accurate decomposition of face videos into blood and appearance components. This allows contactless estimation of heart rate from challenging out-of-distribution face videos.
We propose a CLIP-based language guidance that leads to SOTA semantic segmentation results in adverse weather results on our own WeatherProof dataset, the A2I2-Haze segmentation dataset, as well as the popular ACDC dataset.
Feature 3DGS, distills feature fields from 2D foundation models, opening the door to a brand new semantic, editable, and promptable explicit 3D scene representation.
We present a framework that leverages explicit radiance fields, monocular depth cues, and generative priors to enable 360° sparse view synthesis using 3D Gaussian Splatting.
We present a method for 3D generation of multi-object realistic scenes from text by utilizing text-to-image diffusion models and Gaussian radiance fields. These scenes are decomposable and editable at the object level.
We propose a semantic segmentation dataset with paired clean and adverse weather image pairs as well as a general, paired-training method that can be applied to all current foundational model architectures that improves performance in adverse conditions.
We propose a loss function for latent diffusion models that improves the perspective accuracy of generated images, allowing us to create synthetic data that helps improve SOTA monocular depth estimation models.
Novel algorithm to mitigate skin tone bias in remote heart rate estimation, for improved performance on our diverse, telemedicine oriented VITAL dataset.
We introduce WeatherStream, an automatic pipeline capturing all real-world weather effects, along with their clean image pairs.
We present a method for inferring dense depth from a camera image and a sparse noisy radar point cloud
pCON learns to fit an image by learning a series of reconstructions with different singular values.
A novel pipeline that enables discovery of underlying parameters and equations from videos of physical phenomena.
Rethinking latent topologies for fast and detailed implicit 3D reconstructions.
Inclusion of minority samples improves test error for the majority group.
Filling the sim2real domain gap by collecting a real paired single image deraining dataset.
An attempt that transfers light-skinned subjects to dark skin tones while preserving the pulse signals in the facial videos.
A multimodal fusion approach between camera and radar to achieve more equitable and robust plethysmography.
A scalable biophysical neural rendering method to generate biorealistic synthetic rPPG videos given any reference image and target rPPG signal as input.
Computational photography has become an increasingly active area of research within the computer vision community. The CCD workshop series serves as an annual gathering place for researchers and practitioners who design, build, and use computational cameras.
Medical devices can be biased across multiple axes, including the physics of light. How can we rethink the engineering of medical devices using Pareto principles?
This paper makes a first attempt to re-examine the shape from polarization (SfP) problem using physics-based deep learning.
Overview of convergence of physics and AI for imaging and vision and the path ahead.
Novel incorporation of polarization cues towards non-line-of-sight imaging.
The CVPR 2020 Tutorial on Visual Physics with Katerina Fragkiadaki, Laura Waller, Bill Freeman, and Ayan Chakrabarti.
Generalizing Physics-Based Learning (PBL), by making the first attempt to bring neural architecture search (NAS) to the realm of PBL.
A novel non-line of sight imaging framework with long-wave infared.