Abstract: Recently, deepfakes have raised severe concerns about the authenticity of online media. Prior works for deepfake detection have made many efforts to capture the intra-modal artifacts.
Abstract: Audio-visual zero-shot learning (ZSL) leverages both video and audio information for model training, aiming to classify new video categories that were not seen during the training. However, ...
- checkpoints/ - audio-cond_animation/ - avsync15_audio-cond_cfg/ - landscapes_audio-cond_cfg/ - thegreatesthits_audio-cond_cfg/ - avsync/ - vggss_sync_contrast ...
The cinema provided a dedicated space for film screenings during a period when the charity experiences high demand. Crisis’ ...
In this paper, we propose a new multi-modal task, termed audio-visual instance segmentation (AVIS), which aims to simultaneously identify, segment and track individual sounding object instances in ...
SAM Audio is the first unified AI model that can segment sound from complex audio mixtures using text, visual, and time span prompts. This technology has the potential to transform audio and video ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results