Abstract: Understanding human interactions is extremely crucial in various applications, including robotics, automated systems, human-computer interaction, and video surveillance. Many studies have ...
Synaesthesia is a perceptual condition where one sense triggers an experience in another sense. For some people, sounds ...
Abstract: Text-based Visual Question Answering (TextVQA) focuses on answering questions about the scene text in images. Most works in this field uses transformer based models to modeling the ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results