VATT Paper Battle Presentation

vatt project preview

Overview

This is a paper battle presentation for the Visual Recognition with Transformers class by Gedas Bertasius at UNC in Spring 2024 semester. Paper battle is where two groups of student present and defend the strengths of their assigned paper and rebut on the opposing paper, later which the class votes on which paper should be accepted, simulating a conference paper selection process. I presented the VATT paper with David K. and Junjie in class.

Links :

Presentation slides : link

Main Challenges

  • Master the skill of reading CV research papers critically and deliver a clear and concise presentation.
  • Use adequate visual cues and top-down structure for an engaging and easy-to-follow presentation.

My Contributions

vatt architecture

  • I created and presented on the motivation, prior works (and their limitations) and VATT architecture slides in class.
  • I combined the concepts of vision transformers, self-supervised learning and multi-modal learning through reading related papers and their approach to prepare for this presentation.

Outcome & Impact

The presentation (slides) received good feedback from my instructor and seminar class which proved that our hard work paid off for the past week.

Instructor Feedback

  • Good overview of the problem and the motivation behind it.
  • Excellent discussion on prior works in self-supervised learning and its limitations.
  • Good top-down structure of your presentation. Starting with high-level overview of the key components, then gradually transitioning into lower-level technical details. Clear and easy to follow.
  • I really like your high-level overview of the VATT architecture: ViT + MMV = VATT.
  • Overview of modality-agnostic and modality-specific aspects is very intuitive.
  • Good technical descriptions. They were all detailed, yet easy to follow. The motivation behind technical designs was also well explained. Excellent design of the slides. The slides were simple and easy to follow. You used text sparingly without overwhelming the audience. You used lots of visual cues, which made your presentation engaging and easy to understand the key messages associated with each slide.
  • Excellent empirical analysis with interesting insights and good experimental descriptions.
  • A well-timed presentation. You covered a lot of information but still managed to fit your presentation in time.