Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Multi-Modality Distillation via Learning the teacher's modality-level Gram Matrix

About

In the context of multi-modality knowledge distillation research, the existing methods was mainly focus on the problem of only learning teacher final output. Thus, there are still deep differences between the teacher network and the student network. It is necessary to force the student network to learn the modality relationship information of the teacher network. To effectively exploit transfering knowledge from teachers to students, a novel modality relation distillation paradigm by modeling the relationship information among different modality are adopted, that is learning the teacher modality-level Gram Matrix.

Peng Liu• 2021

Related benchmarks

TaskDatasetResultRank
Visual EntailmentSNLI-VE (test)
Overall Accuracy72.45
199
Visual EntailmentSNLI-VE (val)
Overall Accuracy72.66
111
Hateful Meme DetectionHateful Memes (test)--
67
Hate Speech DetectionHateful-Memes (HM) (val)
Accuracy69.85
2
Visual ReasoningNLVR (test)
Accuracy75.33
2
Visual ReasoningNLVR (val)
Accuracy75.06
2
Showing 6 of 6 rows

Other info

Follow for update