Skip to yearly menu bar Skip to main content


Poster

Mean-Shift Feature Transformer

Takumi Kobayashi

Arch 4A-E Poster #120
[ ] [ Paper PDF ]
[ Poster
Wed 19 Jun 5 p.m. PDT — 6:30 p.m. PDT

Abstract:

Transformer models developed in NLP make a great impact on computer vision fields, producing promising performance on various tasks.While multi-head attention, a characteristic mechanism of the transformer, attracts keen research interest such as for reducing computation cost, we analyze the transformer model from a viewpoint of feature transformation based on a distribution of input feature tokens.The analysis inspires us to derive a novel transformation method from mean-shift update which is an effective gradient ascent to seek a local mode of distinctive representation on the token distribution.We also present an efficient projection approach to reduce parameter size of linear projections constituting the proposed multi-head feature transformation.In the experiments on ImageNet-1K dataset, the proposed methods are embedded into various network models to exhibit favorable performance improvement in place of the transformer module.

Chat is not available.