Attention 论文阅读

Attention

论文阅读

发布日期: 2023-04-22

更新日期: 2023-04-29

文章字数: 523

阅读时长: 2 分

阅读次数:

Attention的目的是让模型知道关注的输入是哪些，自定义一个权重，然后组合输出。Attention的输入可以是任何东西，包括各种隐变量。

此外在Attention里面有位置编码（Positional Encoding）的概念，常见的位置编码就是Simusoidal。

Attention目前主要应用在大模型的Transformer、BERT等结构里面。self-Attention仅仅只是大模型中的一个模块。

正常来说，q和k是N维度的，q*k的数量量是很大的，能否在里面人类主导一部分计算或者机器主导一部分计算是研究Attention的主要工作。

【论文阅读】Attention Is All You Need

【论文阅读】Self-Attention Generative Adversarial Networks

自注意力机制应用到GAN里面

【论文阅读】Attention is not Explanation

【论文阅读】Attention Augmented Convolutional Networks

【论文阅读】Stand-Alone Self-Attention in Vision Models

【论文阅读】Attention is not not Explanation

【论文阅读】Reformer: The Efficient Transformer

通过聚类（Clustering）来找到attention里面的关键点

【论文阅读】On the Relationship between Self-Attention and Convolutional Layers

自注意力机制和卷积神经网络的关系

【论文阅读】Sparse Sinkhorn Attention

通过另一个神经网络来找到attention里面的关键点

【论文阅读】Efficient Content-Based Sparse Attention with Routing Transformers

通过聚类（Clustering）来找到attention里面的关键点

【论文阅读】Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation

【论文阅读】Learning to Encode Position for Transformer with Continuous Dynamical Model

【论文阅读】Longformer: The Long-Document Transformer

人类主导一部分计算规则

【论文阅读】ResNeSt: Split-Attention Networks

【论文阅读】Exploring Self-attention for Image Recognition

【论文阅读】Linformer: Self-Attention with Linear Complexity

【论文阅读】Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention

自注意力机制和循环神经网络的关系

【论文阅读】Big Bird: Transformers for Longer Sequences

人类主导一部分计算规则

【论文阅读】Efficient Transformers: A Survey

【论文阅读】An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

Vison Transformer

【论文阅读】Long Range Arena: A Benchmark for Efficient Transformers

【论文阅读】Beyond Self-attention: External Attention using Two Linear Layers for Visual Tasks

【论文阅读】Attention Mechanisms in Computer Vision: A Survey

JiJunhao

http://jijunhao.github.io/2023/04/22/article20230422/

本博客所有文章除特別声明外，均采用 CC BY 4.0 许可协议。转载请注明来源 JiJunhao !

Attention

Transformer and MLP 论文阅读

2023-04-23 论文阅读

Transformer

Differential Privacy 论文阅读

2023-04-21 论文阅读

Differential Privacy

你的赏识是我前进的动力