https://arxiv.org/pdf/2001.04451.pdf

Abstract

Introduction

LOCALITY-SENSITIVE HASHING ATTENTION

Dot-product attention

https://s3-us-west-2.amazonaws.com/secure.notion-static.com/0d0b0f0c-e777-47e4-bbd6-f95b7ac120d3/Untitled.png

Memory-efficient attention