Attention Demo

Self-attention heatmaps on token sequences and ViT patch grids; Q/K/V and multi-head views.