<aside> 💡 For more efficient Transformer implementations.

</aside>

My Publications


Scientific LLM

StableMask: Refining Causal Masking in Decoder-only Transformer

DLCNet: Enabling Long-Range Convolution with Data Dependency

MLsys+AI

Triton Tutorials

Guides & Processes


[IMPORTANT] All I hope to learn

MikaStars @ Research

Reports