Metadata-Version: 2.1
Name: faster-norm
Version: 0.3.0
Summary: A fast, yet specialized, RMSNorm/LayerNorm implementation
Home-page: https://github.com/yuantailing/faster-norm
Author: Tailing Yuan
Author-email: yuantailing@gmail.com
License: MIT
Classifier: Development Status :: 3 - Alpha
Classifier: Environment :: GPU :: NVIDIA CUDA
Classifier: Programming Language :: Python :: 3
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.3
License-File: LICENSE

# faster-norm

A fast, yet specialized, RMSNorm/LayerNorm implementation

This library is under development. Currently, only some special cases are supported, and the performance is not yet fully optimized.

- [x] RMSNorm
- [x] LayerNorm
- [x] Float16 and BFloat16
- [ ] More data types
- [x] More shapes
- [x] Optimize for no wgrad
- [ ] Performance tuning
- [ ] Optimize compilation time


## Statement

This work was independently completed by me at home using my personal RTX 3080.
