Skip to content

v0.5.0

Latest
Compare
Choose a tag to compare
@EricLBuehler EricLBuehler released this 24 Mar 04:16
· 36 commits to master since this release
7c086a9

Highlights

Blog post: https://huggingface.co/blog/EricB/mistralrs-v0-5-0

Thank you to all contributors for this release! This release includes the following highlights but also countless improvements, fixes, and optimizations.

  • Support for many more models:
    • Gemma 3
    • Qwen 2.5 VL
    • Mistral Small 3.1
    • Phi 4 Multimodal (image only)
  • Native tool calling support for:
    • Llama 3.1/3.2/3.3
    • Mistral Small 3
    • Mistral Nemo
    • Hermes 2 Pro
    • Hermes 3
  • Tensor Parallelism support (NCCL)!
  • FlashAttention V3 support and integration in PagedAttention
  • 30x reduction in ISQ times on Metal!
  • Revamped prefix cacher system

What's Changed

New Contributors

Full Changelog: v0.4.0...v0.5.0