Skip to content

v0.4.0

Compare
Choose a tag to compare
@XkunW XkunW released this 28 Nov 18:21
· 373 commits to main since this release
d221dae
  • Onboarded various new models and new model types: text embedding model and reward reasoning model.
  • Added metrics command that streams performance metrics for inference server.
  • Enabled more launch command options: --max-num-seqs, --model-weights-parent-dir, --pipeline-parallelism, --enforce-eager.
  • Improved support for launching custom models.
  • Improved command response time.
  • Improved visuals for list command.