Releases · VectorInstitute/vector-inference

20 Mar 20:36

XkunW

v0.5.0

45ef380

v0.5.0 Latest

Latest

What's Changed

Decouple model config from repo, the config priority is the following descending order: user-defined config, cached config, default config
Code base refactor, command logic moved to helper classes
Retire launch server bash script, moved server launching logic to python
Update metrics command to use metrics API endpoint
Exposed additional launch parameters for CLI
Automate docker image build process and pushing to dockerhub
Added unit tests for _cli.py, _utils.py, and imports, improved test coverage
Add server launch info json to logging
Misc house-keeping fixes

New Models

VLM:
- Phi-3.5-vision-instruct
- Molmo-7B-D-0924
- InternVL2_5-8B
- glm-4v-9b
- deepseek-vl2
- deepseek-vl2-small
Reward Modelling:
- Qwen2.5-Math-PRM-7B

New Contributors

@jwilles made their first contribution in #66

Other Contributors

@amrit110 @XkunW @fcogidi @xeon27 @kohankhaki

Full Changelog: v0.4.1...v0.5.0

Contributors

jwilles, amrit110, and 4 other contributors

Assets 2

14 Feb 19:28

kohankhaki

v0.4.1

0d3b89c

v0.4.1

What's Changed

Add pre-commit CI config by @amrit110 in #20
Add dependabot.yml file by @amrit110 in #23
Update jinja2 and fix poetry.lock file by @amrit110 in #25
Bump up vllm to 0.7.0 by @amrit110 in #27
Add CI check for PRs submitted to develop branch by @amrit110 in #26
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #22
Migrate to uv, add lint fixes, improvements and some unit tests by @amrit110 in #28
onboard BGE-Base-EN-v1.5 and All-MiniLM-L6-v2 by @kohankhaki in #29
Add docs for vector-inference by @amrit110 in #30
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #31
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #32
Update docs_build.yml by @amrit110 in #33
Update dependencies by @fcogidi in #34

New Contributors

@pre-commit-ci made their first contribution in #22

Full Changelog: v0.4.0...v0.4.1

Contributors

amrit110, fcogidi, and 2 other contributors

Assets 2

28 Nov 19:23

XkunW

v0.4.0.post1

dba901b

v0.4.0.post1

Fix wrong dependency
Updated README files

Assets 4

28 Nov 18:21

XkunW

v0.4.0

d221dae

v0.4.0

Onboarded various new models and new model types: text embedding model and reward reasoning model.
Added metrics command that streams performance metrics for inference server.
Enabled more launch command options: --max-num-seqs, --model-weights-parent-dir, --pipeline-parallelism, --enforce-eager.
Improved support for launching custom models.
Improved command response time.
Improved visuals for list command.

Assets 4

03 Sep 21:53

XkunW

v0.3.3

d10758d

v0.3.3

Added missing package in decencies
Fixed pre-commit hooks
Linted and formatted code
Updated outdated examples

Assets 2

03 Sep 18:27

XkunW

v0.3.2

39b98a2

v0.3.2

Add support for custom models, users can now launch custom models as long as the model architecture is supported by vllm
Minor update multi-node job launching to better support custom models
Add Llama3-OpenBioLLM-70B to supported model list

Assets 2

29 Aug 13:41

XkunW

v0.3.1

f43d7bf

v0.3.1

Add model-name argument to list command to show default setup of a specific supported model
Improved command option descriptions
Restructured models directory
Add some default values for using a custom model

Assets 2

29 Aug 06:09

XkunW

v0.3.0

156dfa5

v0.3.0

Added vec-inf CLI:
- Install vec-inf via pip
- launch command to launch models
- status command to check inference server status
- shutdown command to stop inference server
- list command to see all available models
Upgraded vllm to 0.5.4
Added support for new model families:
- Llama 3.1 (Including 405B)
- Gemma 2
- Phi 3
- Mistral Large

Assets 2

06 Jul 15:58

XkunW

v0.2.1

2c43a25

v0.2.1

Add CodeLlama
Update model variant names for Llama 2 in README

Assets 2

04 Jul 14:29

XkunW

v0.2.0

635e13f

v0.2.0

Update default environment to use singularity container, added associated Dockerfile
Update vLLM to 0.5.0 and added VLM support (LLaVa-1.5 and LLaVa-NEXT) and updated example scripts
Refactored repo structure for simpler model onboard and update process

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What's Changed

New Models

New Contributors

Other Contributors

Contributors

What's Changed

New Contributors

Contributors

Releases: VectorInstitute/vector-inference

v0.5.0

What's Changed

New Models

New Contributors

Other Contributors

Contributors

v0.4.1

What's Changed

New Contributors

Contributors

v0.4.0.post1

v0.4.0

v0.3.3

v0.3.2

v0.3.1

v0.3.0

v0.2.1

v0.2.0