We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Background vLLM currently supports various model features through configuration parameters, but lacks support for passing additional model-specific parameters through extra_body, which is particularly important for features like structured output. https://github.com/vllm-project/vllm/blob/v0.6.0/vllm/engine/arg_utils.py#L276
completion = client.chat.completions.create( model="gpt-3.5-turbo", messages=[{"role": "user", "content": "Generate a user profile"}], extra_body={ "guided_json": Test.schema_json, "guided_decoding_backend": "lm-format-enforcer" } )
resp, err := integrations.LLMClient.Client.CreateChatCompletion( ctx, openai.ChatCompletionRequest{ Model: "...", Messages: []openai.ChatCompletionMessage{ ... }, ExtraBody: map[string]any{ .... }, }, )
The text was updated successfully, but these errors were encountered:
This can be useful incase of other engines as well e.g. Nvidia NIMs - https://build.nvidia.com/nvidia/nv-embedqa-e5-v5?snippet_tab=Python
Sorry, something went wrong.
No branches or pull requests
Background
vLLM currently supports various model features through configuration parameters, but lacks support for passing additional model-specific parameters through extra_body, which is particularly important for features like structured output.
https://github.com/vllm-project/vllm/blob/v0.6.0/vllm/engine/arg_utils.py#L276
Current OpenAI implementation
Proposed implementation
The text was updated successfully, but these errors were encountered: