Skip to content

Conversation

@krrishdholakia
Copy link
Contributor

Relevant issues

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

  • I have Added testing in the tests/litellm/ directory, Adding at least 1 test is a hard requirement - see details
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible, it only solves 1 specific problem

CI (LiteLLM team)

CI status guideline:

  • 50-55 passing tests: main is stable with minor issues.
  • 45-49 passing tests: acceptable but needs attention
  • <= 40 passing tests: unstable; be careful with your merges and assess the risk.
  • Branch creation CI run
    Link:

  • CI run for the last commit
    Link:

  • Merge / cherry-pick CI run
    Links:

Type

🆕 New Feature
🐛 Bug Fix
🧹 Refactoring
📖 Documentation
🚄 Infrastructure
✅ Test

Changes

naaa760 and others added 3 commits January 28, 2026 17:19
…hropic (#19896)

* fix(vertex_ai): convert image URLs to base64 in tool messages for Anthropic

Fixes #19891

Vertex AI Anthropic models don't support URL sources for images. LiteLLM
already converted image URLs to base64 for user messages, but not for tool
messages (role='tool'). This caused errors when using ToolOutputImage with
image_url in tool outputs.

Changes:
- Add force_base64 parameter to convert_to_anthropic_tool_result()
- Pass force_base64 to create_anthropic_image_param() for tool message images
- Calculate force_base64 in anthropic_messages_pt() based on llm_provider
- Add unit tests for tool message image handling

* chore: remove extra comment from test file header
* fix(proxy_server): pass search_tools to Router during DB-triggered initialization

* fix search tools from db

* add missing statement to handle from db

* fix import issues to pass lint errors
@vercel
Copy link

vercel bot commented Jan 29, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Review Updated (UTC)
litellm Error Error Jan 29, 2026 6:36am

Request Review

…9654)

Fixes #19478

The stream_chunk_builder function was not handling image chunks from
models like gemini-2.5-flash-image. When streaming responses were
reconstructed (e.g., for caching), images in delta.images were lost.

This adds handling for image_chunks similar to how audio, annotations,
and other delta fields are handled.
…sing (#19776)

Fixes #16920 for users of the stable release images.

The previous fix (PR #18092) added libsndfile to docker/Dockerfile.alpine,
but stable releases are built from the main Dockerfile (Wolfi-based),
not the Alpine variant.
… list (#19952)

The /health/services endpoint rejected datadog_llm_observability as an
unknown service, even though it was registered in the core callback
registry and __init__.py. Added it to both the Literal type hint and
the hardcoded validation list in the health endpoint.
* fix(proxy): prevent provider-prefixed model leaks

Proxy clients should not see LiteLLM internal provider prefixes (e.g. hosted_vllm/...) in the OpenAI-compatible response model field.

This patch sanitizes the client-facing model name for both:
- Non-streaming responses returned from base_process_llm_request
- Streaming SSE chunks emitted by async_data_generator

Adds regression tests covering vLLM-style hosted_vllm routing for both streaming and non-streaming paths.

* chore(lint): suppress PLR0915 in proxy handler

Ruff started flagging ProxyBaseLLMRequestProcessing.base_process_llm_request() for too many statements after the hotpatch changes.

Add an explicit '# noqa: PLR0915' on the function definition to avoid a large refactor in a hotpatch.

* refactor(proxy): make model restamp explicit

Replace silent try/except/pass and type ignores with explicit model restamping.

- Logs an error when the downstream response model differs from the client-requested model
- Overwrites the OpenAI `model` field to the client-requested value to avoid leaking internal provider-prefixed identifiers
- Applies the same behavior to streaming chunks, logging the mismatch only once per stream

* chore(lint): drop PLR0915 suppression

The model restamping bugfix made `base_process_llm_request()` slightly exceed Ruff's
PLR0915 (too-many-statements) threshold, requiring a `# noqa` suppression.

Collapse consecutive `hidden_params` extractions into tuple unpacking so the
function falls back under the lint limit and remove the suppression.

No functional change intended; this keeps the proxy model-field bugfix intact
while aligning with project linting rules.

* chore(proxy): log model mismatches as warnings

These model-restamping logs are intentionally verbose: a mismatch is a useful signal
that an internal provider/deployment identifier may be leaking into the public
OpenAI response `model` field.

- Downgrade model mismatch logs from error -> warning
- Keep error logs only for cases where the proxy cannot read/override the model

* fix(proxy): preserve client model for streaming aliasing

Pre-call processing can rewrite request_data['model'] via model alias maps.\n\nOur streaming SSE generator was using the rewritten value when restamping chunk.model, which caused the public 'model' field to differ between streaming and non-streaming responses for alias-based requests.\n\nStash the original client model in request_data as _litellm_client_requested_model after the model has been routed, and prefer it when overriding the outgoing chunk model. Add a regression test for the alias-mapping case.

* chore(lint): satisfy PLR0915 in streaming generator

Ruff started flagging async_data_generator() for too many statements after adding model restamping logic.\n\nExtract the client-model selection + chunk restamping into small helpers to keep behavior unchanged while meeting the project's PLR0915 threshold.
cfchase and others added 2 commits January 28, 2026 22:33
…verify (#19893)

* fix(hosted_vllm): route through base_llm_http_handler to support ssl_verify

The hosted_vllm provider was falling through to the OpenAI catch-all path
which doesn't pass ssl_verify to the HTTP client. This adds an explicit
elif branch that routes hosted_vllm through base_llm_http_handler.completion()
which properly passes ssl_verify to the httpx client.

- Add explicit hosted_vllm branch in main.py completion()
- Add ssl_verify tests for sync and async completion
- Update existing audio_url test to mock httpx instead of OpenAI client

* feat(hosted_vllm): add embedding support with ssl_verify

- Add HostedVLLMEmbeddingConfig for embedding transformations
- Register hosted_vllm embedding config in utils.py
- Add lazy import for embedding transformation module
- Add unit test for ssl_verify parameter handling
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants