Skip to content

Add micro batching and enpoints for v1 list_models and get_model#41

Open
baixiac wants to merge 1 commit intomainfrom
llm-gen2
Open

Add micro batching and enpoints for v1 list_models and get_model#41
baixiac wants to merge 1 commit intomainfrom
llm-gen2

Conversation

@baixiac
Copy link
Member

@baixiac baixiac commented Mar 6, 2026

feat: add enpoints for v1 models and list models
feat: add micro batching and lower CPU usage during model loading
feat: ensure the pad token for generative models
feat: use the async streamer during async generation
feat: apply timeout to text generation
fix: fix the property name for stop sequences in OpenAI requests

feat: add micro batching and lower CPU usage during model loading
feat: ensure the pad token for generative models
feat: use the async streamer during async generation
feat: apply timeout to text generation
fix: fix the property name for stop sequences in OpenAI requests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant