Skip to content

max_tokens shouldn't be used to define the context window size #2397

@k33g

Description

@k33g

I think docker-agent uses max_tokens for both the context window size and the maximum response length, which are two different things (at least with regard to DMR). So, if I want a context window with 32768 tokens and limit response lengths to 4096, I can't.

For example, for llamacpp, the context size is defined with num_ctx and the maximum number of tokens in the response with max_tokens.

Apparently, Docker Compose uses context_size for DMR.

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions