Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

context_length_exceeded when generating title #377

Open
endolith opened this issue Jun 28, 2023 · 9 comments
Open

context_length_exceeded when generating title #377

endolith opened this issue Jun 28, 2023 · 9 comments

Comments

@endolith
Copy link

Error generating title!
{
  "error": {
    "message": "This model's maximum context length is 4097 tokens. However, your messages resulted in 7403 tokens. Please reduce the length of the messages.",
    "type": "invalid_request_error",
    "param": "messages",
    "code": "context_length_exceeded"
  }
}

Either use the 16k model to generate the title, or just truncate the input (which should be good enough for generating a title)

@XOKP
Copy link

XOKP commented Jun 29, 2023

I am encountering the same error.

@niccolofavari
Copy link

I'm getting the same error but it looks like the max_tokens data isn't even sent to the API

image

Is this the expected behavior? I don't see any changes when I edit the max_tokens setting.

@endolith
Copy link
Author

endolith commented Jul 10, 2023

I'm getting the same error but it looks like the max_tokens data isn't even sent to the API

image

Is this the expected behavior? I don't see any changes when I edit the max_tokens setting.

Max tokens is a property of each model, but isn't published through the API. I've asked them to add that openai/openai-python#448

@niccolofavari
Copy link

Also I think that the max_tokens should be the maximum model token (e.g. 16384 for gpt-3.5-turbo-16k) minus the previous messages length, minus some more safe "margin" tokens.

example:
16384 (max model) - 8985 (previous content) = 7399 (remaining max_tokens)

Unfortunately doing so will result in an error, so it's usually better to set 1% or 2% less tokens for max_tokens (I'd send 7300 for the example above).

@niccolofavari
Copy link

I'm getting the same error but it looks like the max_tokens data isn't even sent to the API
image
Is this the expected behavior? I don't see any changes when I edit the max_tokens setting.

Max tokens is a property of each model, but isn't published through the API. I've asked them to add that openai/openai-python#448

I'm confused. It's already implemented in the api: https://platform.openai.com/docs/api-reference/chat/create#chat/create-max_tokens

I could copy the request from the browser inspect network tab (in curl format), set the max_tokens and run it in the cli terminal. Looks like it's working. I must be missing something...

@endolith
Copy link
Author

endolith commented Jul 10, 2023

I'm confused. It's already implemented in the api: platform.openai.com/docs/api-reference/chat/create#chat/create-max_tokens

Ah, that's the maximum number of tokens to generate, not the maximum supported by the model.

(Which I guess would actually be called context_length?)

The token count of your prompt plus max_tokens cannot exceed the model's context length.

Context length VS Max token VS Maximum length

When BetterChatGPT is trying to auto-generate a title, it's feeding more tokens to the model than the model supports, producing this error.

The maximum context lengths for each GPT and embeddings model can be found in the model index.

(Though it is confusingly called "Max tokens" in the model index table.)

@niccolofavari
Copy link

niccolofavari commented Jul 10, 2023

It is a bit confusing indeed but the max_tokens parameter is never sent to begin with. It should be calculated and sent with each request like context_length - content-tokens (for lack of better wording) = max_tokens

As I said it should probably be 1% or 2% less than that to avoid errors (I tried with a precise number and it gave me errors anyway)

So in summary... this parameter varies from call to call (i.e. the maximum range of the slider, should become smaller and smaller each time we send a request and get a response)

@endolith
Copy link
Author

image

I get this every time, it's frustrating

endolith added a commit to endolith/BetterChatGPT that referenced this issue Aug 13, 2023
endolith added a commit to endolith/BetterChatGPT that referenced this issue Aug 13, 2023
endolith added a commit to endolith/BetterChatGPT that referenced this issue Aug 13, 2023
Crude fix for ztjhz#377

Ideally would be based on tokens, not characters.
@jackschedel
Copy link
Contributor

jackschedel commented Aug 17, 2023

This is fixed in my fork. unfortunately, I fixed it after fixing a lot more stuff to do with model context and max tokens (and detaching fork from parent), so I can't easily make a diff., but feel free to try to steal my implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants