Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Inability to Parse API Response Data Due to Missing Space After … #932

Open
wants to merge 12 commits into
base: master
Choose a base branch
from

Conversation

sunhailin-Leo
Copy link

Copy link

codecov bot commented Feb 7, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.89%. Comparing base (774fc9d) to head (b1fd9e6).
Report is 87 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #932      +/-   ##
==========================================
+ Coverage   98.46%   98.89%   +0.43%     
==========================================
  Files          24       27       +3     
  Lines        1364     1812     +448     
==========================================
+ Hits         1343     1792     +449     
+ Misses         15       14       -1     
  Partials        6        6              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@sunhailin-Leo sunhailin-Leo marked this pull request as draft February 15, 2025 03:37
@sunhailin-Leo sunhailin-Leo marked this pull request as ready for review February 15, 2025 03:38
@sunhailin-Leo
Copy link
Author

sunhailin-Leo commented Feb 18, 2025

@sashabaranov @liushuangls

  • Have any other problem or question in this pull-request?

@sashabaranov
Copy link
Owner

@sunhailin-Leo Thank you for updates, the approach looks good, but the processLines() is really hard to read. Partly because it has //nolint:gocognit before it disabling cognitive complexity linter (cc @liushuangls)

I propose we remove //nolint:gocognit and try to simplify processLines(). I can't understand what this method is doing by reading the diff.

Here's an idea to simplify, but I'm not sure it's correct

func trimAndCompare(s []byte, cutset string) ([]byte, bool) {
	originalLen = len(s)
	updated := bytes.Trim(s, cutset)
	lengthChanged := originalLen != len(updated)
	return updated, lengthChanged
}

func (stream *streamReader[T]) processLines() ([]byte, error) {
	var (
		emptyMessagesCount uint
		hasErrorPrefix     bool
		noPrefixLine       []byte
	)

	for {
		line, readErr := stream.reader.ReadBytes('\n')
		if readErr != nil || hasErrorPrefix {
			respErr := stream.unmarshalError()
			if respErr != nil {
				return nil, fmt.Errorf("error, %w", respErr.Error)
			}
			return nil, readErr
		}

		line = bytes.TrimSpace(rawLine)
		line, gotHeader = trimAndCompare(line, headerData)
		line = bytes.TrimSpace(line)
		if gotHeader {
			hasErrorPrefix = bytes.HasPrefix(line, errorPrefix)
		}
		if !gotHeader || hasErrorPrefix {
			writeErr := stream.errAccumulator.Write(line)
			if writeErr != nil {
				return nil, writeErr
			}
			emptyMessagesCount++
			if emptyMessagesCount > stream.emptyMessagesLimit {
				return nil, ErrTooManyEmptyStreamMessages
			}

			continue
		}

		if string(noPrefixLine) == "[DONE]" {
			stream.isFinished = true
			return nil, io.EOF
		}

		return noPrefixLine, nil
	}
}

@sashabaranov
Copy link
Owner

Side idea: maybe we can re-factor this as a state machine here, with states clearly enumerated

@sunhailin-Leo
Copy link
Author

Side idea: maybe we can re-factor this as a state machine here, with states clearly enumerated

If want to re-factor, I think we can refer to https://github.com/openai/openai-go/blob/main/packages/ssestream/ssestream.go#L80

@sashabaranov
Copy link
Owner

@sunhailin-Leo that's a way to go as well!

@sunhailin-Leo
Copy link
Author

sunhailin-Leo commented Feb 18, 2025

@sashabaranov

  • I think this implementation may be better than before.
var (
	//headerData      = []byte("data:")
	//errorPrefix     = []byte(`{"error":`)
	doneFlag        = []byte("[DONE]")
	dataFlag        = "data"
	errorFlag       = []byte(`"error"`)
	errorPrefixFlag = []byte(`{"error`)
)


func (stream *streamReader[T]) processLines() ([]byte, error) {
	var (
		emptyMessagesCount uint
		hasErrorPrefix     bool
	)

	for {
		rawLine, readErr := stream.reader.ReadBytes('\n')
		if readErr != nil || (hasErrorPrefix && readErr == io.EOF) {
			respErr := stream.unmarshalError()
			if respErr != nil {
				return nil, fmt.Errorf("error, %w", respErr.Error)
			}
			return nil, readErr
		}

		// Split a string like "event: bar" into name="event" and value=" bar".
		name, value, _ := bytes.Cut(rawLine, []byte(":"))
		value = bytes.TrimSpace(value)

		// Consume an optional space after the colon if it exists.
		if len(value) > 0 && value[0] == ' ' {
			value = value[1:]
		}

		switch string(name) {
		case dataFlag:
			if bytes.Equal(value, doneFlag) {
				stream.isFinished = true
				return nil, io.EOF
			}
			if bytes.HasPrefix(value, errorPrefixFlag) {
				if writeErr := stream.writeErrAccumulator(value); writeErr != nil {
					return nil, writeErr
				}
				respErr := stream.unmarshalError()
				if respErr != nil {
					return nil, fmt.Errorf("error, %w", respErr.Error)
				}
				continue
			}

			return value, nil
		default:
			if writeErr := stream.writeErrAccumulator(rawLine); writeErr != nil {
				return nil, writeErr
			}
			if bytes.Equal(name, errorFlag) {
				hasErrorPrefix = true
				continue
			}

			emptyMessagesCount++
			if emptyMessagesCount > stream.emptyMessagesLimit {
				return nil, ErrTooManyEmptyStreamMessages
			}

			continue
		}
	}
}

@sunhailin-Leo
Copy link
Author

sunhailin-Leo commented Feb 24, 2025

@sashabaranov @liushuangls

  • There are some problems in golangci.yml ==> linters-settings > varcheck

@sashabaranov
Copy link
Owner

@sunhailin-Leo Could you please rebase on the latest master for CI to pass?

@sashabaranov
Copy link
Owner

There's also this PR: #941 @sunhailin-Leo do you think it makes sense or not?

mazyaryousefinia and others added 5 commits February 26, 2025 01:30
* ref: add image url support to messages

* fix linter error

* fix linter error
* fix: remove validateO1Specific

* update golangci-lint-action version

* fix actions

* fix actions

* fix actions

* fix actions

* remove some o1 test
…nov#934)

* feat: add Anthropic API support with custom version header

* refactor: use switch statement for API type header handling

* refactor: add OpenAI & AzureAD types to be exhaustive

* Update client.go

need explicit fallthrough in empty case statements

* constant for APIVersion; addtl tests
* fix lint

* remove linters
@sunhailin-Leo
Copy link
Author

sunhailin-Leo commented Feb 25, 2025

There's also this PR: #941 @sunhailin-Leo do you think it makes sense or not?

I think that PR is pretty interesting, and it looks like the way to go with minimal changes.
But it's still not elegant enough, and the function complexity doesn't go down.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants