Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xds: ensure node ID is populated in errors from the server #8140

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

easwars
Copy link
Contributor

@easwars easwars commented Mar 5, 2025

Addresses #7931

This PR ensures that the following scenarios in the xDS-enabled server result in errors that contain the xDS node id:

  • The requested listener resource is not found
    • This results in the server switching to NOT_SERVING, and the error sent to the callback contains the node id
  • A received listener resource that passes xDS client validation but fails subsequent validation at the server
    • This results in the server switching to NOT_SERVING, and the error sent to the callback contains the node id
  • The route configuration is either NACKed or the resource is not found (without a previous good update)
  • RPC time errors
    • Matched route configuration contains an error (either NACKed or resource not found)
    • Matched route contains no matching virtual host
    • Matched route contains a non-forwarding route action
    • No matching route found
    • RPC denied by RBAC policy

The PR also includes a whole bunch of test cleanup. But essentially, the only important things changing are the following:

  • Checks that errors returned from failed RPCs because of server issues contain the xDS node ID
  • Checks that errors pushed to the serving mode change callbacks contain the xDS node ID

Apart from that, it moves tests around quite a bit so that most server tests live in the top level xds directory where the server code lives. Also moving these tests out of the test/xds package causes a overall reduction in test run time.

RELEASE NOTES: none

@easwars easwars requested a review from dfawley March 5, 2025 01:05
@easwars easwars added the Type: Feature New features or improvements in behavior label Mar 5, 2025
@easwars easwars added this to the 1.72 Release milestone Mar 5, 2025
Copy link

codecov bot commented Mar 5, 2025

Codecov Report

Attention: Patch coverage is 94.44444% with 2 lines in your changes missing coverage. Please review.

Project coverage is 82.22%. Comparing base (8ae4b7d) to head (718611f).
Report is 8 commits behind head on master.

Files with missing lines Patch % Lines
xds/internal/server/listener_wrapper.go 90.47% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #8140      +/-   ##
==========================================
- Coverage   82.34%   82.22%   -0.13%     
==========================================
  Files         389      392       +3     
  Lines       39103    39106       +3     
==========================================
- Hits        32200    32154      -46     
- Misses       5579     5619      +40     
- Partials     1324     1333       +9     
Files with missing lines Coverage Δ
internal/testutils/xds/e2e/clientresources.go 98.03% <ø> (-0.19%) ⬇️
xds/internal/server/rds_handler.go 87.36% <100.00%> (-3.94%) ⬇️
xds/internal/xdsclient/xdsresource/filter_chain.go 93.31% <100.00%> (+0.43%) ⬆️
xds/server.go 82.53% <100.00%> (+0.10%) ⬆️
xds/internal/server/listener_wrapper.go 75.00% <90.47%> (-0.84%) ⬇️

... and 26 files with indirect coverage changes

🚀 New features to boost your workflow:
  • Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

xds/server.go Outdated
}
}
return nil
}

func annotateErrorWithNodeID(nodeID string, err error) error {
return fmt.Errorf("[xDS node id: %v]: %w", nodeID, err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These were status errors and now they will be wrapped status errors.

Is this desirable/OK?

Also, WDYT about something like:

func (rc *UsableRouteConfiguration) statusErrWithNodeID(c codes.Code, msg string, ...any) error

to avoid the need to pass NodeID explicitly, like you did with listenerWrapper?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this desirable/OK?

I guess if the caller deals with wrapped errors properly, it should be fine. And if they use our status package functions to read the code/message out of it, they should be fine too. But I guess the approach you suggested avoids that and is nicer than what I had. So, switched to that. Thanks.

@easwars easwars removed their assignment Mar 7, 2025
@dfawley dfawley requested a review from purnesh42H March 7, 2025 22:54
@dfawley dfawley assigned purnesh42H and unassigned dfawley Mar 7, 2025
Copy link
Member

@dfawley dfawley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main part of the changes LGTM.

I was thinking this would impact the generic xdsclient, so assigned to @purnesh42H, but actually it looks like it probably won't since it's all in the resources. But can you still please review the e2e test changes @purnesh42H?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Feature New features or improvements in behavior
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants