Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change in GRPC_XDS_BOOTSTRAP behaviour between grpc-go versions #8152

Open
davinci26 opened this issue Mar 6, 2025 · 6 comments
Open

Change in GRPC_XDS_BOOTSTRAP behaviour between grpc-go versions #8152

davinci26 opened this issue Mar 6, 2025 · 6 comments
Assignees
Labels
Area: xDS Includes everything xDS related, including LB policies used with xDS. Type: Bug

Comments

@davinci26
Copy link

davinci26 commented Mar 6, 2025

I am not sure if this is considered a bug / breaking change or by design

Previous versions of grpc-go checked the existence of XDSBootstrapFileNameEnv = "GRPC_XDS_BOOTSTRAP" at init() but the latest version is creating the xds client pool at init. Which means that previously we could pass an env variable that points to a file and have the program itself construct the config.

The snippet below shows the current behaviour

func init() {
	internal.TriggerXDSResourceNotFoundForTesting = triggerXDSResourceNotFoundForTesting
	xdsclientinternal.ResourceWatchStateForTesting = resourceWatchStateForTesting

	// DefaultPool is initialized with bootstrap configuration from one of the
	// supported environment variables. If the environment variables are not
	// set, then fallback bootstrap configuration should be set before
	// attempting to create an xDS client, else xDS client creation will fail.
	config, err := bootstrap.GetConfiguration()
	if err != nil {
		if logger.V(2) {
			logger.Infof("Failed to read xDS bootstrap config from env vars:  %v", err)
		}
		DefaultPool = &Pool{clients: make(map[string]*clientRefCounted)}
		return
	}
	DefaultPool = &Pool{clients: make(map[string]*clientRefCounted), config: config}
}

At init() we call bootstrap.GetConfiguration() and then construct a pool that is invalid because config is a nil.

This is a breaking change because the program here https://gist.github.com/davinci26/4d4a9dcd1e5e7d7842037a438132e813 worked with v.1.70 but does not work with v.1.71

On why we construct the json file in-process is mostly as a convenience factor so we can add the locality field based on the process information.

@davinci26
Copy link
Author

I suspect the commit that caused the difference is this one 79b6830#diff-8c0a0efa1bb340ffe5846ebebb64f9182dcbb1218d0a51a6b3d6d340072ab07e

@arjan-bal arjan-bal added the Area: xDS Includes everything xDS related, including LB policies used with xDS. label Mar 6, 2025
@davinci26 davinci26 changed the title Change in behaviour between grpc-go versions Change in GRPC_XDS_BOOTSTRAP behaviour between grpc-go versions Mar 6, 2025
@purnesh42H
Copy link
Contributor

purnesh42H commented Mar 7, 2025

@davinci26 yeah the library now expect the bootstrap file to be present at the init time after the changes in #7898. It is because internally we have a way to set fallback bootstrap config but only if the env vars were not set https://github.com/grpc/grpc-go/blob/master/xds/internal/xdsclient/pool.go#L148 but its not exposed. It was done mainly because we wanted to disallow overriding. Our example recommend the bootstrap file to be present beforehand https://github.com/grpc/grpc-go/blob/master/examples/features/xds/README.md#the-client but since the previous versions allowed creating and setting the bootstrap file at runtime, I will ask other maintainers if we want to expose setting fallback config for backward compatibility.

@davinci26
Copy link
Author

davinci26 commented Mar 7, 2025

@purnesh42H totally fair, we are maybe in a hyrum law situation where you are a victim of your own success. This is a breaking change for us so we would happily take the escape hatch for back compatibility. We are excited to try out other features like fallback and the new metrics you added so we are super keen to upgrade to v1.71.0 but this makes it a bit tricky.

So we can plan a bit internally:

  • When would you know if the maintainers would be ok with exposing a setting back-compat config option?
  • if this option where to be exposed would it be backported to v1.71.0 as a bug-fix or would it be available in v1.72.0?

@easwars
Copy link
Contributor

easwars commented Mar 7, 2025

Reading the bootstrap configuration is supposed to happen once during program init and the fact that whatever you are doing was working earlier is exactly a Hyrum's law situation. We are sorry for breaking you, but we don't have any plans of exposing the API to set fallback bootstrap configuration (that gets used when the environment variables are not set). In fact, all the APIs used by the xDS client are internal only APIs, and exposing anything from there would lock us in since we are planning to make some major changes there.

Maybe, if we understand your deployment better, we might be able to suggest ways that you can use to get things to work with the existing functionality/behavior.

Do you run in a containerized environment? If so, we do have a binary which can be run as an init container to generate bootstrap configuration and place it in a well known location for your program to pick up. Please see: https://github.com/GoogleCloudPlatform/traffic-director-grpc-bootstrap

@davinci26
Copy link
Author

Do you run in a containerized environment? If so, we do have a binary which can be run as an init container to generate bootstrap configuration and place it in a well known location for your program to pick up.

This is the alternative we are considering along with other k8s specific alternative. We have ways out to solve the problem but since this is a breaking change for us, we need to plan/refactor and change existing applications before we upgrade so we are going to have to take engineering/development cost before we can upgrade our grpc version.

Maybe, if we understand your deployment better, we might be able to suggest ways that you can use to get things to work with the existing functionality/behavior.

100% up to you as the maintainer team. I think the perspective is that having the ability for the program to configure the library is a nice feature and makes the UX much better. Because while the problem can be addressed with an init-container (or other infrastructure solutions) requiring special infrastructure to configure a facet of your application increases friction.

Especially since bootstrap.json contains fields like locality and metadata which are not known at build time

So I guess the verdict is that we should solve this at the infrastructure layer? (I am ok with either outcome just wanted the clarity so we can plan internally next steps)

@easwars
Copy link
Contributor

easwars commented Mar 7, 2025

I checked with other maintainers and looks like other gRPC languages don't read the bootstrap file pointed to by the environment variable at init time. Instead they do it lazily when the first xDS client is created (which is triggered either when the first xDS enabled channel or server is created). So, we will make the same change in grpc-go as well.

Thanks for filing the issue and engaging in a useful conversation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: xDS Includes everything xDS related, including LB policies used with xDS. Type: Bug
Projects
None yet
Development

No branches or pull requests

4 participants