Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stats/opentelemetry: separate out interceptors for tracing and metrics #8063

Open
wants to merge 37 commits into
base: master
Choose a base branch
from

Conversation

janardhanvissa
Copy link
Contributor

@janardhanvissa janardhanvissa commented Feb 3, 2025

RELEASE NOTE: None

Copy link

codecov bot commented Feb 3, 2025

Codecov Report

Attention: Patch coverage is 68.99225% with 40 lines in your changes missing coverage. Please review.

Project coverage is 82.30%. Comparing base (5668c66) to head (2a914e3).

Files with missing lines Patch % Lines
stats/opentelemetry/client_tracing.go 58.82% 23 Missing and 5 partials ⚠️
stats/opentelemetry/server_tracing.go 60.00% 9 Missing and 3 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #8063      +/-   ##
==========================================
- Coverage   82.32%   82.30%   -0.03%     
==========================================
  Files         392      392              
  Lines       39140    39207      +67     
==========================================
+ Hits        32222    32269      +47     
- Misses       5597     5611      +14     
- Partials     1321     1327       +6     
Files with missing lines Coverage Δ
stats/opentelemetry/client_metrics.go 87.79% <100.00%> (-1.61%) ⬇️
stats/opentelemetry/opentelemetry.go 76.82% <100.00%> (+1.82%) ⬆️
stats/opentelemetry/server_metrics.go 89.30% <100.00%> (-0.52%) ⬇️
stats/opentelemetry/server_tracing.go 68.42% <60.00%> (-31.58%) ⬇️
stats/opentelemetry/client_tracing.go 59.49% <58.82%> (-16.98%) ⬇️

... and 18 files with indirect coverage changes

🚀 New features to boost your workflow:
  • Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@janardhanvissa janardhanvissa force-pushed the refactor-tracing-metrics branch from 69df069 to 71804b4 Compare February 3, 2025 11:48
@purnesh42H
Copy link
Contributor

@janardhanvissa its not clear what is the intention of this refactor. The follow up from opentelemetry tracing API PR was to create separate interceptors for metrics and traces. Right now, single interceptor is handling both trace and metrics options. Once we have separate unary and stream interceptor each for tracing and metrics, we don't have to check for options disabled/enabled everytime.

Copy link
Contributor

@purnesh42H purnesh42H left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@purnesh42H purnesh42H changed the title opentelemetry: Refactor tracing and metrics separately stats/opentelemetry: separate out interceptors for tracing and metrics Feb 19, 2025
type clientMetricsStatsHandler struct {
*clientStatsHandler
}
type clientTracingStatsHandler struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new line in between struct declaration

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

method: h.determineMethod(method, opts...),
}
ctx = setCallInfo(ctx, ci)
if h.options.MetricsOptions.pluginOption != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this metadata part is not applicable for tracing. It should only be in metrics interceptor. Please remove

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}
ctx = setCallInfo(ctx, ci)

if h.options.MetricsOptions.pluginOption != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same. this metadata part is not applicable for tracing. It should only be in metrics interceptor. Please remove

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}

// perCallTraces records per call trace spans.
func (h *clientTracingStatsHandler) perCallTraces(_ context.Context, err error, _ time.Time, _ *callInfo, ts trace.Span) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since we don't need time and callInfo, we should not just have them here instead of replacing them with _

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -211,24 +218,41 @@ func (h *serverStatsHandler) TagRPC(ctx context.Context, info *stats.RPCTagInfo)

// HandleRPC implements per RPC tracing and stats implementation.
func (h *serverStatsHandler) HandleRPC(ctx context.Context, rs stats.RPCStats) {
if h.options.isMetricsEnabled() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should have separate HandleRPC for metrics and trace statsHandlers instead of having common which requires to check what is enabled etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}

type clientTracingStatsHandler struct {
*clientStatsHandler
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we extending clientStatsHandler. Let's just keep all stats handler embedding estats.MetricsRecorder. Increasing levels in hierarchy doesn't help in anyway because metrics and traces stats handler doesn't share common functionality.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think keeping the inheritance (clientMetricsStatsHandler and clientTracingStatsHandler embedding *clientStatsHandler) would be beneficial because it avoids duplicating the common fields and methods in both metrics and tracing handlers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as i know there is nothing common between metrics and traces so its not providing any help. Which are the common things you found?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly, you're right. After further review and refinement, there isn't significant functional commonality between the metrics and tracing handlers themselves. I thought inheritance or embedding thinking there could be shared setup or code reduction, but now I see the benefits of separation for clarity and maintainability are more important here.

@janardhanvissa janardhanvissa removed their assignment Mar 7, 2025
return joinDialOptions(grpc.WithChainUnaryInterceptor(csh.unaryInterceptor), grpc.WithChainStreamInterceptor(csh.streamInterceptor), grpc.WithStatsHandler(csh))
var do []grpc.DialOption

if o.isMetricsEnabled() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't need to check for MetricsEnabled as that's being added as no-op and check is there in initializeMetrics

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

return joinServerOptions(grpc.ChainUnaryInterceptor(ssh.unaryInterceptor), grpc.ChainStreamInterceptor(ssh.streamInterceptor), grpc.StatsHandler(ssh))
var so []grpc.ServerOption

if o.isMetricsEnabled() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same. We don't need to check for metrics enabled

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

options: o,
}
tracingHandler.initializeTraces()
do = append(do,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same: one line

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

)
}
if o.isTracingEnabled() {
tracingHandler := &clientTracingHandler{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: one line

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

var do []grpc.DialOption

if o.isMetricsEnabled() {
metricsHandler := &clientStatsHandler{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we adding these new changes? We should keep the current part as is and only add the tracing part. Please revert the refactor changes done for clientStatsHandler here

csh := &clientStatsHandler{options: o}
	csh.initializeMetrics()
	do := joinDialOptions(grpc.WithChainUnaryInterceptor(csh.unaryInterceptor), grpc.WithChainStreamInterceptor(csh.streamInterceptor), grpc.WithStatsHandler(csh))

if o.isTracingEnabled(){
...
do := joinDialOptions(grpc.WithChainUnaryInterceptor(th.unaryInterceptor),.....)
}

return do

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}

func newServerStatsHandler(options MetricsOptions) metricsRecorderForTest {
return &serverStatsHandler{options: Options{MetricsOptions: options}}
rm := &registryMetrics{optionalLabels: options.OptionalLabels}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same. We should't need to change anything here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -47,11 +47,13 @@ type metricsRecorderForTest interface {
}

func newClientStatsHandler(options MetricsOptions) metricsRecorderForTest {
return &clientStatsHandler{options: Options{MetricsOptions: options}}
rm := &registryMetrics{optionalLabels: options.OptionalLabels}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we shouldn't need to change anything here. Please revert.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -178,6 +201,14 @@ func (h *serverStatsHandler) TagConn(ctx context.Context, _ *stats.ConnTagInfo)
// HandleConn exists to satisfy stats.Handler.
func (h *serverStatsHandler) HandleConn(context.Context, stats.ConnStats) {}

// TagConn exists to satisfy stats.Handler for tracing.
func (h *serverTracingHandler) TagConn(ctx context.Context, _ *stats.ConnTagInfo) context.Context {
return ctx
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: single line

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After saving the file it's coming in a newline.

@@ -170,6 +185,14 @@ func (h *serverStatsHandler) streamInterceptor(srv any, ss grpc.ServerStream, _
return err
}

func (h *serverTracingHandler) unaryInterceptor(ctx context.Context, req any, _ *grpc.UnaryServerInfo, handler grpc.UnaryHandler) (any, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: single line

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After saving the file it's coming in a newline.

}
ai := ri.ai

ctx, _ = h.traceTagRPC(ctx, ai)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can directly pass ri.ai

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -71,10 +76,18 @@ func (h *clientStatsHandler) initializeMetrics() {
rm.registerMetrics(metrics, meter)
}

func (h *clientTracingHandler) initializeTraces() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same. Should go to client_tracing.go

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}

func (h *clientTracingHandler) unaryInterceptor(ctx context.Context, method string, req, reply any, cc *grpc.ClientConn, invoker grpc.UnaryInvoker, opts ...grpc.CallOption) error {
ci := &callInfo{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like we are setting callInfo again here. Since, tracingHandler is added after clientStatsHandler, the callInfo would have already been created. Similar to what we are doing with attempt info, we should try fetching here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}

func (h *clientTracingHandler) streamInterceptor(ctx context.Context, desc *grpc.StreamDesc, cc *grpc.ClientConn, method string, streamer grpc.Streamer, opts ...grpc.CallOption) (grpc.ClientStream, error) {
ci := &callInfo{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment about callInfo as unaryInterceptor of tracingHandler

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}
ai := ri.ai

ctx, _ = h.traceTagRPC(ctx, ai)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can pass ri.ai directly

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}
ai := ri.ai

ctx, _ = h.traceTagRPC(ctx, ai)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we shouldn't ignore the attempt info because that contains the trace span. We should set it again in the context using setRPCInfo

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

// Fetches rpcInfo from context. This handler requires a preceding stats handler
// in the interceptor chain to have already created and set the rpcInfo.
func (h *clientTracingHandler) TagRPC(ctx context.Context, _ *stats.RPCTagInfo) context.Context {
ri := getRPCInfo(ctx)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment explaining why we are fetching getRPCInfo

@@ -71,10 +68,12 @@ func (h *clientStatsHandler) initializeMetrics() {
rm.registerMetrics(metrics, meter)
}

// unaryInterceptor implements the UnaryClientInterceptor to record metrics for
// unary calls.
func (h *clientStatsHandler) unaryInterceptor(ctx context.Context, method string, req, reply any, cc *grpc.ClientConn, invoker grpc.UnaryInvoker, opts ...grpc.CallOption) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a doc string // unaryInterceptor records metrics for unary RPC calls. It updates the context
// with call info, adds plugin metadata if configured, and tracks call duration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants