-
Notifications
You must be signed in to change notification settings - Fork 901
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GODRIVER-3255 Await heartbeat checks upto freq when polling #1720
Conversation
API Change ReportNo changes found! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good with one suggestion. 👍
mongo/integration/sdam_prose_test.go
Outdated
ticker := time.NewTicker(heartbeatInterval / 4) | ||
t.Cleanup(ticker.Stop) | ||
|
||
timer := time.NewTimer(heartbeatInterval - 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optional: This seems to rely on precise sleep synchronization and may intermittently fail if the scheduler responds to the timer
goroutine slightly after the ticker
goroutine (on the 4th "tick"). Consider using a more reliable assertion, like asserting a maximum number of heartbeats during an elapsed time.
E.g.
start := time.Now()
time.Sleep(heartbeatInterval * 4)
got := heartbeatStartedCount.Load()
elapsed := time.Since(start)
wantMax := elapsed / heartbeatInterval
assert.LessOrEqual(mt,
got,
wantMax,
"expected fewer than %d heartbeats in %v",
wantMax,
elapsed)
It may also be helpful to further reduce the heartbeat interval to speed up the test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point! A more robust solution would be to assert that the number of handshakes align with the number of discovered servers within minHeartbeatFrequencyMS
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, right, you'd have to know how many servers connected 🤔 . I believe you could derive that from the heartbeat events by counting distinct addresses.
235a5b7
mongo/integration/sdam_prose_test.go
Outdated
// servers. | ||
time.Sleep(500 * time.Millisecond) | ||
|
||
assert.LessOrEqual(mt, heartbeatStartedCount.Load(), int64(len(servers))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could also make this a strong equality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that could cause failures in some cases. Go timers (including tickers) guarantee that a goroutine will continue no earlier than the deadline set. However, there are no strong guarantees for how soon after the deadline the goroutine will continue. As a result, it's possible (though unlikely) that the polling loop will not continue as many times as expected during the test duration.
|
||
mt.Run("polling must await frequency", func(mt *mtest.T) { | ||
var heartbeatStartedCount atomic.Int64 | ||
servers := map[string]bool{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fairly certain there are no concurrency concerns here since this is a set (i.e. nothing is aggregated).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Go maps are not concurrent safe except for read-only use, independent of the keys or data written. Here's the data race detector error:
WARNING: DATA RACE
Read at 0x00c000511470 by goroutine 122:
go.mongodb.org/mongo-driver/mongo/integration.TestServerHeartbeatStartedEvent.func2()
/mongo-go-driver/mongo/integration/sdam_prose_test.go:266 +0x370
go.mongodb.org/mongo-driver/mongo/integration/mtest.(*T).Run.(*T).RunOpts.func1()
/mongo-go-driver/mongo/integration/mtest/mongotest.go:267 +0x2ac
testing.tRunner()
/usr/local/go/src/testing/testing.go:1689 +0x180
testing.(*T).Run.gowrap1()
/usr/local/go/src/testing/testing.go:1742 +0x40
Previous write at 0x00c000511470 by goroutine 195:
runtime.mapaccess2_faststr()
/usr/local/go/src/runtime/map_faststr.go:108 +0x42c
go.mongodb.org/mongo-driver/mongo/integration.TestServerHeartbeatStartedEvent.func2.2()
/mongo-go-driver/mongo/integration/sdam_prose_test.go:249 +0xb8
go.mongodb.org/mongo-driver/x/mongo/driver/topology.(*Topology).publishTopologyDescriptionChangedEvent()
/mongo-go-driver/x/mongo/driver/topology/topology.go:1067 +0x224
go.mongodb.org/mongo-driver/x/mongo/driver/topology.(*Topology).apply()
/mongo-go-driver/x/mongo/driver/topology/topology.go:973 +0xe98
go.mongodb.org/mongo-driver/x/mongo/driver/topology.New.func1()
/mongo-go-driver/x/mongo/driver/topology/topology.go:163 +0x78
go.mongodb.org/mongo-driver/x/mongo/driver/topology.(*Server).updateDescription()
/mongo-go-driver/x/mongo/driver/topology/server.go:707 +0x140
go.mongodb.org/mongo-driver/x/mongo/driver/topology.(*Server).update.func3()
/mongo-go-driver/x/mongo/driver/topology/server.go:624 +0xd8
go.mongodb.org/mongo-driver/x/mongo/driver/topology.(*Server).update()
/mongo-go-driver/x/mongo/driver/topology/server.go:654 +0x4f0
go.mongodb.org/mongo-driver/x/mongo/driver/topology.(*Server).Connect.gowrap1()
/mongo-go-driver/x/mongo/driver/topology/server.go:252 +0x34
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the sake of this test it seems like it shouldn't matter. We're just checking that the number of heartbeats doesn't get ahead of discovery, AFAIK the size of the set doesn't have to be precise at the time we make the assertion. Anyway, updated with a mutex.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even though there is a race concern I don't think you can actually break what is being tested. I wasn't sure the race detector would care in this case 🤷
…iver into GODRIVER-3255
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGMT!
Summary
Align the heartbeat check with driver specifications:
Background & Motivation
When serverMonitoringMode=poll, SDAM awaits is not awaiting the full heartbeatFrequency before handshaking with the server.