-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Job timeout applied after middleware #752
Comments
Hmm, so I didn't consider it one way or the other super deeply admittedly, but including the middleware in the timeout doesn't seem super obviously more or less correct to me. An argument could be made that that when setting work timeouts, users are trying to set a timeout specifically on their code as opposed to River's internal code, which to their eye would be an implementation detail. So similarly how the timeout doesn't include the time it takes to complete a job, it doesn't include middleware, which is largely made up of other internal stuff. |
The risk I see is that as it currently is, the middleware have no timeout whatsoever applied to them. Beyond that I don't think it matters a lot either way, but I do worry about the risk of a middleware getting stuck and the job just running forever despite whatever timeouts are configured. |
Yeah I suppose that could be a danger depending on what you're trying to do in the middleware. Probably a separate timeout like you'd find in |
While reading through the executor for another change, I realized that as of #632 / #584 we're applying the job timeout after stepping through all the middleware:
river/job_executor.go
Lines 209 to 227 in fc667eb
I'm not sure if this was an intentional decision and I couldn't find any conversations where it was mentioned, so I thought we should at least talk about it. Do you think this is the right place for it, or should the timeout happen before the middleware stack gets called? I was thinking of a worst case scenario where a middleware got stuck or took awhile on a blocking operation and the job ended up running far longer than either the job's timeout or the client's default timeout.
The text was updated successfully, but these errors were encountered: