Monday, 7 October 2019

Do people use domain in production?

Hey /r/node,I'm wondering how you guys all handle uncaught exceptions in production. Right now at my company, an uncaught exception causes the entire node process to die, doesn't send a 500, and immediately terminates all concurrent requests without sending their responses either.This is a huge availability problem, especially as the app takes ~5-10 seconds to boot.I looked into domains as a solution for this, and it seems to work beautifully. The only catch is that the docs say:By the very nature of how throw works in JavaScript, there is almost never any way to safely "pick up where it left off", without leaking references, or creating some other sort of undefined brittle state.What is the risk of that? The docs suggest:The better approach is to send an error response to the request that triggered the error, while letting the others finish in their normal time, and stop listening for new requests in that worker.So I've written code that does that, but also allows the "dying" worker to continue to accept requests for ~10 seconds to allow the new worker to boot up and avoid downtime.Is this approach safe? Do you guys do something similar or completely different? Is there any feasible scenario where accepting a new request after an uncaught exception is somehow worse than just allowing the existing concurrent ones to finish up?Thanks!

Submitted October 07, 2019 at 05:54PM by PM_ME_RAILS_R34

No comments:

Post a Comment