[master] f3575e382 Avoid deadlock in VRT_AddDirector() when VCL is going cold
Nils Goroll
nils.goroll at uplex.de
Mon Feb 5 23:44:03 UTC 2024
commit f3575e382cf00f5cb5a69cbc1f5eb1f42246c87a
Author: Nils Goroll <nils.goroll at uplex.de>
Date: Wed Jan 31 09:25:04 2024 +0100
Avoid deadlock in VRT_AddDirector() when VCL is going cold
If VRT_AddDirector() was called from handling a VCL_COLD event or,
indirectly, from another thread which the VCL_COLD event handler was
waiting for, varnishd would deadlock and prevent any CLI or director
changes, because VRT_AddDirector() requires the vcl_mtx, which is held
during vcl_BackendEvent() to ensure a consistent view of the director
list. Because of the early return from VRT_AddDirector() this likely
only happened in VTC mode, but the underlying race existed
nevertheless.
This patch _almost_ fixes the issue with the intend of making it
highly unlikely to occur without getting too involved with the vcl
temperature controls: We now check the same conditions under which
vcl_set_state() would transition the temperature to COOLING and, if
they apply, use Lck_Trylock() in a try/wait loop instead of
Lck_Lock(), avoiding the deadlock.
The patch presumably still does not fix the problem entirely, because
the reads of vcl->busy and vcl->temp before the Lck_Trylock() could
still be outdated. With the temperature controls otherwise unchanged,
the only alternative idea I could come up with was to always use a
try/wait loop, which I dismissed due to the performance impact
(overhead and added latency).
Ref https://github.com/nigoroll/libvmod-dynamic/issues/110
diff --git a/bin/varnishd/cache/cache_vrt_vcl.c b/bin/varnishd/cache/cache_vrt_vcl.c
index a454b14d9..82e6e10a5 100644
--- a/bin/varnishd/cache/cache_vrt_vcl.c
+++ b/bin/varnishd/cache/cache_vrt_vcl.c
@@ -219,8 +219,25 @@ VRT_AddDirector(VRT_CTX, const struct vdi_methods *m, void *priv,
* not be legal to do so. Because we change the VCL temperature before
* sending COLD events we have to tolerate and undo attempts for the
* COOLING case.
+ *
+ * To avoid deadlocks during vcl_BackendEvent, we only wait for vcl_mtx
+ * if the vcl is busy (ref vcl_set_state())
*/
- Lck_Lock(&vcl_mtx);
+
+ while (1) {
+ temp = vcl->temp;
+ if (temp == VCL_TEMP_COOLING)
+ return (vcldir_surplus(vdir));
+ if (vcl->busy == 0 && vcl->temp->is_warm) {
+ if (! Lck_Trylock(&vcl_mtx))
+ break;
+ usleep(10 * 1000);
+ continue;
+ }
+ Lck_Lock(&vcl_mtx);
+ break;
+ }
+ Lck_AssertHeld(&vcl_mtx);
temp = vcl->temp;
if (temp != VCL_TEMP_COOLING)
VTAILQ_INSERT_TAIL(&vcl->director_list, vdir, list);
More information about the varnish-commit
mailing list