You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When retrieving the classic locale the current code calls _Init which takes a lock regardless of initialization state.
Expanded here, the code looks like the following.
_MRTIMP2_PURE const locale& __CLRCALL_PURE_OR_CDECL locale::classic() { // get reference to "C" locale//_Init();
{
locale::_Locimp* ptr = nullptr;
_BEGIN_LOCK(_LOCK_LOCALE) // prevent double initialization// this function just returns a global variable// ptr = _Getgloballocale();
ptr = global_locale;
if (ptr == nullptr) { // create new locales_Setgloballocale(ptr = _Locimp::_New_Locimp());
ptr->_Catmask = all; // set current locale to "C"
ptr->_Name = "C";
_Locimp::_Clocptr = ptr; // set classic to match
_Locimp::_Clocptr->_Incref();
::new (&classic_locale) locale(_Locimp::_Clocptr);
}
// this is always false in the classic() codepath//if (_Do_incref) {// ptr->_Incref();//}_END_LOCK()
//return ptr;
}
return classic_locale;
}
This means that every call to classic() is locking and unlocking a mutex, despite that after its initialization the pointer must always be valid. This can cause huge performance issues in specific scenarios, for example https://twitter.com/aras_p/status/1558063800145813505 shows a measured ~500x performance penalty in some blender loading code due to contention on this lock.
A fast path that avoids locking (by checking whether global_locale is valid, and directly returning before entering the critical section) should drastically improve performance in this and other similar situations.
Looks like a duplicate of #1652. I'm curious what the ABI problem is, though. It seems like you could use double-checked locking if global_locale were upgraded to an atomic pointer. User code doesn't access that pointer directly, so I don't think you have a "print driver scenario" problem, and the semantics of _Init haven't changed, so you also don't have a mix-n-match problem.
It may be possible to fix this without breaking ABI - however, @CaseyCarter noted some potential difficulties:
This needs to be compatible with /clr:pure (which is incompatible with atomics),
This is injected into the STL's import library, where we recently had trouble with atomic appearing (and fixed it by not dragging in atomic)
We didn't fully understand the root cause, but it involved the internal Windows build breaking because the _Init_locks assignment operator was getting dragged in to the import lib by the non-core atomics machinery, and that assignment operator is also exported from the DLL, leading to a duplicate symbol error 😿
When retrieving the classic locale the current code calls _Init which takes a lock regardless of initialization state.
Expanded here, the code looks like the following.
This means that every call to classic() is locking and unlocking a mutex, despite that after its initialization the pointer must always be valid. This can cause huge performance issues in specific scenarios, for example https://twitter.com/aras_p/status/1558063800145813505 shows a measured ~500x performance penalty in some blender loading code due to contention on this lock.
A fast path that avoids locking (by checking whether global_locale is valid, and directly returning before entering the critical section) should drastically improve performance in this and other similar situations.
https://github.com/microsoft/STL/blob/5e7cac7f679/stl/src/locale0.cpp#L138-L141
The text was updated successfully, but these errors were encountered: