Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try to fix crash on close with saving enabled #11440

Merged
merged 2 commits into from
Oct 20, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
50 changes: 37 additions & 13 deletions src/cascadia/WindowsTerminal/AppHost.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@ AppHost::AppHost() noexcept :
_window->SetAlwaysOnTop(_logic.GetInitialAlwaysOnTop());
_window->MakeWindow();

_windowManager.GetWindowLayoutRequested([this](auto&&, const winrt::Microsoft::Terminal::Remoting::GetWindowLayoutArgs& args) {
_GetWindowLayoutRequestedToken = _windowManager.GetWindowLayoutRequested([this](auto&&, const winrt::Microsoft::Terminal::Remoting::GetWindowLayoutArgs& args) {
// The peasants are running on separate threads, so they'll need to
// swap what context they are in to the ui thread to get the actual layout.
args.WindowLayoutJsonAsync(_GetWindowLayoutAsync());
Expand Down Expand Up @@ -401,23 +401,42 @@ void AppHost::LastTabClosed(const winrt::Windows::Foundation::IInspectable& /*se
_HideNotificationIconRequested();
}

// We don't want to try to save layouts if we are about to close
_getWindowLayoutThrottler.reset();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh jeez, this introduced a new one. I'll file it

_windowManager.GetWindowLayoutRequested(_GetWindowLayoutRequestedToken);
Comment on lines +404 to +406
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use a weak_ptr for the GetWindowLayoutRequested even handler instead?

Copy link
Contributor Author

@Rosefield Rosefield Oct 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is that the event handler can get called during the destructor for the AppHost so technically this is still valid, but half of the state is gone. Removing the event handler makes it so that it (hopefully) can't get called at all. I don't know if revoking the event handler finishes executing outstanding asynchronous events.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use of a weak_ptr would prevent this. The moment the destructor starts running, a weak_ptr cannot be promoted to a strong com_ptr anymore. So basically if you write code like this:

std::weak_ptr<T> weakSelf{ shared_from_this() };
...
if (const auto self = weakSelf.lock()) {
    ... // <-- This path will not be reached if the destructor is running.
}

Except of course you meant the issue were our WinUI/XAML classes are being Close()d long before they're destroyed... But AppHost is unrelated to WinUI, so this shouldn't apply right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤷 I'm not sure what to do. Currently AppHost is not std::enabled_shared_from_this, and I don't know if that is a desirable change to make. I'm happy to do whatever, but I'm not confident in my ability to appropriately test this since I haven't been able to reproduce the original exceptions myself.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I'm sorry I mixed up std and winrt lingo...
With WinRT we have to use the member function get_weak(), which returns a winrt::weak_ref and to promote it to a strong reference you have to use get() instead of lock().

We can wait for input from other terminal devs. 🙂

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AppHost also does not seem to be a winrt type either so it does not have get_weak

// Remove ourself from the list of peasants so that we aren't included in
// any future requests. This will also mean we block until any existing
// event handler finishes.
_windowManager.SignalClose();
Copy link
Contributor Author

@Rosefield Rosefield Oct 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this which makes many of the other changes potentially redundant (but it doesn't hurt to have more checks). Assuming this call doesn't fail somehow, we will try to acquire the lock in the monarch which means we block until any outstanding event handlers finish, and then afterwards we are removed from the list of peasants so we shouldn't be called again.

Edit: I remembered that I made this function do nothing if we are in the middle of a quit operation, so it isn't quite as thorough. But also before a quit happens we stop the automatic saving so it is probably fine anyways. 🤷 Enforcing all of the invariants I want is a pain.


_window->Close();
}

LaunchPosition AppHost::_GetWindowLaunchPosition()
{
// Get the position of the current window. This includes the
// non-client already.
const auto window = _window->GetWindowRect();
LaunchPosition pos{};
// If we started saving before closing, but didn't resume the event handler
// until after _window might be a nullptr.
if (!_window)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is exclusively called on the main thread right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is called in a couple of event handlers that may or may not be on the main thread, so I don't know if I can guarantee it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this a race condition with whatever sets this field to null then?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its why there is still the try/catch, yes. I admittedly don't have a good sense of the full lifetime of the application, especially as things are closing. I believe in the error that @zadjii-msft found it would be on the main thread (assuming the ui thread is also the one that handles destruction) since it went through GetWindowLayoutsAsync.

Copy link
Contributor Author

@Rosefield Rosefield Oct 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be more clear, I believe, but have not verified that this is the order of operations in the crash

  • Event handler gets called which gives the peasant an IAsyncOperation
  • The coroutine progresses until it yields on await resume_foreground ...
  • The application starts getting closed because we called PostQuitMessage
  • The event loop in wWinMain exits so the destructor for AppHost is called
  • we set _window = nullptr and then call App->Close.
  • App->Close() tries to clear the event handlers / flush any outstanding requests which includes our event handler in (2)
  • The event handler resumes and then tries to access _window which is now a a nullptr.
  • crash

I have attempted to solve the problem in the following ways

  • Remove the event handler for WindowLayoutRequested before we call PostQuitMessage so hopefully there are no outstanding events when we call our destructor
  • If we still get into the event handler somehow, we check explicitly if the window is a nullptr so that we don't attempt to use it if it is invalid
  • If we fail later (say on usage of _logic) then we just wrap everything in a try catch and just ignore the error. I have not seen a case where this fails, but it might just be that any crash here would've crashed on the _window usage before.

I think that this is sufficiently resilient to avoiding crashes but it is possible that there is something I'm missing still in my understanding of how the runtime manages the lifetime of IAsyncOperations.

{
return pos;
}

const auto dpi = _window->GetCurrentDpi();
const auto nonClientArea = _window->GetNonClientFrame(dpi);
try
{
// Get the position of the current window. This includes the
// non-client already.
const auto window = _window->GetWindowRect();

// The nonClientArea adjustment is negative, so subtract that out.
// This way we save the user-visible location of the terminal.
LaunchPosition pos{};
pos.X = window.left - nonClientArea.left;
pos.Y = window.top;
const auto dpi = _window->GetCurrentDpi();
const auto nonClientArea = _window->GetNonClientFrame(dpi);

// The nonClientArea adjustment is negative, so subtract that out.
// This way we save the user-visible location of the terminal.
pos.X = window.left - nonClientArea.left;
pos.Y = window.top;
}
CATCH_LOG();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This try/catch seems to be redundant with the one on line 746.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It felt safer to also just put the try/catch here since this function is called in the CloseRequested and WindowCloseButtonClicked event handlers as well, and might be used elsewhere in the future.


return pos;
}
Expand Down Expand Up @@ -723,10 +742,15 @@ winrt::Windows::Foundation::IAsyncOperation<winrt::hstring> AppHost::_GetWindowL
{
winrt::apartment_context peasant_thread;

winrt::hstring layoutJson = L"";
// Use the main thread since we are accessing controls.
co_await winrt::resume_foreground(_logic.GetRoot().Dispatcher());
Copy link
Contributor Author

@Rosefield Rosefield Oct 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this line fail? should that also go in the try {}? If it can fail what state are we even in afterwards?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This cannot fail. At least not in a way that can be caught by try/catch (for instance if GetRoot() were to return a nullptr).

const auto pos = _GetWindowLaunchPosition();
const auto layoutJson = _logic.GetWindowLayoutJson(pos);
try
{
const auto pos = _GetWindowLaunchPosition();
layoutJson = _logic.GetWindowLayoutJson(pos);
}
CATCH_LOG()

// go back to give the result to the peasant.
co_await peasant_thread;
Expand Down
1 change: 1 addition & 0 deletions src/cascadia/WindowsTerminal/AppHost.h
Original file line number Diff line number Diff line change
Expand Up @@ -114,4 +114,5 @@ class AppHost
winrt::event_token _NotificationIconPressedToken;
winrt::event_token _ShowNotificationIconContextMenuToken;
winrt::event_token _NotificationIconMenuItemSelectedToken;
winrt::event_token _GetWindowLayoutRequestedToken;
};