-
Notifications
You must be signed in to change notification settings - Fork 5.3k
[CoreCLR][Signal] Bump shutdown notif and crashdump before prev handler #123735
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
mdh1418
wants to merge
1
commit into
dotnet:main
Choose a base branch
from
mdh1418:reorder_previous_signal_handler
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+8
−5
Open
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think this works well when there are multiple runtimes loaded in the process that all want to handle segfaults. It can be multiple .NET runtimes (e.g. CoreCLR + NAOT, or multiple NAOT), or it can be .NET and some other runtime (e.g. Java runtime).
The expected behavior in these situations is that the given runtime will check whether the signal happened in the code that it cares about. If yes, it will handle the signal. If no, it will forward the signal to the next runtime, and so on.
With this change, I think we will shutdown our runtime instance and generate crashdump if there is segfault gracefully handled by some other runtime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gracefully handled by another runtime, as in sa_sigaction/sa_handler? Wouldn't those still hit PROCNotifyProcessShutdown/PROCCreateCrashDumpIfEnabled in the original implementation? Or do they somehow return from
invoke_previous_actionThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the signal is handled in some other runtime, the handler registered by that runtime would not return.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, so given that we cannot tell how the other runtime will handle the signal, would it make sense then to pivot to an opt-in config switch that allows triggering a shutdown/create dump, even if the other runtime handled it gracefully?
I am not sure yet if there is a way for Android CoreCLR to not have a previously registered signal handler, so in those cases we wouldn't ever create crash dumps for signals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we know what the handlers that are registered on Android before us do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we can start with that. I do not think we even need to pass the signal along to the previously registered handler if it is an opt-in.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On Android we need to call previously handler, or it probably won't generate the tombstone and native crash report making sure logcat and crash artifacts gets uploaded to play console.
The idea on Android was to log a textual format of a managed crash report into logcat (and potentially a json file similar to current --crashreportonly) inside Androids implementation of PROCCreateCrashDumpIfEnabled making sure it runs in all scenarios where we normally would produce a crash dump/report. Since Android have limited capabilities to fork and ptrace parent (need root permissions), this needs to be done in-proc on a best effort basis. The first part of this is just to log the managed stack trace of the crashing thread and then potentially expand it into something similar to --crashreportonly, but in a format suitable for logcat.
On Android apps probably always want to generate the crash report into logcat, so I assume embedding SDKs will enable this by default, like done by dotnet Android SDK on Mono. Alternative is for app developers to opt-in, by setting a runtime config or env variable, and since we already discuss opt-in for the crash chaining, maybe it should be the same setting, doing crash chaining without wanting a crash report doesn't sound like a scenario, then you just don't request crash chaining to begin with and it will behave like it currently do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think writing to logcat is the default given we aren't providing any kind of solution to exfiltrate the data. I do think it would be worth looking into how we can be more friendly with Sentry and other similar solutions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should be writing the unhandled exception message and stacktrace to the logcat already (even before this change). Is that correct?
#101560 is trying to go in this direction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly, with the exception of a native crash. If that happens, then we have very little to go off of unless a customer can provide a repro or answer a bunch of sluething questions. This is what @mdh1418 is looking to improve.