Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 8 additions & 5 deletions src/coreclr/pal/src/exception/signal.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -447,7 +447,14 @@ static void invoke_previous_action(struct sigaction* action, int code, siginfo_t
PROCAbort(code, siginfo, context);
}
}
else if (IsSaSigInfo(action))

_ASSERTE(!IsSigDfl(action) && !IsSigIgn(action));

PROCNotifyProcessShutdown(IsRunningOnAlternateStack(context));
Copy link
Member

@jkotas jkotas Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think this works well when there are multiple runtimes loaded in the process that all want to handle segfaults. It can be multiple .NET runtimes (e.g. CoreCLR + NAOT, or multiple NAOT), or it can be .NET and some other runtime (e.g. Java runtime).

The expected behavior in these situations is that the given runtime will check whether the signal happened in the code that it cares about. If yes, it will handle the signal. If no, it will forward the signal to the next runtime, and so on.

With this change, I think we will shutdown our runtime instance and generate crashdump if there is segfault gracefully handled by some other runtime.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gracefully handled by another runtime, as in sa_sigaction/sa_handler? Wouldn't those still hit PROCNotifyProcessShutdown/PROCCreateCrashDumpIfEnabled in the original implementation? Or do they somehow return from invoke_previous_action

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the signal is handled in some other runtime, the handler registered by that runtime would not return.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, so given that we cannot tell how the other runtime will handle the signal, would it make sense then to pivot to an opt-in config switch that allows triggering a shutdown/create dump, even if the other runtime handled it gracefully?

I am not sure yet if there is a way for Android CoreCLR to not have a previously registered signal handler, so in those cases we wouldn't ever create crash dumps for signals.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we know what the handlers that are registered on Android before us do?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should atleast allow an opt-in to generate a crash report without terminating the process, and then pass the signal along to the previously registered handler.

Yes, we can start with that. I do not think we even need to pass the signal along to the previously registered handler if it is an opt-in.

Copy link
Member

@lateralusX lateralusX Jan 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Android we need to call previously handler, or it probably won't generate the tombstone and native crash report making sure logcat and crash artifacts gets uploaded to play console.

The idea on Android was to log a textual format of a managed crash report into logcat (and potentially a json file similar to current --crashreportonly) inside Androids implementation of PROCCreateCrashDumpIfEnabled making sure it runs in all scenarios where we normally would produce a crash dump/report. Since Android have limited capabilities to fork and ptrace parent (need root permissions), this needs to be done in-proc on a best effort basis. The first part of this is just to log the managed stack trace of the crashing thread and then potentially expand it into something similar to --crashreportonly, but in a format suitable for logcat.

On Android apps probably always want to generate the crash report into logcat, so I assume embedding SDKs will enable this by default, like done by dotnet Android SDK on Mono. Alternative is for app developers to opt-in, by setting a runtime config or env variable, and since we already discuss opt-in for the crash chaining, maybe it should be the same setting, doing crash chaining without wanting a crash report doesn't sound like a scenario, then you just don't request crash chaining to begin with and it will behave like it currently do.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, CoreCLR post-mortem diagnostics tooling is oriented around crash dumps. It would be about figuring out the whole flow - what would people need to do to opt-in on the given retail device (drop a config file to some known user writeable location?) and what would they need to do to exfiltrate the crashdump from the device (find the crash dump in some known location?).

I think writing to logcat is the default given we aren't providing any kind of solution to exfiltrate the data. I do think it would be worth looking into how we can be more friendly with Sentry and other similar solutions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think writing to logcat is the default given we aren't providing any kind of solution to exfiltrate the data

We should be writing the unhandled exception message and stacktrace to the logcat already (even before this change). Is that correct?

t would be worth looking into how we can be more friendly with Sentry and other similar solutions.

#101560 is trying to go in this direction.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be writing the unhandled exception message and stacktrace to the logcat already (even before this change). Is that correct?

Mostly, with the exception of a native crash. If that happens, then we have very little to go off of unless a customer can provide a repro or answer a bunch of sluething questions. This is what @mdh1418 is looking to improve.


PROCCreateCrashDumpIfEnabled(code, siginfo, context, true);

if (IsSaSigInfo(action))
{
// Directly call the previous handler.
_ASSERTE(action->sa_sigaction != NULL);
Expand All @@ -459,10 +466,6 @@ static void invoke_previous_action(struct sigaction* action, int code, siginfo_t
_ASSERTE(action->sa_handler != NULL);
action->sa_handler(code);
}

PROCNotifyProcessShutdown(IsRunningOnAlternateStack(context));

PROCCreateCrashDumpIfEnabled(code, siginfo, context, true);
}

/*++
Expand Down
Loading