Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 8 additions & 5 deletions src/coreclr/pal/src/exception/signal.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -447,7 +447,14 @@ static void invoke_previous_action(struct sigaction* action, int code, siginfo_t
PROCAbort(code, siginfo, context);
}
}
else if (IsSaSigInfo(action))

_ASSERTE(!IsSigDfl(action) && !IsSigIgn(action));

PROCNotifyProcessShutdown(IsRunningOnAlternateStack(context));
Copy link
Member

@jkotas jkotas Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think this works well when there are multiple runtimes loaded in the process that all want to handle segfaults. It can be multiple .NET runtimes (e.g. CoreCLR + NAOT, or multiple NAOT), or it can be .NET and some other runtime (e.g. Java runtime).

The expected behavior in these situations is that the given runtime will check whether the signal happened in the code that it cares about. If yes, it will handle the signal. If no, it will forward the signal to the next runtime, and so on.

With this change, I think we will shutdown our runtime instance and generate crashdump if there is segfault gracefully handled by some other runtime.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gracefully handled by another runtime, as in sa_sigaction/sa_handler? Wouldn't those still hit PROCNotifyProcessShutdown/PROCCreateCrashDumpIfEnabled in the original implementation? Or do they somehow return from invoke_previous_action

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the signal is handled in some other runtime, the handler registered by that runtime would not return.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, so given that we cannot tell how the other runtime will handle the signal, would it make sense then to pivot to an opt-in config switch that allows triggering a shutdown/create dump, even if the other runtime handled it gracefully?

I am not sure yet if there is a way for Android CoreCLR to not have a previously registered signal handler, so in those cases we wouldn't ever create crash dumps for signals.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we know what the handlers that are registered on Android before us do?

Copy link
Member

@lateralusX lateralusX Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Mono we have this as opt-in feature and it will create the native crash report before chaining any signal handlers, but its off by default and apps opt in to enable the feature through the embedding API's (as described in the PR description). dotnet Android SDK currently opt-in to this feature, meaning that it will dump the native crash report before chaining signal handlers.

The "crash dump" that we collect on Android is mainly crash details going into logcat in text format, since that is the only way to get crash data out of retail devices in cases when using default error reporting, logcat output will be one of few artifacts uploaded from installed app on retail devices, unless app is running 3'rd party crash services that could upload additional data into their own services.

The Android signal handler is most likely invoking the Android crash daemon generating the Android crash report and tombstone and terminate the app, so it won't return, meaning there is no way for us to report additional information in a SIGSEGV scenario unless we do it before that handler gets executed.

There is AFAIK no reliable way to detect that its the Andorid crash report handler that has been registered, maybe we could identify the handler as being part of Android libraries and act based on that information, but that will probably break sooner or later, so probably not a good path forward.

Let say we have this as an opt-in as we do on Mono, meaning we can generate a crash report before chaining signals, should we still call PROCNotifyProcessShutdown or is it enough to just call PROCCreateCrashDumpIfEnabled triggering the Android specific implementation of that function?

Copy link
Member

@lateralusX lateralusX Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could do something along these lines instead (default config value is false):

if (IsSigIgn(action))
{
    ...
}
else if (IsSigDfl(action))
{
    ....
}
else
{
    bool doCrashChaining = Configuration::GetKnobBooleanValue(....);
    if (doCrashChaining)
    {
             //Should we ignore doing PROCNotifyProcessShutdown in this scenario?
            //PROCNotifyProcessShutdown(IsRunningOnAlternateStack(context));
            PROCCreateCrashDumpIfEnabled(code, siginfo, context, true);
    }
 
    if (IsSaSigInfo(action))
    {
        _ASSERTE(action->sa_sigaction != NULL);
        action->sa_sigaction(code, siginfo, context);
    }
    else
    {
        _ASSERTE(action->sa_handler != NULL);
        action->sa_handler(code);
    }
 
    if (!doCrashChaining)
    {
            PROCNotifyProcessShutdown(IsRunningOnAlternateStack(context));
            PROCCreateCrashDumpIfEnabled(code, siginfo, context, true);
    }
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm seeing that PROCNotifyProcessShutdown doesn't actually handle/signal shutdown. It currently only applies to Unix and just cleans up the debugger transport and diagnostic server.

For non-Android, PROCCreateCrashDumpIfEnabled will trigger PROCCreateCrashDump which will likely terminate the process with exit().

On Android CoreCLR, there is no CreateDump (hopefully we will have one in the future), so its PROCCreateCrashDumpIfEnabled is just

PROCCreateCrashDumpIfEnabled(int signal, siginfo_t* siginfo, void* context, bool serialize)
{
// Preserve context pointer to prevent optimization
DoNotOptimize(&context);
// TODO: Dump all managed threads callstacks into logcat and/or file?
// TODO: Dump stress log into logcat and/or file when enabled?
minipal_log_write_fatal("Aborting process.\n");
}

is collecting a crash dump of a appstore-installed app on retail device something we want to enable?

Do other CoreCLR platforms not do so if crash dumps are enabled?
I think having parity with other platforms makes sense, but has any customer hit a crash on desktop CoreCLR and not produced a dump with DOTNET_DbgEnableMiniDump=1 because of signal handler registered out of their control?
Unless we coordinate with Android's crash reporter resolve our symbols, it feels like we should atleast allow an opt-in to generate a crash report without terminating the process, and then pass the signal along to the previously registered handler.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is collecting a crash dump of a appstore-installed app on retail device something we want to enable?

Do other CoreCLR platforms not do so if crash dumps are enabled?
I think having parity with other platforms makes sense,

Right, CoreCLR post-mortem diagnostics tooling is oriented around crash dumps. It would be about figuring out the whole flow - what would people need to do to opt-in on the given retail device (drop a config file to some known user writeable location?) and what would they need to do to exfiltrate the crashdump from the device (find the crash dump in some known location?).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should atleast allow an opt-in to generate a crash report without terminating the process, and then pass the signal along to the previously registered handler.

Yes, we can start with that. I do not think we even need to pass the signal along to the previously registered handler if it is an opt-in.


PROCCreateCrashDumpIfEnabled(code, siginfo, context, true);

if (IsSaSigInfo(action))
{
// Directly call the previous handler.
_ASSERTE(action->sa_sigaction != NULL);
Expand All @@ -459,10 +466,6 @@ static void invoke_previous_action(struct sigaction* action, int code, siginfo_t
_ASSERTE(action->sa_handler != NULL);
action->sa_handler(code);
}

PROCNotifyProcessShutdown(IsRunningOnAlternateStack(context));

PROCCreateCrashDumpIfEnabled(code, siginfo, context, true);
}

/*++
Expand Down
Loading