Skip to content

qtvcp: don't pop a modal error dialog when headless#4183

Open
grandixximo wants to merge 1 commit into
LinuxCNC:masterfrom
grandixximo:fix/qtvcp-headless-excepthook
Open

qtvcp: don't pop a modal error dialog when headless#4183
grandixximo wants to merge 1 commit into
LinuxCNC:masterfrom
grandixximo:fix/qtvcp-headless-excepthook

Conversation

@grandixximo

Copy link
Copy Markdown
Contributor

qtvcp's excepthook shows a modal QMessageBox for uncaught exceptions. Under an offscreen/headless platform (CI) nobody can dismiss it, so msg.exec_() blocks forever. When the exception happens during screen construction, this hangs before the event loop arms the SIGTERM handler, so the process cannot be terminated cleanly and tests/ui-smoke/qtdragon-quit times out with the GUI still alive. This matches the intermittent F43 failures in #4169: any flaky construction-time error lands here, and a rerun that avoids it passes.

When QApplication is absent or offscreen, the hook now logs the traceback, runs the normal shutdown, and exits instead of showing the dialog. The error is still reported (stderr + log + non-zero exit); only the dialog is skipped, and the interactive path is unchanged.

Reproduced and verified on a Fedora 43 / Python 3.14.5 uspace build by injecting an exception during construction: before, the test hangs 15s; after, it exits immediately with the traceback in the log.

@BsAtHome, three things:

  • The guard triggers on the offscreen platform, which the qtdragon ui-smoke tests use. The xvfb/xcb smoke tests (touchy, gmoccapy) are not offscreen and so are unchanged. Should I broaden the condition, for example to also fire when there is no controlling tty, so those get the same protection?
  • Could you test whether this clears the intermittent qtdragon-quit failures on your F43 setup? With a complete dependency set I could not reproduce the exact flakiness here, only the underlying hang via fault injection, so a confirmation on the real CI would help.
  • Worth a dedicated ui-smoke regression test for this? I can add one with a deliberately-broken test screen (a handler that raises in __init__) loaded via its own config, asserting the process exits promptly instead of hanging, without putting test-only code in qtvcp.

@BsAtHome

Copy link
Copy Markdown
Contributor

No controlling TTY is a situation that may happen in a setting where you actually have an accessible screen. In itself it cannot be a situation to suppress a dialog. The real problem is that you need to find out whether or not you are in interactive mode or not. That is often a difficult determination.

The real problem in the test has been no information in the logs. If there is a crash/exception, then there must be a switch to have it always dump the trace to stderr so it will be caught in any test. Usually you want to have "silent" programs, but crashes and unhandled exceptions are rather serious problems that need attention, wouldn't you agree?

A separate test would probably be too much for now. The better solution is to make these types of problems visible in the logs in the existing tests.

The excepthook only showed a modal QMessageBox, so a crash left no trace in
the logs and, offscreen (CI), the dialog blocked forever; during construction
that hung before SIGTERM was armed and qtdragon-quit timed out. Always write
the traceback to stderr and the log, and when offscreen (or no QApplication)
run shutdown and exit instead of the dialog. Interactive path unchanged.
@grandixximo grandixximo force-pushed the fix/qtvcp-headless-excepthook branch from e706751 to bc37bd9 Compare June 20, 2026 07:34
@grandixximo

Copy link
Copy Markdown
Contributor Author

Agreed on both counts. I dropped the no-tty idea; you are right that it can have an accessible screen and that interactive detection is not reliable.

I reworked it so the excepthook always writes the traceback to stderr and the log for any unhandled exception, so a crash is now visible in every test, not just this one. I kept the dialog skip only for the offscreen platform, which is the one unambiguous "no human" case and the one that actually hangs; everywhere else the dialog behaves as before. I left out the separate regression test.

Verified on the Fedora 43 build by injecting an exception during construction: the traceback now shows up in the test log, and the process exits instead of hanging; a normal run still passes.

Could you also run it on your machine and see what you get? With a complete dependency set I could not reproduce the intermittent failure here, only the underlying hang via fault injection, so a check on your real F43 setup would help confirm whether this clears the flakiness.

@BsAtHome

Copy link
Copy Markdown
Contributor

Well, didn't fail... So that is good.

Log in result shows:

Traceback (most recent call last):
  File "/...lcnc.../lib/python/qtvcp/widgets/led_widget.py", line 127, in sizeHint
    def sizeHint(self):
    
KeyboardInterrupt
[QTvcp][<esc>[41mCRITICAL<esc>[0m]  Qtvcp unhandled exception:
Traceback (most recent call last):
  File "/...lcnc.../python/qtvcp/widgets/led_widget.py", line 127, in sizeHint
    def sizeHint(self):
    
KeyboardInterrupt
 (qtvcp:564)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants