feature(metrics): SSH tunnel observability and Grafana dashboard#3
Closed
CodeLieutenant wants to merge 1 commit into
Closed
feature(metrics): SSH tunnel observability and Grafana dashboard#3CodeLieutenant wants to merge 1 commit into
CodeLieutenant wants to merge 1 commit into
Conversation
Add per-build attribution for SSH-tunneled client traffic and a Grafana
dashboard to observe it, proving traffic flows through the tunnel and
surfacing proxy vitals.
Client (argus/client/session.py):
- Emit X-Argus-Build-Id (JOB_NAME#BUILD_NUMBER, ARGUS_BUILD_ID override)
and X-Argus-Build-Url on tunneled requests only.
Backend (argus_backend.py):
- Add http_request_tunnel_build_total{build_id, build_url} counter;
build_url is 1:1 with build_id (no extra series), carried for linking.
Dashboard (scripts/grafana/argus-overview.json):
- Make datasource portable via a ${datasource} template variable.
- Add "SSH Tunneling" row (tunneled-vs-direct proof, % via tunnel, header
anomalies, SSH auth-attempt rate, registrations, by-endpoint, by-UA,
by-build with a clickable Jenkins data link).
- Add "SSH Proxy Vitals" row (bandwidth/CPU/mem/disk/connections) gated on
a proxy_job variable; requires node_exporter on the proxy host.
Tests: cover build-id composition, build-url, and omission outside CI.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Overview
Adds observability for the SSH Tunnel client feature so we can prove that traffic is actually flowing through the tunnel when
use_tunnel=True, attribute that traffic to the originating Jenkins build, and observe the proxy host's vitals — all surfaced in a portable Grafana dashboard.Why
When clients route through the SSH tunnel, we previously had no reliable signal that the tunnel was up and carrying traffic, no way to tell which job was using it, and no visibility into the proxy server. This closes those gaps using metrics the backend already exports plus minimal new instrumentation.
What changed
Client —
argus/client/session.pyX-Argus-Build-Id—JOB_NAME#BUILD_NUMBER(full Jenkins folder path + build number), withARGUS_BUILD_IDas a verbatim override; omitted outside CI.X-Argus-Build-Url—BUILD_URL(orARGUS_BUILD_URL), used to make the dashboard series clickable.Backend —
argus_backend.pyhttp_request_tunnel_build_total{build_id, build_url}.build_urlis 1:1 withbuild_id, so it adds no extra series — it's only a carrier so Grafana can link back to the build. Requests without the header land in a singleunknownbucket (filtered out in dashboards).Dashboard —
scripts/grafana/argus-overview.json${datasource}template variable (works for both manual import and provisioning); dropped import-only__inputs/__requiresscaffolding and live-instanceid/version.proxy_jobvariable.Tests —
argus/client/tests/test_tunnel.pyjob/path#42), explicit override, job-name-without-build-number, build-url emission, and omission outside CI. Full tunnel suite: 20 passed.Reviewer notes / follow-ups (not in this PR)
proxy_job. The proxy setup script (scripts/tunnel-server-setup.sh) installs sshd config only — adding node_exporter is a separate change.JOB_NAME#BUILD_NUMBERgranularity means each build is a new Prometheus series (bounded by retention). Intentional, to make builds individually identifiable; dropping#BUILD_NUMBERwould collapse to one series per job if it ever becomes a concern.X-Argus-Build-Url(this client version, in CI); older clients showbuild_idwithout a link.