Skip to content

Generate web audio at the AudioContext's native sample rate#12

Open
lxpollitt wants to merge 1 commit into
lanceewing:masterfrom
lxpollitt:fix/web-audio-sample-rate
Open

Generate web audio at the AudioContext's native sample rate#12
lxpollitt wants to merge 1 commit into
lanceewing:masterfrom
lxpollitt:fix/web-audio-sample-rate

Conversation

@lxpollitt

Copy link
Copy Markdown
Contributor

Context

This is the first in a series of PRs aimed at improving the AY-3-8912 sound emulation. These PRs target the web version first as it was the quickest for me to iterate and validate on. I believe that, in most cases, the desktop and Android versions have the same underlying issues in their own copies of the 8912 emulation. My intention is to bring the same improvements to those platforms later, ideally by moving to a shared core implementation rather than porting fixes copy by copy, but I am focusing on the web version first.

I've deployed a WIP version of JOric that includes all of the fixes from this series of PRs here so you can use this to see the overall impact: https://joric-wip.pages.dev

Scuba Dive is a good reference point that hits multiple issues addressed in the PRs and so offers a nice and easy compare and contrast with the current code:

Problem

The web version generates its audio at a hardcoded 22050 Hz, and forces the AudioContext to match. This causes audible artefacts for higher pitched tones, which come out sounding fuzzy and multi-tonal rather than clean.

An easy way to hear this is to type in BASIC:

SOUND 1,16,15

On a real Oric this produces a steady high pitched tone. On the current web version it carries clearly audible additional tones underneath.

The cause is in two layers. The main one: generating at 22050 Hz puts the Nyquist limit at 11 kHz, so the square wave harmonics above that fold back down into the audible range (the chip emulation's sample accumulation attenuates them but cannot remove them). The second layer: since real output devices typically run at 44100 or 48000 Hz, the browser then resamples the 22050 Hz stream on top of that.

Fix

The AudioContext is now opened at the device's preferred sample rate, so no resampling happens at the context boundary, and sample generation runs at that same rate. At 48 kHz the Nyquist limit moves to 24 kHz, so the 3rd harmonics of the test tones above render correctly instead of folding down. Note this fix doesn't resolve all high frequency artefacts (there is still some fuzziness and artefacts associated with the conversion from the AY-3-8912 emulation's clock rate to 48 kHz) but it is noticeably cleaner and the remaining artefacts are less prominent.

The rate is capped at 48 kHz: above that the emulation's 32-bit fixed point envelope maths would overflow, and any real audible benefit above 48 kHz is unlikely anyway. If a device reports higher (e.g. 96 kHz audio interfaces), the AudioContext is recreated at 48 kHz and the browser handles that final conversion, which is benign compared to upsampling from 22050. If AudioContext creation fails entirely there is no audio anyway, but the generation maths falls back to the previous 22050 Hz so it stays sane.

Since the rate is now only known once the AudioContext exists on the UI thread, it is passed to the web worker as a new field in the existing Initialise message. The values that previously assumed the fixed rate are now derived from the actual rate: the cycles-per-sample pacing used by the emulation loop, the sample latency target (now expressed as SAMPLE_LATENCY_MS = 140, roughly the same time value as the previous 3072 samples at 22050 Hz, so perceived latency is unchanged), and the DC blocker coefficient (derived so its corner frequency stays at ~17.5 Hz regardless of rate).

Scope

Web platform only - five files (GwtAYPSG, PSGAudioWorklet, GwtJOricRunner, JOricWebWorker, sound-renderer.js), no changes to core or the other platforms. The chip emulation logic itself is untouched: its updateStep scaling already had a sample rate term, this just feeds it the true rate.

Testing

Tested on macOS in Chrome and Safari, and on iOS with Safari. High pitched SOUND commands (such as SOUND 1,8,15 and SOUND 1,16,15) now produce cleaner tones, confirmed by ear and by spectrum analysis. Verified overall pitch and tempo are unchanged (the emulation pacing loop adapts to the new rate). I also played a range of games checking for regressions and didn't spot any. The browser console logs the sample rate the AudioContext actually opened at, which may be useful when testing on other devices.

Previously the sample rate was hardcoded to 22050 Hz, both for sample
generation and for the AudioContext, forcing the browser to resample
to the device rate (typically 44100 or 48000 Hz) and, more
significantly, putting the generation Nyquist at 11 kHz, so square
wave harmonics above that folded back down as audible inharmonic
artefacts (easily heard with e.g. SOUND 1,8,15 or SOUND 1,16,15).

The AudioContext is now opened at the device's preferred rate, capped
at 48 kHz (above which the chip emulation's 32-bit fixed point
envelope maths would overflow, with no real audible benefit anyway),
and the rate is passed to the web worker in the Initialise message so
that sample generation matches. The sample latency target is now
expressed in milliseconds (SAMPLE_LATENCY_MS = 140, equivalent to the
previous 3072 samples at 22050 Hz) and the DC blocker coefficient is
derived from the rate to keep its corner frequency at ~17.5 Hz. If
AudioContext creation fails, the sample generation maths falls back
to the previous fixed 22050 Hz rate.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant