Generate web audio at the AudioContext's native sample rate#12
Open
lxpollitt wants to merge 1 commit into
Open
Generate web audio at the AudioContext's native sample rate#12lxpollitt wants to merge 1 commit into
lxpollitt wants to merge 1 commit into
Conversation
Previously the sample rate was hardcoded to 22050 Hz, both for sample generation and for the AudioContext, forcing the browser to resample to the device rate (typically 44100 or 48000 Hz) and, more significantly, putting the generation Nyquist at 11 kHz, so square wave harmonics above that folded back down as audible inharmonic artefacts (easily heard with e.g. SOUND 1,8,15 or SOUND 1,16,15). The AudioContext is now opened at the device's preferred rate, capped at 48 kHz (above which the chip emulation's 32-bit fixed point envelope maths would overflow, with no real audible benefit anyway), and the rate is passed to the web worker in the Initialise message so that sample generation matches. The sample latency target is now expressed in milliseconds (SAMPLE_LATENCY_MS = 140, equivalent to the previous 3072 samples at 22050 Hz) and the DC blocker coefficient is derived from the rate to keep its corner frequency at ~17.5 Hz. If AudioContext creation fails, the sample generation maths falls back to the previous fixed 22050 Hz rate.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Context
This is the first in a series of PRs aimed at improving the AY-3-8912 sound emulation. These PRs target the web version first as it was the quickest for me to iterate and validate on. I believe that, in most cases, the desktop and Android versions have the same underlying issues in their own copies of the 8912 emulation. My intention is to bring the same improvements to those platforms later, ideally by moving to a shared core implementation rather than porting fixes copy by copy, but I am focusing on the web version first.
I've deployed a WIP version of JOric that includes all of the fixes from this series of PRs here so you can use this to see the overall impact: https://joric-wip.pages.dev
Scuba Dive is a good reference point that hits multiple issues addressed in the PRs and so offers a nice and easy compare and contrast with the current code:
Problem
The web version generates its audio at a hardcoded 22050 Hz, and forces the AudioContext to match. This causes audible artefacts for higher pitched tones, which come out sounding fuzzy and multi-tonal rather than clean.
An easy way to hear this is to type in BASIC:
SOUND 1,16,15On a real Oric this produces a steady high pitched tone. On the current web version it carries clearly audible additional tones underneath.
The cause is in two layers. The main one: generating at 22050 Hz puts the Nyquist limit at 11 kHz, so the square wave harmonics above that fold back down into the audible range (the chip emulation's sample accumulation attenuates them but cannot remove them). The second layer: since real output devices typically run at 44100 or 48000 Hz, the browser then resamples the 22050 Hz stream on top of that.
Fix
The AudioContext is now opened at the device's preferred sample rate, so no resampling happens at the context boundary, and sample generation runs at that same rate. At 48 kHz the Nyquist limit moves to 24 kHz, so the 3rd harmonics of the test tones above render correctly instead of folding down. Note this fix doesn't resolve all high frequency artefacts (there is still some fuzziness and artefacts associated with the conversion from the AY-3-8912 emulation's clock rate to 48 kHz) but it is noticeably cleaner and the remaining artefacts are less prominent.
The rate is capped at 48 kHz: above that the emulation's 32-bit fixed point envelope maths would overflow, and any real audible benefit above 48 kHz is unlikely anyway. If a device reports higher (e.g. 96 kHz audio interfaces), the AudioContext is recreated at 48 kHz and the browser handles that final conversion, which is benign compared to upsampling from 22050. If AudioContext creation fails entirely there is no audio anyway, but the generation maths falls back to the previous 22050 Hz so it stays sane.
Since the rate is now only known once the AudioContext exists on the UI thread, it is passed to the web worker as a new field in the existing Initialise message. The values that previously assumed the fixed rate are now derived from the actual rate: the cycles-per-sample pacing used by the emulation loop, the sample latency target (now expressed as SAMPLE_LATENCY_MS = 140, roughly the same time value as the previous 3072 samples at 22050 Hz, so perceived latency is unchanged), and the DC blocker coefficient (derived so its corner frequency stays at ~17.5 Hz regardless of rate).
Scope
Web platform only - five files (GwtAYPSG, PSGAudioWorklet, GwtJOricRunner, JOricWebWorker, sound-renderer.js), no changes to core or the other platforms. The chip emulation logic itself is untouched: its updateStep scaling already had a sample rate term, this just feeds it the true rate.
Testing
Tested on macOS in Chrome and Safari, and on iOS with Safari. High pitched SOUND commands (such as
SOUND 1,8,15andSOUND 1,16,15) now produce cleaner tones, confirmed by ear and by spectrum analysis. Verified overall pitch and tempo are unchanged (the emulation pacing loop adapts to the new rate). I also played a range of games checking for regressions and didn't spot any. The browser console logs the sample rate the AudioContext actually opened at, which may be useful when testing on other devices.