Humble attempt to fix typos by adding codespell action, config etc. by yarikoptic · Pull Request #53 · canlab/CanlabCore

yarikoptic · 2023-11-28T21:01:57Z

original description: but the number is way too many. If someone could help and push more skips to add -- would be great.

adjusted one -- mostly redone with my codespell claude skill -- since 260 files changed, might need some quality time to click through all them marking "viewed -- legit" ;):

Adds codespell spell-checking infrastructure and fixes all existing typos it detects (~250 unique typos across ~245 files), so the next typo to land will be caught in CI rather than after the fact.

I've introduced this exact pattern to over a hundred projects with positive feedback (see improveit-dashboard's notes for context). The GitHub Actions workflow has permissions: contents: read only.

There are about 25 prior typo-related commits in master, so this is a recurring issue worth automating.

What's in the PR

Infrastructure (4 commits)

Add CI workflow to codespell master on push and PRs – .github/workflows/codespell.yml, pinned to a commit SHA for actions-codespell@v2.2, runs on push and on PRs targeting master.
Add codespell configuration – .codespellrc with:
- Skip patterns for binary/data files, External/ (vendored), docs_sphinx_old/, etc.
- ignore-regex for URLs and for base64-encoded image data embedded by MATLAB live-script export in docs/markdown_tutorials/*.m.
- ignore-words-list for MATLAB function names (ttest, assignin, evalin), variable-name conventions (indx, alph, te, selt, etc.), domain proper names (Sepulcre, Shepard), and a few struct-field / file names that are public API (prctiles, efficency).
Fix ambiguous typos requiring context review – 29 files, ~32 lines. Each fixed manually after reading the surrounding code; commit body lists every fix with file and rationale (e.g. interation -> iteration in an algorithm-convergence comment, not interaction).
Fix non-ambiguous typos with codespell -w – 244 files, ~405 lines. All single-suggestion typos auto-applied by codespell -w, then I reviewed the diff and reverted the changes that would have broken things:
- desc.prctiles is a documented struct field of descriptives() output – kept (would have broken downstream readers that codespell missed due to apostrophe-transpose syntax).
- efficency.m filename retained – the function name must match the file; the sibling efficiency.m (newer version) was correctly cleaned up.
- groupt variable in effect_size_map.m kept (with inline codespell:ignore) – example workflow variable, renaming would silently break user scripts.
- Two docs cells that intentionally cite source-code typos (continguous, classfy) restored with inline pragmas.
- Pipe-column alignment in canlab_glm_dsgninfo.txt re-spaced after word-length changes (because/performed/with/details).

One real bug found as a side effect

Visualization_functions/tor_wedge_plot.m line 468 had handels(i).texth(1) = ... (typo of the function's return variable handles). The text-handle assignment was being silently discarded into a phantom variable; codespell -w corrected it.

Most-frequent fixes

typo	count	replacement
saggital	40	sagittal
expermental	28	experimental
dispaly	22	display
atleast	17	at least
aproach	15	approach
concensus	12	consensus
initalize	12	initialize
Nneeded handling, fucntion etc.	9 each	function, etc.
signficant	8	significant
homogenous	7	homogeneous
accomodate	7	accommodate
re-used	7	reused
analagous	6	analogous
efficency (in `efficiency.m` only)	6	efficiency
~230 other unique typos	1–5 each	–

Verification

$ uvx codespell
# (no output – clean)

The CI workflow runs this same command on every push to master and on PRs.

Generated with Claude Code and love to typo-free code.

jcf2 · 2024-09-19T15:39:27Z

In the 'sagittal' case, the change could be extended slightly:

case {'sagg', 'sagittal', 'saggital'}

really needs the first as well as the 2nd/3rd to be fixed?

yarikoptic · 2025-09-13T20:52:17Z

so, overall, would you like me to polish it up? -- I could push more fixes (as you can see there is still a good number)...

yarikoptic · 2025-09-13T20:54:27Z

                whcol = 3;

-            case {'sagg', 'sagittal', 'saggital'}
+            case {'sagg', 'sagittal', 'sagittal'}


@jc2 , following your comment in the main thread (sorry -- missed), you want to expand this to contain all of them and then we need to ignore the line to not fix them up, so smth like

Suggested change

case {'sagg', 'sagittal', 'sagittal'}

case {'sagg', 'sagittal', 'sagittal', 'saggital'} # codespell:ignore

?

I'm not sure exactly what I meant there, but looking at that case, 'sagg' and 'saggital' are misspellings, though possibly intentionally handled in this case statement? So this may be less a "typo" issue than a design one.

The original has case {'sagg', 'sagittal', 'sagittal'}. That handles both correct and misspelled sagittal, but only misspelled sag. So I think the correct line might be

case {'sag', 'sagg', 'sagittal', 'saggital'} # codespell:ignore

IF the intention is indeed to gently support misspelled versions?...

FWIW git history is not useful here

❯ git blame CanlabCore/@fmridisplay/addpoints.m | grep case.*sagg d7bb2736 fmridisplay/@fmridisplay/addpoints.m (Zeb Delk 2014-08-12 15:45:56 -0600 181) case {'sagg', 'sagittal', 'saggital'} ❯ git show d7bb2736 | head -n 20 commit d7bb27368a04c20a3bb62672ed5854bb698c4aab Author: Zeb Delk <elizabeth.delk@colorado.edu> Date: Tue Aug 12 15:45:56 2014 -0600 Import the SCN Core Support. diff --git a/@canlab_dataset/add_var.m b/@canlab_dataset/add_var.m new file mode 100644 index 0000000..1876c74 --- /dev/null +++ b/@canlab_dataset/add_var.m @@ -0,0 +1,57 @@ +% Not complete yet. Please edit me +% +% Function for adding a variable to a dataset in a systematic way. +% - Checks IDs of subjects to make sure data is added in the correct order. +% - Values for missing data are coded as missing values, as specified in dat.Description.Missing_Values +% - Handles Subject_Level or Event_Level data + +varname = 'ValenceType';

but indeed I would say the idea likely was to allow for human errors , but I think it was a mistake to not amplify here: allowing errors in one place in the code leads to the need to spread such need to all places where such mistakes could be made etc. So I would really advise against expanding, but to just add a correct value here... looking back at me making mistake and adding correct value second time, I think we just need

Suggested change

case {'sagg', 'sagittal', 'sagittal'}

case {'sagg', 'sagittal', 'sagittal'} # codespell:ignore

and be done here!

I agree with that reasoning, but I don't understand the change line. Why two copies of 'sagittal'? Why not the correct spelling of 'sag'?

I would think it would be just case {'sag', 'sagittal'} to follow those principles (optionally with codespell:ignore but I'm guessing it cannot "correct" anything in this line erroneously so that part seems unnecessary, if not intentionally preserving bad spelling...?).

yarikoptic · 2025-09-13T20:55:24Z

should I just ignore docs_sphinx_old ?

torwager · 2026-06-19T05:19:17Z

Thank you, @yarikoptic ! now there are substantial documentation changes and some updates using claude's help. let me know if you see other problems or would be useful to run codespell regularly.

torwager · 2026-06-19T05:21:16Z

you could ignore docs_sphinx_old yes. @yarikoptic

GitHub Actions workflow runs codespell on every push to master and on pull requests targeting master. The workflow is pinned to a commit SHA for reproducibility and uses 'permissions: contents: read' for safety. Co-Authored-By: Claude Code 2.1.138 / Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Configure codespell to skip vendored / data / build artifacts and to ignore short variable names, MATLAB built-in function names, proper names, and domain abbreviations that would otherwise be flagged. The ignore-regex skips URLs (so we don't "fix" typos in third-party links) and the base64-encoded image data embedded by MATLAB live-script export in docs/markdown_tutorials/*.m. Co-Authored-By: Claude Code 2.1.138 / Claude Opus 4.7 (1M context) <noreply@anthropic.com>

These typos have multiple plausible corrections (e.g. trough -> through or trough). Fixed manually after reading each occurrence's context: - regoin -> region (@atlas/threshold.m: brain region splitting) - interals -> intervals (@canlab_dataset/plot_var.m: 95% confidence intervals) - converstion -> conversion (×2: @fmri_data/fmri_data.m, Model_building_tools/design_matrix.m) - extacted -> extracted (×2: @fmri_data/fmri_data.m, @fmri_timeseries/fmri_timeseries.m) - interation -> iteration (@fmri_glm_design_matrix/robustfit.m: algorithm iteration) - exlude -> exclude (@image_vector/image_similarity_plot.m: exclude empty data) - followd -> followed (@image_vector/orthviews.m: followed by integer) - pring -> print (Cifti_plotting/plot_surface_map.m: figure to print) - numbe -> number (GLM_Batch_tools/spm_splines.README) - nexted -> nested (HRF_Est_Toolbox4/.../testFMinSearchNew.m: nested function) - fiels -> fields (Image_computation_tools/image_eval_function_multisubj.m: .n fields) - shoul -> should (Image_computation_tools/image_histogram1d.m: trailing comment fragment) - inbetween -> in between (×2: Model_building_tools/design_matrix.m) - specificy -> specify (OptimizeDesign11/.../README_gst_notes) - propotions -> proportions (OptimizeDesign11/core_functions/calcFreqDev.m) - defauls -> defaults (OptimizeDesign11/other_functions/designsim_gui_script.m) - cicles -> circles (Statistics_tools/.../xval_SVM_BKedit25Dec.m: plot legend) - foor -> for (Statistics_tools/cluster_confusion_matrix.m) - decription -> description (×5 in Visualization_functions/riverplot/*.m) - psace -> space (canlab_canonical_brains/.../spherical_icosahedral_interpolation.m) - covert -> convert (diagnostics/effect_size_map.m: convert to power estimate) - coverge -> converge (diagnostics/fmri_mask_thresh_canlab.m: mixture models converge) - sensivity -> sensitivity (web_repository_tools/...: sensitivity, specificity, PPV) - warpper -> wrapper (×2 CBIG_RF_*.sh: wrapper script) Co-Authored-By: Claude Code 2.1.138 / Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Single-suggestion typos applied automatically by `codespell -w`, plus a few targeted reverts for false positives the auto-fix had introduced: * `desc.prctiles` (struct field, documented public API) reverted from the `percentiles` "fix" in @image_vector/descriptives.m, Image_computation_tools/mean_image.m, and the docs table. * `efficency.m` function name kept (must match file name); sibling `efficiency.m` correctly renamed to its filename. * `groupt` variable in diagnostics/effect_size_map.m kept (script-level example variable); inline `codespell:ignore` annotation added. * `continguous` and `classfy` in docs intentionally retained as they reference real source identifiers; inline pragmas added. * Pipe-column alignment in canlab_glm_dsgninfo.txt restored after word-length changes (because, performed, with, details). The dominant fixes by frequency: saggital -> sagittal (40 occurrences), expermental -> experimental (28), dispaly -> display (22), atleast -> at least (17, all in comments), aproach -> approach (15), concensus -> consensus (12), initalize -> initialize (12), fucntion -> function (9), signficant -> significant (8), homogenous -> homogeneous (7), accomodate -> accommodate (7), re-used -> reused (7), efficency -> efficiency (in efficiency.m only), analagous -> analogous (6), and ~230 unique single-suggestion fixes across docs, comments, strings, and code. Also fixes a long-standing latent bug in Visualization_functions/tor_wedge_plot.m where `handels(i).texth(1) = ...` was a typo of the return variable `handles` -- the text-handle assignment was being discarded into a phantom variable. Co-Authored-By: Claude Code 2.1.138 / Claude Opus 4.7 (1M context) <noreply@anthropic.com>

torwager approved these changes Sep 13, 2025

View reviewed changes

yarikoptic commented Sep 13, 2025

View reviewed changes

torwager closed this Jun 19, 2026

torwager reopened this Jun 19, 2026

yarikoptic and others added 4 commits June 19, 2026 08:59

yarikoptic force-pushed the enh-codespell branch from 86207f5 to c897fd9 Compare June 21, 2026 19:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Humble attempt to fix typos by adding codespell action, config etc.#53

Humble attempt to fix typos by adding codespell action, config etc.#53
yarikoptic wants to merge 4 commits into
canlab:masterfrom
yarikoptic:enh-codespell

yarikoptic commented Nov 28, 2023 •

edited

Loading

Uh oh!

jcf2 commented Sep 19, 2024

Uh oh!

yarikoptic commented Sep 13, 2025

Uh oh!

yarikoptic Sep 13, 2025

Uh oh!

jcf2 Sep 15, 2025

Uh oh!

yarikoptic Sep 15, 2025

Uh oh!

jcf2 Sep 15, 2025

Uh oh!

yarikoptic commented Sep 13, 2025

Uh oh!

torwager commented Jun 19, 2026

Uh oh!

torwager commented Jun 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	case {'sagg', 'sagittal', 'sagittal'}
	case {'sagg', 'sagittal', 'sagittal', 'saggital'} # codespell:ignore

	case {'sagg', 'sagittal', 'sagittal'}
	case {'sagg', 'sagittal', 'sagittal'} # codespell:ignore

Conversation

yarikoptic commented Nov 28, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Infrastructure (4 commits)

One real bug found as a side effect

Uh oh!

jcf2 commented Sep 19, 2024

Uh oh!

yarikoptic commented Sep 13, 2025

Uh oh!

yarikoptic Sep 13, 2025

Choose a reason for hiding this comment

Uh oh!

jcf2 Sep 15, 2025

Choose a reason for hiding this comment

Uh oh!

yarikoptic Sep 15, 2025

Choose a reason for hiding this comment

Uh oh!

jcf2 Sep 15, 2025

Choose a reason for hiding this comment

Uh oh!

yarikoptic commented Sep 13, 2025

Uh oh!

torwager commented Jun 19, 2026

Uh oh!

torwager commented Jun 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

yarikoptic commented Nov 28, 2023 •

edited

Loading