Humble attempt to fix typos by adding codespell action, config etc.#53
Humble attempt to fix typos by adding codespell action, config etc.#53yarikoptic wants to merge 4 commits into
Conversation
|
In the 'sagittal' case, the change could be extended slightly:
really needs the first as well as the 2nd/3rd to be fixed? |
|
so, overall, would you like me to polish it up? -- I could push more fixes (as you can see there is still a good number)... |
| whcol = 3; | ||
|
|
||
| case {'sagg', 'sagittal', 'saggital'} | ||
| case {'sagg', 'sagittal', 'sagittal'} |
There was a problem hiding this comment.
@jc2 , following your comment in the main thread (sorry -- missed), you want to expand this to contain all of them and then we need to ignore the line to not fix them up, so smth like
| case {'sagg', 'sagittal', 'sagittal'} | |
| case {'sagg', 'sagittal', 'sagittal', 'saggital'} # codespell:ignore |
?
There was a problem hiding this comment.
I'm not sure exactly what I meant there, but looking at that case, 'sagg' and 'saggital' are misspellings, though possibly intentionally handled in this case statement? So this may be less a "typo" issue than a design one.
The original has case {'sagg', 'sagittal', 'sagittal'}. That handles both correct and misspelled sagittal, but only misspelled sag. So I think the correct line might be
case {'sag', 'sagg', 'sagittal', 'saggital'} # codespell:ignore
IF the intention is indeed to gently support misspelled versions?...
There was a problem hiding this comment.
FWIW git history is not useful here
❯ git blame CanlabCore/@fmridisplay/addpoints.m | grep case.*sagg
d7bb2736 fmridisplay/@fmridisplay/addpoints.m (Zeb Delk 2014-08-12 15:45:56 -0600 181) case {'sagg', 'sagittal', 'saggital'}
❯ git show d7bb2736 | head -n 20
commit d7bb27368a04c20a3bb62672ed5854bb698c4aab
Author: Zeb Delk <elizabeth.delk@colorado.edu>
Date: Tue Aug 12 15:45:56 2014 -0600
Import the SCN Core Support.
diff --git a/@canlab_dataset/add_var.m b/@canlab_dataset/add_var.m
new file mode 100644
index 0000000..1876c74
--- /dev/null
+++ b/@canlab_dataset/add_var.m
@@ -0,0 +1,57 @@
+% Not complete yet. Please edit me
+%
+% Function for adding a variable to a dataset in a systematic way.
+% - Checks IDs of subjects to make sure data is added in the correct order.
+% - Values for missing data are coded as missing values, as specified in dat.Description.Missing_Values
+% - Handles Subject_Level or Event_Level data
+
+varname = 'ValenceType';
but indeed I would say the idea likely was to allow for human errors , but I think it was a mistake to not amplify here: allowing errors in one place in the code leads to the need to spread such need to all places where such mistakes could be made etc. So I would really advise against expanding, but to just add a correct value here... looking back at me making mistake and adding correct value second time, I think we just need
| case {'sagg', 'sagittal', 'sagittal'} | |
| case {'sagg', 'sagittal', 'sagittal'} # codespell:ignore |
and be done here!
There was a problem hiding this comment.
I agree with that reasoning, but I don't understand the change line. Why two copies of 'sagittal'? Why not the correct spelling of 'sag'?
I would think it would be just case {'sag', 'sagittal'} to follow those principles (optionally with codespell:ignore but I'm guessing it cannot "correct" anything in this line erroneously so that part seems unnecessary, if not intentionally preserving bad spelling...?).
|
should I just ignore docs_sphinx_old ? |
|
Thank you, @yarikoptic ! now there are substantial documentation changes and some updates using claude's help. let me know if you see other problems or would be useful to run codespell regularly. |
|
you could ignore docs_sphinx_old yes. @yarikoptic |
GitHub Actions workflow runs codespell on every push to master and on pull requests targeting master. The workflow is pinned to a commit SHA for reproducibility and uses 'permissions: contents: read' for safety. Co-Authored-By: Claude Code 2.1.138 / Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Configure codespell to skip vendored / data / build artifacts and to ignore short variable names, MATLAB built-in function names, proper names, and domain abbreviations that would otherwise be flagged. The ignore-regex skips URLs (so we don't "fix" typos in third-party links) and the base64-encoded image data embedded by MATLAB live-script export in docs/markdown_tutorials/*.m. Co-Authored-By: Claude Code 2.1.138 / Claude Opus 4.7 (1M context) <noreply@anthropic.com>
These typos have multiple plausible corrections (e.g. trough -> through or trough). Fixed manually after reading each occurrence's context: - regoin -> region (@atlas/threshold.m: brain region splitting) - interals -> intervals (@canlab_dataset/plot_var.m: 95% confidence intervals) - converstion -> conversion (×2: @fmri_data/fmri_data.m, Model_building_tools/design_matrix.m) - extacted -> extracted (×2: @fmri_data/fmri_data.m, @fmri_timeseries/fmri_timeseries.m) - interation -> iteration (@fmri_glm_design_matrix/robustfit.m: algorithm iteration) - exlude -> exclude (@image_vector/image_similarity_plot.m: exclude empty data) - followd -> followed (@image_vector/orthviews.m: followed by integer) - pring -> print (Cifti_plotting/plot_surface_map.m: figure to print) - numbe -> number (GLM_Batch_tools/spm_splines.README) - nexted -> nested (HRF_Est_Toolbox4/.../testFMinSearchNew.m: nested function) - fiels -> fields (Image_computation_tools/image_eval_function_multisubj.m: .n fields) - shoul -> should (Image_computation_tools/image_histogram1d.m: trailing comment fragment) - inbetween -> in between (×2: Model_building_tools/design_matrix.m) - specificy -> specify (OptimizeDesign11/.../README_gst_notes) - propotions -> proportions (OptimizeDesign11/core_functions/calcFreqDev.m) - defauls -> defaults (OptimizeDesign11/other_functions/designsim_gui_script.m) - cicles -> circles (Statistics_tools/.../xval_SVM_BKedit25Dec.m: plot legend) - foor -> for (Statistics_tools/cluster_confusion_matrix.m) - decription -> description (×5 in Visualization_functions/riverplot/*.m) - psace -> space (canlab_canonical_brains/.../spherical_icosahedral_interpolation.m) - covert -> convert (diagnostics/effect_size_map.m: convert to power estimate) - coverge -> converge (diagnostics/fmri_mask_thresh_canlab.m: mixture models converge) - sensivity -> sensitivity (web_repository_tools/...: sensitivity, specificity, PPV) - warpper -> wrapper (×2 CBIG_RF_*.sh: wrapper script) Co-Authored-By: Claude Code 2.1.138 / Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Single-suggestion typos applied automatically by `codespell -w`, plus a
few targeted reverts for false positives the auto-fix had introduced:
* `desc.prctiles` (struct field, documented public API) reverted from
the `percentiles` "fix" in @image_vector/descriptives.m,
Image_computation_tools/mean_image.m, and the docs table.
* `efficency.m` function name kept (must match file name); sibling
`efficiency.m` correctly renamed to its filename.
* `groupt` variable in diagnostics/effect_size_map.m kept (script-level
example variable); inline `codespell:ignore` annotation added.
* `continguous` and `classfy` in docs intentionally retained as they
reference real source identifiers; inline pragmas added.
* Pipe-column alignment in canlab_glm_dsgninfo.txt restored after
word-length changes (because, performed, with, details).
The dominant fixes by frequency: saggital -> sagittal (40 occurrences),
expermental -> experimental (28), dispaly -> display (22), atleast -> at
least (17, all in comments), aproach -> approach (15), concensus ->
consensus (12), initalize -> initialize (12), fucntion -> function (9),
signficant -> significant (8), homogenous -> homogeneous (7), accomodate
-> accommodate (7), re-used -> reused (7), efficency -> efficiency (in
efficiency.m only), analagous -> analogous (6), and ~230 unique
single-suggestion fixes across docs, comments, strings, and code.
Also fixes a long-standing latent bug in
Visualization_functions/tor_wedge_plot.m where `handels(i).texth(1) =
...` was a typo of the return variable `handles` -- the text-handle
assignment was being discarded into a phantom variable.
Co-Authored-By: Claude Code 2.1.138 / Claude Opus 4.7 (1M context) <noreply@anthropic.com>
86207f5 to
c897fd9
Compare
original description: but the number is way too many. If someone could help and push more skips to add -- would be great.
adjusted one -- mostly redone with my codespell claude skill -- since 260 files changed, might need some quality time to click through all them marking "viewed -- legit" ;):
Adds codespell spell-checking infrastructure and fixes all existing typos it detects (~250 unique typos across ~245 files), so the next typo to land will be caught in CI rather than after the fact.
I've introduced this exact pattern to over a hundred projects with positive feedback (see improveit-dashboard's notes for context). The GitHub Actions workflow has
permissions: contents: readonly.There are about 25 prior typo-related commits in
master, so this is a recurring issue worth automating.What's in the PR
Infrastructure (4 commits)
.github/workflows/codespell.yml, pinned to a commit SHA foractions-codespell@v2.2, runs on push and on PRs targetingmaster..codespellrcwith:External/(vendored),docs_sphinx_old/, etc.ignore-regexfor URLs and for base64-encoded image data embedded by MATLAB live-script export indocs/markdown_tutorials/*.m.ignore-words-listfor MATLAB function names (ttest,assignin,evalin), variable-name conventions (indx,alph,te,selt, etc.), domain proper names (Sepulcre, Shepard), and a few struct-field / file names that are public API (prctiles,efficency).interation -> iterationin an algorithm-convergence comment, notinteraction).codespell -w, then I reviewed the diff and reverted the changes that would have broken things:desc.prctilesis a documented struct field ofdescriptives()output – kept (would have broken downstream readers that codespell missed due to apostrophe-transpose syntax).efficency.mfilename retained – the function name must match the file; the siblingefficiency.m(newer version) was correctly cleaned up.grouptvariable ineffect_size_map.mkept (with inlinecodespell:ignore) – example workflow variable, renaming would silently break user scripts.continguous,classfy) restored with inline pragmas.canlab_glm_dsgninfo.txtre-spaced after word-length changes (because/performed/with/details).One real bug found as a side effect
Visualization_functions/tor_wedge_plot.mline 468 hadhandels(i).texth(1) = ...(typo of the function's return variablehandles). The text-handle assignment was being silently discarded into a phantom variable;codespell -wcorrected it.Most-frequent fixes
efficiency.monly)Verification
$ uvx codespell # (no output – clean)The CI workflow runs this same command on every push to
masterand on PRs.Generated with Claude Code and love to typo-free code.