Skip to content

fix: preserve consecutive backslashes in attribute values#309

Merged
taoqf merged 1 commit into
taoqf:mainfrom
spokodev:fix-consecutive-backslash-attributes
Jun 22, 2026
Merged

fix: preserve consecutive backslashes in attribute values#309
taoqf merged 1 commit into
taoqf:mainfrom
spokodev:fix-consecutive-backslash-attributes

Conversation

@spokodev

Copy link
Copy Markdown

Problem

quoteAttribute serializes an attribute value with JSON.stringify (which doubles every backslash) and then tries to undo the doubling with .replace(/([^\\])\\/g, '$1'). That regex requires a non-backslash before each backslash and matches non-overlapping, so it cannot collapse a run of consecutive backslashes. The value gains an extra backslash on every serialization and no longer round-trips:

const { parse } = require('node-html-parser'); // 8.0.2

const el = parse('<div></div>').firstChild;
el.setAttribute('path', 'C:\\Users\\me');      // value: C:\Users\me  (single)  -> OK
el.setAttribute('path', 'C:\\\\Users\\\\me');  // value: C:\\Users\\me (doubled)
el.toString();                                 // <div path="C:\\\Users\\\me">  (extra backslash)
parse(el.toString()).firstChild.getAttribute('path'); // 'C:\\\Users\\\me'  (corrupted)

The single-backslash case fixed in #306 works; consecutive backslashes (e.g. a JSON-escaped Windows path or a regex stored in an attribute) were the unhandled residual.

Fix

Attribute values are literal text, so only the double quote needs escaping (as &quot;). Quote the value directly instead of round-tripping it through JSON.stringify. This preserves any number of backslashes and still escapes embedded quotes (the #62 behaviour).

Test

Added a round-trip assertion for an attribute value with consecutive backslashes. It fails before the change and passes after; the full suite (263 passing) and lint are green.

quoteAttribute serialized attribute values with JSON.stringify (which
doubles every backslash) and then tried to undo the doubling with
`.replace(/([^\\])\\/g, '$1')`. That regex needs a non-backslash before
each backslash and matches non-overlapping, so it cannot collapse runs of
consecutive backslashes: a value such as "C:\\Users\\me" gained an extra
backslash on every serialization and no longer round-tripped through
setAttribute -> toString -> parse. (The single-backslash case fixed in
issue taoqf#306 worked; consecutive backslashes were the unhandled residual.)

Attribute values are literal text, so only the double quote needs escaping
(as &quot;). Quote the value directly, which preserves any number of
backslashes and still escapes embedded quotes (issue taoqf#62 behaviour).
@taoqf taoqf merged commit 6f42a4e into taoqf:main Jun 22, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants