I parsed many different pages and tried to get all links from pages. I collect all acquired URLs into an array, but with time I got an error 'heap out of memory'. I have made a dump of memory and I have discovered that your library returns sliced arrays in some cases. There is a situation when I store small strings but that continues to be linked with large strings that cause out of memory with time. I have used 'getAttribute' function. I recommend writing some notification in docs so that users could avoid this situation in the future. Or create an additional function with deep copy
Code that can reproduce, can use any HTML with links:
const fs = require('fs');
const HTMLParser = require('node-html-parser');
(async () => {
const array = [];
setInterval(() => {
fs.readFile('tmp.html', (err, buf) => {
let lines = buf.toString();
const root = HTMLParser.parse(lines);
for (const elem of root.querySelectorAll('a')) {
array.push(elem.getAttribute('href'));
}
})
}, 2000);
})();
I parsed many different pages and tried to get all links from pages. I collect all acquired URLs into an array, but with time I got an error 'heap out of memory'. I have made a dump of memory and I have discovered that your library returns sliced arrays in some cases. There is a situation when I store small strings but that continues to be linked with large strings that cause out of memory with time. I have used 'getAttribute' function. I recommend writing some notification in docs so that users could avoid this situation in the future. Or create an additional function with deep copy
Code that can reproduce, can use any HTML with links: