Extract the innerText
from a snippet of HTML
npm install innertext
Pass it a string containing some HTML.
var innertext = require('innertext');
var text = innertext('<h1>Heading text <em>with</em> <b>some</b> <u>markup</u></h1>');
console.log(text); // 'Heading text with some markup'
The current implementation favors speed and simplicity over other considerations like perfect web browser compatibility. For instance:
- malformed HTML (e.g., un-encoded
<
&>
characters, etc…) will generally break the text extraction process - whitespace around HTML tag/element boundaries gets collapsed into a single space, whereas browsers will typically preserve newlines
So if you trust the incoming HTML, things will typically be OK, but don't use this as the basis for creating a browser or anything.
npm install
npm test
ISC