A lightweight, DOM-like HTML and CSS parser for Node.js that creates a simple tree structure (Simple Object Model - SOM) for easy manipulation and serialization back to HTML/CSS strings. 21kb minified, zero dependencies.
- HTML Parsing: Parse HTML into a tree structure with proper handling of nested elements
- CSS Parsing: Parse inline
<style>tags with support for modern CSS features - DOM Manipulation: Insert, move, replace, and remove nodes
- Query Selectors: Find elements using CSS-like selectors
- Preserves Formatting: Maintains whitespace and indentation when manipulating nodes
- No Dependencies: Pure JavaScript implementation
Add to your project via pnpm or npm:
pnpm install simple-html-parser
# or
npm install simple-html-parserOr include manually by downloading the minified ESM dist/simple-html-parser.min.js file.
import { SimpleHtmlParser } from 'simple-html-parser';
const parser = new SimpleHtmlParser();
const dom = parser.parse('<div id="app"><h1>Hello World</h1></div>');
// Query elements
const app = dom.querySelector('#app');
const heading = dom.querySelector('h1');
// Manipulate
heading.setAttribute('class', 'title');
// Output
console.log(dom.toHtml());
// <div id="app"><h1 class="title">Hello World</h1></div>Parses an HTML string into a SOM tree structure.
const parser = new SimpleHtmlParser();
const dom = parser.parse('<div>Hello</div>');Returns the parser version.
The core building block of the SOM tree. Every element, text node, and comment is a Node.
type:'root' | 'tag-open' | 'tag-close' | 'text' | 'comment'name: Tag name (for element nodes)attributes: Object containing element attributeschildren: Array of child nodesparent: Reference to parent nodecontent: Text content (for text/comment nodes)
Find the first element matching a CSS selector.
const div = dom.querySelector('div');
const byId = dom.querySelector('#myId');
const byClass = dom.querySelector('.myClass');
const complex = dom.querySelector('div.container > p');Supported selectors:
- Tag names:
div,p,span - IDs:
#myId - Classes:
.myClass,.class1.class2 - Attributes:
[data-id],[data-id="value"] - Descendant:
div p(p inside div) - Pseudo-classes:
:not(selector)
Find all elements matching a CSS selector.
const allDivs = dom.querySelectorAll('div');
const allLinks = dom.querySelectorAll('a[href]');Find all nodes with a specific attribute.
const withDataId = dom.findAllByAttr('data-id');Add child nodes to this node.
const div = dom.querySelector('div');
const p = new Node('tag-open', 'p', {}, div);
div.appendChild(p);Insert nodes before this node (outside the element).
Note: target.insertBefore(node) inserts node before target.
const b = dom.querySelector('#B');
const a = dom.querySelector('#A');
a.insertBefore(b); // Inserts B before AInsert nodes after this node (outside the element).
Note: target.insertAfter(node) inserts node after target.
const a = dom.querySelector('#A');
const b = dom.querySelector('#B');
b.insertAfter(a); // Inserts A after BInsert HTML string at a specific position relative to this element. Mimics the browser's insertAdjacentHTML API.
Parameters:
position(string): One of:'beforebegin': Before the element (outside)'afterbegin': At the start of element's children (inside)'beforeend': At the end of element's children (inside)'afterend': After the element (outside)
html(string): HTML string to parse and insert
Returns: Node - This node for chaining
const container = dom.querySelector('#container');
// Insert before element
container.insertAdjacentHTML('beforebegin', '<p>Before</p>');
// Insert at start of children
container.insertAdjacentHTML('afterbegin', '<span>Start</span>');
// Insert at end of children
container.insertAdjacentHTML('beforeend', '<span>End</span>');
// Insert after element
container.insertAdjacentHTML('afterend', '<p>After</p>');Note: The parser instance used to create the DOM tree is automatically used for parsing the HTML string, preserving parser configuration (e.g., special tags).
Replace this node with other nodes.
const old = dom.querySelector('#old');
const newNode = dom.querySelector('#new');
old.replaceWith(newNode); // Removes old, replaces with newRemove this node from the tree. Automatically removes matching closing tags.
const div = dom.querySelector('div');
div.remove();Get an attribute value.
const href = link.getAttribute('href');Set an attribute value.
div.setAttribute('class', 'container');Remove an attribute.
div.removeAttribute('class');Append to an attribute value.
div.updateAttribute('class', 'active'); // class="container active"CSS methods are available when parsing <style> tags.
Find at-rules (@media, @keyframes, @supports, etc.) in the CSS tree.
// Find all @media rules
const mediaRules = style.cssFindAtRules('media');
// Find all at-rules
const allAtRules = style.cssFindAtRules();Find CSS rules matching a selector.
Options:
includeCompound(default:true) - Include compound selectors like.card.activeshallow(default:false) - Exclude nested children and descendant selectors
// Find all .card rules (includes .card.active)
const cardRules = style.cssFindRules('.card');
// Find only exact .card rules
const exactCard = style.cssFindRules('.card', { includeCompound: false });
// Find #wrapper rules, excluding nested rules
const wrapperOnly = style.cssFindRules('#wrapper', { shallow: true });Find a specific CSS variable (custom property) by name.
// Find --primary-color
const primary = style.cssFindVariable('--primary-color');
// Find variable without -- prefix
const spacing = style.cssFindVariable('spacing');Find all CSS variables with their scope paths.
Options:
includeRoot(default:false) - Include 'root' in scope path for root-level variables
const vars = style.cssFindVariables();
// [{name: '--primary', value: '#007bff', scope: ':root', rule: Node}]Convert CSS rules to a formatted CSS string.
Behavior:
- Called with nodes: Converts those specific nodes
- Called on HTML node: Finds and combines all
<style>tags - Called on CSS/style node: Converts this node's CSS tree
Options:
includeComments(default:false) - Include CSS commentsincludeNestedRules(default:true) - Include nested rules within parent rulesflattenNested(default:false) - Flatten nested rules to separate top-level rules with full selectorsincludeBraces(default:true) - Include { } around declarationsincludeSelector(default:true) - Include the selectorcombineDeclarations(default:true) - Merge declarations from multiple rulessingleLine(default:false) - Output on single lineindent(default:0) - Indentation level in spaces
// Convert specific rules
const rules = style.cssFindRules('.card');
const css = style.cssToString(rules, { includeNestedRules: false });
// Convert entire style tag
const style = dom.querySelector('style');
const css = style.cssToString({ flattenNested: true });
// Combine all styles in document
const css = dom.cssToString();
// Just declarations
const css = style.cssToString(rules, {
includeSelector: false,
includeBraces: false
});
// "background: white; padding: 1rem;"Convert the node tree back to an HTML string.
const html = dom.toHtml();
const htmlWithComments = dom.toHtml(true);Alias for toHtml(true).
Nodes are iterable, allowing depth-first traversal:
for (const node of dom) {
if (node.type === 'tag-open') {
console.log(node.name);
}
}const table = dom.querySelector('table');
const rowA = dom.querySelector('#rowA');
const rowB = dom.querySelector('#rowB');
// Swap rows - insert B before A
rowA.insertBefore(rowB); // B now comes before A// Method 1: Create nodes manually
const div = new Node('tag-open', 'div', { class: 'new' });
const text = new Node('text');
text.content = 'Hello';
div.appendChild(text);
const parent = dom.querySelector('#parent');
parent.appendChild(div);
// Method 2: Use insertAdjacentHTML (simpler for HTML strings)
const parent2 = dom.querySelector('#parent');
parent2.insertAdjacentHTML('beforeend', '<div class="new">Hello</div>');const style = dom.querySelector('style');
// Get all CSS variables
const variables = style.cssFindVariables();
console.log(variables);
// [{ name: '--primary', value: '#007bff', scope: ':root', rule: Node }]
// Find specific variable
const primaryColor = style.cssFindVariable('--primary-color');
// Get .card rules (shallow - no nested)
const rules = style.cssFindRules('.card', { shallow: true });
// Convert to CSS string without nested rules
const css = style.cssToString(rules, { includeNestedRules: false });
// ".card { background: white; padding: 1rem; }"The parser treats certain tags specially:
- Void elements (
img,br,hr,input, etc.): No closing tag created - Style tags: Contents parsed as CSS
- Script tags: Can be configured via
specialTagsparameter
const parser = new SimpleHtmlParser(['script', 'custom-tag']);The parser creates a tree where:
- Opening and closing tags are siblings in the parent's children array
- Element content is in the opening tag's
childrenarray - Text nodes (including whitespace) are preserved
Example:
<div>
<p>Hello</p>
</div>Becomes:
root
└─ <div>
├─ text "\n "
├─ <p>
│ └─ text "Hello"
├─ </p>
├─ text "\n"
└─ </div>
- Regex patterns are extracted to module-level constants for reuse
- Whitespace-only text nodes are only checked during manipulation, not parsing
- Methods use private helpers to avoid duplication
Common Clause with MIT
Contributions welcome! Please ensure all tests pass and add tests for new features.
Christopher Keers - caboodle-tech