Skip to content

rmraya/TypesXML

Repository files navigation

TypesXML

npm version npm license TypeScript

TypesXML is a TypeScript / Node.js XML library for parsing, validating, and processing XML documents. It provides both DOM and streaming (SAX) APIs, along with support for DTD and XML Schema 1.0 validation and XML Catalog resolution.

The library is implemented entirely in TypeScript (no native bindings) and is validated against the official W3C XML test suites for DTD and XML Schema.

Quick example

Parse an XML file and build a DOM document:

import { DOMBuilder, SAXParser } from "typesxml";

const handler = new DOMBuilder();
const parser = new SAXParser();

parser.setContentHandler(handler);
parser.parseFile("example.xml");

const document = handler.getDocument();
console.log(document.toString());

Features

Core capabilities for parsing, validation, and integration of XML documents:

Parsing

  • DOM builder (DOMBuilder) for in-memory document trees with lexical preservation
  • Streaming SAX parser with file, string, and Node.js stream support

Validation

  • Complete DTD parser/validator (including conditional sections and parameter entities)
  • XML Schema 1.0 validation with support for complex types
  • Strict XML 1.0 / 1.1 character validation

Integration

  • OASIS XML Catalog resolver for public/system identifiers
  • XML ↔ JSON conversion (lightweight and lossless modes)

Compliance

  • 100% W3C XML DTD test suite
  • 95.8% W3C XML Schema test suite

SAX Parser

SAXParser drives any ContentHandler implementation. A handler receives structured callbacks during parsing:

interface ContentHandler {
    initialize(): void;
    setCatalog(catalog: Catalog): void;
    startDocument(): void;
    endDocument(): void;
    xmlDeclaration(version: string, encoding: string, standalone: string | undefined): void;
    startElement(name: string, atts: XMLAttribute[]): void;
    endElement(name: string): void;
    internalSubset(declaration: string): void;
    characters(text: string): void;
    ignorableWhitespace(text: string): void;
    comment(text: string): void;
    processingInstruction(target: string, data: string): void;
    startCDATA(): void;
    endCDATA(): void;
    startDTD(name: string, publicId: string, systemId: string): void;
    endDTD(): void;
    skippedEntity(name: string): void;
    getGrammar(): Grammar | undefined;
    setGrammar(grammar: Grammar | undefined): void;
    getCurrentText(): string;
}

The built-in DOMBuilder implements this interface to provide DOM support out of the box.

Installation

npm install typesxml

Usage

import { DOMBuilder, SAXParser } from "typesxml";

const handler = new DOMBuilder();
const parser = new SAXParser();
parser.setContentHandler(handler);

// Parse from a file
parser.parseFile("example.xml");
const document = handler.getDocument();
console.log(document.toString());

// Parse from a string
parser.parseString("<root><child/></root>");

// Parse from a stream
// await parser.parseStream(fs.createReadStream("example.xml"));

To enable XML Catalog resolution or validation, configure the parser before invoking parse* methods:

parser.setCatalog(myCatalog);
parser.setValidating(true); // Turns on DTD and XML Schema validation

Documentation & Samples

  • Read the step-by-step TypesXML tutorial for guided workflows.
  • Use the JSON and XML Conversion Guide to translate between XML documents and JSON objects, with guidance on when to enable the metadata-preserving round-trip mode.
  • Explore the runnable examples under samples/ to see the code in action.

Benchmark

The following benchmark compares TypesXML with fast-xml-parser and tXml using the same input files and runtime environment. Each result is the best of three runs after a warmup pass. Throughput is calculated as file_size / duration.

This comparison focuses on parsing speed only. Feature sets and parsing models differ significantly between libraries.

Small to Medium Files

Size: 1.858 MB | Elements: 41,349
+-----------------+---------------+-------------------+---------+
| Parser          | Duration (ms) | Throughput (MB/s) | Success |
+-----------------+---------------+-------------------+---------+
| TypesXML        |     165.20 ms |        11.25 MB/s | yes     |
| fast-xml-parser |     154.41 ms |        12.03 MB/s | yes     |
| tXml            |      17.19 ms |       108.06 MB/s | yes     |
+-----------------+---------------+-------------------+---------+

Size: 63.215 MB | Elements: 817,216
+-----------------+---------------+-------------------+---------+
| Parser          | Duration (ms) | Throughput (MB/s) | Success |
+-----------------+---------------+-------------------+---------+
| TypesXML        |    5444.54 ms |        11.61 MB/s | yes     |
| fast-xml-parser |    4294.62 ms |        14.72 MB/s | yes     |
| tXml            |     555.80 ms |       113.74 MB/s | yes     |
+-----------------+---------------+-------------------+---------+

Size: 121.517 MB | Elements: 1,883,407
+-----------------+---------------+-------------------+---------+
| Parser          | Duration (ms) | Throughput (MB/s) | Success |
+-----------------+---------------+-------------------+---------+
| TypesXML        |    8530.47 ms |        14.25 MB/s | yes     |
| fast-xml-parser |    8615.05 ms |        14.11 MB/s | yes     |
| tXml            |    1169.80 ms |       103.88 MB/s | yes     |
+-----------------+---------------+-------------------+---------+

tXml achieves significantly higher throughput on smaller inputs because it parses from a fully loaded in-memory string and performs minimal processing.

Large Files (Streaming vs In-Memory)

Size: 574.672 MB | Elements: 7,853,048
+-----------------+---------------+-------------------+---------+
| Parser          | Duration (ms) | Throughput (MB/s) | Success |
+-----------------+---------------+-------------------+---------+
| TypesXML        |   57134.36 ms |        10.06 MB/s | yes     |
| fast-xml-parser |           n/a |               n/a | no      |
| tXml            |           n/a |               n/a | no      |
+-----------------+---------------+-------------------+---------+

Both fast-xml-parser and tXml fail on this input with: Error: Cannot create a string longer than 0x1fffffe8 characters

These parsers require loading the entire document into a single JavaScript string. Node.js imposes a maximum string size (~512 MB), which causes parsing to fail for large inputs.

TypesXML uses a streaming SAX pipeline and processes input in chunks, allowing it to handle arbitrarily large files without hitting this limitation.

Summary

  • tXml: Extremely fast for small to medium files, but limited by in-memory string size
  • fast-xml-parser: Competitive speed, but same memory limitation
  • TypesXML: Consistent performance and capable of processing very large files reliably

If your use case involves large XML documents or streaming pipelines, TypesXML provides predictable performance where in-memory parsers cannot operate.

W3C XML Test Suite

The repository includes code that runs the official W3C XML Conformance Test Suite for DTD and XML Schema grammars.

DTD

  1. Download the latest archive from the W3C XML Test Suite (e.g., xmlts20080827.zip).

  2. Extract the archive into ./tests/xmltest so the valid, invalid, and not-wf folders sit under that path.

  3. Install dependencies if needed: npm install.

  4. Run the suite:

    npm run testDtd

The script compiles the TypeScript sources and executes ts/tests/DTDTestSuite.ts, reporting any conformance failures.

XML Schema

TypesXML currently passes 95.8% of the W3C XML Schema Test Suite (2006 edition, ~40,000 tests), the only native TypeScript implementation of XML Schema 1.0 validated against the official W3C test suite.

  1. Download the latest archive from the XML Schema Version 1.0, 2nd Edition.

  2. Extract the archive into ./tests/ so the test cases are available under ./tests/xmlschema2006-11-06.

  3. Install dependencies if needed: npm install.

  4. Run the suite:

    npm run testXmlSchema

The script compiles the TypeScript sources and executes ts/tests/XmlSchemaTestSuite.ts, reporting any conformance failures.