A fast, lightweight HTML formatter written in Zig that adds proper indentation and line breaks to HTML content. Similar to WebStorm's "Reformat Code" functionality, this tool transforms minified or poorly formatted HTML into clean, readable code.
- Smart Formatting: Intelligently handles block vs inline elements
- Preserves Content: Keeps formatting intact inside
<script>
,<style>
, and<pre>
tags - Configurable Indentation: Customize indentation size (default: 2 spaces)
- Fast Processing: Single-pass algorithm for efficient formatting
- Flexible I/O: Supports files, stdin/stdout, and pipe operations
- Comment Handling: Properly formats HTML comments with correct indentation
- Memory Efficient: Uses Zig's allocator for optimal memory management
- Zig (version 0.11.0 or later)
# Clone the repository
git clone <repository-url>
cd format_html
# Build the executable
zig build
# The binary will be created at zig-out/bin/format_html
# Debug build (default)
zig build
# Optimized release build
zig build -Doptimize=ReleaseFast
# Run without installing
zig build run -- [args]
format_html [OPTIONS] [INPUT_FILE]
Options:
-h, --help Show help message
-v, --version Show version information
-i, --input FILE Input HTML file (default: stdin)
-o, --output FILE Output file (default: stdout)
--stdin Force reading from stdin
--stdout Force writing to stdout
--verbose Show processing statistics
--indent N Set indentation size (default: 2)
# Format a file and save to another file
format_html input.html -o output.html
# Format using pipes
cat minified.html | format_html > formatted.html
# Custom indentation (4 spaces)
format_html --indent 4 input.html -o output.html
# Show processing statistics
format_html --verbose input.html -o output.html
# Quick test with zig run
zig run src/main.zig -- --stdin
Before formatting:
<div class="product"><h1>Title</h1><p>Description text.</p><ul><li>Item 1</li><li>Item 2</li></ul></div>
After formatting:
<div class="product">
<h1>Title</h1>
<p>Description text.</p>
<ul>
<li>Item 1</li>
<li>Item 2</li>
</ul>
</div>
The formatter uses a single-pass algorithm with a state machine approach:
- Tag Recognition: Identifies opening, closing, self-closing, and comment tags
- Smart Indentation: Tracks nesting depth and adds appropriate indentation
- Element Classification: Distinguishes between block and inline elements
- Content Preservation: Maintains original formatting in special tags like
<script>
,<style>
, and<pre>
- Whitespace Normalization: Cleans up excessive whitespace while preserving meaningful spaces
- Block Elements:
div
,p
,h1-h6
,ul
,li
,table
,tr
,td
, etc. - Inline Elements:
span
,strong
,em
,a
,img
,button
, etc. - Self-Closing:
br
,img
,input
,meta
, etc. - Special Tags:
script
,style
,pre
(content preserved as-is)
format_html/
├── src/
│ ├── main.zig # CLI interface and argument parsing
│ └── html_formatter.zig # Core formatting logic
├── examples/
│ ├── sample.html # Example input file
│ └── formatted_sample.html # Example output file
├── build.zig # Build configuration
├── build.zig.zon # Package configuration
└── README.md # This file
# Build the project
zig build
# Run tests
zig build test
# Run example formatting
zig build example
# Build and run with arguments
zig build run -- --verbose examples/sample.html -o output.html
- Module System: Uses Zig's module system with
html_formatter
as a separate module - Memory Management: Uses
GeneralPurposeAllocator
with proper cleanup - Error Handling: Comprehensive error handling for file I/O and parsing
- Testing: Embedded tests using Zig's built-in testing framework
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests for new functionality
- Ensure all tests pass with
zig build test
- Submit a pull request
- Text content formatting needs refinement for better inline handling
- Some edge cases in comment parsing may need attention
- Test suite currently has some failing tests that need fixes
[Add your license information here]
Current version: 0.1.0