Parser is a lightweight, efficient Go library for filtering slices of structs using a SQL-like query language. It enables in-memory data filtering without a database, ideal for applications like data processing, configuration management, or API response filtering. Built with Go generics, it offers type-safe queries and supports complex expressions, nested fields, and case-insensitive matching.
- SQL-Like Query Language: Filter structs with intuitive queries (e.g.,
Age > 25 AND Skills CONTAINS 'Go'
). - Type-Safe with Generics: Works with any struct type using Go’s generics.
- Nested Field Access: Query nested structs and maps using dot notation (e.g.,
Department.Name
). - Humanized Values Support: Parse human-readable values like time units (
10m
,2h30m
), byte units (10GB
/10GiB
,2TB
/2TiB
), SI prefixes (1.5K
,2.3M
), and comma-separated numbers (1,000
) automatically. - Rich Operators: Supports
=
,!=
,<
,>
,<=
,>=
,CONTAINS
,IS NULL
,ANY
,NOT
,AND
,OR
. - Case-Insensitive Matching: Field names and keywords (e.g.,
AND
,OR
) are case-insensitive. - Efficient Parsing: Uses an enhanced lexer with support for negative numbers, scientific notation, and comma-separated numbers.
- Robust Error Handling: Detailed error messages for syntax and evaluation errors.
- Zero Dependencies: Pure Go implementation with built-in support for time, byte, and SI unit parsing.
- Go 1.24.1 or higher (for generics and latest features)
- No external dependencies
Install the library using Go modules:
go get github.com/zveinn/parser
Filter a slice of structs using a query:
package main
import (
"fmt"
"log"
"github.com/zveinn/parser"
)
type Person struct {
Name string
Age int
IsEmployed bool
Skills []string
Salary float64
Department *Department
}
type Department struct {
Name string
Location string
}
func main() {
people := []Person{
{Name: "Alice", Age: 30, IsEmployed: true, Skills: []string{"Go", "Python"}, Salary: 75000.50, Department: &Department{Name: "Engineering", Location: "New York"}},
{Name: "Bob", Age: 25, IsEmployed: false, Skills: []string{"Java", "C++"}, Salary: 65000.25, Department: &Department{Name: "Engineering", Location: "Remote"}},
{Name: "Charlie", Age: 35, IsEmployed: true, Skills: []string{"Go", "Rust"}, Salary: 85000.75, Department: nil},
}
results, err := parser.Parse("Age > 25 AND isemployed = true", people)
if err != nil {
log.Fatalf("Error parsing query: %v", err)
}
for _, p := range results {
fmt.Printf("Match: %s (Age: %d)\n", p.Name, p.Age)
}
}
Output:
Match: Alice (Age: 30)
Match: Charlie (Age: 35)
The query language supports a variety of operators and expressions:
Operator | Description | Example |
---|---|---|
= |
Equal | Name = 'Alice' |
!= |
Not equal | Age != 30 |
> |
Greater than | Salary > 70,000 |
< |
Less than | Age < 35 |
>= |
Greater than or equal | Salary >= 75000.50 |
<= |
Less than or equal | Age <= 30 |
CONTAINS |
String or slice contains | Skills CONTAINS 'Go' |
Operator | Description | Example |
---|---|---|
AND |
Logical AND | Age > 25 AND IsEmployed = true |
OR |
Logical OR | Name = 'Alice' OR Name = 'Bob' |
NOT |
Logical NOT | NOT (Age < 30) |
Operator | Description | Example |
---|---|---|
IS NULL |
Check for nil/zero value | Department IS NULL |
IS NOT NULL |
Check for non-nil value | Department IS NOT NULL |
ANY |
Match any value in a list | ANY(Skills) = ANY('Go', 'Rust') |
# Basic filtering
Name = 'Alice'
Salary > 80,000
Skills CONTAINS 'Go'
# Time-based filtering (converted to seconds)
ResponseTime < 30s
Timeout > 5m
CacheExpiry < 2h
Uptime > 1d
# Byte size filtering
Memory > 8GB
Storage < 1TiB
BackupSize > 500MiB
# SI prefix filtering (uppercase only)
Population > 1.5M
Records < 10K
Distance >= 2.5G
# Nested fields and maps
Department.Location = 'Remote'
Tags.level = 'senior'
# Complex logic with mixed units
(Age > 30 AND Salary > 75,000) OR IsEmployed = false
ResponseTime < 1m AND Memory > 8GB AND Uptime > 1d
ANY(Skills) = 'Go' AND NOT (Department IS NULL)
Query nested fields or map values using dot notation:
query := "Department.Name = 'Engineering' AND Tags.level = 'senior'"
results, err := parser.Parse(query, people)
The parser supports advanced numeric formats:
- Negative numbers:
Salary > -1000
- Scientific notation:
Salary > 7.5e4
- Comma-separated numbers:
Salary > 1,000,000.50
- Time durations:
ResponseTime < 30s
,Timeout > 2h30m
- Byte sizes:
Memory > 8GB
,Storage < 1TiB
- SI prefixes:
Population > 1.5M
,Count < 5K
(uppercase only)
The parser automatically converts humanized values to their numeric equivalents with unambiguous parsing rules. Values are parsed in the following priority order:
- Time Duration Units (parsed first to avoid conflicts)
- Byte Size Units (decimal and binary)
- SI Prefixes (case-sensitive, uppercase only)
- Comma-Separated Numbers
Time Duration Units: Time units are converted to total seconds and support compound durations:
# Single time units
ResponseTime < 30s # 30 seconds
Timeout > 5m # 300 seconds (5 minutes)
CacheExpiry < 2h # 7200 seconds (2 hours)
Retention > 7d # 604800 seconds (7 days)
# Compound time units (multiple units combined)
Duration = 2h30m # 9000 seconds (2 hours + 30 minutes)
Delay < 1m30s # 90 seconds (1 minute + 30 seconds)
Uptime > 1d12h # 129600 seconds (1 day + 12 hours)
# Supported time units:
# ns - nanoseconds, us/µs - microseconds, ms - milliseconds
# s - seconds, m - minutes, h - hours, d - days
Byte Size Units (Decimal and Binary):
# Decimal units (powers of 1000) - International System of Units
Drive.Size > 10GB # 10,000,000,000 bytes
Memory > 1.5TB # 1,500,000,000,000 bytes
Storage < 500MB # 500,000,000 bytes
Buffer < 100KB # 100,000 bytes
# Binary units (powers of 1024) - Computer memory standards
Backup > 2.5GiB # 2,684,354,560 bytes
Cache > 512MiB # 536,870,912 bytes
Temp < 100KiB # 102,400 bytes
Archive > 1TiB # 1,099,511,627,776 bytes
# Supported byte units:
# Decimal: B, KB, MB, GB, TB, PB, EB, ZB, YB
# Binary: B, KiB, MiB, GiB, TiB, PiB, EiB, ZiB, YiB
SI Prefixes (Case-Sensitive, Uppercase Only): SI prefixes are now case-sensitive and only recognize uppercase letters to avoid conflicts with time units:
Population > 1.5M # 1,500,000 (mega = 10^6)
Count < 5K # 5,000 (kilo = 10^3)
Records >= 2.3G # 2,300,000,000 (giga = 10^9)
Distance < 500K # 500,000 (kilo = 10^3)
# Supported SI prefixes (uppercase only):
# K (kilo = 10^3), M (mega = 10^6), G (giga = 10^9)
# T (tera = 10^12), P (peta = 10^15), E (exa = 10^18)
# Z (zettabyte = 10^21), Y (yottabyte = 10^24)
# Note: Lowercase prefixes (k, m, g, etc.) are NOT supported
# to avoid conflicts with time units (m = minutes, s = seconds)
Comma-Separated Numbers:
Price > 1,000,000 # 1000000
Users >= 50,000 # 50000
Transactions < 2,500 # 2500
Example with Real Data:
type Server struct {
Name string
Memory int64 // in bytes
Storage int64 // in bytes
ResponseTime int64 // in seconds
Uptime int64 // in seconds
}
servers := []Server{
{Name: "web1", Memory: 8589934592, Storage: 536870912000, ResponseTime: 30, Uptime: 86400}, // 8GB, 500GB, 30s, 1 day
{Name: "db1", Memory: 34359738368, Storage: 2199023255552, ResponseTime: 120, Uptime: 604800}, // 32GB, 2TB, 2m, 7 days
}
// Query using time units (converted to seconds)
results, _ := parser.Parse("ResponseTime < 1m AND Uptime > 1d", servers)
// Query using decimal byte units (powers of 1000)
results, _ := parser.Parse("Memory > 16GB AND Storage < 1TB", servers)
// Query using binary byte units (powers of 1024)
results, _ := parser.Parse("Memory > 16GiB AND Storage < 1TiB", servers)
// Query using SI prefixes (case-sensitive, uppercase only)
results, _ := parser.Parse("Memory > 8G AND Storage > 500M", servers) // Treating as generic numbers
// Mixed units work correctly due to unambiguous parsing
results, _ := parser.Parse("ResponseTime < 2m AND Memory > 8GB AND Uptime > 1d", servers)
Important Notes:
- Time units (
m
,s
,h
,d
) take precedence over SI prefixes - SI prefixes are case-sensitive and only recognize uppercase (
K
,M
,G
, etc.) - Byte units support both decimal (GB, MB) and binary (GiB, MiB) standards
- The parser automatically resolves conflicts by checking units in priority order
The parser uses a priority-based system to handle potential conflicts between different unit types:
- Time Units First:
10m
is always parsed as 10 minutes (600 seconds), never as 10 milli-units - Byte Units Second:
10GB
is parsed as 10 gigabytes (10,000,000,000 bytes) - SI Prefixes Third:
10K
is parsed as 10,000 using the kilo prefix - Comma-Separated Last:
10,000
is parsed as ten thousand
Case Sensitivity Rules:
- Time units are case-insensitive:
10M
=10m
= 10 minutes - Byte units are case-sensitive:
10GB
≠10gb
(only10GB
is valid) - SI prefixes are case-sensitive:
10K
is valid,10k
is not supported - This prevents conflicts like
m
(minutes) vsm
(milli-prefix)
Examples of Conflict Resolution:
# These are unambiguous and work as expected:
Duration < 5m # 5 minutes = 300 seconds (time unit)
Size > 5MB # 5 megabytes = 5,000,000 bytes (byte unit)
Count > 5K # 5 thousand = 5,000 (SI prefix)
# These demonstrate the priority system:
Value > 10m # Always 10 minutes (600 seconds), never 10 milli-units
Storage > 10M # 10 megabytes if comparing to bytes, otherwise 10 million
Population > 10M # 10 million (SI prefix) when comparing to numbers
Based on benchmark results:
- Efficient for Small to Medium Datasets: Queries on datasets of 10–1000 structs are fast, with simple queries (e.g.,
Age > 30
) taking microseconds. - Unit Parsing Overhead: Time, byte, and SI unit parsing adds minimal overhead and is optimized for common cases.
- Reflection Overhead: Minimal reflection is used during evaluation, with no reflection during query compilation.
- Scalability: Performance scales linearly with dataset size. For very large datasets (>10,000 items), consider batching.
- Query Complexity: Complex queries with nested logic or
ANY
operators are slightly slower but optimized with short-circuit evaluation. - Memory Usage: Low memory footprint, with minimal allocations for simple queries (benchmarks show 1–2 allocations per query).
The parser provides detailed error messages:
_, err := parser.Parse("Age >", people)
if err != nil {
fmt.Println(err) // Output: "failed to parse query: unexpected EOF"
}
_, err = parser.Parse("InvalidField = 10", people)
if err != nil {
fmt.Println(err) // Output: "evaluation error: field 'InvalidField' not found"
}
Clone the repository and build:
git clone https://github.com/zveinn/parser.git
cd parser
Run tests to verify functionality:
go test -v ./...
Run benchmarks to measure performance:
go test -bench=. ./...
- API Reference: Available via GoDoc.
- Examples: See the examples/ directory for sample queries (create this directory if needed).
- Source Code Insights:
parser.go
: Core parsing logic with AST evaluation.enhanced_lexer.go
: Tokenization with support for advanced numeric formats.parser_test.go
: Comprehensive test suite for all operators and edge cases.
Contributions are welcome! To contribute:
- Fork the repository.
- Create a feature branch (
git checkout -b feature/my-feature
). - Commit your changes (
git commit -m 'Add my feature'
). - Push to the branch (
git push origin feature/my-feature
). - Open a Pull Request.
This project is licensed under the Apache 2.0 License. See LICENSE for details.
- Built by zveinn.
- Inspired by SQL query engines and libraries like rql.
- Thanks to the Go community for feedback and inspiration.
⭐ Star this project if you find it useful!
💬 Report issues or suggest features in Issues.