Skip to content

ZeroCostAutomation/wappalyzer-fingerprints

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 

Repository files navigation

Wappalyzer Fingerprints

A GitHub Action workflow that automatically fetches, processes, and publishes the latest Wappalyzer fingerprints for technology detection.

What is this?

This repository contains a GitHub workflow that:

  1. Automatically runs daily to fetch the latest Wappalyzer technology fingerprints
  2. Processes and standardizes the data format
  3. Creates GitHub Releases with the fingerprint data
  4. Updates a "latest" tag for easy access

The data is sourced from:

Download the Latest Data

You can always download the latest fingerprints using these commands:

# Download the latest technologies data
wget https://github.com/ZeroCostAutomation/wappalyzer-fingerprints/releases/latest/download/technologies.json

# Download the latest categories
wget https://github.com/ZeroCostAutomation/wappalyzer-fingerprints/releases/latest/download/categories.json

# Download the latest groups
wget https://github.com/ZeroCostAutomation/wappalyzer-fingerprints/releases/latest/download/groups.json

# Download everything in a ZIP archive
wget https://github.com/ZeroCostAutomation/wappalyzer-fingerprints/releases/latest/download/wappalyzer-fingerprints.zip

Or visit the latest release page.

Data Files

The following files are included in each release:

  • technologies.json: Technology fingerprints for detection (flat array)
  • categories.json: Technology categories
  • groups.json: Category grouping information
  • README.md: Documentation and statistics
  • wappalyzer-fingerprints.zip: All files in a single ZIP archive

Data Models

The fingerprint data follows specific structures that can be modeled in various languages:

JSON Schema

Technologies

[
  {
    "name": "technology_name",
    "cats": [1, 2],
    "description": "Technology description",
    "website": "https://example.com",
    "cpe": "cpe:/a:vendor:product:version",
    "icon": "icon_file.png",
    "cookies": {
      "cookie_name": "cookie_pattern"
    },
    "headers": {
      "header_name": "header_pattern"
    },
    "html": ["html_pattern1", "html_pattern2"],
    "scripts": ["script_pattern1", "script_pattern2"],
    "meta": {
      "meta_name": ["meta_pattern"]
    },
    "js": {
      "object.property": ""
    },
    "implies": ["other_technology"]
  }
]

Categories

{
  "category_id": {
    "name": "Category Name",
    "priority": 1,
    "groups": ["group_id"]
  }
}

Groups

{
  "group_id": {
    "name": "Group Name"
  }
}

TypeScript Models

// models.ts
export interface Technology {
  name: string;
  cats: number[];
  description?: string;
  website?: string;
  cpe?: string;
  icon?: string;
  cookies?: Record<string, string>;
  headers?: Record<string, string>;
  html?: string[];
  scripts?: string[];
  scriptSrc?: string[];
  meta?: Record<string, string[]>;
  js?: Record<string, any>;
  implies?: string[];
}

// Technologies.json is a direct array of Technology objects
export type WappalyzerTechnologies = Technology[];

export interface Categories {
  [categoryId: string]: {
    name: string;
    priority: number;
    groups?: string[];
  };
}

export interface Groups {
  [groupId: string]: {
    name: string;
  };
}

// Usage example
import { WappalyzerTechnologies, Categories, Groups } from './models';

async function loadTechnologies(): Promise<WappalyzerTechnologies> {
  const response = await fetch('https://github.com/ZeroCostAutomation/wappalyzer-fingerprints/releases/latest/download/technologies.json');
  return response.json();
}

Go Structs

// models.go
package wappalyzer

// Technologies.json is a direct array of Technology objects
type Technologies []Technology

type Technology struct {
	Name        string                `json:"name"`
	Cats        []int                 `json:"cats"`
	Description string                `json:"description,omitempty"`
	Website     string                `json:"website,omitempty"`
	CPE         string                `json:"cpe,omitempty"`
	Icon        string                `json:"icon,omitempty"`
	Cookies     map[string]string     `json:"cookies,omitempty"`
	Headers     map[string]string     `json:"headers,omitempty"`
	HTML        []string              `json:"html,omitempty"`
	Scripts     []string              `json:"scripts,omitempty"`
	ScriptSrc   []string              `json:"scriptSrc,omitempty"`
	Meta        map[string][]string   `json:"meta,omitempty"`
	JS          map[string]any        `json:"js,omitempty"`
	Implies     []string              `json:"implies,omitempty"`
}

type Categories map[string]Category

type Category struct {
	Name     string   `json:"name"`
	Priority int      `json:"priority"`
	Groups   []string `json:"groups,omitempty"`
}

type Groups map[string]Group

type Group struct {
	Name string `json:"name"`
}

// Usage example
func LoadTechnologies() (Technologies, error) {
	resp, err := http.Get("https://github.com/ZeroCostAutomation/wappalyzer-fingerprints/releases/latest/download/technologies.json")
	if err != nil {
		return nil, err
	}
	defer resp.Body.Close()

	var technologies Technologies
	err = json.NewDecoder(resp.Body).Decode(&technologies)
	return technologies, err
}

Python Classes

# models.py
from typing import Dict, List, Any, Optional
from dataclasses import dataclass
from pydantic import BaseModel

# Pydantic model
class Technology(BaseModel):
    name: str
    cats: List[int]
    description: Optional[str] = None
    website: Optional[str] = None
    cpe: Optional[str] = None
    icon: Optional[str] = None
    cookies: Optional[Dict[str, str]] = None
    headers: Optional[Dict[str, str]] = None
    html: Optional[List[str]] = None
    scripts: Optional[List[str]] = None
    scriptSrc: Optional[List[str]] = None
    meta: Optional[Dict[str, List[str]]] = None
    js: Optional[Dict[str, Any]] = None
    implies: Optional[List[str]] = None

# Technologies.json is a direct array of Technology objects
WappalyzerTechnologies = List[Technology]

class Category(BaseModel):
    name: str
    priority: int
    groups: Optional[List[str]] = None

class Group(BaseModel):
    name: str

# Usage example
import requests

def load_technologies():
    response = requests.get("https://github.com/ZeroCostAutomation/wappalyzer-fingerprints/releases/latest/download/technologies.json")
    data = response.json()
    return [Technology(**tech) for tech in data]

PHP Classes

<?php
// models.php

class Technology {
    public string $name;
    public array $cats;
    public ?string $description = null;
    public ?string $website = null;
    public ?string $cpe = null;
    public ?string $icon = null;
    public ?array $cookies = null;
    public ?array $headers = null;
    public ?array $html = null;
    public ?array $scripts = null;
    public ?array $scriptSrc = null;
    public ?array $meta = null;
    public ?array $js = null;
    public ?array $implies = null;
}

// For technologies.json - array of Technology objects
// Usage example
function loadTechnologies(): array {
    $json = file_get_contents("https://github.com/ZeroCostAutomation/wappalyzer-fingerprints/releases/latest/download/technologies.json");
    $data = json_decode($json, true);
    
    $technologies = [];
    
    foreach ($data as $techData) {
        $tech = new Technology();
        $tech->name = $techData['name'];
        $tech->cats = $techData['cats'];
        $tech->description = $techData['description'] ?? null;
        // populate other properties
        $technologies[] = $tech;
    }
    
    return $technologies;
}

class Category {
    public string $name;
    public int $priority;
    public ?array $groups = null;
}

class Group {
    public string $name;
}

function loadCategories(): array {
    $json = file_get_contents("https://github.com/ZeroCostAutomation/wappalyzer-fingerprints/releases/latest/download/categories.json");
    return json_decode($json, true);
}

Usage Examples

Python

import json
import requests

# Download technologies data
technologies_url = "https://github.com/ZeroCostAutomation/wappalyzer-fingerprints/releases/latest/download/technologies.json"
response = requests.get(technologies_url)
data = response.json()

# Print number of technologies
print(f"Total technologies: {len(data)}")

# Example: Find WordPress data
wordpress_tech = next((tech for tech in data if tech["name"] == "WordPress"), None)
if wordpress_tech:
    print(f"WordPress categories: {wordpress_tech['cats']}")

JavaScript/Node.js

const https = require('https');
const fs = require('fs');

// Download technologies data
const url = "https://github.com/ZeroCostAutomation/wappalyzer-fingerprints/releases/latest/download/technologies.json";
https.get(url, (response) => {
    let data = '';
    
    response.on('data', (chunk) => {
        data += chunk;
    });
    
    response.on('end', () => {
        const technologies = JSON.parse(data);
        console.log(`Total technologies: ${technologies.length}`);
        
        // Example: Find React data
        const reactTech = technologies.find(tech => tech.name === "React");
        if (reactTech) {
            console.log(`React categories: ${reactTech.cats.join(', ')}`);
        }
    });
}).on("error", (err) => {
    console.log("Error: " + err.message);
});

Go

package main

import (
	"encoding/json"
	"fmt"
	"io/ioutil"
	"net/http"
)

// Technologies.json is a direct array of Technology objects
type Technology struct {
	Name string  `json:"name"`
	Cats []int   `json:"cats"`
	// other fields omitted for brevity
}

func main() {
	// Download technologies data
	resp, err := http.Get("https://github.com/ZeroCostAutomation/wappalyzer-fingerprints/releases/latest/download/technologies.json")
	if err != nil {
		fmt.Printf("Error fetching data: %s\n", err)
		return
	}
	defer resp.Body.Close()

	// Parse JSON
	data, err := ioutil.ReadAll(resp.Body)
	if err != nil {
		fmt.Printf("Error reading response: %s\n", err)
		return
	}

	var technologies []Technology
	if err := json.Unmarshal(data, &technologies); err != nil {
		fmt.Printf("Error parsing JSON: %s\n", err)
		return
	}

	fmt.Printf("Total technologies: %d\n", len(technologies))

	// Example: Find jQuery data
	for _, tech := range technologies {
		if tech.Name == "jQuery" {
			fmt.Printf("jQuery categories: %v\n", tech.Cats)
			break
		}
	}
}

Running Locally

If you want to run the fingerprint update script locally:

  1. Clone this repository
  2. Install Python dependencies: pip install requests
  3. Run the script: python .github/scripts/fetch_fingerprints.py
  4. Find the output files in the assets directory

Workflow Schedule

The GitHub workflow runs automatically:

  • Daily at 01:00 UTC
  • Manually via GitHub Actions workflow dispatch

License

This data is derived from Wappalyzer, which is licensed under the MIT License. When using this data, please respect the original license terms.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.