Revolutionary approach to compiling, optimizing, and executing C code. Platform-independent, secure, and highly efficient .desf
format and its generator utility c2desf
.
Introduction β’ Features β’ Concept β’ .desf Format β’ c2desf Utility β’ VM β’ Optimization & Protection β’ Portability β’ Installation β’ License
In software development, there's a constant need for optimization and versatility. Traditional approaches to compiling C code into dynamic libraries (.so
for Linux/Unix, .dll
for Windows) have significant drawbacks: they depend on specific operating systems and CPU architectures, complicating portability and deployment.
The .desf
(Description File) format was created to solve these problems β a new platform-independent intermediate format.
The core mission of
.desf
is to provide execution optimization and complete platform independence, moving away from standard compiled libraries. This approach is particularly relevant for improving large script performance (e.g., in BYM environments) and running code on diverse devices β from powerful servers to embedded systems and even IoT devices (yes, we mean that "smart refrigerator"!).
The c2desf
project includes:
.desf
format specification: Detailed description of innovative C code representation.c2desf
utility: Tool for automatic conversion of C source code to.desf
format..desf
Virtual Machine (VM): High-performance interpreter for executing.desf
files on any supported platform.
- π Platform Independence: Run your C code anywhere! From Windows, Linux, macOS to Android, iOS and microcontrollers.
.desf
eliminates the need for architecture-specific compilation. - π Execution Optimization:
- Size: Aggressive minification, smart dead code elimination, and compact command representation significantly reduce file size (potentially 30-50% smaller than original C).
- Speed: Use of SIMD instructions (NEON, AVX2), caching, and potential JIT compilation of "hot" code sections ensure high performance.
- π‘οΈ Enhanced Protection:
- Multi-layer obfuscation: Identifier renaming, string encryption, "dummy code" insertion, Control Flow Flattening.
- Anti-debugging techniques: Complicates analysis and reverse engineering.
- Non-standard format: The
.desf
format itself acts as a barrier against standard decompilation tools.
- π‘ Innovative Concept: Replaces traditional binary libraries with an interpreted yet optimized representation.
- π§ Automation: The
c2desf
utility automates the entire conversion process. - π§© Modularity & Extensibility: Clear project structure with plugin potential.
- π IoT Security: Comprehensive measures for secure code execution on connected devices including cryptography, authentication, and isolation.
Characteristic | Traditional .so /.dll |
.desf Format |
---|---|---|
Size | Often large, compiler-dependent | Significantly smaller due to optimizations and compact command representation |
Analysis Protection | Relatively easy to decompile (IDA Pro, Ghidra) | Substantially higher protection through obfuscation and unknown bytecode |
Platform Independence | None, requires separate compilation | Complete, thanks to interpreter/VM |
Portability | Limited by architecture/OS | Maximum, from servers to microcontrollers |
Unlike direct compilation to machine code for specific architectures, .desf
uses a different approach.
A
.desf
file is not an obscure binary but specially structured C code (or text-binary/binary in final version). Its key feature is transforming all source C program logic into sequences of simple commands.
These commands are stored in an array within the generated file (if C-based) or a specialized structure (if binary). Program execution occurs by iterating through this command array in a for
loop (in conceptual C representation) and interpreting them using a large switch-case
structure in the .desf
VM. Each case
handles a specific command (e.g., assign variable value, call function, jump to code section).
To achieve high performance, size reduction, and analysis complexity:
- Name shortening: All variable/function names become single-character or short identifiers (e.g.,
myCounter
β@a
,calculateSum
βF1
). Implemented via#define
macros in intermediate C or stored in.desf
symbol table. - Command structure: Program logic (operators, function calls) transforms into numeric command codes stored in
enum
(e.g.,enum CMD { SET_A=0, PRINT=1, ADD=2, JUMP_IF_ZERO=3, EXIT=4 };
).
This approach allows standard C compilers (e.g., gcc
) compiling the VM to optimize the generated switch-case
code using target architecture capabilities (caching, branch prediction), while the .desf
file remains compilable with standard tools without external dependencies.
The .desf
format is the project's foundation. Its specification defines how C code transforms into VM-executable commands.
Simplified example showing the concept. Actual .desf
may be binary.
// File: program.desf (compiled as regular C: gcc program.desf -o program)
// OR: Input for .desf VM
#include <stdio.h> // Include standard libraries as needed (abstracted in VM)
// --- Obfuscation via macros (or .desf symbol table) ---
#define var_a vm_registers[0] // 'counter' becomes 'var_a' (VM register array element)
#define var_b vm_registers[1] // Another variable
#define F_print printf // 'printf' becomes 'F_print' (external function call via VM)
// --- Command codes (defined in .desf spec) ---
enum DESF_CMD {
CMD_SET_VAR = 0, // Set variable (register) value
CMD_ADD_VARS = 1, // Add two variables, store in first
CMD_PRINT_VAR = 2, // Print variable
CMD_EXIT_VM = 3 // Terminate VM execution
// ... more commands for loops, conditions, function calls
};
// --- Command sequence with arguments (core of .desf) ---
int desf_commands[] = {
CMD_SET_VAR, /*reg_idx*/ 0, /*value*/ 5, // var_a = 5
CMD_SET_VAR, /*reg_idx*/ 1, /*value*/ 10, // var_b = 10
CMD_ADD_VARS, /*dest_reg*/ 0, /*src_reg1*/ 0, /*src_reg2*/ 1, // var_a += var_b
CMD_PRINT_VAR, /*reg_idx*/ 0, // F_print("...", var_a)
CMD_EXIT_VM
};
int desf_command_count = sizeof(desf_commands) / sizeof(int); // Simplified
// --- Execution logic (implemented in .desf VM) ---
/*
int main_vm_loop() {
int pc = 0; // Program Counter
while (desf_commands[pc] != CMD_EXIT_VM && pc < desf_command_count) {
int current_command_code = desf_commands[pc];
switch (current_command_code) {
case CMD_SET_VAR:
// vm_registers[desf_commands[pc+1]] = desf_commands[pc+2];
// pc += 3;
break;
case CMD_ADD_VARS:
// vm_registers[desf_commands[pc+1]] = vm_registers[desf_commands[pc+2]] + vm_registers[desf_commands[pc+3]];
// pc += 4;
break;
case CMD_PRINT_VAR:
// F_print("Result: %d\n", vm_registers[desf_commands[pc+1]]);
// pc += 2;
break;
// ... other command cases ...
default:
// Handle unknown command
return 1;
}
}
return 0;
}
*/
Key for consistent generator (c2desf) and interpreter (VM) operation:
// Version and magic number
#define DESF_VERSION 0x00010000 // Version 1.0.0
#define DESF_MAGIC 0xDEADBEEF // File identifier
// Command types
enum DesfCommandType {
// Arithmetic (integer, float, vector)
// Logical operations
// Comparisons
// Control flow (jumps, calls, returns)
// Loop operations
// Memory operations
// I/O operations
// Debug commands (e.g., DESF_BREAKPOINT)
};
// Command structure
typedef struct {
uint8_t type; // Command type
uint8_t flags; // Additional flags
union {
struct {
DesfOperand op1;
DesfOperand op2;
DesfOperand op3;
} operands;
char* label; // Jump labels
} data;
} DesfCommand;
// Operand structure
typedef struct {
DesfOperandType type; // Constant, register, memory address, label
union {
int32_t constant_val_i32;
float constant_val_f32;
uint16_t register_idx;
uint32_t memory_addr;
char* label_name;
} value;
} DesfOperand;
Key evolution points:
- I/O abstraction via
desf_stdio.c
- Support for 3-operand commands
- Memory-efficient unions
- Plugin support mechanism
- Built-in debugger (
DESF_BREAKPOINT
) - Command encryption prototypes
Converts C source to .desf
format (C-with-switch or binary).
- Parser (Frontend - Python):
- Uses libclang to build AST
- Analyzes AST to build Control Flow Graph (CFG) with NetworkX
- Implements AST caching and error handling
- Obfuscator (Frontend - Python):
- Generates short identifiers
- Applies renaming, string encryption, control flow flattening
- Includes equivalence testing
- .desf Generator (Frontend - Python):
- Creates command arrays
- Handles control structures via jumps/labels
- Supports text-binary or pure binary output
Research & Development Focus:
- Advanced AST analysis with libclang/LLVM
- Obfuscation algorithm research
- Loop/condition representation
- CFG-based optimization
- Architecture-specific optimizations
- Optional AST/CFG GUI visualizer
Tech Stack:
- Python (3.8+): Frontend (parser, analyzer, generator)
- C/C++: Backend (high-performance VM)
- Cython: Python-C/C++ binding
Core component written in C/C++ for performance and portability.
Key Modules:
- Core (core.c):
- Interpreter loop (switch-case)
- Program Counter (PC)
- VM state management
- Memory (memory.c):
- Register array
- Call stack and loop stack
- Optional heap management
- Arithmetic Operations (ops/arithmetic.c):
- Integer/float operations
- Error handling (division by zero)
- SIMD optimizations (NEON/AVX)
- JIT compilation prototypes
- Embedded mode support
- Control Flow (ops/control_flow.c):
- Label management (hash tables)
- Jump/Call/Return implementation
- Conditional handling
- Loop operations
- Error handling
- Other Operations: String handling, I/O, plugin calls
Error Handling:
- Defined error codes (
VMErrorCode
) - Context preservation (error message, command address)
- Informative error messages
Advanced techniques for compact, fast, and secure code.
Size Optimization:
- Smart dead code elimination
- Compact command representation
- Efficient encoding (1-2 byte opcodes)
- String/symbol tables to avoid duplication
- Potential 30-50% size reduction vs original C
Reverse Engineering Protection:
- Aggressive Obfuscation:
- Renaming (
@a
,F1
) - String/number encryption
- Control Flow Flattening
- Dummy code insertion
- Renaming (
- Dynamic Logic:
- Specialized interpreter barrier
- Runtime decryption with environment keys
- Anti-Debugging:
- Debugger detection
- Behavior modification when debugged
- Format Specification:
- Native barrier against standard tools
Performance Optimizations:
- SIMD Utilization:
- Automatic NEON/AVX detection
- Vectorized data processing
- Alignment checks with fallbacks
- JIT Compilation:
- Dynamic native code generation for hotspots
- Secure executable memory allocation
- VM-level Caching:
- Simulated CPU caches
- Specialized Commands:
- Complex operations as single optimized commands
Run anywhere philosophy with specialized modes.
Device Spectrum:
- Servers/Desktops (Linux, Windows, macOS): Full optimizations (AVX2/JIT)
- Mobile (Android/iOS):
- Compile via NDK/Xcode
- Embed as native library
- Sandbox/JIT limitations
- Embedded/IoT ("Refrigerator"):
--target=embedded
mode- 8-bit operation support
- Minimal dependencies (no stdlib)
- Microcontrollers (ARM, RISC-V): Cross-compile VM
Portability Keys:
- Architecture-agnostic commands
- Portable VM interpreter (C11/C++17)
- Minimal dependencies
- Cross-compilation support
- QEMU/real-device testing
Addressing risks of arbitrary code execution on IoT devices.
Comprehensive Measures:
- Cryptography:
- Command encryption (XOR/AES)
- Mandatory TLS/SSL for communication
- Authentication & Authorization:
- Strict device/user authentication
- Role-based access control
- Isolation & Monitoring:
- Sandboxed execution
- Activity monitoring with kill switches
- Functionality Restrictions:
- Least-privilege principle
- Untrusted source prohibition
- Secure Updates:
- Signed firmware updates
- Physical Security:
- Hardware tamper protection
- Emergency Stop:
- Hardware/software kill switch
- "Moral Code":
- Asimov-inspired rules for IoT
Organized for modularity and growth:
c2desf/
βββ src/ # Core source
β βββ frontend/ # Python components (parser, obfuscator)
β βββ backend/ # C/C++ components (VM core)
βββ include/ # Headers
β βββ desf/ # .desf specifications
βββ dependencies/ # Third-party libraries
βββ docs/ # Documentation
βββ tests/ # Test suites
βββ tools/ # Development tools
Dependency Management:
- Isolation in
dependencies/
directory - Documentation:
dependencies/dependencies_rule_docs.md
: Policies and licensesdependencies/dependency_list.md
: Dynamic library list (e.g., uthash.h)
Detailed instructions will accompany releases. Conceptual workflow:
Install c2desf:
git clone https://github.com/Ferki-git-creator/c2desf
cd c2desf
# Python environment
python -m venv venv
source venv/bin/activate # Linux/macOS
# venv\Scripts\activate # Windows
# Install dependencies (Poetry or pip)
poetry install
# pip install -r requirements.txt
# Build C/C++ components
mkdir build && cd build
cmake ..
make
Usage:
# Convert C to .desf
c2desf input.c -o output.desf [OPTIONS]
# Options examples:
# --obfuscate-level=high
# --optimize=neon
# --target=embedded
# --output-format=binary
# --debug
# Execute .desf file
desf_vm output.desf
This project is licensed under the MIT License. See LICENSE for details.
Third-party libraries have their own licenses documented in dependencies/dependency_list.md
.
Thank you for your interest in c2desf! Together we can change how we work with C code.
"The future belongs to those who believe in the beauty of their dreams... and write efficient code for them." π