Netwerx is a lightweight, extensible deep learning library for Java 23+.
Itโs designed for learning, prototyping, and research โ with full transparency into what your neural network is doing under the hood. No magic, no black boxes.
- โจ Features
- ๐ Quickstart
- ๐ Core Concepts
- ๐ Activation Functions
- โ๏ธ Optimizers
- ๐ฏ Loss Functions
- ๐งช Training Executors
- ๐ Scoring Functions
- โน Early Stopping (Stopping Advisors)
- ๐ก Regularization
- ๐ Parameter Initialization
- ๐ฃ Training Listeners
- ๐ง Model Types
- ๐งฎ Matrix Abstraction
- ๐งช Titanic Example
- ๐ง Extending Netwerx
- ๐ค Roadmap
- ๐ค Contributing
- ๐ License
- ๐ Acknowledgements
- Fully connected feed-forward networks
- Binary/multi-class classifiers, regressors, autoencoders
- Mini-batch or full-batch training
- Dropout support for regularization
- Modular components (optimizers, activations, loss functions, etc.)
- Pluggable matrix backend
- Lightweight โ depends only on EJML
- Training listeners and early stopping
- Reproducibility with pluggable random sources
var trainer = new DefaultNeuralNetworkTrainerBuilder<>(factory, 5)
.defaultOptimizer(() -> Optimizers.adam(0.01))
.denseLayer(layer -> layer.units(8).activationFunction(ActivationFunctions.relu()))
.denseLayer(layer -> layer.units(1).activationFunction(ActivationFunctions.sigmoid()))
.buildBinaryClassifierTrainer();
var trainFeatures = factory.filled(5, 10, 0.5);
var trainLabels = factory.filled(1, 10, 1.0);
var dataset = new Dataset<>(trainFeatures, trainLabels);
var network = trainer.train(dataset);
- Activation Functions โ introduce non-linearity into your network
- Regularization โ penalize model complexity
- Loss Functions โ define how wrong your model is
- Optimizers โ control how weights are updated
- Training Executors โ manage batching and parallel execution
- Scoring Functions โ evaluate training progress
- Stopping Advisors โ control when training halts
- Parameter Initialization โ choose sensible starting points for weights and biases
- Training Listeners โ observe and respond to training events
- Matrix Abstraction โ plug in your own backend
Netwerx supports ReLU, Sigmoid, Tanh, LeakyReLU, Softmax, and Linear. You can plug in your own:
var relu = ActivationFunctions.relu();
var custom = (ActivationFunction) (input) -> ...
Optimizers update your weights each step:
- SGD โ basic gradient descent
- Momentum โ adds inertia
- Adam โ adaptive learning rate + momentum
- RMSProp โ adaptive learning rate only
Optimizers.adam(0.01, 0.9, 0.999, 1e-8);
Use a loss function suited to your task:
- MSE โ regression
- MAE โ regression
- Binary Cross Entropy โ binary classification
- Categorical Cross Entropy โ multi-class
- Hinge โ SVM-style classifiers
LossFunctions.bce();
Training executors handle how training samples are fed to the network:
- Full Batch: use entire dataset each epoch
- Mini Batch: configurable size, shuffling, and parallelism
TrainingExecutors.miniBatch(32, new Random(), Executors.newFixedThreadPool(4));
Scoring functions monitor progress, typically by evaluating validation loss or accuracy. Use one of ours or create your own:
ScoringFunctions.validationLoss();
Stop training when it's no longer improving:
- Max Epochs
- Score Threshold
- Patience
StoppingAdvisors.patience(10, 1e-4);
Avoid overfitting by penalizing weights:
- L1 (sparsity)
- L2 (shrinkage)
- Elastic Net (combines both)
Regularizations.l2(1e-4);
Choose how weights and biases are initialized:
ParameterInitializers.heUniform();
ParameterInitializers.zeros();
Attach listeners to monitor progress:
TrainingListeners.logging(logger, 100);
Custom listeners can log metrics, write to disk, update UIs, etc.
Netwerx supports:
- BinaryClassifierTrainer โ one output, sigmoid, BCE loss
- MultiClassifierTrainer โ softmax, categorical loss
- RegressionTrainer โ identity output, MSE/MAE
- AutoencoderTrainer โ encoder/decoder pattern with MSE loss
All computations are built on a pluggable matrix abstraction:
Matrix<M> matrix = factory.random(rows, cols);
Plug in your own backend (e.g., EJML, ND4J) by implementing Matrix<M>
.
A binary classifier predicts Titanic survival:
- Input: class, age, sex, fare, family members
- Layers: [8 โ 4 โ 1]
- Activation: ReLU + Sigmoid
- Loss: Binary Cross Entropy
- Optimizer: SGD
Accuracy: ~83%, F1 Score: 0.75
You want to... | Implement... |
---|---|
Add an activation | ActivationFunction |
Add a loss function | LossFunction |
Create an optimizer | Optimizer |
Add scoring/early stop | ScoringFunction / StoppingAdvisor |
Monitor training | TrainingListener |
- Binary/Multi/Regression/Autoencoder Trainers
- Dropout support
- Mini-batch + parallel execution
- Early stopping (patience, score threshold)
- Adam, RMSProp, Momentum
- Xavier, He initialization
- Model serialization
- Learning rate schedulers
- CNN, RNN layer support
- Visual training dashboards
Have an idea? Found a bug? Contributions are welcome!
- Fork, branch, submit PRs
- Add your own trainers, layers, components
- Suggest improvements via Issues
Licensed under Apache License 2.0
Inspired by:
- PyTorch
- Keras
- TensorFlow
Built from scratch for Java developers who want to deeply understand whatโs happening in a neural network.