Skip to content

Rezaei-Parham/RKE

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Renyi Kernel Entropy PyTorch (RKE)

An Information-Theoretic Evaluation of Generative Models in Learning Multi-modal Distributions

Overview

This repository contains an optimized implementation of the Renyi Kernel Entropy (RKE) score using PyTorch. The PyTorch implementation leverages GPU acceleration and parallel computing to significantly speed up the computation of RKE scores for large datasets.

The main repository for the paper is here.

Parallelization with PyTorch

To enhance the computational efficiency, the RKE score calculation is implemented using PyTorch. By utilizing the parallel computing capabilities of GPUs, this implementation can handle large datasets more efficiently compared to the traditional NumPy-based approach. This parallelization helps in reducing computation time while maintaining the accuracy of the results.

Results

Setting

The following seeting has been used for the comparrison:

number_of_samples = [10,50,200,1000]
feature_dim = 1000
bandwiths = [8,20,100]
real_features = np.random.normal(loc=0.0, scale=1.0,
                                    size=[num_real_samples, feature_dim])

fake_features = np.random.normal(loc=0.01, scale=1.01,
                                size=[num_fake_samples, feature_dim])

Computation Time Comparison

The following plots show the computation time for RKE scores using both NumPy and PyTorch implementations. The PyTorch implementation demonstrates a significant speedup, especially for larger datasets.

Computation Times for RKE_MC

To better recognize the difference, notice that for an extreme case of 10000 samples with 10 feature dimnesions on the RKE mode count using frobenius norm, the numpy version takes notoriously long time while the parallel version executes under a second.

Offset Comparison

The table below shows the average offset values between the original NumPy-based RKE implementation and the PyTorch-based implementation, proving the accuracy of the PyTorch implementation. (The error is mainly due to the randomness of the sampling)

Method Offset
rke_mc 0.559%
rke_mc_frobenius_norm 5.579e-11%
rrke 0.001%

Example Code

Here is an example script demonstrating how to use the RKE class for computing RKE scores and comparing the performance of the NumPy and PyTorch implementations.

import numpy as np
from RKEtorch import RKE


num_real_samples = num_fake_samples = 10000
feature_dim = 1000

real_features = np.random.normal(loc=0.0, scale=1.0,
                                 size=[num_real_samples, feature_dim])

fake_features = np.random.normal(loc=0.0, scale=1.0,
                                 size=[num_fake_samples, feature_dim])

kernel = RKE(kernel_bandwidth=[0.2, 0.3, 0.4])


print(kernel.compute_rke_mc(fake_features))
print(kernel.compute_rrke(real_features, fake_features))

Block Operations

As it is evident, if we have high value for feature dimension and large number of samples, we would face gpu memory issues. It is solved by using a value named block_size to do the parallel computations in each block and iterate sequentially on the blocks. Therefore, you can use this trade-off of memory and time. If you have 10_000 samples of 1000 feature dimension, the numpy code would take until eternity but using block_size of 100 on this code, would give the result in around 30 seconds.

About

pytroch implementation of RKE paper

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages