for 1 second voice , it real need 8 seconds to clean? i make a same generator network, it just takes 0.2 second for 1 second data