-
Notifications
You must be signed in to change notification settings - Fork 8k
CFG Parameters in the [net] section
CFG-Parameters in the [net] section:
-
[net]section-
batch=1- number of samples (images, letters, ...) which will be precossed in one batch -
subdivisions=1- number of mini_batches in one batch, sizemini_batch = batch/subdivisions, so GPU processesmini_batchsamples at once, and the weights will be updated forbatchsamples (1 iteration processesbatchimages) -
width=416- network size (width), so every image will be resized to the network size during Training and Detection -
height=416- network size (height), so every image will be resized to the network size during Training and Detection -
channels=3- network size (channels), so every image will be converted to this number of channels during Training and Detection -
inputs=256- network size (inputs) is used for non-image data: letters, prices, any custom data -
max_chart_loss=20- max value of Loss in the imagechart.png
-
For training only
-
Contrastive loss:
-
contrastive=1- use Supervised contrastive loss for training Classifier (should be used with[contrastive]layer) -
unsupervised=1- use Unsupervised contrastive loss for training Classifier on images without labels (should be used withcontrastive=1parameter and with[contrastive]layer)
-
-
Data augmentation:
-
angle=0- randomly rotates images during training (classification only) -
saturation = 1.5- randomly changes saturation of images during training -
exposure = 1.5- randomly changes exposure (brightness) during training -
hue=.1- randomly changes hue (color) during training https://en.wikipedia.org/wiki/HSL_and_HSV -
blur=1- blur will be applied randomly in 50% of the time: if1- will be blured background except objects withblur_kernel=31, if>1- will be blured whole image withblur_kernel=blur(only for detection and if OpenCV is used) -
min_crop=224- minimum size of randomly cropped image (classification only) -
max_crop=448- maximum size of randomly cropped image (classification only) -
aspect=.75- aspect ration can be changed during croping from0.75- to1/0.75(classification only) -
letter_box=1- keeps aspect ratio of loaded images during training (detection training only, but to use it during detection-inference - use flag-letter_boxat the end of detection command) -
cutmix=1- use CutMix data augmentation (for Classifier only, not for Detector) -
mosaic=1- use Mosaic data augmentation (4 images in one) -
mosaic_bound=1- limits the size of objects whenmosaic=1is used (does not allow bounding boxes to leave the borders of their images when Mosaic-data-augmentation is used) -
data augmentation in the last
[yolo]-layer-
jitter=0.3- randomly changes size of image and its aspect ratio from x(1 - 2*jitter)to x(1 + 2*jitter) -
random=1- randomly resizes network size after each 10 batches (iterations) from/1.4tox1.4with keeping initial aspect ratio of network size
-
-
adversarial_lr=1.0- Changes all detected objects to make it unlike themselves from neural network point of view. The neural network do an adversarial attack on itself -
attention=1- shows points of attention during training -
gaussian_noise=1- add gaussian noise
-
-
Optimizator:
-
momentum=0.9- accumulation of movement, how much the history affects the further change of weights (optimizer) -
decay=0.0005- a weaker updating of the weights for typical features, it eliminates dysbalance in dataset (optimizer) http://cs231n.github.io/neural-networks-3/ -
learning_rate=0.001- initial learning rate for training -
burn_in=1000- initial burn_in will be processed for the first 1000 iterations,current_learning rate = learning_rate * pow(iterations / burn_in, power) = 0.001 * pow(iterations/1000, 4)where ispower=4by default -
max_batches = 500200- the training will be processed for this number of iterations (batches) -
policy=steps- policy for changing learning rate:constant (by default), sgdr, steps, step, sig, exp, poly, random(f.e., ifpolicy=random- then current learning rate will be changed in this way= learning_rate * pow(rand_uniform(0,1), power)) -
power=4- ifpolicy=poly- the learning rate will be= learning_rate * pow(1 - current_iteration / max_batches, power) -
sgdr_cycle=1000- ifpolicy=sgdr- the initial number of iterations in cosine-cycle -
sgdr_mult=2- ifpolicy=sgdr- multiplier for cosine-cycle https://towardsdatascience.com/https-medium-com-reina-wang-tw-stochastic-gradient-descent-with-restarts-5f511975163 -
steps=8000,9000,12000- ifpolicy=steps- at these numbers of iterations the learning rate will be multiplied byscalesfactor -
scales=.1,.1,.1- ifpolicy=steps- f.e. ifsteps=8000,9000,12000,scales=.1,.1,.1and the current iteration number is10000thencurrent_learning_rate = learning_rate * scales[0] * scales[1] = 0.001 * 0.1 * 0.1 = 0.00001 -
label_smooth_eps=0.1- use label smoothing for training Classifier
-
For training Recurrent networks:
-
Object Detection/Tracking on Video - if
[conv-lstm]or[crnn]layers are used in additional to[connected]and[convolutional]layers -
Text generation - if
[lstm]or[rnn]layers are used in additional to [connected] layers-
track=1- if is set1then the training will be performed in Recurrents-tyle for image sequences -
time_steps=16- training will be performed for a random image sequence that contains 16 images fromtrain.txtfile- for
[convolutional]-layers:mini_batch = time_steps*batch/subdivisions - for
[conv_lstm]-recurrent-layers:mini_batch = batch/subdivisionsandsequence=16
- for
-
augment_speed=3- if set3then can be used each1st, 2nd or 3rdimage randomly, i.e. can be used 16 images with indexes0, 1, 2, ... 15or110, 113, 116, ... 155fromtrain.txtfile -
sequential_subdivisions=8- lower value increases the sequence of images, so iftime_steps=16 batch=16 sequential_subdivisions=8, then will be loadedtime_steps*batch/sequential_subdivisions = 16*16/8 = 32sequential images with the same data-augmentation, so the model will be trained for sequence of 32 video-frames -
seq_scales=0.5, 0.5- increasing sequence of images at some steps, i.e. the coefficients to which the originalsequential_subdivisionsvalue will be multiplied (andbatchwill be dividied, so the weights will be updated rarely) at correspondstepsif is usedpolicy=stepsorpolicy=sgdr
-