-
Install CUDA (make sure your CUDA version is compatible with your GPU)
-
Install dependencies
pip install -r requirements.txt
Note : Use Python 3.6 or newer
> python train.py -h
usage: train.py [-h] [--epochs E] [--batch-size B] [--learning-rate LR]
[--load LOAD] [--scale SCALE] [--interval INTERVAL] [--validation VAL] [--amp]
Train the UNet on images and target masks
optional arguments:
-h, --help show this help message and exit
--epochs E, -e E Number of epochs
--batch-size B, -b B Batch size
--learning-rate LR, -l LR
Learning rate
--load LOAD, -f LOAD Load model from a .pth file
--scale SCALE, -s SCALE
Downscaling factor of the images
--size, -sz SIZE Width and Height of the images
--interval INTERVAL Set training skipping steps
--validation VAL, -v VAL
Percent of the data that is used as validation (0-100)
--amp Use mixed precision
By default, the image resizing(preprocessing) option is size
is [224, 224], which is width and height. You can change it to different size as you wish. If you want to change the option to scale, you can add --scale
(1.0 is the max, which can give better results. If you want to use less memory, you can set it to 0.5.)
Automatic mixed precision is also available with the --amp
flag. Mixed precision allows the model to use less memory and to be faster on recent GPUs by using FP16 arithmetic. Enabling AMP is recommended.
After training your model and saving it to MODEL.pth
, you can easily test the output masks on your images via the CLI.
To predict a single image and save it:
python predict.py -i image.jpg -o output.jpg
To predict a multiple images and show them without saving them:
python predict.py -i image1.jpg image2.jpg --viz --no-save
> python predict.py -h
usage: predict.py [-h] [--model FILE] --input INPUT [INPUT ...]
[--output INPUT [INPUT ...]] [--viz] [--no-save]
[--mask-threshold MASK_THRESHOLD] [--scale SCALE]
[--size SIZE]
Predict masks from input images
optional arguments:
-h, --help show this help message and exit
--model FILE, -m FILE
Specify the file in which the model is stored
--input INPUT [INPUT ...], -i INPUT [INPUT ...]
Filenames of input images
--output INPUT [INPUT ...], -o INPUT [INPUT ...]
Filenames of output images
--viz, -v Visualize the images as they are processed
--no-save, -n Do not save the output masks
--mask-threshold MASK_THRESHOLD, -t MASK_THRESHOLD
Minimum probability value to consider a mask pixel white
--scale SCALE, -s SCALE
Scale factor for the input images
--size SIZE, -sz SIZE Width and Height of the images
You can specify which model file to use with --model MODEL.pth
.
Same as training, by default, the image resizing(preprocessing) option is size
is [224, 224], which is width and height. You can change it to different size as you wish. If you want to change the option to scale, you can add --scale
(1.0 is the max, which can give better results. If you want to use less memory, you can set it to 0.5.)
Note : Don't forget to change the directory path in train.py based on how you stored the figures. Those lines are on 23-25