Skip to content

[GSoC22] Data Augmentation Module in OpenCV (imgaug) #3335

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 23 commits into
base: 4.x
Choose a base branch
from

Conversation

ZhaoChuyang
Copy link

@ZhaoChuyang ZhaoChuyang commented Aug 24, 2022

PR for GSoC'22 project on Efficient Data Augmentation Module in OpenCV for DL Training

I implemented data augmentation methods based on basic image processing and have tested the performance of these already implemented methods in python enviroment. The following table demonstrates the running time (in seconds) comparison between OpenCV-Aug module and torchvision transforms on a subset of ImageNet:

single method:

method config dataset opencv torchvision
resize size: (200, 200) imagenet-320 0.388 1.745
center crop size: (200, 200) imagenet-320 0.001 0.017
pad padding: (100, 100, 100, 100) imagenet-320 0.141 0.553
random crop size: (200, 200) imagenet-320 0.001 0.028
random resized crop size: (500, 500) imagenet-320 0.368 3.806
random flip default imagenet-320 0.046 0.519

compose multiple methods:

method config num_augs opencv pytorch
RandomCrop + RandomFlip + Pad RandomCrop: size(300, 300) Pad: padding(100, 100, 100, 100) 3 0.481 1.240
Resize + Pad + RandomFlip + CenterCrop Resize: size(400, 400) Pad: padding(100, 100, 100, 100) CenterCrop: size(200, 200) 5 0.636 5.486

Besides augmentation methods for pure images, augmentation methods for detection task and segmentation task is also added, which requires processing the target labels of corresponding tasks.

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • I agree to contribute to the project under Apache 2 License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
  • The PR is proposed to the proper branch
  • There is a reference to the original bug report and related work
  • There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake

@ZhaoChuyang ZhaoChuyang changed the title add imgaug module [GSoC22] Data Augmentation Module in OpenCV (imgaug) Aug 24, 2022
Copy link

@kaingwade kaingwade left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add the short-form license and the documentation for the functionalites.


image = compose(image)

plt.imshow(image)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you call cv2.imshow() here to show the result?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I will change the python sample in next commit.

Comment on lines 48 to 71
// CV_EXPORTS_W void randomCropV1(InputOutputArray _src, const Size& sz, const Vec4i& padding, bool pad_if_need, int fill, int padding_mode){
// Mat src = _src.getMat();
//
// if(padding != Vec4i()){
// copyMakeBorder(src, src, padding[0], padding[1], padding[2], padding[3], padding_mode, fill);
// }
//
// // NOTE: make sure src.rows == src.size().height and src.cols = src.size().width
// // pad the height if needed
// if(pad_if_need && src.rows < sz.height){
// Vec4i _padding = {sz.height - src.rows, sz.height - src.rows, 0, 0};
// copyMakeBorder(src, src, _padding[0], _padding[1], _padding[2], _padding[3], padding_mode, fill);
// }
// // pad the width if needed
// if(pad_if_need && src.cols < sz.width){
// Vec4i _padding = {0, 0, sz.width - src.cols, sz.width - src.cols};
// copyMakeBorder(src, src, _padding[0], _padding[1], _padding[2], _padding[3], padding_mode, fill);
// }
//
// int x, y;
// getRandomCropParams(src.rows, src.cols, sz.height, sz.width, &x, &y);
// Mat cropped(src, Rect(x, y, sz.width, sz.height));
// (*(Mat*)_src.getObj()) = cropped;
// }

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove code which are not finished, experimental, for debug and the like.

@ZhaoChuyang
Copy link
Author

Hi @kaingwade, thanks for the advice. I have added the license and add documentation for imgaug module. Junk codes in header files have also been cleaned.

@ZhaoChuyang ZhaoChuyang marked this pull request as draft August 29, 2022 12:27
@ZhaoChuyang ZhaoChuyang marked this pull request as ready for review August 29, 2022 13:47
@asenyaev
Copy link
Contributor

Hello @ZhaoChuyang!

I have added support for opencv_test_imgaug in CI. Could you push the latest commit again to re-run CI for this PR?

@ZhaoChuyang
Copy link
Author

Hi @asenyaev, I have commited the latest changes.

Comment on lines +17 to +20
extern uint64 state;

//! Random number generator for data augmentation module
extern cv::RNG rng;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Global variables with direct access is very bad idea:

  • C++ does not define initialization order. rng and state may be initialized in any order as soon as global variables in user's code. It becomes even worse, if library is linked statically.
  • It's not clear how to use this in multi-threaded environment.

I propose 2 options:

  • Use standard OpenCV theRNG. It uses thread local seed.
  • Provide own theRNG-like function. It returns reference to RNG object that could be re-initialized.

*/
CV_WRAP explicit Convert(int code);

/** @brief Apply data augmentation method on source image and its annotation.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please specify the method.

};

//! Convert the color space of the given image
class CV_EXPORTS_W Convert: public Transform{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I propose to rename to ColorConvert. Convert is too generic.

Comment on lines +13 to +29
template<> struct pyopencvVecConverter<Ptr<cv::imgaug::Transform> >
{
static bool to(PyObject* obj, std::vector<cv::Ptr<cv::imgaug::Transform> >& value, const ArgInfo& info)
{
return pyopencv_to_generic_vec(obj, value, info);
}

};

template<> struct pyopencvVecConverter<Ptr<cv::imgaug::det::Transform> >
{
static bool to(PyObject* obj, std::vector<cv::Ptr<cv::imgaug::det::Transform> >& value, const ArgInfo& info)
{
return pyopencv_to_generic_vec(obj, value, info);
}

};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need custom bindings for it?

* Brightness factor should be >= 0. When brightness factor is larger than 1, the output image will be brighter than original.
* When brightness factor is less than 1, the output image will be darker than original.
*/
void adjustBrightness(Mat& img, double brightness_factor);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes sense to use InputArray and OutputArray for all functions in this header. Rationale:

  • Generic OpenCV interface.
  • UMat support, if InputArray is promoted from Transform classes directly without getMat() call.

Comment on lines +31 to +43
template<> struct PyOpenCV_Converter<unsigned long long>
{
static bool to(PyObject* obj, unsigned long long& value, const ArgInfo& info){
if(!obj || obj == Py_None)
return true;
if(PyLong_Check(obj)){
value = PyLong_AsUnsignedLongLong(obj);
}else{
return false;
}
return value != (unsigned int)-1 || !PyErr_Occurred();
}
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not found long log or unsigned long long usage in interface. Most probably the manual binding is not required.

Comment on lines +46 to +47
std::vector<Mat> gray_arrays = {gray, gray, gray};
merge(gray_arrays, gray);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it chould be cvtColor with COLOR_GRAY2BGR.

std::vector<Mat> new_channels;
for(int i=0; i < num_channels; i++){
Mat& channel = channels[i];
Scalar avg = mean(channel);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function cv::mean calculates the mean value M of array elements, independently for each channel, and return it. No need to allocate array and iterate over channels the next arithmetic steps could be dome for all channels together without loop.

@asmorkalov
Copy link
Contributor

asmorkalov commented Oct 18, 2022

General notes on test code:

  • There is no need to store cropped/flipped/converted images in opencv_extra, if test checks single transformation. It's better to call reference function from OpenCV itself. The test should check new logic, but not OpenCV primitives behavior.
  • Do not use ts->set_failed_test_info. It old API. Just GTest EXPECT_XXX and ASSERT_XXX are simpler and it very clean which of condition fails.

@asmorkalov
Copy link
Contributor

@ZhaoChuyang Friendly reminder.

@ZhaoChuyang
Copy link
Author

Hi, sorry for the delay, I have been caught up in a DDL. I'll fix them ASAP.

@LaurentBerger
Copy link
Contributor

What's new about this module?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants