diff --git a/modules/imgaug/CMakeLists.txt b/modules/imgaug/CMakeLists.txt
new file mode 100644
index 00000000000..7f3e19b6690
--- /dev/null
+++ b/modules/imgaug/CMakeLists.txt
@@ -0,0 +1,2 @@
+set(the_description "Data Augmentation Module")
+ocv_define_module(imgaug opencv_imgproc opencv_core opencv_imgcodecs opencv_highgui WRAP python)
diff --git a/modules/imgaug/LICENSE b/modules/imgaug/LICENSE
new file mode 100644
index 00000000000..d6456956733
--- /dev/null
+++ b/modules/imgaug/LICENSE
@@ -0,0 +1,202 @@
+
+                                 Apache License
+                           Version 2.0, January 2004
+                        http://www.apache.org/licenses/
+
+   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
+
+   1. Definitions.
+
+      "License" shall mean the terms and conditions for use, reproduction,
+      and distribution as defined by Sections 1 through 9 of this document.
+
+      "Licensor" shall mean the copyright owner or entity authorized by
+      the copyright owner that is granting the License.
+
+      "Legal Entity" shall mean the union of the acting entity and all
+      other entities that control, are controlled by, or are under common
+      control with that entity. For the purposes of this definition,
+      "control" means (i) the power, direct or indirect, to cause the
+      direction or management of such entity, whether by contract or
+      otherwise, or (ii) ownership of fifty percent (50%) or more of the
+      outstanding shares, or (iii) beneficial ownership of such entity.
+
+      "You" (or "Your") shall mean an individual or Legal Entity
+      exercising permissions granted by this License.
+
+      "Source" form shall mean the preferred form for making modifications,
+      including but not limited to software source code, documentation
+      source, and configuration files.
+
+      "Object" form shall mean any form resulting from mechanical
+      transformation or translation of a Source form, including but
+      not limited to compiled object code, generated documentation,
+      and conversions to other media types.
+
+      "Work" shall mean the work of authorship, whether in Source or
+      Object form, made available under the License, as indicated by a
+      copyright notice that is included in or attached to the work
+      (an example is provided in the Appendix below).
+
+      "Derivative Works" shall mean any work, whether in Source or Object
+      form, that is based on (or derived from) the Work and for which the
+      editorial revisions, annotations, elaborations, or other modifications
+      represent, as a whole, an original work of authorship. For the purposes
+      of this License, Derivative Works shall not include works that remain
+      separable from, or merely link (or bind by name) to the interfaces of,
+      the Work and Derivative Works thereof.
+
+      "Contribution" shall mean any work of authorship, including
+      the original version of the Work and any modifications or additions
+      to that Work or Derivative Works thereof, that is intentionally
+      submitted to Licensor for inclusion in the Work by the copyright owner
+      or by an individual or Legal Entity authorized to submit on behalf of
+      the copyright owner. For the purposes of this definition, "submitted"
+      means any form of electronic, verbal, or written communication sent
+      to the Licensor or its representatives, including but not limited to
+      communication on electronic mailing lists, source code control systems,
+      and issue tracking systems that are managed by, or on behalf of, the
+      Licensor for the purpose of discussing and improving the Work, but
+      excluding communication that is conspicuously marked or otherwise
+      designated in writing by the copyright owner as "Not a Contribution."
+
+      "Contributor" shall mean Licensor and any individual or Legal Entity
+      on behalf of whom a Contribution has been received by Licensor and
+      subsequently incorporated within the Work.
+
+   2. Grant of Copyright License. Subject to the terms and conditions of
+      this License, each Contributor hereby grants to You a perpetual,
+      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+      copyright license to reproduce, prepare Derivative Works of,
+      publicly display, publicly perform, sublicense, and distribute the
+      Work and such Derivative Works in Source or Object form.
+
+   3. Grant of Patent License. Subject to the terms and conditions of
+      this License, each Contributor hereby grants to You a perpetual,
+      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+      (except as stated in this section) patent license to make, have made,
+      use, offer to sell, sell, import, and otherwise transfer the Work,
+      where such license applies only to those patent claims licensable
+      by such Contributor that are necessarily infringed by their
+      Contribution(s) alone or by combination of their Contribution(s)
+      with the Work to which such Contribution(s) was submitted. If You
+      institute patent litigation against any entity (including a
+      cross-claim or counterclaim in a lawsuit) alleging that the Work
+      or a Contribution incorporated within the Work constitutes direct
+      or contributory patent infringement, then any patent licenses
+      granted to You under this License for that Work shall terminate
+      as of the date such litigation is filed.
+
+   4. Redistribution. You may reproduce and distribute copies of the
+      Work or Derivative Works thereof in any medium, with or without
+      modifications, and in Source or Object form, provided that You
+      meet the following conditions:
+
+      (a) You must give any other recipients of the Work or
+          Derivative Works a copy of this License; and
+
+      (b) You must cause any modified files to carry prominent notices
+          stating that You changed the files; and
+
+      (c) You must retain, in the Source form of any Derivative Works
+          that You distribute, all copyright, patent, trademark, and
+          attribution notices from the Source form of the Work,
+          excluding those notices that do not pertain to any part of
+          the Derivative Works; and
+
+      (d) If the Work includes a "NOTICE" text file as part of its
+          distribution, then any Derivative Works that You distribute must
+          include a readable copy of the attribution notices contained
+          within such NOTICE file, excluding those notices that do not
+          pertain to any part of the Derivative Works, in at least one
+          of the following places: within a NOTICE text file distributed
+          as part of the Derivative Works; within the Source form or
+          documentation, if provided along with the Derivative Works; or,
+          within a display generated by the Derivative Works, if and
+          wherever such third-party notices normally appear. The contents
+          of the NOTICE file are for informational purposes only and
+          do not modify the License. You may add Your own attribution
+          notices within Derivative Works that You distribute, alongside
+          or as an addendum to the NOTICE text from the Work, provided
+          that such additional attribution notices cannot be construed
+          as modifying the License.
+
+      You may add Your own copyright statement to Your modifications and
+      may provide additional or different license terms and conditions
+      for use, reproduction, or distribution of Your modifications, or
+      for any such Derivative Works as a whole, provided Your use,
+      reproduction, and distribution of the Work otherwise complies with
+      the conditions stated in this License.
+
+   5. Submission of Contributions. Unless You explicitly state otherwise,
+      any Contribution intentionally submitted for inclusion in the Work
+      by You to the Licensor shall be under the terms and conditions of
+      this License, without any additional terms or conditions.
+      Notwithstanding the above, nothing herein shall supersede or modify
+      the terms of any separate license agreement you may have executed
+      with Licensor regarding such Contributions.
+
+   6. Trademarks. This License does not grant permission to use the trade
+      names, trademarks, service marks, or product names of the Licensor,
+      except as required for reasonable and customary use in describing the
+      origin of the Work and reproducing the content of the NOTICE file.
+
+   7. Disclaimer of Warranty. Unless required by applicable law or
+      agreed to in writing, Licensor provides the Work (and each
+      Contributor provides its Contributions) on an "AS IS" BASIS,
+      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+      implied, including, without limitation, any warranties or conditions
+      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
+      PARTICULAR PURPOSE. You are solely responsible for determining the
+      appropriateness of using or redistributing the Work and assume any
+      risks associated with Your exercise of permissions under this License.
+
+   8. Limitation of Liability. In no event and under no legal theory,
+      whether in tort (including negligence), contract, or otherwise,
+      unless required by applicable law (such as deliberate and grossly
+      negligent acts) or agreed to in writing, shall any Contributor be
+      liable to You for damages, including any direct, indirect, special,
+      incidental, or consequential damages of any character arising as a
+      result of this License or out of the use or inability to use the
+      Work (including but not limited to damages for loss of goodwill,
+      work stoppage, computer failure or malfunction, or any and all
+      other commercial damages or losses), even if such Contributor
+      has been advised of the possibility of such damages.
+
+   9. Accepting Warranty or Additional Liability. While redistributing
+      the Work or Derivative Works thereof, You may choose to offer,
+      and charge a fee for, acceptance of support, warranty, indemnity,
+      or other liability obligations and/or rights consistent with this
+      License. However, in accepting such obligations, You may act only
+      on Your own behalf and on Your sole responsibility, not on behalf
+      of any other Contributor, and only if You agree to indemnify,
+      defend, and hold each Contributor harmless for any liability
+      incurred by, or claims asserted against, such Contributor by reason
+      of your accepting any such warranty or additional liability.
+
+   END OF TERMS AND CONDITIONS
+
+   APPENDIX: How to apply the Apache License to your work.
+
+      To apply the Apache License to your work, attach the following
+      boilerplate notice, with the fields enclosed by brackets "[]"
+      replaced with your own identifying information. (Don't include
+      the brackets!)  The text should be enclosed in the appropriate
+      comment syntax for the file format. We also recommend that a
+      file or class name and description of purpose be included on the
+      same "printed page" as the copyright notice for easier
+      identification within third-party archives.
+
+   Copyright [yyyy] [name of copyright owner]
+
+   Licensed under the Apache License, Version 2.0 (the "License");
+   you may not use this file except in compliance with the License.
+   You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
diff --git a/modules/imgaug/include/opencv2/imgaug.hpp b/modules/imgaug/include/opencv2/imgaug.hpp
new file mode 100644
index 00000000000..0781e57aa46
--- /dev/null
+++ b/modules/imgaug/include/opencv2/imgaug.hpp
@@ -0,0 +1,19 @@
+// This file is part of OpenCV project.
+// It is subject to the license terms in the LICENSE file found in the top-level directory
+// of this distribution and at http://opencv.org/license.html.
+#ifndef OPENCV_IMGAUG_HPP
+#define OPENCV_IMGAUG_HPP
+
+#include "opencv2/imgaug/transforms.hpp"
+#include "opencv2/imgaug/transforms_det.hpp"
+#include "opencv2/imgaug/functional.hpp"
+#include "opencv2/imgaug/rng.hpp"
+
+/** @defgroup imgaug Data Augmentation Module for Efficient Data Preprocessing
+ *  @{
+ *  @defgroup det Data Augmentation for Object Detection
+ *  @}
+*/
+
+
+#endif
\ No newline at end of file
diff --git a/modules/imgaug/include/opencv2/imgaug/functional.hpp b/modules/imgaug/include/opencv2/imgaug/functional.hpp
new file mode 100644
index 00000000000..8902b5e0c5a
--- /dev/null
+++ b/modules/imgaug/include/opencv2/imgaug/functional.hpp
@@ -0,0 +1,46 @@
+// This file is part of OpenCV project.
+// It is subject to the license terms in the LICENSE file found in the top-level directory
+// of this distribution and at http://opencv.org/license.html.
+#ifndef OPENCV_AUG_FUNCTIONAL_HPP
+#define OPENCV_AUG_FUNCTIONAL_HPP
+#include <opencv2/core.hpp>
+#include <vector>
+
+namespace cv {
+    //! @addtogroup imgaug
+    //! @{
+
+    /** @brief Adjust the brightness of the given image.
+     *
+     * @param img Source image. This operation is inplace.
+     * @param brightness_factor brightness factor which controls the brightness of the adjusted image.
+     * Brightness factor should be >= 0. When brightness factor is larger than 1, the output image will be brighter than original.
+     * When brightness factor is less than 1, the output image will be darker than original.
+     */
+    void adjustBrightness(Mat& img, double brightness_factor);
+
+    /** @brief Adjust the contrast of the given image.
+     *
+     * @param img Source image. This operation is inplace.
+     * @param contrast_factor contrast factor should be larger than 1. It controls the contrast of the adjusted image.
+     */
+    void adjustContrast(Mat& img, double contrast_factor);
+
+    /** @brief Adjust the saturation of the given image.
+     *
+     * @param img Source image. This operation is inplace.
+     * @param saturation_factor saturation factor should be larger than 1. It controls the saturation of the adjusted image.
+     */
+    void adjustSaturation(Mat& img, double saturation_factor);
+
+    /** @brief Adjust the hue of the given image.
+     *
+     * @param img Source image. This operation is inplace.
+     * @param hue_factor hue factor should be in range [-1, 1]. It controls the hue of the adjusted image.
+     */
+    void adjustHue(Mat& img, double hue_factor);
+
+    //! @}
+};
+
+#endif
\ No newline at end of file
diff --git a/modules/imgaug/include/opencv2/imgaug/rng.hpp b/modules/imgaug/include/opencv2/imgaug/rng.hpp
new file mode 100644
index 00000000000..7f283298218
--- /dev/null
+++ b/modules/imgaug/include/opencv2/imgaug/rng.hpp
@@ -0,0 +1,35 @@
+// This file is part of OpenCV project.
+// It is subject to the license terms in the LICENSE file found in the top-level directory
+// of this distribution and at http://opencv.org/license.html.
+#ifndef OPENCV_AUG_RNG_HPP
+#define OPENCV_AUG_RNG_HPP
+
+
+
+namespace cv{
+
+    namespace imgaug{
+        //! @addtogroup imgaug
+        //! @{
+
+        //! Initial state of the random number generator cv::imgaug::rng. If you don't manually set it using cv::imgaug::setSeed,
+        //! it will be set to the current tick count returned by cv::getTickCount.
+        extern uint64 state;
+
+        //! Random number generator for data augmentation module
+        extern cv::RNG rng;
+
+        /** @brief Manually set the initial state of the random number generator cv::imgaug::rng.
+         *
+         * @param seed The seed value needed to generate a random number.
+         */
+        CV_EXPORTS_W void setSeed(uint64 seed);
+
+        //! @}
+    }
+}
+
+
+
+
+#endif //OPENCV_AUG_RNG_HPP
diff --git a/modules/imgaug/include/opencv2/imgaug/transforms.hpp b/modules/imgaug/include/opencv2/imgaug/transforms.hpp
new file mode 100644
index 00000000000..269e2f2bf51
--- /dev/null
+++ b/modules/imgaug/include/opencv2/imgaug/transforms.hpp
@@ -0,0 +1,432 @@
+// This file is part of OpenCV project.
+// It is subject to the license terms in the LICENSE file found in the top-level directory
+// of this distribution and at http://opencv.org/license.html.
+#ifndef OPENCV_AUG_TRANSFORMS_HPP
+#define OPENCV_AUG_TRANSFORMS_HPP
+
+#include <opencv2/core.hpp>
+#include <opencv2/imgproc.hpp>
+#include <vector>
+
+
+namespace cv{
+    //! Data augmentation module
+    namespace imgaug{
+
+    //! @addtogroup imgaug
+    //! @{
+
+    //! Base class for all data augmentation classes.
+    class CV_EXPORTS_W Transform{
+    public:
+        CV_WRAP virtual void call(InputArray src, OutputArray dst) const = 0;
+        CV_WRAP virtual ~Transform() = default;
+    };
+
+    //! Combine a series of data augmentation methods into one and apply them sequentially.
+    class CV_EXPORTS_W Compose{
+    public:
+        /** @brief Initialize the Compose class by passing a series of data augmentation you want to apply.
+         *
+         * @param transforms Series of data augmentation methods. All data augmentation classes should inherited from cv::imgaug::Transform.
+         */
+        CV_WRAP explicit Compose(std::vector<Ptr<Transform> >& transforms);
+        /** @brief Call composed data augmentation methods, apply them to the input image sequentially.
+         *
+         * @param src Source image.
+         * @param dst Destination image.
+         *
+         * @note Some data augmentation methods only support images in certain formats.
+         */
+        CV_WRAP void call(InputArray src, OutputArray dst) const;
+
+        //! vector of the pointers to the data augmentation instances.
+        std::vector<Ptr<Transform> > transforms;
+    };
+
+    //! Crop the given image at a random location
+    class CV_EXPORTS_W RandomCrop: public Transform{
+    public:
+        /** @brief Initialize the RandomCrop class.
+         *
+         * @param sz Size of the cropped image.
+         * @param padding Padding on the borders of the source image. Four element tuple needs to be provided,
+         * which is the padding for the top, bottom, left and right respectively. By default no padding is added.
+         * @param pad_if_need When the cropped size is smaller than the source image (with padding), exception will raise.
+         * Set this value to true to automatically pad the image to avoid this exception.
+         * @param fill Fill value of the padded pixels. By default is 0.
+         * @param padding_mode Type of padding. Default is #BORDER_CONSTANT, see #BorderTypes for details.
+         */
+        CV_WRAP explicit RandomCrop(const Size& sz, const Vec4i& padding=Vec4i(0,0,0,0), bool pad_if_need=false, int fill=0, int padding_mode=BORDER_CONSTANT);
+
+        CV_WRAP ~RandomCrop() override = default;
+
+        /** @brief Apply augmentation method on source image, this operation is not inplace.
+         *
+         * @param src Source image.
+         * @param dst Destination image.
+         */
+        CV_WRAP void call(InputArray src, OutputArray dst) const override;
+
+        Size sz;
+        Vec4i padding;
+        bool pad_if_need;
+        int fill;
+        int padding_mode;
+    };
+
+    //! Flip the image randomly along specified axes.
+    class CV_EXPORTS_W RandomFlip: public Transform{
+    public:
+        /** Initialize the RandomFlip class.
+         *
+         * @param flipCode flipCode to specify the axis along which image is flipped. Set
+         * 0 for vertical axis, positive for horizontal axis, negative for both axes.
+         * \f[\texttt{dst} _{ij} =
+           \left\{
+           \begin{array}{l l}
+           \texttt{src} _{\texttt{src.rows}-i-1,j} & if\;  \texttt{flipCode} = 0 \\
+           \texttt{src} _{i, \texttt{src.cols} -j-1} & if\;  \texttt{flipCode} > 0 \\
+           \texttt{src} _{ \texttt{src.rows} -i-1, \texttt{src.cols} -j-1} & if\; \texttt{flipCode} < 0 \\
+           \end{array}
+           \right.\f]
+         * @param p Probability to apply this method. p should be in range 0 to 1, larger p denotes higher probability.
+         */
+        CV_WRAP explicit RandomFlip(int flipCode=0, double p=0.5);
+
+        CV_WRAP ~RandomFlip() override = default;
+
+        /** @brief Apply augmentation method on source image, this operation is inplace.
+         *
+         * @param src Source image.
+         * @param dst Destination image.
+         */
+        CV_WRAP void call(InputArray src, OutputArray dst) const override;
+
+        int flipCode;
+        double p;
+    };
+
+    //! Resize the image to specified size
+    class CV_EXPORTS_W Resize: public Transform{
+    public:
+        /** @brief Initialize the Resize class.
+         *
+         * @param sz Size of the resized image.
+         * @param interpolation Interpolation mode. Refer to #InterpolationFlags for more details.
+         */
+        CV_WRAP explicit Resize(const Size& sz, int interpolation=INTER_LINEAR);
+
+        CV_WRAP ~Resize() override = default;
+
+        /** @brief Apply augmentation method on source image. This operation is not inplace.
+         *
+         * @param src Source image.
+         * @param dst Destination image.
+         */
+        CV_WRAP void call(InputArray src, OutputArray dst) const override;
+
+        Size sz;
+        int interpolation;
+    };
+
+    //! Crop the given image at the center
+    class CV_EXPORTS_W CenterCrop : public Transform {
+    public:
+        /** @brief Initialize the CenterCrop class.
+         *
+         * @param size Size of the cropped image.
+         */
+        CV_WRAP explicit CenterCrop(const Size& size);
+
+        CV_WRAP ~CenterCrop() override = default;
+
+        /** @brief Apply augmentation method on source image. This operation is not inplace.
+         *
+         * @param src Source image.
+         * @param dst Destination image.
+         */
+        CV_WRAP void call(InputArray src, OutputArray dst) const override;
+
+        Size size;
+    };
+
+    //! Pad the given image on the borders.
+    class CV_EXPORTS_W Pad : public Transform{
+    public:
+        /** Initialize the Pad class.
+         *
+         * @param padding Padding on the borders of the source image. Four-elements tuple needs to be provided,
+         * which is the padding for the top, bottom, left and right respectively.
+         * @param fill Fill value of the padded pixels. By default fill value is 0 for all channels.
+         * @param padding_mode Type of padding. Default is #BORDER_CONSTANT, see #BorderTypes for details.
+         */
+        CV_WRAP explicit Pad(const Vec4i& padding, const Scalar& fill = Scalar(), int padding_mode = BORDER_CONSTANT);
+
+        CV_WRAP ~Pad() override = default;
+
+        /** @brief Apply augmentation method on source image. This operation is inplace.
+         *
+         * @param src Source image.
+         * @param dst Destination image.
+         */
+        CV_WRAP void call(InputArray src, OutputArray dst) const override;
+
+        Vec4i padding;
+        const Scalar fill;
+        int padding_mode;
+    };
+
+    //! Crop a random portion of image and resize it to a given size.
+    class CV_EXPORTS_W RandomResizedCrop : public Transform {
+    public:
+        /** @brief Initialize the RandomResizedCrop class.
+         *
+         * @param size Expected output size of the destination image.
+         * @param scale Specify the the lower and upper bounds for the random area of the crop,
+            before resizing. The scale is defined with respect to the area of the original image.
+         * @param ratio lower and upper bounds for the random aspect ratio of the crop, before
+            resizing.
+         * @param interpolation Interpolation mode. Refer to #InterpolationFlags for more details.
+         */
+        CV_WRAP explicit RandomResizedCrop(const Size& size, const Vec2d& scale = Vec2d(0.08, 1.0), const Vec2d& ratio = Vec2d(3.0 / 4.0, 4.0 / 3.0), int interpolation = INTER_LINEAR);
+
+        /** @brief Apply augmentation method on source image. This operation is not inplace.
+         *
+         * @param src Source image.
+         * @param dst Destination image.
+         */
+        CV_WRAP void call(InputArray src, OutputArray dst) const override;
+
+        Size size;
+        Vec2d scale;
+        Vec2d ratio;
+        int interpolation;
+    };
+
+    //! Change the brightness, contrast, saturation and hue of the given image randomly. The activated functions are applied in random order.
+    class CV_EXPORTS_W ColorJitter : public Transform {
+    public:
+        /** Initialize the ColorJitter class.
+         *
+         * @param brightness Specify the lower and upper bounds for the brightness factor.
+         * Brightness factor is >= 0. When brightness factor is 1, the brightness of the augmented image will not be changed.
+         * When brightness factor is larger, the augmented image is brighter.
+         * By default this function is disabled.
+         * You can also pass cv::Vec2d() to disable this function manually.
+         * @param contrast Specify the lower and upper bounds for the contrast factor.
+         * Contrast factor is >= 0. When contrast factor is 1, the contrast of the augmented image will not be changed.
+         * When contrast factor is larger, the contrast of the destination image is larger.
+         * By default this function is disabled. You can also pass cv::Vec2d() to disable this function manually.
+         * @param saturation Specify the lower and upper bounds for the saturation factor.
+         * Saturation factor is >= 0. When saturation factor is 1, the saturation of the augmented image will not be changed.
+         * When saturation factor is larger, the saturation of the destination image is larger.
+         * By default this function is disabled. You can also pass cv::Vec2d() to disable this function manually.
+         * @param hue Specify the lower and upper bounds for the hue factor.
+         * Hue factor should be in range of -1 to 1. When hue factor is 0, the hue of the augmented image will not be changed.
+         * By default this function is disabled. You can also pass cv::Vec2d() to disable this function manually.
+         */
+        CV_WRAP explicit ColorJitter(const Vec2d& brightness=Vec2d(), const Vec2d& contrast=Vec2d(), const Vec2d& saturation=Vec2d(), const Vec2d& hue=Vec2d());
+
+        /** Apply augmentation method on source image. This operation is not inplace.
+         *
+         * @param src Source image.
+         * @param dst Destination image.
+         */
+        CV_WRAP void call(InputArray src, OutputArray dst) const override;
+
+        Vec2d brightness;
+        Vec2d contrast;
+        Vec2d saturation;
+        Vec2d hue;
+    };
+
+    //! Rotate the given image by a random degree.
+    class CV_EXPORTS_W RandomRotation : public Transform {
+    public:
+        /** @brief Initialize the RandomRotation class.
+         *
+         * @param degrees Specify the lower and upper bounds for the rotation degree.
+         * @param interpolation Interpolation mode. Refer to #InterpolationFlags for more details.
+         * @param center Rotation center, origin is the left corner of the image. By default it is set to the center of the image.
+         * @param fill Fill value for the area outside the rotated image. Default is 0 for all channels.
+         */
+        CV_WRAP explicit RandomRotation(const Vec2d& degrees, int interpolation=INTER_LINEAR, const Point2f& center=Point2f(), const Scalar& fill=Scalar());
+
+        /** Apply augmentation method on source image. This operation is not inplace.
+         *
+         * @param src Source image.
+         * @param dst Destination image.
+         */
+        CV_WRAP void call(InputArray src, OutputArray dst) const override;
+
+        Vec2d degrees;
+        int interpolation;
+        Point2f center;
+        Scalar fill;
+    };
+
+    //! Convert the image into grayscale image of specified channels.
+    class CV_EXPORTS_W GrayScale : public Transform {
+    public:
+        /** @brief Initialize the GrayScale class.
+         *
+         * @param num_channels number of the channels of the destination image. All channels are same.
+         */
+        CV_WRAP explicit GrayScale(int num_channels=1);
+
+        /** @brief Apply augmentation method on source image. This operation is not inplace.
+         *
+         * @param src Source image.
+         * @param dst Destination image.
+         */
+        CV_WRAP void call(InputArray src, OutputArray dst) const override;
+
+        int num_channels;
+    };
+
+    //! Convert the given image into grayscale given a certain probability.
+    class CV_EXPORTS_W RandomGrayScale : public Transform {
+    public:
+        /** @brief Initialize the RandomGrayScale class.
+         *
+         * @param p Probability of turning a image into grayscale. p should be in range 0 to 1. A larger p means a higher probability.
+         */
+        CV_WRAP explicit RandomGrayScale(double p=0.1);
+
+        /** @brief Apply augmentation method on source image. This operation is not inplace.
+         *
+         * @param src Source image.
+         * @param dst Destination image.
+         */
+        CV_WRAP void call(InputArray src, OutputArray dst) const override;
+
+        double p;
+    };
+
+    //! Randomly erase a area of the given image.
+    class CV_EXPORTS_W RandomErasing : public Transform {
+    public:
+        /** Initialize the RandomErasing class.
+         *
+         * @param p Probability to apply the random erasing operation.
+         * @param scale Range of proportion of erased area against input image.
+         * @param ratio Range of aspect ratio of erased area.
+         * @param value Fill value of the erased area.
+         * @param inplace If true, erase the area on the source image.
+         * If false, erase the area on the destination image, which will not affect the source image.
+         */
+        CV_WRAP explicit RandomErasing(double p=0.5, const Vec2d& scale=Vec2d(0.02, 0.33), const Vec2d& ratio=Vec2d(0.3, 0.33), const Scalar& value=Scalar(0, 100, 100), bool inplace=false);
+
+        /** @brief Apply augmentation method on source image.
+         *
+         * @param src Source image.
+         * @param dst Destination image.
+         */
+        CV_WRAP void call(InputArray src, OutputArray dst) const override;
+
+        double p;
+        Vec2d scale;
+        Vec2d ratio;
+        Scalar value;
+        bool inplace;
+    };
+
+
+    //! Normalize given image with mean and standard deviation.
+    //! The destination image will be normalized into range 0 to 1 first,
+    //! then the normalization operation will be applied to each channel of the image.
+    class CV_EXPORTS_W Normalize : public Transform {
+    public:
+        /** @brief Initialize the Normalize class.
+         *
+         * @param mean Sequence of means for each channels.
+         * @param std Sequence of standard deviations for each channels.
+         *
+         * @note The image read in OpenCV is of type BGR by default, you should provide the mean and std in order of [B,G,R] if the type of source image is BGR.
+         */
+        CV_WRAP explicit Normalize(const Scalar& mean=Scalar(0,0,0,0), const Scalar& std=Scalar(1,1,1,1));
+
+        /** @brief Apply augmentation method on source image. This operation is not inplace.
+         *
+         * @param src Source image.
+         * @param dst Destination image.
+         */
+        CV_WRAP void call(InputArray src, OutputArray dst) const override;
+
+        Scalar mean;
+        Scalar std;
+    };
+
+    //! Blurs image with randomly chosen Gaussian blur.
+    class CV_EXPORTS_W GaussianBlur : public Transform {
+    public:
+        /** @brief Initialize the GaussianBlur class.
+         *
+         * @param kernel_size Size of the gaussian kernel.
+         * @param sigma Specify the lower and upper bounds of the standard deviation to be used for creating kernel to perform blurring.
+         */
+        CV_WRAP explicit GaussianBlur(const Size& kernel_size, const Vec2f& sigma=Vec2f(0.1, 2.0));
+
+        /** @brief Apply augmentation method on source image. This operation is not inplace.
+         *
+         * @param src Source image.
+         * @param dst Destination image.
+         */
+        CV_WRAP void call(InputArray src, OutputArray dst) const override;
+
+        Size kernel_size;
+        Vec2f sigma;
+    };
+
+    //! Apply random affine transformation to the image.
+    class CV_EXPORTS_W RandomAffine: public Transform{
+    public:
+        /** Initialize the RandomAffine class.
+         *
+         * @param degrees Range of rotation degrees to select from.
+         * @param translations Tuple of maximum absolute fraction for horizontal and vertical translations. By default translation is 0 in both directions.
+         * @param scales Scaling factor interval. The scale factor is sampled uniformly from the interval. By default scale factor is 1.
+         * @param shears Range of degrees to select from. Degree along x axis shear_x is sampled from range [shears[0], shear[1]]. Degree along y axis shear_y is sampled from range [shears[2], shear[3]]. By default, shear_x and shear_y are all 0.
+         * @param interpolation Interpolation mode. Refer to #InterpolationFlags for more details.
+         * @param fill Fill value of the area outside the transformed image.
+         * @param center Rotation center. Origin is the left corner of the image. By default it is set to the center of the image.
+         */
+        CV_WRAP explicit RandomAffine(const Vec2f& degrees=Vec2f(0., 0.), const Vec2f& translations=Vec2f(0., 0.), const Vec2f& scales=Vec2f(1., 1.), const Vec4f& shears=Vec4f(0., 0., 0., 0.), int interpolation=INTER_NEAREST, const Scalar& fill=Scalar(), const Point2i& center=Point2i(-1, -1));
+
+        /** @brief Apply augmentation method on source image. This operation is not inplace.
+         *
+         * @param src Source image.
+         * @param dst Destination image.
+         */
+        CV_WRAP void call(InputArray src, OutputArray dst) const override;
+
+        Vec2f degrees;
+        Vec2f translations;
+        Vec2f scales;
+        Vec4f shears;
+        int interpolation;
+        Scalar fill;
+        Point2i center;
+
+    };
+
+    //! @cond IGNORED
+    void grayScale(InputArray _src, OutputArray _dst, int num_channels);
+    void randomCrop(InputArray src, OutputArray dst, const Size& sz, const Vec4i& padding=Vec4i() , bool pad_if_need=false, int fill=0, int padding_mode=BORDER_CONSTANT);CV_EXPORTS_W void randomFlip(InputArray src, OutputArray dst, int flipCode=0, double p=0.5);
+    void centerCrop(InputArray src, OutputArray dst, const Size& size);
+    void randomResizedCrop(InputArray src, OutputArray dst, const Size& size, const Vec2d& scale = Vec2d(0.08, 1.0), const Vec2d& ratio = Vec2d(3.0 / 4.0, 4.0 / 3.0), int interpolation = INTER_LINEAR);
+    void colorJitter(InputArray src, OutputArray dst, const Vec2d& brightness=Vec2d(), const Vec2d& contrast=Vec2d(), const Vec2d& saturation=Vec2d(), const Vec2d& hue=Vec2d());
+    void randomRotation(InputArray src, OutputArray dst, const Vec2d& degrees, int interpolation=INTER_LINEAR, const Point2f& center=Point2f(), const Scalar& fill=Scalar(0));
+    void randomGrayScale(InputArray src, OutputArray dst, double p=0.1);
+    void randomErasing(InputArray src, OutputArray dst, double p=0.5, const Vec2d& scale=Vec2d(0.02, 0.33), const Vec2d& ratio=Vec2d(0.3, 0.33), const Scalar& value=Scalar(0, 100, 100), bool inplace=false);
+    void gaussianBlur(InputArray src, OutputArray dst, const Size& kernel_size, const Vec2f& sigma=Vec2f(0.1, 2.0));
+    void randomAffine(InputArray src, OutputArray dst, const Vec2f& degrees=Vec2f(0., 0.), const Vec2f& translations=Vec2f(0., 0.), const Vec2f& scales=Vec2f(1., 1.), const Vec4f& shears=Vec4f(0., 0., 0., 0.), int interpolation=INTER_NEAREST, const Scalar& fill=Scalar(), const Point2i& center=Point2i(-1, -1));
+    //! @endcond
+
+    //! @}
+
+    }
+}
+
+#endif
diff --git a/modules/imgaug/include/opencv2/imgaug/transforms_det.hpp b/modules/imgaug/include/opencv2/imgaug/transforms_det.hpp
new file mode 100644
index 00000000000..b3c8b6bd047
--- /dev/null
+++ b/modules/imgaug/include/opencv2/imgaug/transforms_det.hpp
@@ -0,0 +1,237 @@
+// This file is part of OpenCV project.
+// It is subject to the license terms in the LICENSE file found in the top-level directory
+// of this distribution and at http://opencv.org/license.html.
+#ifndef OPENCV_TRANSFORMS_DET_HPP
+#define OPENCV_TRANSFORMS_DET_HPP
+
+
+namespace cv{
+    namespace imgaug{
+        namespace det{
+
+            //! @addtogroup det
+            //! @{
+
+            //! Base class for all data augmentation classes for detection task
+            class CV_EXPORTS_W Transform{
+            public:
+                CV_WRAP virtual void call(InputArray src, OutputArray dst, CV_IN_OUT std::vector<cv::Rect>& bboxes, CV_IN_OUT std::vector<int>& labels) const = 0;
+                CV_WRAP virtual ~Transform() = default;
+            };
+
+            //! Combine data augmentation methods into one and apply them sequentially to source image and annotation
+            //! All combined data augmentation class must inherited from cv::imgaug::det::Transform
+            class CV_EXPORTS_W Compose : public Transform{
+            public:
+                /** @brief Initialize Compose class.
+                 *
+                 * @param transforms data augmentation methods used to compose
+                 */
+                CV_WRAP explicit Compose(std::vector<cv::Ptr<cv::imgaug::det::Transform> >& transforms);
+
+                /** @brief Apply data augmentation method on source image and its annotation.
+                 *
+                 * @param src Source image.
+                 * @param dst Destination image.
+                 * @param bboxes Annotation of source image, which consists of several bounding boxes of the detected objects in the source image.
+                 * In Python, the bounding box is represented as a four-elements tuple (x, y, w, h),
+                 * in which x, y is the coordinates of the left top corner of the bounding box and w, h is the width and height of the bounding box.
+                 * @param labels Class labels of the detected objects in source image. The order of the labels should correspond to the order of the bboxes.
+                 */
+                CV_WRAP void call(InputArray src, OutputArray dst, CV_IN_OUT std::vector<cv::Rect>& bboxes, CV_IN_OUT std::vector<int>& labels) const override;
+
+                std::vector<cv::Ptr<cv::imgaug::det::Transform> > transforms;
+            };
+
+            class CV_EXPORTS_W RandomFlip: public Transform{
+            public:
+                /** @brief Initialize the RandomFlip class.
+                 *
+                 * @param flipCode flipCode to specify the axis along which image is flipped. Set
+                 * 0 for vertical axis, positive for horizontal axis, negative for both axes.
+                 * \f[\texttt{dst} _{ij} =
+                   \left\{
+                   \begin{array}{l l}
+                   \texttt{src} _{\texttt{src.rows}-i-1,j} & if\;  \texttt{flipCode} = 0 \\
+                   \texttt{src} _{i, \texttt{src.cols} -j-1} & if\;  \texttt{flipCode} > 0 \\
+                   \texttt{src} _{ \texttt{src.rows} -i-1, \texttt{src.cols} -j-1} & if\; \texttt{flipCode} < 0 \\
+                   \end{array}
+                   \right.\f]
+                 * @param p Probability to apply this method. p should be in range 0 to 1, larger p denotes higher probability.
+                 */
+                CV_WRAP explicit RandomFlip(int flipCode=0, float p=0.5);
+
+                /** @brief Apply data augmentation method on source image and its annotation.
+                 *
+                 * @param src Source image.
+                 * @param dst Destination image.
+                 * @param bboxes Annotation of source image, which consists of several bounding boxes of the detected objects in the source image.
+                 * In Python, the bounding box is represented as a four-elements tuple (x, y, w, h),
+                 * in which x, y is the coordinates of the left top corner of the bounding box and w, h is the width and height of the bounding box.
+                 * @param labels Class labels of the detected objects in source image. The order of the labels should correspond to the order of the bboxes.
+                 */
+                CV_WRAP void call(InputArray src, OutputArray dst, CV_IN_OUT std::vector<cv::Rect>& bboxes, std::vector<int>& labels) const override;
+
+                /** @brief Flip the annotated bounding boxes.
+                 *
+                 * @param bboxes Bounding box annotations.
+                 * @param size The size of the source image.
+                 */
+                void flipBoundingBox(std::vector<cv::Rect>& bboxes, const Size& size) const;
+
+                int flipCode;
+                float p;
+            };
+
+//        class CV_EXPORTS_W RandomCrop: cv::det::Transform{
+//        public:
+//            CV_WRAP explicit RandomCrop(const Size& sz, const Vec4i& padding=Vec4i() , bool pad_if_need=false, const Scalar& fill=Scalar(), int padding_mode=BORDER_CONSTANT);
+//            CV_WRAP void call(InputArray src, OutputArray dst, std::vector<cv::Rect>& target) const;
+//
+//            const Size sz;
+//            Vec4i padding;
+//            bool pad_if_need;
+//            Scalar fill;
+//            int padding_mode;
+//        };
+
+
+            //! Resize the source image and its annotations into specified size.
+            class CV_EXPORTS_W Resize: public Transform{
+            public:
+                /** @brief Initialize the Resize class
+                 *
+                 * @param size Size of the resized image.
+                 * @param interpolation Interpolation mode when resize image, see #InterpolationFlags for details.
+                 */
+                CV_WRAP explicit Resize(const Size& size, int interpolation=INTER_NEAREST);
+
+                /** @brief Apply data augmentation method on source image and its annotation.
+                 *
+                 * @param src Source image.
+                 * @param dst Destination image.
+                 * @param bboxes Annotation of source image, which consists of several bounding boxes of the detected objects in the source image.
+                 * In Python, the bounding box is represented as a four-elements tuple (x, y, w, h),
+                 * in which x, y is the coordinates of the left top corner of the bounding box and w, h is the width and height of the bounding box.
+                 * @param labels Class labels of the detected objects in source image. The order of the labels should correspond to the order of the bboxes.
+                 */
+                CV_WRAP void call(InputArray src, OutputArray dst, CV_IN_OUT std::vector<cv::Rect>& bboxes, std::vector<int>& labels) const override;
+
+                /** @brief Resize the bounding boxes of the detected objects in the source image.
+                 *
+                 * @param bboxes Bounding box annotations.
+                 * @param imgSize The size of the source image.
+                 */
+                void resizeBoundingBox(std::vector<cv::Rect>& bboxes, const Size& imgSize) const;
+
+                const Size size;
+                int interpolation;
+            };
+
+            //! Convert the color space of the given image
+            class CV_EXPORTS_W Convert: public Transform{
+            public:
+                /** @brief Initialize the Convert class
+                 *
+                 * @param code color space conversion code (see #ColorConversionCodes).
+                 */
+                CV_WRAP explicit Convert(int code);
+
+                /** @brief Apply data augmentation method on source image and its annotation.
+                 *
+                 * @param src Source image.
+                 * @param dst Destination image.
+                 * @param bboxes Annotation of source image, which consists of several bounding boxes of the detected objects in the source image.
+                 * In Python, the bounding box is represented as a four-elements tuple (x, y, w, h),
+                 * in which x, y is the coordinates of the left top corner of the bounding box and w, h is the width and height of the bounding box.
+                 * @param labels Class labels of the detected objects in source image. The order of the labels should correspond to the order of the bboxes.
+                 */
+                CV_WRAP void call(InputArray src, OutputArray dst, CV_IN_OUT std::vector<cv::Rect>& bboxes, std::vector<int>& labels) const override;
+
+                int code;
+            };
+
+            //! Randomly translate the given image.
+            //! Bounding boxes which has an area of less than the threshold in the remaining in the transformed image
+            //! will be filtered.
+            //! The resolution of the image is not changed after the transformation. The remaining area after shift is filled with 0.
+            class CV_EXPORTS_W RandomTranslation: public Transform{
+            public:
+                /** @brief Initialize the RandomTranslation class
+                 *
+                 * @param translations Contains two elements tx and ty, representing tha maximum translation distances
+                 * along x axis and y axis in pixels. tx and ty must be >= 0. The actual translation distances along x and y axes
+                 * are sampled uniformly from [-tx, tx] and [-ty, ty].
+                 * @param threshold Bounding boxes with area in the remaining image less than threshold will be dropped.
+                 */
+                CV_WRAP explicit RandomTranslation(const Vec2i& translations, float threshold=0.25);
+
+                /** @brief Apply data augmentation method on source image and its annotation.
+                 *
+                 * @param src Source image.
+                 * @param dst Destination image.
+                 * @param bboxes Annotation of source image, which consists of several bounding boxes of the detected objects in the source image.
+                 * In Python, the bounding box is represented as a four-elements tuple (x, y, w, h),
+                 * in which x, y is the coordinates of the left top corner of the bounding box and w, h is the width and height of the bounding box.
+                 * @param labels Class labels of the detected objects in source image. The order of the labels should correspond to the order of the bboxes.
+                 */
+                CV_WRAP void call(InputArray src, OutputArray dst, CV_IN_OUT std::vector<cv::Rect>& bboxes, std::vector<int>& labels) const override;
+
+                /** @brief Translate bounding boxes and filter invalid bounding boxes after translation.
+                 *
+                 * @param bboxes Bounding box annotations.
+                 * @param labels Class labels of the detected objects in source image.
+                 * @param imgSize Size of the source image.
+                 * @param tx Translation in x axis in pixel.
+                 * @param ty Translation in y axis in pixel.
+                 */
+                CV_WRAP void translateBoundingBox(std::vector<cv::Rect>& bboxes, std::vector<int> &labels, const Size& imgSize, int tx, int ty) const;
+
+                Vec2i translations;
+                float threshold;
+            };
+
+            //! Rotate the given image and its bounding boxes by a random angle.
+            //! Filter invalid bounding boxes if its remaining area in the destination image is less than threshold.
+            //! The size of the destination image is not changed. The remaining area in the destination image is filled with 0.
+            class CV_EXPORTS_W RandomRotation: public Transform{
+            public:
+                /** @brief Initialize the RandomRotation class.
+                 *
+                 * @param angles Intervals in which the rotation angle is uniformly sampled from.
+                 * @param threshold Bounding boxes with area in the remaining image less than threshold will be dropped.
+                 */
+                explicit RandomRotation(const Vec2d& angles, double threshold=0.25);
+
+                /** @brief Apply data augmentation method on source image and its annotation.
+                 *
+                 * @param src Source image.
+                 * @param dst Destination image.
+                 * @param bboxes Annotation of source image, which consists of several bounding boxes of the detected objects in the source image.
+                 * In Python, the bounding box is represented as a four-elements tuple (x, y, w, h),
+                 * in which x, y is the coordinates of the left top corner of the bounding box and w, h is the width and height of the bounding box.
+                 * @param labels Class labels of the detected objects in source image. The order of the labels should correspond to the order of the bboxes.
+                 */
+                CV_WRAP void call(InputArray src, OutputArray dst, CV_IN_OUT std::vector<cv::Rect>& bboxes, std::vector<int>& labels) const override;
+
+                /** @brief Rotate bounding boxes and filter out invalid bounding boxes after rotation.
+                 *
+                 * @param bboxes Bounding box annotations.
+                 * @param labels Class labels of the detected objects in source image.
+                 * @param angle Rotation angle in degree.
+                 * @param cx x coordinate of the rotation center.
+                 * @param cy y coordinate of the rotation center.
+                 * @param imgSize Size of the destination image, used for clamping the coordinates of bounding boxes.
+                 */
+                CV_WRAP void rotateBoundingBoxes(std::vector<cv::Rect>& bboxes, std::vector<int> &labels, double angle, int cx, int cy, const Size& imgSize) const;
+
+                Vec2d angles;
+                double threshold;
+            };
+
+            //! @}
+        }
+    }
+}
+
+#endif //OPENCV_TRANSFORMS_DET_HPP
diff --git a/modules/imgaug/misc/python/pyopencv_imgaug.hpp b/modules/imgaug/misc/python/pyopencv_imgaug.hpp
new file mode 100644
index 00000000000..a829fbe3cf9
--- /dev/null
+++ b/modules/imgaug/misc/python/pyopencv_imgaug.hpp
@@ -0,0 +1,45 @@
+// This file is part of OpenCV project.
+// It is subject to the license terms in the LICENSE file found in the top-level directory
+// of this distribution and at http://opencv.org/license.html.
+#ifndef OPENCV_AUG_MISC_PYTHON_HPP
+#define OPENCV_AUG_MISC_PYTHON_HPP
+typedef std::vector<cv::Ptr<cv::imgaug::Transform> > vector_Ptr_Transform;
+typedef std::vector<cv::Ptr<cv::imgaug::det::Transform> > vector_Ptr_imgaug_det_Transform;
+
+//template<>
+//bool pyopencv_to(PyObject *o, std::vector<Ptr<cv::Transform> > &value, const ArgInfo& info){
+//    return pyopencv_to_generic_vec(o, value, info);
+//}
+template<> struct pyopencvVecConverter<Ptr<cv::imgaug::Transform> >
+{
+    static bool to(PyObject* obj, std::vector<cv::Ptr<cv::imgaug::Transform> >& value, const ArgInfo& info)
+    {
+        return pyopencv_to_generic_vec(obj, value, info);
+    }
+
+};
+
+template<> struct pyopencvVecConverter<Ptr<cv::imgaug::det::Transform> >
+{
+    static bool to(PyObject* obj, std::vector<cv::Ptr<cv::imgaug::det::Transform> >& value, const ArgInfo& info)
+    {
+        return pyopencv_to_generic_vec(obj, value, info);
+    }
+
+};
+
+template<> struct PyOpenCV_Converter<unsigned long long>
+{
+    static bool to(PyObject* obj, unsigned long long& value, const ArgInfo& info){
+        if(!obj || obj == Py_None)
+            return true;
+        if(PyLong_Check(obj)){
+            value = PyLong_AsUnsignedLongLong(obj);
+        }else{
+            return false;
+        }
+        return value != (unsigned int)-1 || !PyErr_Occurred();
+    }
+};
+
+#endif
\ No newline at end of file
diff --git a/modules/imgaug/samples/det_compose_sample.cpp b/modules/imgaug/samples/det_compose_sample.cpp
new file mode 100644
index 00000000000..d5a3656879a
--- /dev/null
+++ b/modules/imgaug/samples/det_compose_sample.cpp
@@ -0,0 +1,50 @@
+#include <opencv2/core.hpp>
+#include <opencv2/imgcodecs.hpp>
+#include <opencv2/highgui.hpp>
+#include <opencv2/imgaug.hpp>
+#include <vector>
+
+using namespace cv;
+
+
+static void drawBoundingBoxes(Mat& img, std::vector<Rect>& bboxes){
+    for(cv::Rect bbox: bboxes){
+        cv::Point tl {bbox.x, bbox.y};
+        cv::Point br {bbox.x + bbox.width, bbox.y + bbox.height};
+        cv::rectangle(img, tl, br, cv::Scalar(0, 255, 0), 2);
+    }
+}
+
+
+int main(){
+    Mat src = imread(samples::findFile("lena.jpg"), IMREAD_COLOR);
+    Mat dst;
+
+    std::vector<Rect> bboxes{
+            Rect{112, 40, 249, 343},
+            Rect{61, 273, 113, 228}
+    };
+
+    std::vector<int> labels {1, 2};
+
+    Mat ori_src;
+    src.copyTo(ori_src);
+    drawBoundingBoxes(ori_src, bboxes);
+
+    imgaug::det::RandomRotation randomRotation(Vec2d(-30, 30));
+    imgaug::det::RandomFlip randomFlip(1);
+    imgaug::det::Resize resize(Size(224, 224));
+
+    std::vector<Ptr<imgaug::det::Transform> > transforms {&randomRotation, &randomFlip, &resize};
+    imgaug::det::Compose aug(transforms);
+
+    aug.call(src, dst, bboxes, labels);
+
+    drawBoundingBoxes(dst, bboxes);
+
+    imshow("src", ori_src);
+    imshow("dst", dst);
+    waitKey(0);
+
+    return 0;
+}
\ No newline at end of file
diff --git a/modules/imgaug/samples/det_sample.cpp b/modules/imgaug/samples/det_sample.cpp
new file mode 100644
index 00000000000..2e553e44127
--- /dev/null
+++ b/modules/imgaug/samples/det_sample.cpp
@@ -0,0 +1,43 @@
+#include <opencv2/core.hpp>
+#include <opencv2/imgcodecs.hpp>
+#include <opencv2/highgui.hpp>
+#include <opencv2/imgaug.hpp>
+
+using namespace cv;
+
+
+static void drawBoundingBoxes(Mat& img, std::vector<Rect>& bboxes){
+    for(cv::Rect bbox: bboxes){
+        cv::Point tl {bbox.x, bbox.y};
+        cv::Point br {bbox.x + bbox.width, bbox.y + bbox.height};
+        cv::rectangle(img, tl, br, cv::Scalar(0, 255, 0), 2);
+    }
+}
+
+
+int main(){
+    Mat src = imread(samples::findFile("lena.jpg"), IMREAD_COLOR);
+    Mat dst;
+
+    std::vector<Rect> bboxes{
+            Rect{112, 40, 249, 343},
+            Rect{61, 273, 113, 228}
+    };
+
+    std::vector<int> labels {1, 2};
+
+    Mat ori_src;
+    src.copyTo(ori_src);
+    drawBoundingBoxes(ori_src, bboxes);
+
+    imgaug::det::RandomRotation aug(Vec2d(-30, 30));
+    aug.call(src, dst, bboxes, labels);
+
+    drawBoundingBoxes(dst, bboxes);
+
+    imshow("src", ori_src);
+    imshow("dst", dst);
+    waitKey(0);
+
+    return 0;
+}
\ No newline at end of file
diff --git a/modules/imgaug/samples/opencv_aug_demo.py b/modules/imgaug/samples/opencv_aug_demo.py
new file mode 100644
index 00000000000..a6dd8f45170
--- /dev/null
+++ b/modules/imgaug/samples/opencv_aug_demo.py
@@ -0,0 +1,55 @@
+import cv2
+import copy
+
+
+def random_crop(image):
+    transform = cv2.imgaug.RandomCrop((300, 300))
+    return transform.call(image)
+
+
+def random_flip(image):
+    transform = cv2.imgaug.RandomFlip(flipCode=1, p=0.8)
+    return transform.call(image)
+
+
+def center_crop(image):
+    transform = cv2.imgaug.CenterCrop(size=(100, 100))
+    return transform.call(image)
+
+
+def pad(image):
+    transform = cv2.imgaug.Pad(padding=(10, 10, 10, 10))
+    return transform.call(image)
+
+
+def random_resized_crop(image):
+    transform = cv2.imgaug.RandomResizedCrop(size=(100, 100))
+    return transform.call(image)
+
+
+def compose(image):
+    transform = cv2.imgaug.Compose([
+        cv2.imgaug.Resize((1024, 1024)),
+        cv2.imgaug.RandomCrop((800, 800)),
+        cv2.imgaug.RandomFlip(),
+        cv2.imgaug.CenterCrop((512, 512)),
+    ])
+    return transform.call(image)
+
+
+def main():
+    # read image
+    input_path = "../../../samples/data/corridor.jpg"
+    src = cv2.imread(input_path)
+
+    while True:
+        image = copy.copy(src)
+        image = compose(image)
+        cv2.imshow("dst", image)
+        ch = cv2.waitKey(1000)
+        if ch == 27:
+            break
+
+
+if __name__ == '__main__':
+    main()
diff --git a/modules/imgaug/samples/train_cls_net.py b/modules/imgaug/samples/train_cls_net.py
new file mode 100644
index 00000000000..d5944e18df6
--- /dev/null
+++ b/modules/imgaug/samples/train_cls_net.py
@@ -0,0 +1,99 @@
+import os
+import pandas as pd
+import argparse
+import torch
+import cv2
+from torchvision import transforms
+from torchvision.models import resnet18
+from torch.utils import data
+import numpy as np
+import time
+import tqdm
+
+
+def get_args():
+    parser = argparse.ArgumentParser()
+    parser.add_argument("--root", type=str, default="imagenette2-320")
+    parser.add_argument("--lr", type=float, default=3e-4)
+
+    return parser.parse_args()
+
+
+class ImagenetteDataset(torch.utils.data.Dataset):
+    def __init__(self, root, df_data, mode='train', transform=None):
+        super(ImagenetteDataset, self).__init__()
+        assert mode in ['train', 'valid']
+
+        self.root = root
+        self.transform = transform
+        labels = ['n01440764', 'n02102040', 'n02979186', 'n03000684', 'n03028079', 'n03394916', 'n03417042', 'n03425413', 'n03445777', 'n03888257']
+        self.label_to_num = {v: k for k, v in enumerate(labels)}
+
+        if mode == 'train':
+            self.df_data = df_data[df_data['is_valid'] == False][:256]
+        else:
+            self.df_data = df_data[df_data['is_valid'] == True]
+
+    def __len__(self):
+        return len(self.df_data)
+
+    def __getitem__(self, idx):
+        path = self.df_data.iloc[idx]['path']
+        path = os.path.join(self.root, path)
+        image = self.get_image(path)
+        label = path.split('/')[-2]
+        label = self.label_to_num[label]
+        return image, label
+
+    def get_image(self, path):
+        image = cv2.imread(path)
+        if self.transform:
+            image = self.transform.call(image)
+        image = np.transpose(image, (2, 0, 1))
+        return torch.tensor(image, dtype=torch.float)
+
+
+def train(dataloader, model, num_epochs, criterion, optimizer):
+    start = time.time()
+    for epoch in range(num_epochs):
+        model.train()
+
+        for inputs, targets in tqdm.tqdm(dataloader, total=len(dataloader)):
+            optimizer.zero_grad()
+            preds = model(inputs)
+            loss = criterion(preds, targets)
+            loss.backward()
+            optimizer.step()
+
+    end = time.time()
+    print(end-start)
+
+
+def main():
+    args = get_args()
+    root_dir = args.root
+    lr = args.lr
+
+    df_train = pd.read_csv(os.path.join(root_dir, "noisy_imagenette.csv"))
+    print('load %d records' % len(df_train))
+
+    transforms = cv2.Compose([
+        cv2.RandomCrop((300, 300), (0,0,0,0)),
+        cv2.RandomFlip(),
+        cv2.Resize((500, 500)),
+        cv2.Normalize(mean=(0.406, 0.456, 0.485), std=(0.225, 0.224, 0.229))
+    ])
+
+    train_set = ImagenetteDataset(root_dir, df_train, 'train', transforms)
+
+    train_loader = data.DataLoader(train_set, num_workers=0, batch_size=16, drop_last=True, shuffle=True)
+    model = resnet18(pretrained=True)
+    model.fc = torch.nn.Linear(in_features=512, out_features=10)
+    optimizer = torch.optim.Adam(model.parameters(), lr=lr)
+    criterion = torch.nn.CrossEntropyLoss()
+
+    train(train_loader, model, 1, criterion, optimizer)
+
+
+if __name__ == '__main__':
+    main()
diff --git a/modules/imgaug/samples/train_det_net.py b/modules/imgaug/samples/train_det_net.py
new file mode 100644
index 00000000000..2230445e582
--- /dev/null
+++ b/modules/imgaug/samples/train_det_net.py
@@ -0,0 +1,151 @@
+import os
+import time
+
+import numpy as np
+import torch
+import cv2
+import argparse
+import torchvision
+from tqdm import tqdm
+
+
+def get_args():
+    parser = argparse.ArgumentParser()
+    parser.add_argument("--root", type=str, default="PennFudanPed")
+    parser.add_argument("--lr", type=float, default=3e-4)
+
+    return parser.parse_args()
+
+
+class PennFudanDataset(torch.utils.data.Dataset):
+    def __init__(self, root, transforms=None):
+        self.root = root
+        self.transforms = transforms
+        # load all image files, sorting them to
+        # ensure that they are aligned
+        self.imgs = list(sorted(os.listdir(os.path.join(root, "PNGImages"))))
+        self.masks = list(sorted(os.listdir(os.path.join(root, "PedMasks"))))
+
+    def _get_boxes(self, mask):
+        obj_ids = np.unique(mask)
+        # first id is the background, so remove it
+        obj_ids = obj_ids[1:]
+
+        # split the color-encoded mask into a set
+        # of binary masks
+        masks = mask == obj_ids[:, None, None]
+
+        # get bounding box coordinates for each mask
+        num_objs = len(obj_ids)
+        for i in range(num_objs):
+            pos = np.where(masks[i])
+            xmin = np.min(pos[1])
+            xmax = np.max(pos[1])
+            ymin = np.min(pos[0])
+            ymax = np.max(pos[0])
+            yield xmin, ymin, xmax, ymax
+
+    def __getitem__(self, idx):
+        # load images and masks
+        img_path = os.path.join(self.root, "PNGImages", self.imgs[idx])
+        mask_path = os.path.join(self.root, "PedMasks", self.masks[idx])
+        img = cv2.imread(img_path)
+        img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
+        # mask is array of size (H, W), all elements of array are integers
+        # background is 0, and each distinct person is represented as a distinct integer starting from 1
+        # you can treat mask as grayscale image
+        mask = cv2.imread(mask_path, cv2.IMREAD_GRAYSCALE)
+        boxes = []
+        for x1, y1, x2, y2 in self._get_boxes(mask):
+            # NOTE: in opencv, box is represented as (x, y, width, height)
+            boxes.append([x1, y1, x2-x1, y2-y1])
+        num_objs = len(boxes)
+        labels = torch.ones((num_objs,), dtype=torch.int64)
+
+        if self.transforms is not None:
+            img, boxes = self.transforms.call(img, boxes)
+
+        # 1. transpose from (h, w, c) to (c, h, w)
+        # 2. normalize data into range 0-1
+        # 3. convert from np.array to torch.tensor
+        img = torch.tensor(np.transpose(img, (2, 0, 1)), dtype=torch.float32)
+        boxes = [[x1, y1, x1+width, y1+height] for x1, y1, width, height in boxes]
+        boxes = torch.as_tensor(boxes, dtype=torch.float32)
+
+        return img, boxes, labels
+
+    def __len__(self):
+        return len(self.imgs)
+
+    @staticmethod
+    def collate_fn(batch):
+        images = list()
+        boxes = list()
+        labels = list()
+        targets = list()
+
+        for item in batch:
+            images.append(item[0])
+            # boxes.append(item[1])
+            # labels.append(item[2])
+            target = {"boxes": item[1], "labels": item[2]}
+            targets.append(target)
+
+        images = torch.stack(images, dim=0)
+
+        return images, targets
+
+
+def get_transforms():
+
+    transforms = cv2.det.Compose([
+        cv2.det.RandomFlip(),
+        cv2.det.Resize((500, 500)),
+    ])
+
+    return transforms
+
+
+def train(num_epochs, device, model, dataloader, optimizer):
+    for epoch in range(num_epochs):
+        model.train()
+        for batch in tqdm(dataloader, total=len(dataloader)):
+            optimizer.zero_grad()
+
+            images, targets = batch
+            images = images.to(device)
+
+            outputs = model(images, targets)
+            losses = sum(outputs.values())
+
+            losses.backward()
+            optimizer.step()
+
+
+def main():
+    args = get_args()
+
+    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
+
+    transforms = get_transforms()
+    dataset = PennFudanDataset(args.root, transforms=transforms)
+
+    indices = torch.randperm(len(dataset)).tolist()
+    train_set = torch.utils.data.Subset(dataset, indices[:-50])
+    test_set = torch.utils.data.Subset(dataset, indices[-50:])
+
+    train_loader = torch.utils.data.DataLoader(train_set, batch_size=4, shuffle=True, num_workers=0, collate_fn=PennFudanDataset.collate_fn)
+    test_loader = torch.utils.data.DataLoader(test_set, batch_size=4, shuffle=False, num_workers=0, collate_fn=PennFudanDataset.collate_fn)
+
+    model = torchvision.models.detection.fasterrcnn_resnet50_fpn(weights="DEFAULT").to(device)
+
+    parameters = model.parameters()
+    optimizer = torch.optim.AdamW(parameters, lr=args.lr)
+    start = time.time()
+    train(2, device, model, train_loader, optimizer)
+    end = time.time()
+    print(end-start)
+
+
+if __name__ == '__main__':
+    main()
diff --git a/modules/imgaug/src/functional.cpp b/modules/imgaug/src/functional.cpp
new file mode 100644
index 00000000000..73c0851b022
--- /dev/null
+++ b/modules/imgaug/src/functional.cpp
@@ -0,0 +1,76 @@
+#include "precomp.hpp"
+
+namespace cv{
+
+    void adjustBrightness(Mat& img, double brightness_factor){
+        CV_Assert(brightness_factor >= 0);
+
+        int channels = img.channels();
+        if(channels != 1 && channels != 3){
+            CV_Error(Error::BadNumChannels, "Only support images with 1 or 3 channels");
+        }
+        img = img * brightness_factor;
+    }
+
+    void adjustContrast(Mat& img, double contrast_factor){
+        CV_Assert(contrast_factor >= 0);
+
+        int num_channels = img.channels();
+        if(num_channels != 1 && num_channels != 3){
+            CV_Error(Error::BadNumChannels, "Only support images with 1 or 3 channels");
+        }
+       Mat* channels = new Mat[num_channels];
+        split(img, channels);
+        std::vector<Mat> new_channels;
+        for(int i=0; i < num_channels; i++){
+            Mat& channel = channels[i];
+            Scalar avg = mean(channel);
+            Mat avg_mat(channel.size(), channel.type(), avg);
+            Mat new_channel = contrast_factor * channel + (1-contrast_factor) * avg_mat;
+            new_channels.push_back(new_channel);
+        }
+        merge(new_channels, img);
+        delete[] channels;
+    }
+
+    void adjustSaturation(Mat& img, double saturation_factor){
+        CV_Assert(saturation_factor >= 0);
+
+        int num_channels = img.channels();
+        if(num_channels != 1 && num_channels != 3){
+            CV_Error(Error::BadNumChannels, "Only support images with 1 or 3 channels");
+        }
+        if(img.channels() == 1) return;
+        Mat gray;
+        cvtColor(img, gray, COLOR_BGR2GRAY);
+        std::vector<Mat> gray_arrays = {gray, gray, gray};
+        merge(gray_arrays, gray);
+        img = saturation_factor * img + (1-saturation_factor) * gray;
+    }
+
+    void adjustHue(Mat& img, double hue_factor) {
+        // FIXME: the range of hue_factor needs to be modified
+        CV_Assert(hue_factor >= -1 && hue_factor <= 1);
+
+        int num_channels = img.channels();
+        if (num_channels != 1 && num_channels != 3) {
+            CV_Error(Error::BadNumChannels, "Only support images with 1 or 3 channels");
+        }
+
+        if (num_channels == 1) return;
+        int hue_shift = saturate_cast<int> (hue_factor * 180);
+        Mat hsv;
+        cvtColor(img, hsv, COLOR_BGR2HSV);
+        for (int j=0; j<img.rows; j++){
+            for (int i=0; i<img.cols; i++){
+                int h = hsv.at<Vec3b>(j, i)[0];
+                if(h + hue_shift > 180)
+                    h =  h + hue_shift - 180;
+                else
+                    h = h + hue_shift;
+                hsv.at<Vec3b>(j, i)[0] = h;
+            }
+        }
+        cvtColor(hsv, img, COLOR_HSV2BGR);
+    }
+}
diff --git a/modules/imgaug/src/precomp.hpp b/modules/imgaug/src/precomp.hpp
new file mode 100644
index 00000000000..d8bc71cdcd9
--- /dev/null
+++ b/modules/imgaug/src/precomp.hpp
@@ -0,0 +1,13 @@
+// This file is part of OpenCV project.
+// It is subject to the license terms in the LICENSE file found in the top-level directory
+// of this distribution and at http://opencv.org/license.html.
+#ifndef OPENCV_AUG_PRECOMP_H
+#define OPENCV_AUG_PRECOMP_H
+
+#include "opencv2/imgaug.hpp"
+#include <cstdlib>
+#include <ctime>
+#include <cmath>
+#include <algorithm>
+
+#endif
diff --git a/modules/imgaug/src/rng.cpp b/modules/imgaug/src/rng.cpp
new file mode 100644
index 00000000000..262ef28275d
--- /dev/null
+++ b/modules/imgaug/src/rng.cpp
@@ -0,0 +1,15 @@
+// This file is part of OpenCV project.
+// It is subject to the license terms in the LICENSE file found in the top-level directory
+// of this distribution and at http://opencv.org/license.html.
+#include "precomp.hpp"
+
+namespace cv{
+    namespace imgaug{
+        uint64 state = getTickCount();
+        RNG rng(state);
+
+        void setSeed(uint64 seed){
+            rng.state = seed;
+        }
+    }
+}
\ No newline at end of file
diff --git a/modules/imgaug/src/transforms.cpp b/modules/imgaug/src/transforms.cpp
new file mode 100644
index 00000000000..89225cfe4e3
--- /dev/null
+++ b/modules/imgaug/src/transforms.cpp
@@ -0,0 +1,531 @@
+// This file is part of OpenCV project.
+// It is subject to the license terms in the LICENSE file found in the top-level directory
+// of this distribution and at http://opencv.org/license.html.
+#include "precomp.hpp"
+#include <opencv2/highgui.hpp>
+#include <iostream>
+
+namespace cv{
+    namespace imgaug{
+        extern RNG rng;
+
+        static void getRandomCropParams(int h, int w, int th, int tw, int* x, int* y);
+        static void getRandomResizedCropParams(int height, int width, const Vec2d& scale, const Vec2d& ratio, Rect& rect);
+        static void getRandomErasingCropParams(int height, int width, const Vec2d& scale, const Vec2d& ratio, Rect& rect);
+        static void getRandomAffineParams(const Size& size, const Vec2f& degrees, const Vec2f& translations, const Vec2f& scales, const Vec4f& shears, float* angle, float* translation_x, float* translation_y, float* scale, float* shear_x, float* shear_y);
+        static void getAffineMatrix(Mat mat, float angle, float tx, float ty, float scale, float shear_x, float shear_y, int cx, int cy);
+
+        void randomCrop(InputArray _src, OutputArray _dst, const Size& sz, const Vec4i& padding, bool pad_if_need, int fill, int padding_mode){
+            Mat src = _src.getMat();
+
+            if(padding != Vec4i()){
+                copyMakeBorder(src, src, padding[0], padding[1], padding[2], padding[3], padding_mode, fill);
+            }
+
+            // pad the height if needed
+            if(pad_if_need && src.rows < sz.height){
+                Vec4i _padding = {sz.height - src.rows, sz.height - src.rows, 0, 0};
+                copyMakeBorder(src, src, _padding[0], _padding[1], _padding[2], _padding[3], padding_mode, fill);
+            }
+            // pad the width if needed
+            if(pad_if_need && src.cols < sz.width){
+                Vec4i _padding = {0, 0, sz.width - src.cols, sz.width - src.cols};
+                copyMakeBorder(src, src, _padding[0], _padding[1], _padding[2], _padding[3], padding_mode, fill);
+            }
+
+            int x, y;
+            getRandomCropParams(src.rows, src.cols, sz.height, sz.width, &x, &y);
+
+            Mat RoI(src, Rect(x, y, sz.width, sz.height));
+            RoI.copyTo(_dst);
+
+            // NOTE: inplace operation not works in converting from python to numpy
+            // _dst.move(RoI);
+        }
+
+
+        static void getRandomCropParams(int h, int w, int th, int tw, int* x, int* y){
+            if(h+1 < th || w+1 < tw){
+                CV_Error( Error::StsBadSize, "The cropped size is larger than the image size" );
+            }
+            if(h == th && w == tw){
+                (*x) = 0;
+                (*y) = 0;
+                return;
+            }
+
+            (*x) = rng.uniform(0, w-tw+1);
+            (*y) = rng.uniform(0, h-th+1);
+
+        }
+
+        RandomCrop::RandomCrop(const Size& _sz, const Vec4i& _padding, bool _pad_if_need, int _fill, int _padding_mode):
+                sz (_sz),
+                padding (_padding),
+                pad_if_need (_pad_if_need),
+                fill (_fill),
+                padding_mode (_padding_mode){};
+
+        void RandomCrop::call(InputArray src, OutputArray dst) const{
+            randomCrop(src, dst, sz, padding, pad_if_need, fill, padding_mode);
+        }
+
+        void randomFlip(InputArray _src, OutputArray _dst, int flipCode, double p){
+
+            bool flag = rng.uniform(0., 1.) < p;
+
+            Mat src = _src.getMat();
+
+            if(!flag){
+                _dst.move(src);
+                return;
+            }
+            flip(src, src, flipCode);
+            _dst.move(src);
+        }
+
+        RandomFlip::RandomFlip(int _flipCode, double _p):
+                flipCode(_flipCode),
+                p(_p){};
+
+        void RandomFlip::call(InputArray src, OutputArray dst) const{
+            randomFlip(src, dst);
+        }
+
+        Compose::Compose(std::vector<Ptr<Transform> >& _transforms):
+                transforms(_transforms){};
+
+        void Compose::call(InputArray _src, OutputArray _dst) const{
+            Mat src = _src.getMat();
+
+            for(auto it = transforms.begin(); it != transforms.end(); ++it){
+                (*it)->call(src, src);
+            }
+            src.copyTo(_dst);
+        }
+
+        Resize::Resize(const Size& _sz, int _interpolation):
+                sz(_sz),
+                interpolation(_interpolation){};
+
+        void Resize::call(InputArray src, OutputArray dst) const{
+            resize(src, dst, sz, 0, 0, interpolation);
+        }
+
+        void centerCrop(InputArray _src, OutputArray _dst, const Size& size) {
+            Mat src = _src.getMat();
+            Mat padded(src);
+            // pad the input image if needed
+            if (size.width > src.cols || size.height > src.rows) {
+                int top = size.height - src.rows > 0 ? static_cast<int>((size.height - src.rows) / 2) : 0;
+                int bottom = size.height - src.rows > 0 ? static_cast<int>((size.height - src.rows) / 2) : 0;
+                int left = size.width - src.cols > 0 ? static_cast<int>((size.width - src.cols) / 2) : 0;
+                int right = size.width - src.cols > 0 ? static_cast<int>((size.width - src.cols) / 2) : 0;
+
+                // fill with value 0
+                copyMakeBorder(src, padded, top, bottom, left, right, BORDER_CONSTANT, 0);
+            }
+
+            int x = static_cast<int>((padded.cols - size.width) / 2);
+            int y = static_cast<int>((padded.rows - size.height) / 2);
+
+            Mat cropped(padded, Rect(x, y, size.width, size.height));
+            _dst.move(cropped);
+        }
+
+        CenterCrop::CenterCrop(const Size& _size) :
+                size(_size) {};
+
+        void CenterCrop::call(InputArray src, OutputArray dst) const {
+            centerCrop(src, dst, size);
+        }
+
+        Pad::Pad(const Vec4i& _padding, const Scalar& _fill, int _padding_mode) :
+                padding(_padding),
+                fill(_fill),
+                padding_mode(_padding_mode) {};
+
+        void Pad::call(InputArray src, OutputArray dst) const {
+            copyMakeBorder(src, dst, padding[0], padding[1], padding[2], padding[3], padding_mode, fill);
+        }
+
+        void randomResizedCrop(InputArray _src, OutputArray _dst, const Size& size, const Vec2d& scale, const Vec2d& ratio, int interpolation) {
+            // Ensure scale range and ratio range are valid
+            CV_Assert(scale[0] <= scale[1] && ratio[0] <= ratio[1]);
+
+            Mat src = _src.getMat();
+
+            Rect crop_rect;
+            getRandomResizedCropParams(src.rows, src.cols, scale, ratio, crop_rect);
+            Mat cropped(src, Rect(crop_rect));
+            resize(cropped, _dst, size, 0.0, 0.0, interpolation);
+        }
+
+        static void getRandomResizedCropParams(int height, int width, const Vec2d& scale, const Vec2d& ratio, Rect& rect) {
+            // This implementation is inspired from the implementation in torchvision
+            // https://github.com/pytorch/vision/blob/main/torchvision/transforms/transforms.py
+
+            int area = height * width;
+
+            for (int i = 0; i < 10; i++) {
+                double target_area = rng.uniform(scale[0], scale[1]) * area;
+                double aspect_ratio = rng.uniform(ratio[0], ratio[1]);
+
+                int w = static_cast<int>(round(sqrt(target_area * aspect_ratio)));
+                int h = static_cast<int>(round(sqrt(target_area / aspect_ratio)));
+
+                if (w > 0 && w <= width && h > 0 && h <= height) {
+                    rect.x = rng.uniform(0, width - w + 1);
+                    rect.y = rng.uniform(0, height - h + 1);
+                    rect.width = w;
+                    rect.height = h;
+                    return;
+                }
+            }
+
+            // Center Crop
+            double in_ratio = static_cast<double>(width) / height;
+            if (in_ratio < ratio[0]) {
+                rect.width = width;
+                rect.height = static_cast<int> (round(width / ratio[0]));
+            }
+            else if (in_ratio > ratio[1]) {
+                rect.height = height;
+                rect.width = static_cast<int> (round(height * ratio[1]));
+            }
+            else {
+                rect.width = width;
+                rect.height = height;
+            }
+            rect.x = (width - rect.width) / 2;
+            rect.y = (height - rect.height) / 2;
+
+        }
+
+        RandomResizedCrop::RandomResizedCrop(const Size& _size, const Vec2d& _scale, const Vec2d& _ratio, int _interpolation) :
+                size(_size),
+                scale(_scale),
+                ratio(_ratio),
+                interpolation(_interpolation) {};
+
+        void RandomResizedCrop::call(InputArray src, OutputArray dst) const{
+            randomResizedCrop(src, dst, size, scale, ratio, interpolation);
+        }
+
+        void colorJitter(InputArray _src, OutputArray _dst, const Vec2d& brightness, const Vec2d& contrast, const Vec2d& saturation, const Vec2d& hue){
+            // TODO: check input values
+            Mat src = _src.getMat();
+
+            double brightness_factor = 1, contrast_factor = 1, saturation_factor = 1, hue_factor = 0;
+
+            if(brightness != Vec2d())
+                brightness_factor = rng.uniform(brightness[0], brightness[1]);
+            if(contrast != Vec2d())
+                contrast_factor = rng.uniform(contrast[0], contrast[1]);
+            if(saturation != Vec2d())
+                saturation_factor = rng.uniform(saturation[0], saturation[1]);
+            if(hue != Vec2d())
+                hue_factor = rng.uniform(hue[0], hue[1]);
+
+            int order[4] = {1,2,3,4};
+            std::random_shuffle(order, order+4);
+
+            for(int i : order){
+                if(i == 1 && brightness_factor != 1)
+                    cv::adjustBrightness(src, brightness_factor);
+                if(i == 2 && contrast_factor != 1)
+                    cv::adjustContrast(src, contrast_factor);
+                if(i == 3 && saturation_factor != 1)
+                    cv::adjustSaturation(src, saturation_factor);
+                if(i == 4 && hue_factor != 0)
+                    cv::adjustHue(src, hue_factor);
+            }
+
+            _dst.move(src);
+        }
+
+        ColorJitter::ColorJitter(const Vec2d& _brightness, const Vec2d& _contrast, const Vec2d& _saturation,
+                                 const Vec2d& _hue):
+                brightness(_brightness),
+                contrast(_contrast),
+                saturation(_saturation),
+                hue(_hue){};
+
+        void ColorJitter::call(InputArray src, OutputArray dst) const{
+            colorJitter(src, dst, brightness, contrast, saturation, hue);
+        }
+
+        void randomRotation(InputArray _src, OutputArray _dst, const Vec2d& degrees, int interpolation, const Point2f& center, const Scalar& fill){
+            Mat src = _src.getMat();
+            // TODO: check the validation of degrees
+            double angle = rng.uniform(degrees[0], degrees[1]);
+
+            Point2f pt(src.cols/2., src.rows/2.);
+            if(center != Point2f()) pt = center;
+
+            Mat r = getRotationMatrix2D(pt, angle, 1.0);
+
+            // TODO: auto expand dst size to fit the rotated image
+            warpAffine(src, _dst, r, src.size(), interpolation, BORDER_CONSTANT, fill);
+        }
+
+        RandomRotation::RandomRotation(const Vec2d& _degrees, int _interpolation, const Point2f& _center, const Scalar& _fill):
+                degrees(_degrees),
+                interpolation(_interpolation),
+                center(_center),
+                fill(_fill){};
+
+        void RandomRotation::call(InputArray src, OutputArray dst) const{
+            randomRotation(src, dst, degrees, interpolation, center, fill);
+        }
+
+        void grayScale(InputArray _src, OutputArray _dst, int num_channels){
+            Mat src = _src.getMat();
+            cvtColor(src, src, COLOR_BGR2GRAY);
+
+            if(num_channels == 1){
+                _dst.move(src);
+                return;
+            }
+            Mat channels[3] = {src, src, src};
+            merge(channels, 3, _dst);
+        }
+
+        GrayScale::GrayScale(int _num_channels):
+                num_channels(_num_channels){};
+
+        void GrayScale::call(InputArray _src, OutputArray _dst) const{
+            grayScale(_src, _dst, num_channels);
+        }
+
+        void randomGrayScale(InputArray _src, OutputArray _dst, double p){
+            if(rng.uniform(0.0, 1.0) < p){
+                grayScale(_src, _dst, _src.channels());
+                return;
+            }
+            Mat src = _src.getMat();
+            _dst.move(src);
+        }
+
+        RandomGrayScale::RandomGrayScale(double _p):
+                p(_p){};
+
+        void RandomGrayScale::call(InputArray src, OutputArray dst) const{
+            randomGrayScale(src, dst);
+        }
+
+        void randomErasing(InputArray _src, OutputArray _dst, double p, const Vec2d& scale, const Vec2d& ratio, const Scalar& value, bool inplace){
+            // TODO: check the range of input values
+            Mat src = _src.getMat();
+            if(rng.uniform(0., 1.) >= p){
+                _dst.move(src);
+                return;
+            }
+
+            Rect roi;
+            getRandomErasingCropParams(src.rows, src.cols, scale, ratio, roi);
+
+            Mat erased(src, roi);
+
+            int rows = erased.rows;
+            int cols = erased.cols;
+            int cn = erased.channels();
+            for(int j=0; j<rows; j++){
+                uchar* row = erased.ptr<uchar>(j);
+                for(int i=0; i<cols; i++){
+                    for(int c=0; c<cn; c++){
+                        row[i * cn + c] = value[c];
+                    }
+                }
+            }
+
+            if(inplace)
+                _dst.move(src);
+            else
+                src.copyTo(_dst);
+        }
+
+        static void getRandomErasingCropParams(int height, int width, const Vec2d& scale, const Vec2d& ratio, Rect& rect) {
+            int area = height * width;
+
+            for (int i = 0; i < 10; i++) {
+                double target_area = rng.uniform(scale[0], scale[1]) * area;
+                double aspect_ratio = rng.uniform(ratio[0], ratio[1]);
+
+                int w = static_cast<int>(round(sqrt(target_area * aspect_ratio)));
+                int h = static_cast<int>(round(sqrt(target_area / aspect_ratio)));
+
+                if (w > 0 && w <= width && h > 0 && h <= height) {
+                    rect.x = rng.uniform(0, width - w + 1);
+                    rect.y = rng.uniform(0, height - h + 1);
+                    rect.width = w;
+                    rect.height = h;
+                    return;
+                }
+            }
+
+            // Center Crop
+            double in_ratio = static_cast<double>(width) / height;
+            if (in_ratio < ratio[0]) {
+                rect.width = width;
+                rect.height = static_cast<int> (round(width / ratio[0]));
+            }
+            else if (in_ratio > ratio[1]) {
+                rect.height = height;
+                rect.width = static_cast<int> (round(height * ratio[1]));
+            }
+            else {
+                rect.width = width;
+                rect.height = height;
+            }
+            rect.x = (width - rect.width) / 2;
+            rect.y = (height - rect.height) / 2;
+        }
+
+        RandomErasing::RandomErasing(double _p, const Vec2d& _scale, const Vec2d& _ratio, const Scalar& _value, bool _inplace):
+                p(_p),
+                scale(_scale),
+                ratio(_ratio),
+                value(_value),
+                inplace(_inplace){};
+
+        void RandomErasing::call(InputArray src, OutputArray dst) const{
+            randomErasing(src, dst, p, scale, ratio, value, inplace);
+        }
+
+        // NOTE: because Scalar contains 4 elements at most, normalize can only apply to image with channels no more than 4.
+        Normalize::Normalize(const Scalar& _mean, const Scalar& _std):
+                mean(_mean),
+                std(_std){};
+
+        void Normalize::call(InputArray _src, OutputArray _dst) const{
+            Mat src = _src.getMat();
+
+            _dst.create(src.size(), CV_32FC3);
+            Mat dst = _dst.getMat();
+
+            int cn = src.channels();
+            std::vector<Mat> channels;
+            split(src, channels);
+
+            // normalize each channel to 0-1 first
+            for(int i=0; i<cn; i++){
+                Mat temp;
+                channels[i].convertTo(temp, CV_32FC1, 1.f/255);
+                temp = (temp - mean[i])/std[i];
+                channels[i] = temp;
+            }
+
+            merge(channels, dst);
+        }
+
+        void gaussianBlur(InputArray src, OutputArray dst, const Size& kernel_size, const Vec2f& sigma){
+            float sigmaX = rng.uniform(sigma[0], sigma[1]);
+            cv::GaussianBlur(src, dst, kernel_size, sigmaX);
+        }
+
+        GaussianBlur::GaussianBlur(const Size& _kernel_size, const Vec2f& _sigma):
+                kernel_size(_kernel_size),
+                sigma(_sigma){};
+
+        void GaussianBlur::call(InputArray src, OutputArray dst) const{
+            gaussianBlur(src, dst, kernel_size, sigma);
+        }
+
+        void randomAffine(InputArray _src, OutputArray _dst, const Vec2f& degrees, const Vec2f& translations, const Vec2f& scales, const Vec4f& shears, int interpolation, const Scalar& fill, const Point2i& _center){
+            /*
+             * M = T * R * SHx * SHy
+             * T is the translation matrix: [1, 0, tx | 0, 1, ty | 0, 0, 1]
+             * R is the rotation matrix: [ s * cos(a), s * sin(a), (1 - s * cos(a)) * cx - s * sin(a) * cy | -s * sin(a), s * cos(a), s * sin(a) * cx + (1 - s * cos(a)) * cy | 0, 0, 1]
+             * in which cx and cy are center coordinates to keep the image rotate around the center, s is the scale factor, a is the rotation angle.
+             * SHx and SHy are shear matrices: SHx(s) = [1, -tan(s), 0 | 0, 1, 0 | 0, 0, 1], SHy(s) = [1, 0, 0 | -tan(s), 1, 0 | 0, 0, 1]
+             */
+
+            // TODO: check the input value ranges
+            Mat src = _src.getMat();
+            // when center is default (-1,-1), make the rotation center located in the center of the image
+            Point2i center;
+            if(_center == Point2i(-1, -1)){
+                center.x = static_cast<int>(src.cols / 2);
+                center.y = static_cast<int>(src.rows / 2);
+            }else{
+                center = _center;
+            }
+
+            float angle, translation_x, translation_y, scale, shear_x, shear_y;
+            getRandomAffineParams(src.size(), degrees, translations, scales, shears, &angle, &translation_x, &translation_y, &scale, &shear_x, &shear_y);
+
+            Mat affine_matrix = Mat::eye(2, 3, CV_32F);
+
+            // TODO: check whether equations are right
+            getAffineMatrix(affine_matrix, angle, translation_x, translation_y, scale, shear_x, shear_y, center.x, center.y);
+            warpAffine(src, src, affine_matrix, src.size(), interpolation, BORDER_CONSTANT, fill);
+            _dst.move(src);
+        }
+
+        static void getAffineMatrix(Mat mat, float angle, float tx, float ty, float scale, float shear_x, float shear_y, int cx, int cy){
+            float* data = mat.ptr<float>(0);
+
+            // convert from degrees to radians
+            angle = (float)(CV_PI * angle) / 180;
+            shear_x = (float)(CV_PI * shear_x) / 180;
+            shear_y = (float)(CV_PI * shear_y) / 180;
+
+            data[0] = scale * cos(angle - shear_y) / cos(shear_y);
+            data[1] = scale * (-cos(angle - shear_y) * tan(shear_x) / cos(shear_y) - sin(angle));
+            data[3] = scale * sin(angle - shear_y) / cos(shear_y);
+            data[4] = scale * (-sin(angle - shear_y) * tan(shear_x) / cos(shear_y) + cos(angle));
+            data[2] = cx * (1-data[0]) + data[1] * (-cy) + tx;
+            data[5] = cy * (1-data[4]) + data[3] * (-cx) + ty;
+        }
+
+        static void getRandomAffineParams(const Size& size, const Vec2f& degrees, const Vec2f& translations, const Vec2f& scales, const Vec4f& shears, float* angle, float* translation_x, float* translation_y, float* scale, float* shear_x, float* shear_y){
+
+            if(degrees == Vec2f(0, 0)) {
+                *angle = 0;
+            }
+            else{
+                *angle = rng.uniform(degrees[0], degrees[1]);
+            }
+
+            if(translations == Vec2f(0, 0)) {
+                *translation_x = 0;
+                *translation_y = 0;
+            }
+            else{
+                *translation_x = rng.uniform(-translations[0], translations[0]) * size.width;
+                *translation_y = rng.uniform(-translations[1], translations[1]) * size.height;
+            }
+
+            if(scales == Vec2f(1, 1)) {
+                *scale = 1;
+            }
+            else{
+                *scale = rng.uniform(scales[0], scales[1]);
+            }
+
+            if(shears == Vec4f(0, 0, 0, 0)) {
+                *shear_x = 0;
+                *shear_y = 0;
+            }
+            else{
+                *shear_x = rng.uniform(shears[0], shears[1]);
+                *shear_y = rng.uniform(shears[2], shears[3]);
+            }
+
+        }
+
+        RandomAffine::RandomAffine(const Vec2f& _degrees, const Vec2f& _translations, const Vec2f& _scales, const Vec4f& _shears, int _interpolation, const Scalar& _fill, const Point2i& _center):
+                degrees(_degrees),
+                translations(_translations),
+                scales(_scales),
+                shears(_shears),
+                interpolation(_interpolation),
+                fill(_fill),
+                center(_center){};
+
+        void RandomAffine::call(InputArray src, OutputArray dst) const{
+            randomAffine(src, dst, degrees, translations, scales, shears, interpolation, fill, center);
+        }
+    }
+}
diff --git a/modules/imgaug/src/transforms_det.cpp b/modules/imgaug/src/transforms_det.cpp
new file mode 100644
index 00000000000..b49e721d7be
--- /dev/null
+++ b/modules/imgaug/src/transforms_det.cpp
@@ -0,0 +1,220 @@
+// This file is part of OpenCV project.
+// It is subject to the license terms in the LICENSE file found in the top-level directory
+// of this distribution and at http://opencv.org/license.html.
+#include "precomp.hpp"
+#include <opencv2/highgui.hpp>
+#include <iostream>
+
+namespace cv{
+    namespace imgaug{
+        extern RNG rng;
+
+        namespace det{
+            int clamp(int v, int lo, int hi);
+            void rotate(int* x, int* y, int cx, int cy, double angle);
+
+            Compose::Compose(std::vector<Ptr<Transform> >& _transforms):
+                    transforms(_transforms){};
+
+            void Compose::call(InputArray _src, OutputArray _dst, std::vector<cv::Rect>& target, std::vector<int>& labels) const{
+                Mat src = _src.getMat();
+                for(cv::imgaug::det::Transform* transform:transforms){
+                    transform->call(src, src, target, labels);
+                }
+                src.copyTo(_dst);
+            }
+
+            RandomFlip::RandomFlip(int _flipCode, float _p):
+                flipCode(_flipCode), p(_p)
+            {
+                if(p < 0 || p > 1){
+                    CV_Error(Error::Code::StsBadArg, "probability p must be between range 0 and 1");
+                }
+            };
+
+            void RandomFlip::call(InputArray _src, OutputArray _dst, std::vector<cv::Rect>& target, std::vector<int>& labels) const{
+                CV_Assert(target.size() == labels.size());
+                bool flag = rng.uniform(0., 1.) < p;
+
+                Mat src = _src.getMat();
+                if(!flag){
+                    _dst.move(src);
+                    return;
+                }
+
+                flipBoundingBox(target, src.size());
+                flip(src, src, flipCode);
+                _dst.move(src);
+            }
+
+            void RandomFlip::flipBoundingBox(std::vector<cv::Rect>& target, const Size& size) const{
+                /*
+                 * flipCode = 0 (flip vertically): (x', y') = (x, img.height - y - bbox.height)
+                 * flipCode > 0 (flip horizontally): (x', y') = (img.width - x - bbox.width, y)
+                 * flipCode < 0 (flip diagonally): (x', y') = (img.width - x - bbox.width, img.height - y - bbox.height)
+                 */
+                for(unsigned i = 0; i < target.size(); i++){
+                    if(flipCode == 0){
+                        target[i].y = size.height - target[i].y - target[i].height;
+                    }else if(flipCode > 0){
+                        target[i].x = size.width - target[i].x - target[i].width;
+                    }else{
+                        target[i].x = size.width - target[i].x - target[i].width;
+                        target[i].y = size.height - target[i].y - target[i].height;
+                    }
+                }
+            }
+
+            Resize::Resize(const Size& _size, int _interpolation):
+                    size(_size), interpolation(_interpolation){};
+
+            void Resize::call(InputArray _src, OutputArray dst, std::vector<cv::Rect>& target, std::vector<int>& labels) const{
+                CV_Assert(target.size() == labels.size());
+                Mat src = _src.getMat();
+                resize(src, dst, size, 0, 0, interpolation);
+                resizeBoundingBox(target, src.size());
+            }
+
+            void Resize::resizeBoundingBox(std::vector<cv::Rect>& target, const Size& imgSize) const{
+                for(unsigned i=0; i<target.size(); i++){
+                    target[i].x = static_cast<double>(size.width) / imgSize.width * target[i].x;
+                    target[i].y = static_cast<double>(size.height) / imgSize.height * target[i].y;
+                    target[i].width = static_cast<double>(size.width) / imgSize.width * target[i].width;
+                    target[i].height = static_cast<double>(size.height) / imgSize.height * target[i].height;
+                }
+            }
+
+            Convert::Convert(int _code):
+                    code(_code){};
+
+            void Convert::call(InputArray src, OutputArray dst, std::vector<cv::Rect>& target, std::vector<int>& labels) const{
+                CV_Assert(target.size() == labels.size());
+                cvtColor(src, dst, code);
+            }
+
+            RandomTranslation::RandomTranslation(const cv::Vec2i& _translations, float _threshold):
+                translations(_translations),
+                threshold(_threshold){};
+
+
+            void RandomTranslation::call(cv::InputArray _src, cv::OutputArray _dst, std::vector<cv::Rect> &bboxes, std::vector<int>& labels) const {
+                CV_Assert(bboxes.size() == labels.size());
+                int tx = rng.uniform(-translations[0], translations[0]);
+                int ty = rng.uniform(-translations[1], translations[1]);
+
+                Mat translation_matrix = Mat::eye(2, 3, CV_32F);
+                float* data = translation_matrix.ptr<float>();
+                data[0] = 1;
+                data[1] = 0;
+                data[2] = tx;
+                data[3] = 0;
+                data[4] = 1;
+                data[5] = ty;
+
+                cv::warpAffine(_src, _dst, translation_matrix, _src.size());
+                translateBoundingBox(bboxes, labels, _src.size(), tx, ty);
+            }
+
+
+            void RandomTranslation::translateBoundingBox(std::vector<cv::Rect> &bboxes, std::vector<int> &labels, const cv::Size &imgSize, int tx, int ty) const {
+                for(unsigned i=0; i < bboxes.size(); i++){
+                    int x1 = clamp(bboxes[i].x + tx, 0, imgSize.width);
+                    int y1 = clamp(bboxes[i].y + ty, 0, imgSize.height);
+                    int x2 = clamp(bboxes[i].x + bboxes[i].width + tx, 0, imgSize.width);
+                    int y2 = clamp(bboxes[i].y + bboxes[i].height + ty, 0, imgSize.height);
+                    int w = x2 - x1;
+                    int h = y2 - y1;
+                    if((float)(w * h) / (bboxes[i].width * bboxes[i].height) < threshold){
+                        bboxes.erase(bboxes.begin() + i);
+                        labels.erase(labels.begin() + i);
+                    }else{
+                        bboxes[i].x = x1;
+                        bboxes[i].y = y1;
+                        bboxes[i].width = x2 - x1;
+                        bboxes[i].height = y2 - y1;
+                    }
+                }
+            }
+
+            RandomRotation::RandomRotation(const cv::Vec2d &_angles, double _threshold):
+                angles(_angles),
+                threshold(_threshold){};
+
+            void RandomRotation::call(cv::InputArray _src, cv::OutputArray _dst, std::vector<cv::Rect> &bboxes,
+                                      std::vector<int> &labels) const {
+                CV_Assert(bboxes.size() == labels.size());
+                Mat src = _src.getMat();
+                double angle = rng.uniform(angles[0], angles[1]);
+                Mat rotation_matrix = getRotationMatrix2D(cv::Point2f(src.cols/2., src.rows/2.), angle, 1);
+                warpAffine(src, _dst, rotation_matrix, src.size());
+
+                Mat dst = _dst.getMat();
+                rotateBoundingBoxes(bboxes, labels, angle, src.cols / 2, src.rows / 2, dst.size());
+            }
+
+            void RandomRotation::rotateBoundingBoxes(std::vector<cv::Rect> &bboxes, std::vector<int> &labels,
+                                                     double angle, int cx, int cy, const Size& imgSize) const {
+                angle = -angle * CV_PI / 180;
+
+                for(unsigned i=0; i < bboxes.size(); i++){
+                    int x1 = bboxes[i].x;
+                    int y1 = bboxes[i].y;
+                    int x2 = bboxes[i].x + bboxes[i].width;
+                    int y2 = bboxes[i].y;
+                    int x3 = bboxes[i].x;
+                    int y3 = bboxes[i].y + bboxes[i].height;
+                    int x4 = bboxes[i].x + bboxes[i].width;
+                    int y4 = bboxes[i].y + bboxes[i].height;
+
+                    // convert unit from degree to radius
+                    // rotate the corners
+                    rotate(&x1, &y1, cx, cy, angle);
+                    rotate(&x2, &y2, cx, cy, angle);
+                    rotate(&x3, &y3, cx, cy, angle);
+                    rotate(&x4, &y4, cx, cy, angle);
+
+                    // shrink the rotated corners to get an enclosing box
+                    int x_min = min({x1, x2, x3, x4});
+                    int y_min = min({y1, y2, y3, y4});
+                    int x_max = max({x1, x2, x3, x4});
+                    int y_max = max({y1, y2, y3, y4});
+
+                    x_min = clamp(x_min, 0, imgSize.width);
+                    y_min = clamp(y_min, 0, imgSize.height);
+                    x_max = clamp(x_max, 0, imgSize.width);
+                    y_max = clamp(y_max, 0, imgSize.height);
+
+                    int w = x_max - x_min;
+                    int h = y_max - y_min;
+
+                    if((float)(w * h) / (bboxes[i].width * bboxes[i].height) < threshold){
+                        bboxes.erase(bboxes.begin() + i);
+                        labels.erase(labels.begin() + i);
+                    }else{
+                        bboxes[i].x = x_min;
+                        bboxes[i].y = y_min;
+                        bboxes[i].width = w;
+                        bboxes[i].height = h;
+                    }
+
+                }
+            }
+
+            inline int clamp(int v, int lo, int hi){
+                if(v < lo){
+                    return lo;
+                }
+                if(v > hi){
+                    return hi;
+                }
+                return v;
+            }
+
+            inline void rotate(int* x, int* y, int cx, int cy, double angle){
+                // NOTE: when the unit of angle is degree instead of radius, the result may be incorrect.
+                (*x) = (int)round(((*x) - cx) * cos(angle) - ((*y) - cy) * sin(angle) + cx);
+                (*y) = (int)round(((*x) - cx) * sin(angle) + ((*y) - cy) * cos(angle) + cy);
+            }
+        }
+    }
+}
\ No newline at end of file
diff --git a/modules/imgaug/test/test_imgaug.cpp b/modules/imgaug/test/test_imgaug.cpp
new file mode 100644
index 00000000000..9f41fa83b8b
--- /dev/null
+++ b/modules/imgaug/test/test_imgaug.cpp
@@ -0,0 +1,331 @@
+// This file is part of OpenCV project.
+// It is subject to the license terms in the LICENSE file found in the top-level directory
+// of this distribution and at http://opencv.org/license.html.
+#include "test_precomp.hpp"
+
+namespace opencv_test{ namespace{
+
+
+TEST(Aug_RandomCrop, no_padding){
+    cout << "run test: no_padding" << endl;
+    cvtest::TS* ts = cvtest::TS::ptr();
+    string img_path = findDataFile("imgaug/lena.jpg");
+    Mat input = imread(img_path);
+
+    int th = 200;
+    int tw = 200;
+
+    string ref_path = findDataFile("imgaug/random_crop_test_0.jpg");
+    Mat ref = imread(ref_path);
+
+    int seed = 0;
+
+    cv::imgaug::setSeed(seed);
+    cv::imgaug::RandomCrop aug(Size(tw, th));
+    Mat out;
+    aug.call(input, out);
+
+    if ( out.rows > 0 && out.rows == ref.rows && out.cols > 0 && out.cols == ref.cols ) {
+        // Calculate the L2 relative error between images.
+        double errorL2 = cv::norm( out, ref, NORM_L2 );
+        // Convert to a reasonable scale, since L2 error is summed across all pixels of the image.
+        double error = errorL2 / (double)( out.rows * out.cols );
+        EXPECT_LE(error, 0.1);
+    }else{
+        ts->set_failed_test_info(TS::FAIL_MISMATCH);
+    }
+}
+
+TEST(Aug_RandomCrop, padding){
+    cout << "run test: padding" << endl;
+    cvtest::TS* ts = cvtest::TS::ptr();
+    string img_path = findDataFile("imgaug/lena.jpg");
+    Mat input = imread(img_path);
+
+    int seed = 0;
+
+    int th = 200;
+    int tw = 200;
+    Vec4d padding {10, 20, 30, 40};
+
+    string ref_path = findDataFile("imgaug/random_crop_test_1.jpg");
+    Mat ref = imread(ref_path);
+
+    imgaug::setSeed(seed);
+    cv::imgaug::RandomCrop aug(Size(tw, th), padding);
+    Mat out;
+    aug.call(input, out);
+
+    if ( out.rows > 0 && out.rows == ref.rows && out.cols > 0 && out.cols == ref.cols ) {
+        // Calculate the L2 relative error between images.
+        double errorL2 = cv::norm( out, ref, NORM_L2 );
+        // Convert to a reasonable scale, since L2 error is summed across all pixels of the image.
+        double error = errorL2 / (double)( out.rows * out.cols );
+        EXPECT_LE(error, 0.1);
+    }else{
+        ts->set_failed_test_info(TS::FAIL_MISMATCH);
+    }
+}
+
+TEST(Aug_RandomFlip, diagonal){
+    cout << "run test: random flip (diagonal)" << endl;
+    cvtest::TS* ts = cvtest::TS::ptr();
+    string img_path = findDataFile("imgaug/lena.jpg");
+    Mat input = imread(img_path);
+    Mat out;
+
+    string ref_path = findDataFile("imgaug/random_flip_test_2.jpg");
+    Mat ref = imread(ref_path);
+
+    cv::imgaug::RandomFlip aug(0, 1);
+    aug.call(input, out);
+
+    if ( out.rows > 0 && out.rows == ref.rows && out.cols > 0 && out.cols == ref.cols ) {
+        // Calculate the L2 relative error between images.
+        double errorL2 = cv::norm( out, ref, NORM_L2 );
+        // Convert to a reasonable scale, since L2 error is summed across all pixels of the image.
+        double error = errorL2 / (double)( out.rows * out.cols );
+        EXPECT_LE(error, 0.1);
+    }else{
+        ts->set_failed_test_info(TS::FAIL_MISMATCH);
+    }
+}
+
+TEST(Aug_Resize, basic){
+    cout << "run test: resize (basic)" << endl;
+    cvtest::TS* ts = cvtest::TS::ptr();
+    string img_path = findDataFile("imgaug/lena.jpg");
+    Mat input = imread(img_path);
+    Mat out;
+
+    string ref_path = findDataFile("imgaug/resize_test_3.jpg");
+    Mat ref = imread(ref_path);
+
+    cv::imgaug::Resize aug(cv::Size(256, 128));
+    aug.call(input, out);
+
+    if ( out.rows > 0 && out.rows == ref.rows && out.cols > 0 && out.cols == ref.cols ) {
+        // Calculate the L2 relative error between images.
+        double errorL2 = cv::norm( out, ref, NORM_L2 );
+        // Convert to a reasonable scale, since L2 error is summed across all pixels of the image.
+        double error = errorL2 / (double)( out.rows * out.cols );
+        EXPECT_LE(error, 0.1);
+    }else{
+        ts->set_failed_test_info(TS::FAIL_MISMATCH);
+    }
+}
+
+
+TEST(Aug_CenterCrop, basic){
+    cout << "run test: center crop (basic)" << endl;
+    cvtest::TS* ts = cvtest::TS::ptr();
+    string img_path = findDataFile("imgaug/lena.jpg");
+    Mat input = imread(img_path);
+    Mat out;
+
+    string ref_path = findDataFile("imgaug/center_crop_test_4.jpg");
+    Mat ref = imread(ref_path);
+
+    cv::imgaug::CenterCrop aug(cv::Size(400, 300));
+    aug.call(input, out);
+    if ( out.rows > 0 && out.rows == ref.rows && out.cols > 0 && out.cols == ref.cols ) {
+        // Calculate the L2 relative error between images.
+        double errorL2 = cv::norm( out, ref, NORM_L2 );
+        // Convert to a reasonable scale, since L2 error is summed across all pixels of the image.
+        double error = errorL2 / (double)( out.rows * out.cols );
+        EXPECT_LE(error, 0.1);
+    }else{
+        ts->set_failed_test_info(TS::FAIL_MISMATCH);
+    }
+}
+
+
+TEST(Aug_Pad, basic){
+    cout << "run test: pad (basic)" << endl;
+    cvtest::TS* ts = cvtest::TS::ptr();
+    string img_path = findDataFile("imgaug/lena.jpg");
+    Mat input = imread(img_path);
+    Mat out;
+
+    string ref_path = findDataFile("imgaug/pad_test_5.jpg");
+    Mat ref = imread(ref_path);
+
+    cv::imgaug::Pad aug(Vec4i(10, 20, 30, 40), Scalar(0));
+    aug.call(input, out);
+    if ( out.rows > 0 && out.rows == ref.rows && out.cols > 0 && out.cols == ref.cols ) {
+        // Calculate the L2 relative error between images.
+        double errorL2 = cv::norm( out, ref, NORM_L2 );
+        // Convert to a reasonable scale, since L2 error is summed across all pixels of the image.
+        double error = errorL2 / (double)( out.rows * out.cols );
+        EXPECT_LE(error, 0.1);
+    }else{
+        ts->set_failed_test_info(TS::FAIL_MISMATCH);
+    }
+}
+
+TEST(Aug_RandomResizedCrop, basic){
+    cout << "run test: random resized crop (basic)" << endl;
+    cvtest::TS* ts = cvtest::TS::ptr();
+    string img_path = findDataFile("imgaug/lena.jpg");
+    Mat input = imread(img_path);
+    Mat out;
+
+    cv::Size size(1024, 512);
+    uint64 seed = 10;
+    cv::imgaug::setSeed(seed);
+
+    string ref_path = findDataFile("imgaug/random_resized_crop_test_6.jpg");
+    Mat ref = imread(ref_path);
+
+    cv::imgaug::RandomResizedCrop aug(size);
+
+    aug.call(input, out);
+    if ( out.rows > 0 && out.rows == ref.rows && out.cols > 0 && out.cols == ref.cols ) {
+        // Calculate the L2 relative error between images.
+        double errorL2 = cv::norm( out, ref, NORM_L2 );
+        // Convert to a reasonable scale, since L2 error is summed across all pixels of the image.
+        double error = errorL2 / (double)( out.rows * out.cols );
+        EXPECT_LE(error, 0.1);
+    }else{
+        ts->set_failed_test_info(TS::FAIL_MISMATCH);
+    }
+}
+
+
+TEST(Aug_RandomRotation, not_expand){
+    cout << "run test: random rotation (not_expand)" << endl;
+    cvtest::TS* ts = cvtest::TS::ptr();
+    string img_path = findDataFile("imgaug/lena.jpg");
+    Mat input = imread(img_path);
+    Mat out;
+
+    cv::Vec2d degrees(-10, 10);
+    uint64 seed = 5;
+    cv::imgaug::setSeed(seed);
+
+    string ref_path = findDataFile("imgaug/random_rotation_test_7.jpg");
+    Mat ref = imread(ref_path);
+
+    cv::imgaug::RandomRotation aug(degrees);
+
+    aug.call(input, out);
+
+    if ( out.rows > 0 && out.rows == ref.rows && out.cols > 0 && out.cols == ref.cols ) {
+        // Calculate the L2 relative error between images.
+        double errorL2 = cv::norm( out, ref, NORM_L2 );
+        // Convert to a reasonable scale, since L2 error is summed across all pixels of the image.
+        double error = errorL2 / (double)( out.rows * out.cols );
+        EXPECT_LE(error, 0.1);
+    }else{
+        ts->set_failed_test_info(TS::FAIL_MISMATCH);
+    }
+}
+
+TEST(Aug_GrayScale, basic){
+    cout << "run test: gray scale (basic)" << endl;
+    cvtest::TS* ts = cvtest::TS::ptr();
+    string img_path = findDataFile("imgaug/lena.jpg");
+    Mat input = imread(img_path);
+    Mat out;
+
+    string ref_path = findDataFile("imgaug/gray_scale_test_8.jpg");
+    Mat ref = imread(ref_path, IMREAD_GRAYSCALE);
+
+    cv::imgaug::GrayScale aug;
+
+    aug.call(input, out);
+
+    if ( out.rows > 0 && out.rows == ref.rows && out.cols > 0 && out.cols == ref.cols ) {
+        // Calculate the L2 relative error between images.
+        double errorL2 = cv::norm( out, ref, NORM_L2 );
+        // Convert to a reasonable scale, since L2 error is summed across all pixels of the image.
+        double error = errorL2 / (double)( out.rows * out.cols );
+        EXPECT_LE(error, 0.1);
+    }else{
+        ts->set_failed_test_info(TS::FAIL_MISMATCH);
+    }
+}
+
+
+TEST(Aug_GaussianBlur, basic){
+    cout << "run test: gaussian blur (basic)" << endl;
+    cvtest::TS* ts = cvtest::TS::ptr();
+    string img_path = findDataFile("imgaug/lena.jpg");
+    Mat input = imread(img_path);
+    Mat out;
+
+    string ref_path = findDataFile("imgaug/gaussian_blur_test_9.jpg");
+    Mat ref = imread(ref_path);
+    cv::imgaug::setSeed(15);
+    cv::imgaug::GaussianBlur aug(Size(5, 5));
+
+    aug.call(input, out);
+
+    if ( out.rows > 0 && out.rows == ref.rows && out.cols > 0 && out.cols == ref.cols ) {
+        // Calculate the L2 relative error between images.
+        double errorL2 = cv::norm( out, ref, NORM_L2 );
+        // Convert to a reasonable scale, since L2 error is summed across all pixels of the image.
+        double error = errorL2 / (double)( out.rows * out.cols );
+        EXPECT_LE(error, 0.1);
+    }else{
+        ts->set_failed_test_info(TS::FAIL_MISMATCH);
+    }
+}
+
+
+TEST(Aug_Normalize, basic){
+    cout << "run test: gaussian blur (basic)" << endl;
+    cvtest::TS* ts = cvtest::TS::ptr();
+    string img_path = findDataFile("imgaug/lena.jpg");
+    Mat input = imread(img_path);
+    Mat out;
+
+    string ref_path = findDataFile("imgaug/normalize_test_10.jpg");
+    Mat ref = imread(ref_path);
+    cv::imgaug::setSeed(15);
+    // Mean and std for ImageNet is [0.485, 0.456, 0.406], [0.229, 0.224, 0.225] in order of RGB.
+    // For order of BGR, they should be (0.406, 0.456, 0.485), (0.225, 0.224, 0.229)
+    cv::imgaug::Normalize aug(Scalar(0.406, 0.456, 0.485), Scalar(0.225, 0.224, 0.229));
+    aug.call(input, out);
+    out.convertTo(out, CV_8UC3, 255);
+
+    if ( out.rows > 0 && out.rows == ref.rows && out.cols > 0 && out.cols == ref.cols ) {
+        // Calculate the L2 relative error between images.
+        double errorL2 = cv::norm( out, ref, NORM_L2 );
+        // Convert to a reasonable scale, since L2 error is summed across all pixels of the image.
+        double error = errorL2 / (double)( out.rows * out.cols );
+        EXPECT_LE(error, 0.1);
+    }else{
+        ts->set_failed_test_info(TS::FAIL_MISMATCH);
+    }
+}
+
+
+TEST(Aug_ColorJitter, basic){
+    cout << "run test: color jitter (basic)" << endl;
+    cvtest::TS* ts = cvtest::TS::ptr();
+    string img_path = findDataFile("imgaug/lena.jpg");
+    Mat input = imread(img_path);
+    Mat out;
+
+    string ref_path = findDataFile("imgaug/color_jitter_test_11.jpg");
+    Mat ref = imread(ref_path);
+    cv::imgaug::setSeed(15);
+    // Mean and std for ImageNet is [0.485, 0.456, 0.406], [0.229, 0.224, 0.225] in order of RGB.
+    // For order of BGR, they should be (0.406, 0.456, 0.485), (0.225, 0.224, 0.229)
+    cv::imgaug::ColorJitter aug(cv::Vec2d(0, 2), cv::Vec2d(0, 2), cv::Vec2d(0, 2), cv::Vec2d(-0.5, 0.5));
+    aug.call(input, out);
+
+    if ( out.rows > 0 && out.rows == ref.rows && out.cols > 0 && out.cols == ref.cols ) {
+        // Calculate the L2 relative error between images.
+        double errorL2 = cv::norm( out, ref, NORM_L2 );
+        // Convert to a reasonable scale, since L2 error is summed across all pixels of the image.
+        double error = errorL2 / (double)( out.rows * out.cols );
+        EXPECT_LE(error, 0.1);
+    }else{
+        ts->set_failed_test_info(TS::FAIL_MISMATCH);
+    }
+}
+
+
+}}
diff --git a/modules/imgaug/test/test_imgaug_det.cpp b/modules/imgaug/test/test_imgaug_det.cpp
new file mode 100644
index 00000000000..4f4edaa8179
--- /dev/null
+++ b/modules/imgaug/test/test_imgaug_det.cpp
@@ -0,0 +1,254 @@
+// This file is part of OpenCV project.
+// It is subject to the license terms in the LICENSE file found in the top-level directory
+// of this distribution and at http://opencv.org/license.html.
+#include "test_precomp.hpp"
+
+namespace opencv_test{ namespace{
+
+void read_annotation(const String& path, std::vector<Rect>& bboxes, std::vector<int>& labels){
+    FILE* fp;
+    fp = fopen(path.c_str(), "rt");
+
+    int n;
+    int sig;
+    sig = fscanf(fp, "%d", &n);
+    CV_Assert(sig != EOF);
+
+    for(int i=0; i < n; i++){
+        int x, y, w, h, l;
+        sig = fscanf(fp, "%d %d %d %d %d\n", &x, &y, &w, &h, &l);
+        CV_Assert(sig != EOF);
+        bboxes.push_back(Rect(x, y, w, h));
+        labels.push_back(l);
+    }
+
+    fclose(fp);
+}
+
+
+TEST(Aug_Det_RandomFlip, vertical){
+    cvtest::TS* ts = cvtest::TS::ptr();
+    string img_path = findDataFile("imgaug/lena.jpg");
+    Mat src = imread(img_path);
+    Mat out;
+
+    int seed = 0;
+    cv::imgaug::setSeed(seed);
+
+
+    string ref_path = findDataFile("imgaug/det_random_flip_test_0.jpg");
+    Mat ref = imread(ref_path);
+
+    std::vector<Rect> ref_bboxes;
+    std::vector<int> ref_labels;
+
+    String ref_data = findDataFile("imgaug/det_random_flip_test_0.dat");
+    read_annotation(ref_data, ref_bboxes, ref_labels);
+
+
+    std::vector<Rect> bboxes{
+            Rect{112, 40, 249, 343},
+            Rect{61, 273, 113, 228}
+    };
+
+    std::vector<int> labels{1, 2};
+
+    int flipCode = 0;
+    cv::imgaug::det::RandomFlip aug(flipCode);
+    aug.call(src, out, bboxes, labels);
+
+    if ( out.rows > 0 && out.rows == ref.rows && out.cols > 0 && out.cols == ref.cols && ref_bboxes.size() == bboxes.size() && ref_labels.size() == labels.size()) {
+        EXPECT_EQ(bboxes, ref_bboxes);
+        EXPECT_EQ(labels, ref_labels);
+
+        // Calculate the L2 relative error between images.
+        double errorL2 = cv::norm( out, ref, NORM_L2 );
+        // Convert to a reasonable scale, since L2 error is summed across all pixels of the image.
+        double error = errorL2 / (double)( out.rows * out.cols );
+        EXPECT_LE(error, 0.1);
+    }else{
+        ts->set_failed_test_info(TS::FAIL_MISMATCH);
+    }
+}
+
+TEST(Aug_Det_Resize, small){
+    cvtest::TS* ts = cvtest::TS::ptr();
+    string img_path = findDataFile("imgaug/lena.jpg");
+    Mat src = imread(img_path);
+    Mat out;
+
+    int seed = 0;
+    cv::imgaug::setSeed(seed);
+
+
+    string ref_path = findDataFile("imgaug/det_resize_test_0.jpg");
+    Mat ref = imread(ref_path);
+
+    std::vector<Rect> ref_bboxes;
+    std::vector<int> ref_labels;
+
+    String ref_data = findDataFile("imgaug/det_resize_test_0.dat");
+    read_annotation(ref_data, ref_bboxes, ref_labels);
+
+
+    std::vector<Rect> bboxes{
+            Rect{112, 40, 249, 343},
+            Rect{61, 273, 113, 228}
+    };
+
+    std::vector<int> labels{1, 2};
+
+    Size size(224, 224);
+    cv::imgaug::det::Resize aug(size);
+    aug.call(src, out, bboxes, labels);
+
+    if ( out.rows > 0 && out.rows == ref.rows && out.cols > 0 && out.cols == ref.cols && ref_bboxes.size() == bboxes.size() && ref_labels.size() == labels.size()) {
+        EXPECT_EQ(bboxes, ref_bboxes);
+        EXPECT_EQ(labels, ref_labels);
+
+        // Calculate the L2 relative error between images.
+        double errorL2 = cv::norm( out, ref, NORM_L2 );
+        // Convert to a reasonable scale, since L2 error is summed across all pixels of the image.
+        double error = errorL2 / (double)( out.rows * out.cols );
+        EXPECT_LE(error, 0.1);
+    }else{
+        ts->set_failed_test_info(TS::FAIL_MISMATCH);
+    }
+}
+
+
+TEST(Aug_Det_Convert, BGR2GRAY){
+    cvtest::TS* ts = cvtest::TS::ptr();
+    string img_path = findDataFile("imgaug/lena.jpg");
+    Mat src = imread(img_path);
+    Mat out;
+
+    int seed = 0;
+    cv::imgaug::setSeed(seed);
+
+
+    string ref_path = findDataFile("imgaug/det_convert_test_0.jpg");
+    Mat ref = imread(ref_path, IMREAD_GRAYSCALE);
+
+    std::vector<Rect> ref_bboxes;
+    std::vector<int> ref_labels;
+
+    String ref_data = findDataFile("imgaug/det_convert_test_0.dat");
+    read_annotation(ref_data, ref_bboxes, ref_labels);
+
+
+    std::vector<Rect> bboxes{
+            Rect{112, 40, 249, 343},
+            Rect{61, 273, 113, 228}
+    };
+
+    std::vector<int> labels{1, 2};
+
+    int code = COLOR_BGR2GRAY;
+    cv::imgaug::det::Convert aug(code);
+    aug.call(src, out, bboxes, labels);
+
+    if ( out.rows > 0 && out.rows == ref.rows && out.cols > 0 && out.cols == ref.cols && ref_bboxes.size() == bboxes.size() && ref_labels.size() == labels.size()) {
+        EXPECT_EQ(bboxes, ref_bboxes);
+        EXPECT_EQ(labels, ref_labels);
+
+        // Calculate the L2 relative error between images.
+        double errorL2 = cv::norm( out, ref, NORM_L2 );
+        // Convert to a reasonable scale, since L2 error is summed across all pixels of the image.
+        double error = errorL2 / (double)( out.rows * out.cols );
+        EXPECT_LE(error, 0.1);
+    }else{
+        ts->set_failed_test_info(TS::FAIL_MISMATCH);
+    }
+}
+
+
+TEST(Aug_Det_RandomTranslation, no_drop){
+    cvtest::TS* ts = cvtest::TS::ptr();
+    string img_path = findDataFile("imgaug/lena.jpg");
+    Mat src = imread(img_path);
+    Mat out;
+
+    int seed = 0;
+    cv::imgaug::setSeed(seed);
+
+    string ref_path = findDataFile("imgaug/det_random_translation_test_0.jpg");
+    Mat ref = imread(ref_path, IMREAD_COLOR);
+
+    std::vector<Rect> ref_bboxes;
+    std::vector<int> ref_labels;
+
+    String ref_data = findDataFile("imgaug/det_random_translation_test_0.dat");
+    read_annotation(ref_data, ref_bboxes, ref_labels);
+
+    std::vector<Rect> bboxes{
+            Rect{112, 40, 249, 343},
+            Rect{61, 273, 113, 228}
+    };
+
+    std::vector<int> labels{1, 2};
+
+    Vec2d trans(20, 20);
+    cv::imgaug::det::RandomTranslation aug(trans);
+    aug.call(src, out, bboxes, labels);
+
+    if ( out.rows > 0 && out.rows == ref.rows && out.cols > 0 && out.cols == ref.cols && ref_bboxes.size() == bboxes.size() && ref_labels.size() == labels.size()) {
+        EXPECT_EQ(bboxes, ref_bboxes);
+        EXPECT_EQ(labels, ref_labels);
+
+        // Calculate the L2 relative error between images.
+        double errorL2 = cv::norm( out, ref, NORM_L2 );
+        // Convert to a reasonable scale, since L2 error is summed across all pixels of the image.
+        double error = errorL2 / (double)( out.rows * out.cols );
+        EXPECT_LE(error, 0.1);
+    }else{
+        ts->set_failed_test_info(TS::FAIL_MISMATCH);
+    }
+}
+
+
+TEST(Aug_Det_RandomRotation, no_drop){
+    cvtest::TS* ts = cvtest::TS::ptr();
+    string img_path = findDataFile("imgaug/lena.jpg");
+    Mat src = imread(img_path);
+    Mat out;
+
+    int seed = 0;
+    cv::imgaug::setSeed(seed);
+
+    string ref_path = findDataFile("imgaug/det_random_rotation_test_0.jpg");
+    Mat ref = imread(ref_path, IMREAD_COLOR);
+
+    std::vector<Rect> ref_bboxes;
+    std::vector<int> ref_labels;
+
+    String ref_data = findDataFile("imgaug/det_random_rotation_test_0.dat");
+    read_annotation(ref_data, ref_bboxes, ref_labels);
+
+    std::vector<Rect> bboxes{
+            Rect{112, 40, 249, 343},
+            Rect{61, 273, 113, 228}
+    };
+
+    std::vector<int> labels{1, 2};
+
+    Vec2d degrees(-30, 30);
+    cv::imgaug::det::RandomRotation aug(degrees);
+    aug.call(src, out, bboxes, labels);
+
+    if ( out.rows > 0 && out.rows == ref.rows && out.cols > 0 && out.cols == ref.cols && ref_bboxes.size() == bboxes.size() && ref_labels.size() == labels.size()) {
+        EXPECT_EQ(bboxes, ref_bboxes);
+        EXPECT_EQ(labels, ref_labels);
+
+        // Calculate the L2 relative error between images.
+        double errorL2 = cv::norm( out, ref, NORM_L2 );
+        // Convert to a reasonable scale, since L2 error is summed across all pixels of the image.
+        double error = errorL2 / (double)( out.rows * out.cols );
+        EXPECT_LE(error, 0.1);
+    }else{
+        ts->set_failed_test_info(TS::FAIL_MISMATCH);
+    }
+}
+
+
+}}
diff --git a/modules/imgaug/test/test_main.cpp b/modules/imgaug/test/test_main.cpp
new file mode 100644
index 00000000000..0e51ddfd050
--- /dev/null
+++ b/modules/imgaug/test/test_main.cpp
@@ -0,0 +1,6 @@
+// This file is part of OpenCV project.
+// It is subject to the license terms in the LICENSE file found in the top-level directory
+// of this distribution and at http://opencv.org/license.html.
+#include "test_precomp.hpp"
+
+CV_TEST_MAIN("cv")
diff --git a/modules/imgaug/test/test_precomp.hpp b/modules/imgaug/test/test_precomp.hpp
new file mode 100644
index 00000000000..d7ffec30338
--- /dev/null
+++ b/modules/imgaug/test/test_precomp.hpp
@@ -0,0 +1,13 @@
+// This file is part of OpenCV project.
+// It is subject to the license terms in the LICENSE file found in the top-level directory
+// of this distribution and at http://opencv.org/license.html.
+#ifndef __OPENCV_TEST_PRECOMP_HPP__
+#define __OPENCV_TEST_PRECOMP_HPP__
+
+#include "opencv2/ts.hpp"
+#include "opencv2/imgaug.hpp"
+
+static uint64 seed=0;
+static cv::RNG rng(seed);
+
+#endif
\ No newline at end of file
diff --git a/modules/imgaug/tutorials/imgaug_basic_usage/images/compose_out.jpg b/modules/imgaug/tutorials/imgaug_basic_usage/images/compose_out.jpg
new file mode 100644
index 00000000000..cc590a08748
Binary files /dev/null and b/modules/imgaug/tutorials/imgaug_basic_usage/images/compose_out.jpg differ
diff --git a/modules/imgaug/tutorials/imgaug_basic_usage/images/lena.jpg b/modules/imgaug/tutorials/imgaug_basic_usage/images/lena.jpg
new file mode 100644
index 00000000000..add8374dfff
Binary files /dev/null and b/modules/imgaug/tutorials/imgaug_basic_usage/images/lena.jpg differ
diff --git a/modules/imgaug/tutorials/imgaug_basic_usage/images/random_crop_out.jpg b/modules/imgaug/tutorials/imgaug_basic_usage/images/random_crop_out.jpg
new file mode 100644
index 00000000000..d657372b05c
Binary files /dev/null and b/modules/imgaug/tutorials/imgaug_basic_usage/images/random_crop_out.jpg differ
diff --git a/modules/imgaug/tutorials/imgaug_basic_usage/imgaug_basic_usage.markdown b/modules/imgaug/tutorials/imgaug_basic_usage/imgaug_basic_usage.markdown
new file mode 100644
index 00000000000..a835678c38c
--- /dev/null
+++ b/modules/imgaug/tutorials/imgaug_basic_usage/imgaug_basic_usage.markdown
@@ -0,0 +1,236 @@
+Data augmentation with imgaug {#tutorial_imgaug_basic_usage}
+============================================================
+
+@tableofcontents
+
+@next_tutorial{tutorial_imgaug_object_detection}
+
+|    |    |
+| -: | :- |
+| Author | Chuyang Zhao |
+| Compatibility | OpenCV >= 4.0 |
+
+
+Introduction
+------
+From [Wikipedia](https://en.wikipedia.org/wiki/Data_augmentation), **data augmentation** are techniques used to increase the amount of data
+by adding slightly modified copies of already existing data or newly created synthetic data from existing data.
+It acts as a regularizer and helps reduce overfitting when training a machine learning model.
+
+In a narrow sense, data augmentation is to perform some sort of transforms on given images and generate the modified
+images as additional training data, but broadly speaking, data augmentation can perform not only on images.
+For computer vision tasks like object detection and semantic segmentation, the inputs contain not only images
+but also annotation on the source images. So in these tasks, data augmentation should be able to perform transforms on
+all these data.
+
+The imgaug module implemented in OpenCV takes both these requirements into account. You can use the imgaug module
+for a wide range of computer vision tasks.
+The imgaug module in OpenCV is implemented in pure C++ and is backend with OpenCV efficient image processing operations,
+so it runs much faster and more efficiently than other existing Python-based implementation such as torchvision. Powered with OpenCV, the imgaug module
+is cross-platform and can convert to other languages easily. This is especially useful when we want to
+deploy our model along with its data preprocessing pipeline to the production environment for better inference speed.
+With this feature, we can also use imgaug on other devices such as embedded systems and mobile phones easily.
+
+Goal
+----
+In this tutorial, you will learn:
+- How to use **imgaug** to perform data augmentation for images
+- How to compose multiple methods into one data augmentation method
+- How to change the seed of the random number generator used in **imgaug**
+
+
+Usage
+-----
+In this section, I will use some methods in imgaug to demonstrate how to use imgaug to perform data augmentation on images.
+For the details of all the methods in imgaug, please refer to the documentation @ref cv::imgaug .
+
+### Apply single data augmentation method
+@add_toggle_cpp
+In C++ environment, to use imgaug module you should include the header file:
+
+@code{.cpp}
+#include <opencv2/imgaug.hpp>
+@endcode
+
+We call the constructor of the data augmentation class to get its initialized instance.
+Here we get the instance of cv::imgaug::RandomCrop to perform random crop on the given images. cv::imgaug::RandomCrop requires parameter `sz`
+which is the size of the cropped area on the given image, here we pass cv::Size(300, 300) for this parameter.
+
+@code{.cpp}
+imgaug::RandomCrop randomCrop(cv::Size(300, 300));
+@endcode
+
+Then we read the source image in format cv::Mat and performs the data augmentation operation on it by calling cv::imgaug::RandomCrop::call function.
+
+@code{.cpp}
+Mat src = imread(samples::findFile("lena.jpg"), IMREAD_COLOR);
+Mat dst;
+randomCrop.call(src, dst);
+@endcode
+
+The original image is as follows:
+
+![](images/lena.jpg)
+
+You can display the augmented image after applying random crop by:
+
+@code{.cpp}
+imshow("result", dst);
+waitKey(0);
+@endcode
+
+![](images/random_crop_out.jpg)
+
+@end_toggle
+
+@add_toggle_python
+In Python, to use imgaug module you should import the following package:
+
+@code{.py}
+from cv2 import imgaug
+@endcode
+
+We call the constructor of the data augmentation class to get its initialized instance.
+Here we get an instance of **cv::imgaug::RandomCrop** to perform random crop on the given images. **cv::imgaug::RandomCrop** requires a parameter `sz`
+which is the size of the cropped area on the given image, here we pass a two-elements tuple `(300, 300)` for this parameter.
+
+@code{.py}
+randomCrop = imgaug.RandomCrop(sz=(300, 300))
+@endcode
+
+Then we read the source image with **cv::imread** and performs the data augmentation operation on it by calling **cv::imgaug::RandomCrop::call** function.
+
+@code{.py}
+src = cv2.imread("lena.jpg", cv2.IMREAD_COLOR)
+dst = randomCrop.call(src)
+@endcode
+
+The original image is as follows:
+
+![](images/lena.jpg)
+
+You can display the augmented image after applying random crop by:
+
+@code{.py}
+cv2.imshow("result", dst)
+cv2.waitKey(0)
+@endcode
+
+![](images/random_crop_out.jpg)
+
+@end_toggle
+
+### Compose multiple data augmentation methods
+@add_toggle_cpp
+To compose multiple data augmentation methods into one, firstly you need to
+initialize the data augmentation classes you want to use later:
+
+@code{.cpp}
+imgaug::RandomCrop randomCrop(cv::Size(300, 300));
+imgaug::RandomFlip randomFlip(1);
+imgaug::Resize resize(cv::Size(224, 224));
+@endcode
+
+Because in **cv::imgaug::Compose**, we call each data augmentation method by the pointer of their
+base class **cv::imgaug::Transform**. We need to use a vector of type **cv::Ptr<cv::imgaug::Transform>** to
+store the addresses of all data augmentation instances.
+
+@code{.cpp}
+std::vector<Ptr<imgaug::Transform> > transforms {&randomCrop, &randomFlip, &resize};
+@endcode
+
+Then we construct the **cv::imgaug::Compose** class by passing `transforms` as the required argument.
+
+@code{.cpp}
+imgaug::Compose aug(transforms);
+@endcode
+
+We call the compose method the same way as normal data augmentation methods. The composed
+method will call all the methods in `transforms` on the given image sequentially:
+
+@code{.cpp}
+Mat src = imread(samples::findFile("lena.jpg"), IMREAD_COLOR);
+Mat dst;
+aug.call(src, dst);
+@endcode
+
+Here is the result we get:
+
+![](images/compose_out.jpg)
+
+@end_toggle
+
+@add_toggle_python
+To compose multiple data augmentation methods into one, firstly you need to
+initialize the data augmentation classes you want to use later:
+
+@code{.py}
+randomCrop = imgaug.RandomCrop((300, 300))
+randomFlip = imgaug.RandomFlip(1)
+resize = imgaug.Resize((224, 224))
+@endcode
+
+We store all data augmentation instances in a list.
+
+@code{.py}
+transforms = [randomCrop, randomFlip, resize]
+@endcode
+
+Then we initialize the cv::imgaug::Compose class by passing the list of all data augmentation instances as the argument.
+
+@code{.py}
+aug = imgaug.Compose(transforms)
+@endcode
+
+We call the compose method the same way as normal data augmentation methods.
+The composed method will apply all the data augmentation methods in transforms list to the given image sequentially.
+
+@code{.py}
+src = cv2.imread("lena.jpg", cv2.IMREAD_COLOR)
+dst = aug.call(src)
+@endcode
+
+Here is the result we get:
+
+![](images/compose_out.jpg)
+
+@end_toggle
+
+### Change the seed of random number generator
+@add_toggle_cpp
+In imgaug, we use **cv::imgaug::rng** as our random number generator. The role of rng is to generate
+random numbers for some random methods. For example, in cv::imgaug::RandomCrop we need to generate the coordinates
+of the upper-left corner of the cropped rectangle randomly, in which we will use `rng` to generate random
+numbers in valid range. When a random number is generated by `rng`, the internal state of `rng` will change.
+Thus, we probably won't get the same result when we call the same method again. In the above process, the most
+important thing is the initial state of `rng`, which determines the subsequent numbers `rng` generated. So in some
+cases if you want to replicate other one's results, or if you want to make sure the random values generated will be
+different the next time you run the same program. You can manually set the initial state of the `rng` by calling
+**cv::imgaug::setSeed**. By default, if you don't manually set the initial state of `rng`, its initial state will be
+set to the tick count since it was first initialized.
+
+@code{.cpp}
+int seed = 1234;
+imgaug::setSeed(seed);
+@endcode
+
+@end_toggle
+
+@add_toggle_python
+In imgaug, we use **cv::imgaug::rng** as our random number generator. The role of rng is to generate
+random numbers for some random methods. For example, in cv::imgaug::RandomCrop we need to generate the coordinates
+of the upper-left corner of the cropped rectangle randomly, in which we will use `rng` to generate random
+numbers in valid range. When a random number is generated by `rng`, the internal state of `rng` will change.
+Thus, we probably won't get the same result when we call the same method again. In the above process, the most
+important thing is the initial state of `rng`, which determines the subsequent numbers `rng` generated. So in some
+cases if you want to replicate other one's results, or if you want to make sure the random values generated will be
+different the next time you run the same program. You can manually set the initial state of the `rng` by calling
+**cv::imgaug::setSeed**. By default, if you don't manually set the initial state of `rng`, its initial state will be
+set to the tick count since it was first initialized.
+
+@code{.py}
+seed = 1234
+imgaug.setSeed(seed)
+@endcode
+
+@end_toggle
\ No newline at end of file
diff --git a/modules/imgaug/tutorials/imgaug_obj_det/images/det_compose_out.jpg b/modules/imgaug/tutorials/imgaug_obj_det/images/det_compose_out.jpg
new file mode 100644
index 00000000000..b874c481589
Binary files /dev/null and b/modules/imgaug/tutorials/imgaug_obj_det/images/det_compose_out.jpg differ
diff --git a/modules/imgaug/tutorials/imgaug_obj_det/images/det_rotation_out.jpg b/modules/imgaug/tutorials/imgaug_obj_det/images/det_rotation_out.jpg
new file mode 100644
index 00000000000..1017ee95331
Binary files /dev/null and b/modules/imgaug/tutorials/imgaug_obj_det/images/det_rotation_out.jpg differ
diff --git a/modules/imgaug/tutorials/imgaug_obj_det/images/det_src.jpg b/modules/imgaug/tutorials/imgaug_obj_det/images/det_src.jpg
new file mode 100644
index 00000000000..dbfdc107b85
Binary files /dev/null and b/modules/imgaug/tutorials/imgaug_obj_det/images/det_src.jpg differ
diff --git a/modules/imgaug/tutorials/imgaug_obj_det/imgaug_obj_det.markdown b/modules/imgaug/tutorials/imgaug_obj_det/imgaug_obj_det.markdown
new file mode 100644
index 00000000000..b5a1dd72eb6
--- /dev/null
+++ b/modules/imgaug/tutorials/imgaug_obj_det/imgaug_obj_det.markdown
@@ -0,0 +1,218 @@
+Data augmentation with imgaug in object detection {#tutorial_imgaug_object_detection}
+==============================
+
+@tableofcontents
+
+@prev_tutorial{tutorial_imgaug_basic_usage}
+@next_tutorial{tutorial_imgaug_pytorch}
+
+|    |    |
+| -: | :- |
+| Author | Chuyang Zhao |
+| Compatibility | OpenCV >= 4.0 |
+
+
+Introduction
+------
+In the previous tutorial, we demonstrate how to use imgaug to perform transforms on pure images.
+In some tasks, the inputs contains not only images but also the annotations. We extend the imgaug
+module to support most of the main stream computer vision tasks. Here we demonstrate how to use imgaug for
+object detection.
+
+
+Goal
+----
+In this tutorial, you will learn:
+- How to use imgaug to perform data augmentation for data in object detection task
+
+
+The inputs of object detection task contain source input image, the annotated bounding boxes, and the class labels
+for each bounding box. In C++, the input image is represented as cv::Mat, the annotated bounding boxes can be represented
+as `std::vector<cv::Rect>` in which each bounding box is represented as a cv::Rect. The annotated labels for objects in
+bounding boxes can be represented as `std::vector<int>`.
+
+The data augmentation methods for object detection are implemented under namespace cv::imgaug::det, you can
+find more details of all implemented methods in documentation cv::imgaug::det.
+
+
+Usage
+-----
+### Apply single data augmentation method
+@add_toggle_cpp
+
+To use the imgaug module in object detection, we need to include the header file:
+
+@code{.cpp}
+#include <opencv2/imgaug.hpp>
+@endcode
+
+Take random flip as an example, we first initialize the cv::imgaug::det::RandomRotation instance by:
+
+@code{.cpp}
+imgaug::det::RandomRotation aug(Vec2d(-30, 30));
+@endcode
+
+The first argument cv::Vec2d(-30, 30) is the degree range in which the rotation degree will be uniformly sampled from.
+
+Then we read the source image and load its annotation data, which including bounding boxes and class labels.
+In the following example, the annotation data contains two bounding boxes and two class labels:
+
+@code{.cpp}
+Mat src = imread(samples::findFile("lena.jpg"), IMREAD_COLOR);
+
+std::vector<Rect> bboxes{
+Rect{112, 40, 249, 343},
+Rect{61, 273, 113, 228}
+};
+
+std::vector<int> labels {1, 2};
+@endcode
+
+The bounding boxes on the source image is as follows:
+
+![](images/det_src.jpg)
+
+Then we call random rotation operation on the given image and its annotations by imgaug::det::RandomRotation::call:
+
+@code{.cpp}
+aug.call(src, dst, bboxes, labels);
+@endcode
+
+The augmented image and its annotation are as follows:
+
+![](images/det_rotation_out.jpg)
+
+Complete code of this example:
+@include imgaug/samples/det_sample.cpp.
+
+@end_toggle
+
+@add_toggle_python
+
+In Python, you should import the following package:
+
+@code{.py}
+from cv2 import imgaug
+@endcode
+
+Be aware the data augmentation methods for object detection are all in submodule `cv2.imgaug.det`.
+
+Take random flip as an example, we first initialize the cv::imgaug::det::RandomRotation instance by:
+
+@code{.py}
+aug = imgaug.det.RandomRotation((-30, 30))
+@endcode
+
+The first argument (-30, 30) is the degree range in which the rotation degree will be uniformly sampled from.
+
+Then we read the source image and load its annotation data, which including bounding boxes and class labels.
+In the following example, the annotation data contains two bounding boxes and two class labels:
+
+@code{.py}
+src = cv2.imread("lena.jpg", cv2.IMREAD_COLOR)
+
+bboxes = [
+    (112, 40, 249, 343),
+    (61, 273, 113, 228)
+]
+
+labels = [1, 2]
+@endcode
+
+@note We represent the bounding box with a four-elements tuple (x, y, w, h),
+in which x and y are the coordinates of the top left corner of the bounding box,
+w and h are the width and height of the bounding box. The binding generator will
+convert the tuple into cv::Rect in C++. Please make sure the elements in the tuple
+is in the right order.
+
+The bounding boxes on the source image is as follows:
+
+![](images/det_src.png)
+
+Then we call random rotation operation on the given image and its annotations by imgaug::det::RandomRotation::call:
+
+@code{.py}
+dst = aug.call(src, bboxes, labels)
+@endcode
+
+The augmented image and its annotation are as follows:
+
+![](images/det_rotation_out.png)
+
+Complete code of this example:
+@include imgaug/samples/det_sample.cpp
+
+@end_toggle
+
+### Compose multiple data augmentation methods
+@add_toggle_cpp
+Compose multiple data augmentation methods into one in object detection module (cv::imgaug::det) is similar to basic imgaug module (cv::imgaug).
+We also need to initialize multiple data augmentation instances in imgaug::det :
+
+@code{.cpp}
+imgaug::det::RandomRotation randomRotation(Vec2d(-30, 30));
+imgaug::det::RandomFlip randomFlip(1);
+imgaug::det::Resize resize(Size(224, 224));
+@endcode
+
+Different from data augmentation classes in cv::imgaug, data augmentation classes in cv::imgaug::det are inherited from base class
+cv::imgaug::det::Transform, so we need to use pointer of type cv::imgaug::det::Transform to store the address of each data augmentation
+instances in det module. We store their pointers in a vector and then initialize the imgaug::det::Compose class with this vector:
+
+@code{.cpp}
+std::vector<Ptr<imgaug::det::Transform> > transforms {&randomRotation, &randomFlip, &resize};
+imgaug::det::Compose aug(transforms);
+@endcode
+
+@warning You cannot compose data augmentation methods in cv::imgaug::det module with methods in cv::imgaug module,
+because they do not inherit from the same base class. You can only compose methods in the same module.
+
+Then we can call the compose method on the given image and its annotation as follows:
+
+@code{.cpp}
+aug.call(src, dst, bboxes, labels);
+@endcode
+
+The augmented image and its annotation are as follows:
+
+![](images/det_compose_out.png)
+
+Complete code of this example:
+@include imgaug/samples/det_compose_sample.cpp
+
+@end_toggle
+
+@add_toggle_python
+Compose multiple data augmentation methods into one in object detection module (cv::imgaug::det) is similar to basic imgaug module (cv::imgaug).
+We also need to initialize multiple data augmentation instances in imgaug::det :
+
+@code{.py}
+randomRotation = imgaug.det.RandomRotation((-30, 30))
+randomFlip = imgaug.det.RandomFlip(1)
+resize = imgaug.det.Resize((224, 224))
+@endcode
+
+We save all these methods in a list `transforms` as parameter to initialize Compose class.
+
+@code{.py}
+transforms = [randomRotation, randomFlip, resize]
+aug = Compose(transforms)
+@endcode
+
+@warning You cannot compose data augmentation methods in cv::imgaug::det module with methods in cv::imgaug module,
+because they do not inherit from the same base class. You can only compose methods in the same module.
+
+Then we can call the compose method on the given image and its annotation as follows:
+
+@code{.py}
+dst = aug.call(src, bboxes, labels)
+@endcode
+
+The augmented image and its annotation are as follows:
+
+![](images/det_compose_out.png)
+
+Complete code of this example:
+@include imgaug/samples/det_compose_sample.cpp
+
+@end_toggle
\ No newline at end of file
diff --git a/modules/imgaug/tutorials/imgaug_pytorch/imgaug_pytorch.markdown b/modules/imgaug/tutorials/imgaug_pytorch/imgaug_pytorch.markdown
new file mode 100644
index 00000000000..1c65c36b13a
--- /dev/null
+++ b/modules/imgaug/tutorials/imgaug_pytorch/imgaug_pytorch.markdown
@@ -0,0 +1,210 @@
+Use imgaug with PyTorch {#tutorial_imgaug_pytorch}
+==============================
+
+@tableofcontents
+
+@prev_tutorial{tutorial_imgaug_object_detection}
+
+|    |    |
+| -: | :- |
+| Author | Chuyang Zhao |
+| Compatibility | OpenCV >= 4.0 |
+
+Introduction
+------------
+Imgaug is the data augmentation module in OpenCV which allows you to process
+the data before putting them into the model. Because imgaug is implemented in
+pure C++ and is backend with OpenCV's efficient image processing operations,
+it runs faster and more efficiently than other existing Python-based
+implementations. In this tutorial, I will demonstrate how to use imgaug
+with PyTorch. Specifically, how to preprocess the data before putting
+them into the PyTorch model for training or inference.
+
+
+Goals
+-----
+In this tutorial, you will learn how to:
+1. Use imgaug to perform data augmentation on your input data
+2. Use imgaug with PyTorch for the image classification task
+3. Use imgaug with PyTorch for the object detection task
+
+
+Usage
+-----
+### Use imgaug with PyTorch in image classification task
+In this section, we use Imagenette as the training dataset. You can download it [here](https://github.com/fastai/imagenette).
+
+Firstly, we define the dataset of PyTorch as follows:
+
+@code{.py}
+class ImagenetteDataset(torch.utils.data.Dataset):
+    def __init__(self, root, df_data, mode='train', transform=None):
+        super(ImagenetteDataset, self).__init__()
+        assert mode in ['train', 'valid']
+
+        self.root = root
+        self.transform = transform
+        labels = ['n01440764', 'n02102040', 'n02979186', 'n03000684', 'n03028079', 'n03394916', 'n03417042', 'n03425413', 'n03445777', 'n03888257']
+        self.label_to_num = {v: k for k, v in enumerate(labels)}
+
+        if mode == 'train':
+            self.df_data = df_data[df_data['is_valid'] == False][:256]
+        else:
+            self.df_data = df_data[df_data['is_valid'] == True]
+
+    def __len__(self):
+        return len(self.df_data)
+
+    def __getitem__(self, idx):
+        path = self.df_data.iloc[idx]['path']
+        path = os.path.join(self.root, path)
+        image = self.get_image(path)
+        label = path.split('/')[-2]
+        label = self.label_to_num[label]
+        return image, label
+
+    def get_image(self, path):
+        image = cv2.imread(path)
+        if self.transform:
+            image = self.transform.call(image)
+        image = np.transpose(image, (2, 0, 1))
+        return torch.tensor(image, dtype=torch.float)
+@endcode
+
+In this dataset, we use `transforms` which we defined below to perform
+data augmentation on the image.
+
+The transforms we used contain four data augmentation methods, and they are
+composed into one using the cv::imgaug::Compose class.
+
+@code{.py}
+transforms = cv2.Compose([
+    cv2.RandomCrop((300, 300), (0,0,0,0)),
+    cv2.RandomFlip(),
+    cv2.Resize((500, 500)),
+    cv2.Normalize(mean=(0.406, 0.456, 0.485), std=(0.225, 0.224, 0.229))
+])
+@endcode
+
+@note The mean and std here we pass to cv.Normalize are [0.406, 0.456, 0.485] and [0.225, 0.224, 0.229]
+respectively, which is slightly different from the mean and std of ImageNet (mean = [0.485, 0.456, 0.406], std = [0.229, 0.224, 0.225]).
+This is because the mean and std of ImageNet is for image in format RGB. But the image read by OpenCV is in BGR format.
+So we need to change the order of the original mean and std of ImageNet to make it suitable for image read by OpenCV.
+
+After constructing the dataset and building the model, we can start training our model:
+
+@code{.py}
+train_set = ImagenetteDataset(root_dir, df_train, 'train', transforms)
+
+train_loader = data.DataLoader(train_set, num_workers=0, batch_size=16, drop_last=True, shuffle=True)
+model = resnet18(pretrained=True)
+model.fc = torch.nn.Linear(in_features=512, out_features=10)
+optimizer = torch.optim.Adam(model.parameters(), lr=lr)
+criterion = torch.nn.CrossEntropyLoss()
+
+train(train_loader, model, 1, criterion, optimizer)
+@endcode
+
+Complete code of the example is as follows:
+
+@include samples/train_cls_net.py
+
+### Use imgaug with PyTorch in object detection task
+
+In this section, we use Penn-Fudan dataset for training the object detection model.
+You can download the dataset from [here](https://www.cis.upenn.edu/~jshi/ped_html/).
+
+Similarly, we first define the PyTorch dataset:
+@code{.py}
+class PennFudanDataset(torch.utils.data.Dataset):
+    def __init__(self, root, transforms=None):
+        self.root = root
+        self.transforms = transforms
+        # load all image files, sorting them to
+        # ensure that they are aligned
+        self.imgs = list(sorted(os.listdir(os.path.join(root, "PNGImages"))))
+        self.masks = list(sorted(os.listdir(os.path.join(root, "PedMasks"))))
+
+    def _get_boxes(self, mask):
+        obj_ids = np.unique(mask)
+        # first id is the background, so remove it
+        obj_ids = obj_ids[1:]
+
+        # split the color-encoded mask into a set
+        # of binary masks
+        masks = mask == obj_ids[:, None, None]
+
+        # get bounding box coordinates for each mask
+        num_objs = len(obj_ids)
+        for i in range(num_objs):
+            pos = np.where(masks[i])
+            xmin = np.min(pos[1])
+            xmax = np.max(pos[1])
+            ymin = np.min(pos[0])
+            ymax = np.max(pos[0])
+            yield xmin, ymin, xmax, ymax
+
+    def __getitem__(self, idx):
+        # load images and masks
+        img_path = os.path.join(self.root, "PNGImages", self.imgs[idx])
+        mask_path = os.path.join(self.root, "PedMasks", self.masks[idx])
+        img = cv2.imread(img_path)
+        img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
+        # mask is array of size (H, W), all elements of array are integers
+        # background is 0, and each distinct person is represented as a distinct integer starting from 1
+        # you can treat mask as grayscale image
+        mask = cv2.imread(mask_path, cv2.IMREAD_GRAYSCALE)
+        boxes = []
+        for x1, y1, x2, y2 in self._get_boxes(mask):
+            # NOTE: in opencv, box is represented as (x, y, width, height)
+            boxes.append([x1, y1, x2-x1, y2-y1])
+        num_objs = len(boxes)
+        labels = torch.ones((num_objs,), dtype=torch.int64)
+
+        if self.transforms is not None:
+            img, boxes = self.transforms.call(img, boxes)
+
+        # 1. transpose from (h, w, c) to (c, h, w)
+        # 2. normalize data into range 0-1
+        # 3. convert from np.array to torch.tensor
+        img = torch.tensor(np.transpose(img, (2, 0, 1)), dtype=torch.float32)
+        boxes = [[x1, y1, x1+width, y1+height] for x1, y1, width, height in boxes]
+        boxes = torch.as_tensor(boxes, dtype=torch.float32)
+
+        return img, boxes, labels
+
+    def __len__(self):
+        return len(self.imgs)
+
+    @staticmethod
+    def collate_fn(batch):
+        images = list()
+        boxes = list()
+        labels = list()
+        targets = list()
+
+        for item in batch:
+            images.append(item[0])
+            # boxes.append(item[1])
+            # labels.append(item[2])
+            target = {"boxes": item[1], "labels": item[2]}
+            targets.append(target)
+
+        images = torch.stack(images, dim=0)
+
+        return images, targets
+@endcode
+
+Then we define the transforms we use for data augmentation as:
+@code{.py}
+def get_transforms():
+    transforms = cv2.det.Compose([
+        cv2.det.RandomFlip(),
+        cv2.det.Resize((500, 500)),
+    ])
+    return transforms
+@endcode
+
+Complete code the example is as follows:
+
+@include samples/train_det_net.py
\ No newline at end of file
diff --git a/modules/imgaug/tutorials/table_of_content_imgaug.markdown b/modules/imgaug/tutorials/table_of_content_imgaug.markdown
new file mode 100644
index 00000000000..90b087937ee
--- /dev/null
+++ b/modules/imgaug/tutorials/table_of_content_imgaug.markdown
@@ -0,0 +1,36 @@
+Tutorials for data augmentation module {#tutorial_table_of_content_imgaug}
+===============================================================
+
+Data augmentation techniques are widely used in deep learning training to expand
+the training samples and overcome overfitting problem. imgaug module in OpenCV is
+implemented in pure C++ and powered with efficient OpenCV image processing operations,
+so it runs much faster and more efficient than Python-based implementations.
+
+With the binding generator provided by OpenCV, imgaug can be used not only from C++, but also from
+different languages like Python, Java, etc. Conversely, you can also adopt your code
+easily from other languages to C++, which is especially useful when you want to deploy
+a model with its data preprocessing pipeline from Python to production environment in C++.
+
+-   @subpage tutorial_imgaug_basic_usage
+
+    *Compatibility:* >= OpenCV 4.0
+
+    *Author:* Chuyang Zhao
+
+    Basic usage of imgaug module. Perform data augmentation on images.
+
+-   @subpage tutorial_imgaug_object_detection
+
+    *Compatibility:* >= OpenCV 4.0
+
+    *Author:* Chuyang Zhao
+
+    Use imgaug to perform data augmentation for object detection task.
+
+-   @subpage tutorial_imgaug_pytorch
+
+    *Compatibility:* >= OpenCV 4.0
+
+    *Author:* Chuyang Zhao
+
+    Use imgaug with PyTorch for different computer vision tasks.
\ No newline at end of file