ResNet18-2Plus1DD

A fully serializable 2Plus1D(3D) implementation of ResNet18, incorporating improvements from the paper "Bag of Tricks for Image Classification with Convolutional Neural Networks" along with additional personal optimizations and modifications.

2Plus1D processes spatial and temporal dimensions separately using two consecutive convolutional layers, which are then concatenated. This method enables efficient handling of high-dimensional data while keeping computational costs relatively low. It was introduced in "A Closer Look at Spatiotemporal Convolutions for Action Recognition".

When to Use 2+1D Convolutions?

They excel in video analysis (action recognition, motion detection) where spatial and temporal features are naturally separable. For comparison:

3D Convolutions: Better for dense spatiotemporal correlations (e.g., fluid dynamics).
2+1D Convolutions: Optimal for balancing efficiency and performance in most video tasks.

This repository also includes implementations of the Hardswish and Mish activation functions:

The codebase is fully integratable inside the TensorFlow and Keras code pipelines.

Key Enhancements

Modified Stem: Utilizes three convolutional layers instead of a single one.
ResNet-B Inspired Strides: Moved the stride placement in the residual blocks from the first convolution to the second.
ResNet-D Inspired Shortcut: Introduces an average pooling layer before the 1x1 convolution in the shortcut connection.
Reduced Downsampling: The temporal dimension is now downsampled only twice in the stem block, while the spatial dimension follows the original approach, undergoing downsampling five times.

Note: The images above represent the architectural modifications. They depict 2D convolutional layers, whereas this project is focused on 2Plus1D(3D) convolutions. The ResNet-C image is sourced from the referenced paper, while the shortcut image is created by the author.

Installation & Usage

This code is compatible with Python 3.12.8 and TensorFlow 2.18.0.

from ResNet182Plus1DD import ResNet182Plus1DD


model = ResNet182Plus1DD()
model.build((None, 32, 256, 256, 3))
model.summary()

Model Summary Example

Model: "res_net182_plus1dd"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃ Layer (type)                         ┃ Output Shape                ┃         Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
│ conv2_plus1d_layer                   │ (None, 16, 128, 128, 32)    │           2,706 │
│ (Conv2Plus1DLayer)                   │                             │                 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ conv2_plus1d_layer_1                 │ (None, 16, 128, 128, 32)    │          27,648 │
│ (Conv2Plus1DLayer)                   │                             │                 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ conv2_plus1d_layer_2                 │ (None, 16, 128, 128, 64)    │          55,680 │
│ (Conv2Plus1DLayer)                   │                             │                 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ max_pooling3d (MaxPooling3D)         │ (None, 8, 64, 64, 64)       │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ residual2_plus1dd (Residual2Plus1DD) │ (None, 8, 64, 64, 64)       │         221,184 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ residual2_plus1dd_1                  │ (None, 8, 32, 32, 128)      │         672,384 │
│ (Residual2Plus1DD)                   │                             │                 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ residual2_plus1dd_2                  │ (None, 8, 32, 32, 128)      │         884,736 │
│ (Residual2Plus1DD)                   │                             │                 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ residual2_plus1dd_3                  │ (None, 8, 16, 32, 256)      │       2,687,616 │
│ (Residual2Plus1DD)                   │                             │                 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ residual2_plus1dd_4                  │ (None, 8, 16, 32, 256)      │       3,538,944 │
│ (Residual2Plus1DD)                   │                             │                 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ residual2_plus1dd_5                  │ (None, 8, 8, 16, 512)       │      10,749,696 │
│ (Residual2Plus1DD)                   │                             │                 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ residual2_plus1dd_6                  │ (None, 8, 8, 16, 512)       │      14,155,776 │
│ (Residual2Plus1DD)                   │                             │                 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ global_average_pooling3d             │ (None, 512)                 │               0 │
│ (GlobalAveragePooling3D)             │                             │                 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense (Dense)                        │ (None, 256)                 │         131,328 │
└──────────────────────────────────────┴─────────────────────────────┴─────────────────┘
 Total params: 33,127,698 (126.37 MB)
 Trainable params: 33,127,698 (126.37 MB)
 Non-trainable params: 0 (0.00 B)

License

This work is under an MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.github/workflows		.github/workflows
ResNet182Plus1DD		ResNet182Plus1DD
tests		tests
util_resources/readme		util_resources/readme
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
playground.py		playground.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ResNet18-2Plus1DD

Key Enhancements

Installation & Usage

Model Summary Example

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

AliKHaliliT/ResNet18-2Plus1DD

Folders and files

Latest commit

History

Repository files navigation

ResNet18-2Plus1DD

Key Enhancements

Installation & Usage

Model Summary Example

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages