4
4
# Keras Image Models
5
5
6
6
<div align =" center " >
7
- <img width =" 50% " src =" docs/banner/kimm.png " alt =" KIMM " >
7
+ <img width =" 50% " src =" https://github.com/james77777778/kimm/assets/20734616/b21db8f2-307b-4791-b93d-e913e45fb238 " alt =" KIMM " >
8
8
9
9
[ ![ PyPI] ( https://img.shields.io/pypi/v/kimm )] ( https://pypi.org/project/kimm/ )
10
10
[ ![ Contributions Welcome] ( https://img.shields.io/badge/contributions-welcome-brightgreen.svg?style=flat )] ( https://github.com/james77777778/kimm/issues )
@@ -24,10 +24,15 @@ pip install keras kimm
24
24
25
25
## Quickstart
26
26
27
- ### Use Pretrained Model
27
+ ### Image Classification Using the Model Pretrained on ImageNet
28
+
29
+ [ ![ Open In Colab] ( https://colab.research.google.com/assets/colab-badge.svg )] ( https://colab.research.google.com/drive/14WxYgVjlwCIO9MwqPYW-dskbTL2UHsVN?usp=sharing )
28
30
29
31
``` python
30
- from keras import random
32
+ import cv2
33
+ import keras
34
+ from keras import ops
35
+ from keras.applications.imagenet_utils import decode_predictions
31
36
32
37
import kimm
33
38
@@ -38,58 +43,81 @@ print(kimm.list_models())
38
43
print (kimm.list_models(" efficientnet" , weights = " imagenet" )) # fuzzy search
39
44
40
45
# Initialize the model with pretrained weights
41
- model = kimm.models.EfficientNetV2B0(weights = " imagenet" )
42
-
43
- # Predict
44
- x = random.uniform([1 , 192 , 192 , 3 ]) * 255.0
45
- y = model.predict(x)
46
- print (y.shape)
46
+ model = kimm.models.EfficientNetV2B0()
47
+ image_size = model._default_size
47
48
48
- # Initialize the model as a feature extractor with pretrained weights
49
- model = kimm.models.EfficientNetV2B0 (
50
- feature_extractor = True , weights = " imagenet "
49
+ # Load an image as the model input
50
+ image_path = keras.utils.get_file (
51
+ " african_elephant.jpg " , " https://i.imgur.com/Bvro0YD.png "
51
52
)
53
+ image = cv2.imread(image_path)
54
+ image = cv2.resize(image, (image_size, image_size))
55
+ x = ops.convert_to_tensor(image)
56
+ x = ops.expand_dims(x, axis = 0 )
52
57
53
- # Extract features for downstream tasks
54
- y = model.predict(x)
55
- print (y.keys())
56
- print (y[" BLOCK5_S32" ].shape)
58
+ # Predict
59
+ preds = model.predict(x)
60
+ print (" Predicted:" , decode_predictions(preds, top = 3 )[0 ])
57
61
```
58
62
59
- ### Transfer Learning
63
+ ``` bash
64
+ [' ConvMixer1024D20' , ' ConvMixer1536D20' , ' ConvMixer736D32' , ' ConvNeXtAtto' , ...]
65
+ [' EfficientNetB0' , ' EfficientNetB1' , ' EfficientNetB2' , ' EfficientNetB3' , ...]
66
+ 1/1 ━━━━━━━━━━━━━━━━━━━━ 11s 11s/step
67
+ Predicted: [(' n02504458' , ' African_elephant' , 0.90578836), (' n01871265' , ' tusker' , 0.024864597), (' n02504013' , ' Indian_elephant' , 0.01161992)]
68
+ ```
60
69
61
- ``` python
62
- from keras import layers
63
- from keras import models
64
- from keras import random
70
+ ### An end-to-end example: fine-tuning an image classification model on a cats vs. dogs dataset
65
71
66
- import kimm
72
+ [ ![ Open In Colab ] ( https://colab.research.google.com/assets/colab-badge.svg )] ( https://colab.research.google.com/drive/1IbqfqG2NKEOKvBOznIPT1kjOdVPfThmd?usp=sharing )
67
73
68
- # Initialize the model as a backbone with pretrained weights
69
- backbone = kimm.models.EfficientNetV2B0(
70
- input_shape = [224 , 224 , 3 ],
71
- include_top = False ,
72
- pooling = " avg" ,
73
- weights = " imagenet" ,
74
- )
74
+ Using ` kimm.models.EfficientNetLiteB0 ` :
75
75
76
- # Freeze the backbone for transfer learning
77
- backbone.trainable = False
76
+ < div align = " center " >
77
+ < img width = " 75% " src = " https://github.com/james77777778/kimm/assets/20734616/cbfc0773-a3fa-407d-be9a-fba4f19da6d3 " alt = " kimm_prediction_0 " >
78
78
79
- # Construct the model with new head
80
- inputs = layers.Input([224 , 224 , 3 ])
81
- x = backbone(inputs, training = False )
82
- x = layers.Dropout(0.2 )(x)
83
- outputs = layers.Dense(2 )(x)
84
- model = models.Model(inputs, outputs)
79
+ <img width =" 75% " src =" https://github.com/james77777778/kimm/assets/20734616/2eac0831-75bb-4790-a3af-412c3e09cf8f " alt =" kimm_prediction_1 " >
80
+ </div >
85
81
86
- # Train the new model (put your own logic here )
82
+ Reference: [ Transfer learning & fine-tuning (keras.io) ] ( https://keras.io/guides/transfer_learning/#an-endtoend-example-finetuning-an-image-classification-model-on-a-cats-vs-dogs-dataset )
87
83
88
- # Predict
89
- x = random.uniform([1 , 224 , 224 , 3 ]) * 255.0
90
- y = model.predict(x)
91
- print (y.shape)
92
- ```
84
+ ### Grad-CAM
85
+
86
+ [ ![ Open In Colab] ( https://colab.research.google.com/assets/colab-badge.svg )] ( https://colab.research.google.com/drive/1h25VmsYDOLL6BNbRPEVOh1arIgcEoHu6?usp=sharing )
87
+
88
+ Using ` kimm.models.MobileViTS ` :
89
+
90
+ <div align =" center " >
91
+ <img width =" 75% " src =" https://github.com/james77777778/kimm/assets/20734616/cb5022a3-aaea-4324-a9cd-3d2e63a0a6b2 " alt =" grad_cam " >
92
+ </div >
93
+
94
+ Reference: [ Grad-CAM class activation visualization (keras.io)] ( https://keras.io/examples/vision/grad_cam/ )
95
+
96
+ ## Model Zoo
97
+
98
+ | Model| Paper| Weights are ported from| API|
99
+ | -| -| -| -|
100
+ | ConvMixer| [ ICLR 2022 Submission] ( https://arxiv.org/abs/2201.09792 ) | ` timm ` | ` kimm.models.ConvMixer* ` |
101
+ | ConvNeXt| [ CVPR 2022] ( https://arxiv.org/abs/2201.03545 ) | ` timm ` | ` kimm.models.ConvNeXt* ` |
102
+ | DenseNet| [ CVPR 2017] ( https://arxiv.org/abs/1608.06993 ) | ` timm ` | ` kimm.models.DenseNet* ` |
103
+ | EfficientNet| [ ICML 2019] ( https://arxiv.org/abs/1905.11946 ) | ` timm ` | ` kimm.models.EfficientNet* ` |
104
+ | EfficientNetLite| [ ICML 2019] ( https://arxiv.org/abs/1905.11946 ) | ` timm ` | ` kimm.models.EfficientNetLite* ` |
105
+ | EfficientNetV2| [ ICML 2021] ( https://arxiv.org/abs/2104.00298 ) | ` timm ` | ` kimm.models.EfficientNetV2* ` |
106
+ | GhostNet| [ CVPR 2020] ( https://arxiv.org/abs/1911.11907 ) | ` timm ` | ` kimm.models.GhostNet* ` |
107
+ | GhostNetV2| [ NeurIPS 2022] ( https://arxiv.org/abs/2211.12905 ) | ` timm ` | ` kimm.models.GhostNetV2* ` |
108
+ | InceptionV3| [ CVPR 2016] ( https://arxiv.org/abs/1512.00567 ) | ` timm ` | ` kimm.models.InceptionV3 ` |
109
+ | LCNet| [ arXiv 2021] ( https://arxiv.org/abs/2109.15099 ) | ` timm ` | ` kimm.models.LCNet* ` |
110
+ | MobileNetV2| [ CVPR 2018] ( https://arxiv.org/abs/1801.04381 ) | ` timm ` | ` kimm.models.MobileNetV2* ` |
111
+ | MobileNetV3| [ ICCV 2019] ( https://arxiv.org/abs/1905.02244 ) | ` timm ` | ` kimm.models.MobileNetV3* ` |
112
+ | MobileViT| [ ICLR 2022] ( https://arxiv.org/abs/2110.02178 ) | ` timm ` | ` kimm.models.MobileViT* ` |
113
+ | RegNet| [ CVPR 2020] ( https://arxiv.org/abs/2003.13678 ) | ` timm ` | ` kimm.models.RegNet* ` |
114
+ | ResNet| [ CVPR 2015] ( https://arxiv.org/abs/1512.03385 ) | ` timm ` | ` kimm.models.ResNet* ` |
115
+ | TinyNet| [ NeurIPS 2020] ( https://arxiv.org/abs/2010.14819 ) | ` timm ` | ` kimm.models.TinyNet* ` |
116
+ | VGG| [ ICLR 2015] ( https://arxiv.org/abs/1409.1556 ) | ` timm ` | ` kimm.models.VGG* ` |
117
+ | ViT| [ ICLR 2021] ( https://arxiv.org/abs/2010.11929 ) | ` timm ` | ` kimm.models.VisionTransformer* ` |
118
+ | Xception| [ CVPR 2017] ( https://arxiv.org/abs/1610.02357 ) | ` keras ` | ` kimm.models.Xception ` |
119
+
120
+ The export scripts can be found in ` tools/convert_*.py ` .
93
121
94
122
## License
95
123
0 commit comments