Skip to content

Commit 7c7337b

Browse files
author
helloworld
committed
Added New model Resnet_50_ArcLoss, made api changes to easily add new models in future, fixed ibm_generate
1 parent 15c117f commit 7c7337b

File tree

152 files changed

+795
-215
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

152 files changed

+795
-215
lines changed

CITATION.bib

100644100755
File mode changed.

LICENSE.md

100644100755
File mode changed.

MANIFEST.in

100644100755
File mode changed.

README.md

100644100755
Lines changed: 76 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -107,11 +107,15 @@ The pathname of the generated wakeword needs to passed to the HotwordDetector de
107107
```python
108108
HotwordDetector(
109109
hotword="hello",
110+
model = Resnet_50_Arc_loss(),
110111
reference_file = "/full/path/name/of/hello_ref.json"),
111112
threshold=0.9, #min confidence required to consider a trigger
112113
relaxation_time = 0.8 #default value ,in seconds
113114
)
114115
```
116+
117+
The model variable can receive an instance of Resnet_50_Arc_loss or First_Iteration_Siamese
118+
115119
relaxation time parameter is used to determine the min time between any 2 triggers, any potential triggers before the relaxation_time will be cancelled
116120

117121
The detector operates on a sliding widow approach resulting in multiple triggers for single utterance of a hotword, the relaxation_time parameter can used to control the multiple triggers, in most cases 0.8sec(default) will do
@@ -133,14 +137,25 @@ from eff_word_net import samples_loc
133137
import os
134138
from eff_word_net.streams import SimpleMicStream
135139
from eff_word_net.engine import HotwordDetector
140+
141+
from eff_word_net.audio_processing import Resnet50_Arc_loss
142+
136143
from eff_word_net import samples_loc
137144

145+
base_model = Resnet50_Arc_loss()
146+
138147
mycroft_hw = HotwordDetector(
139-
hotword="Mycroft",
140-
reference_file = os.path.join(samples_loc,"mycroft_ref.json"),
141-
)
148+
hotword="mycroft",
149+
model = base_model,
150+
reference_file="mycroft_ref.json",
151+
threshold=0.7,
152+
relaxation_time=2
153+
)
142154

143-
mic_stream = SimpleMicStream()
155+
mic_stream = SimpleMicStream(
156+
window_length=1.5,
157+
sliding_window=0.75,
158+
)
144159
mic_stream.start_stream()
145160

146161
print("Say Mycroft ")
@@ -163,51 +178,78 @@ The library provides a computation friendly way
163178
to detect multiple hotwords from a given stream, instead of running `scoreFrame()` of each wakeword individually
164179

165180
```python
181+
166182
import os
167183
from eff_word_net.streams import SimpleMicStream
168184
from eff_word_net import samples_loc
169185
print(samples_loc)
170186

171-
alexa_hw = HotwordDetector(
172-
hotword="Alexa",
173-
reference_file = os.path.join(samples_loc,"alexa_ref.json"),
174-
)
175187

176-
siri_hw = HotwordDetector(
177-
hotword="Siri",
178-
reference_file = os.path.join(samples_loc,"siri_ref.json"),
179-
)
188+
base_model = Resnet50_Arc_loss()
180189

181190
mycroft_hw = HotwordDetector(
182-
hotword="mycroft",
183-
reference_file = os.path.join(samples_loc,"mycroft_ref.json"),
184-
activation_count=3
185-
)
186-
187-
multi_hw_engine = MultiHotwordDetector(
188-
detector_collection = [
189-
alexa_hw,
190-
siri_hw,
191-
mycroft_hw,
192-
],
193-
)
194-
195-
mic_stream = SimpleMicStream()
191+
hotword="mycroft",
192+
model = base_model,
193+
reference_file=os.path.join(samples_loc,"mycroft_ref.json"),
194+
threshold=0.7,
195+
relaxation_time=2
196+
)
197+
198+
alexa_hw = HotwordDetector(
199+
hotword="alexa",
200+
model=base_model,
201+
reference_file=os.path.join(samples_loc,"alexa_ref.json"),
202+
threshold=0.7,
203+
relaxation_time=2,
204+
#verbose=True
205+
)
206+
207+
208+
computer_hw = HotwordDetector(
209+
hotword="computer",
210+
model=base_model,
211+
reference_file=os.path.join(samples_loc,"computer_ref.json"),
212+
threshold=0.7,
213+
relaxation_time=2,
214+
#verbose=True
215+
)
216+
217+
multi_hotword_detector = MultiHotwordDetector(
218+
[mycroft_hw, alexa_hw, computer_hw],
219+
model=base_model,
220+
continuous=True,
221+
)
222+
223+
mic_stream = SimpleMicStream(window_length_secs=1.5, sliding_window_secs=0.75)
196224
mic_stream.start_stream()
197225

198-
print("Say Mycroft / Alexa / Siri")
226+
print("Say ", " / ".join([x.hotword for x in multi_hotword_detector.detector_collection]))
199227

200228
while True :
201229
frame = mic_stream.getFrame()
202-
result = multi_hw_engine.findBestMatch(frame)
230+
result = multi_hotword_detector.findBestMatch(frame)
203231
if(None not in result):
204232
print(result[0],f",Confidence {result[1]:0.4f}")
205233

234+
206235
```
207236
<br>
208237

209238
Access documentation of the library from here : https://ant-brain.github.io/EfficientWord-Net/
210239

240+
## Change notes from 0.2.2 to v1.0.1
241+
### New Model Addition Resnet_50_Arc_loss with huge improvements !!
242+
Trained a new model from scratch using a modified distilled dataset from MLCommons, used Arcloss logic instead of triplet loss logic
243+
244+
The resultant model created is stored resnet_50_arcloss
245+
246+
The newer model is show casing much better resilience towards background noise and requires fewer samples for good accuracy
247+
248+
Minor changes in the api flow to facilitate easy addition of newer models
249+
250+
Newer model can handle a fixed window length of 1.5 seconds
251+
252+
The old model can still be accessed through first_iteration_siamese
211253

212254
## Change notes from v0.1.1 to 0.2.2
213255
major changes to replace complex friking logic of handling poly triggers per utterance into more simpler logic and more simpler api for programmers
@@ -220,6 +262,7 @@ Introduces breaking changes
220262

221263
## FAQ :
222264
* **Hotword Perfomance is bad** : if you are having some issue like this , feel to ask the same in [discussions](https://github.com/Ant-Brain/EfficientWord-Net/discussions/4)
265+
* **Can it run on FPGAs like arduino?** : No , new model Resnet_50_Arcloss is too heavy to run on arduino (Roughly 88Mb) in size, soon we will add support of pruned versions of the model so that it can become light enough to run on tiny devices, for now it should be able to run in Raspberry pi like devices
223266

224267
## CONTRIBUTION:
225268
* If you have an ideas to make the project better, feel free to ping us in [discussions](https://github.com/Ant-Brain/EfficientWord-Net/discussions/3)
@@ -230,8 +273,13 @@ Introduces breaking changes
230273
* Add audio file handler in streams. PR's are welcome.
231274
* Remove librosa requirement to encourage generating reference files directly in edge devices
232275
* Add more detailed documentation explaining slider window concept
276+
* Add model finetuning support
277+
* Add support for sparse and finegrained pruning where the resultant models could be used for finetuning (already working on the same)
233278

234279
## SUPPORT US:
280+
235281
Our hotword detector's performance is notably low when compared to Porcupine. We have thought about better NN architectures for the engine and hope to outperform Porcupine. This has been our undergrad project. Hence your support and encouragement will motivate us to develop the engine. If you loved this project recommend this to your peers, give us a 🌟 in Github and a clap 👏 in [medium](https://link.medium.com/yMBmWGM03kb).
236282

283+
Update: Your stars encouraged us to create a new model which is far better , lets make this community grow
284+
237285
## LICENCSE : [Apache License 2.0](/LICENSE.md)
260 KB
Binary file not shown.
259 KB
Binary file not shown.
259 KB
Binary file not shown.
260 KB
Binary file not shown.

docs/.nojekyll

100644100755
File mode changed.

docs/audio_processing.html

100644100755
File mode changed.

docs/engine.html

100644100755
File mode changed.

docs/generate_reference.html

100644100755
File mode changed.

docs/ibm_generate.html

100644100755
File mode changed.

docs/index.html

100644100755
File mode changed.

docs/package_installation_scripts.html

100644100755
File mode changed.

docs/streams.html

100644100755
File mode changed.

eff_word_net/__init__.py

100644100755
File mode changed.

0 commit comments

Comments
 (0)