Skip to content

Commit 62162a6

Browse files
Major changes in HotwordDetector in engine.py and added Mycroft wakeword
1 parent f7a6867 commit 62162a6

14 files changed

+113
-37
lines changed

README.md

Lines changed: 32 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -95,19 +95,18 @@ The pathname of the generated wakeword needs to passed to the HotwordDetector de
9595
```python
9696
HotwordDetector(
9797
hotword="hello",
98-
reference_file = "/full/path/name/of/hello_ref.json")
98+
reference_file = "/full/path/name/of/hello_ref.json"),
99+
activation_count = 3 #2 by default
99100
)
100101
```
101-
102-
Few wakewords such as **Google**, **Firefox**, **Alexa**, **Mobile**, **Siri** the library has predefined embeddings readily available in the library installation directory, its path is readily available in the following variable
102+
Few wakewords such as **Mycroft**, **Google**, **Firefox**, **Alexa**, **Mobile**, **Siri** the library has predefined embeddings readily available in the library installation directory, its path is readily available in the following variable
103103

104104
```python
105105
from eff_word_net import samples_loc
106106
```
107107

108108
<br>
109109

110-
111110
## Try your first single hotword detection script
112111

113112
```python
@@ -116,18 +115,19 @@ from eff_word_net.streams import SimpleMicStream
116115
from eff_word_net.engine import HotwordDetector
117116
from eff_word_net import samples_loc
118117

119-
alexa_hw = HotwordDetector(
120-
hotword="Alexa",
121-
reference_file = os.path.join(samples_loc,"alexa_ref.json"),
118+
mycroft_hw = HotwordDetector(
119+
hotword="Mycroft",
120+
reference_file = os.path.join(samples_loc,"mycroft_ref.json"),
121+
activation_count=3
122122
)
123123

124124
mic_stream = SimpleMicStream()
125125
mic_stream.start_stream()
126126

127-
print("Say Alexa ")
127+
print("Say Mycroft ")
128128
while True :
129129
frame = mic_stream.getFrame()
130-
result = alexa_hw.checkFrame(frame)
130+
result = mycroft_hw.checkFrame(frame)
131131
if(result):
132132
print("Wakeword uttered")
133133

@@ -145,6 +145,7 @@ of running `checkFrame()` of each wakeword individually
145145
import os
146146
from eff_word_net.streams import SimpleMicStream
147147
from eff_word_net import samples_loc
148+
print(samples_loc)
148149

149150
alexa_hw = HotwordDetector(
150151
hotword="Alexa",
@@ -153,31 +154,44 @@ alexa_hw = HotwordDetector(
153154

154155
siri_hw = HotwordDetector(
155156
hotword="Siri",
156-
reference_file = os.path.join(samples_loc,"siri_ref.json")
157-
)
157+
reference_file = os.path.join(samples_loc,"siri_ref.json"),
158+
)
158159

159-
google_hw = HotwordDetector(
160-
hotword="Google",
161-
reference_file = os.path.join(samples_loc,"google_ref.json")
160+
mycroft_hw = HotwordDetector(
161+
hotword="mycroft",
162+
reference_file = os.path.join(samples_loc,"mycroft_ref.json"),
163+
activation_count=3
162164
)
163165

164166
multi_hw_engine = MultiHotwordDetector(
165-
detector_collection = [alexa_hw,siri_hw,google_hw]
166-
) # Efficient multi hotword detector
167+
detector_collection = [
168+
alexa_hw,
169+
siri_hw,
170+
mycroft_hw,
171+
],
172+
)
167173

168174
mic_stream = SimpleMicStream()
169175
mic_stream.start_stream()
170176

171-
print("Say Google / Alexa / Siri")
177+
print("Say Mycroft / Alexa / Siri")
178+
172179
while True :
173180
frame = mic_stream.getFrame()
174181
result = multi_hw_engine.findBestMatch(frame)
175182
if(None not in result):
176183
print(result[0],f",Confidence {result[1]:0.4f}")
184+
177185
```
178186
<br>
179187

180188
Access documentation of the library from here : https://ant-brain.github.io/EfficientWord-Net/
189+
190+
191+
## About `activation_count` in `HotwordDetector`
192+
Documenatation with detailed explanation on the usage of `activation_count` parameter in `HotwordDetector` is in the making , For now understand that for long hotwords 3 is advisable and 2 for smaller hotwords. If the detector gives out multiple triggers for a single utterance, try increasing `activation_count`. To experiment begin with smaller values. Default value for the same is 2
193+
194+
181195
## FAQ :
182196
* **Hotword Perfomance is bad** : if you are having some issue like this , feel to ask the same in [discussions](https://github.com/Ant-Brain/EfficientWord-Net/discussions/4)
183197

@@ -189,6 +203,7 @@ Access documentation of the library from here : https://ant-brain.github.io/Effi
189203

190204
* Add audio file handler in streams. PR's are welcome.
191205
* Remove librosa requirement to encourage generating reference files directly in edge devices
206+
* Add more detailed documentation explaining slider window concept
192207

193208
## SUPPORT US:
194209
Our hotword detector's performance is notably low when compared to Porcupine. We have thought about better NN architectures for the engine and hope to outperform Porcupine. This has been our undergrad project. Hence your support and encouragement will motivate us to develop the engine. If you loved this project recommend this to your peers, give us a 🌟 in Github and a clap 👏 in [medium](https://link.medium.com/yMBmWGM03kb).

dist/EfficientWord-Net-0.0.1.tar.gz

-1.64 MB
Binary file not shown.
-1.64 MB
Binary file not shown.

eff_word_net/engine.py

Lines changed: 79 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,14 @@ class HotwordDetector :
1414
EfficientWord based HotwordDetector Engine implementation class
1515
"""
1616

17-
def __init__(self,hotword:str,reference_file:str,threshold:float=0.85):
17+
def __init__(
18+
self,
19+
hotword:str,
20+
reference_file:str,
21+
threshold:float=0.9,
22+
activation_count=2,
23+
continuous=True,
24+
verbose = False):
1825
"""
1926
Intializes hotword detector instance
2027
@@ -28,6 +35,8 @@ def __init__(self,hotword:str,reference_file:str,threshold:float=0.85):
2835
threshold: float value between 0 and 1 , min similarity score
2936
required for a match
3037
38+
continuous: bool value to know if a HotwordDetector is operating on a single continuous stream , else false
39+
3140
"""
3241
assert isfile(reference_file), \
3342
"Reference File Path Invalid"
@@ -43,10 +52,21 @@ def __init__(self,hotword:str,reference_file:str,threshold:float=0.85):
4352

4453
self.hotword = hotword
4554
self.threshold = threshold
55+
self.continuous = continuous
56+
57+
self.__repeat_count = 0
58+
self.__activation_count = activation_count
59+
self.verbose = verbose
60+
61+
self.__relaxation_time_step = 4 #number of cycles to prevent recall after a trigger
62+
self.__is_it_a_trigger = False
4663

4764
def __repr__(self):
4865
return f"Hotword: {self.hotword}"
4966

67+
def is_it_a_trigger(self):
68+
return self.__is_it_a_trigger
69+
5070
def getMatchScoreVector(self,inp_vec:np.array) -> float :
5171
"""
5272
**Use this directly only if u know what you are doing**
@@ -71,8 +91,24 @@ def getMatchScoreVector(self,inp_vec:np.array) -> float :
7191
for i in top3 :
7292
out+= (1-out) * i
7393

74-
return out
94+
#assert self.redundancy_count>0 , "redundancy_count count can only be greater than 0"
95+
96+
self.__is_it_a_trigger = False
97+
98+
if self.__repeat_count < 0 :
99+
self.__repeat_count += 1
75100

101+
elif out > self.threshold :
102+
if self.__repeat_count == self.__activation_count -1 :
103+
self.__repeat_count = - self.__relaxation_time_step
104+
self.__is_it_a_trigger = True
105+
else:
106+
self.__repeat_count +=1
107+
108+
elif self.__repeat_count > 0:
109+
self.__repeat_count -= 1
110+
111+
return out
76112

77113
def checkVector(self,inp_vec:np.array) -> bool:
78114
"""
@@ -85,7 +121,12 @@ def checkVector(self,inp_vec:np.array) -> bool:
85121
assert inp_vec.shape == (1,128), \
86122
"Inp vector should be of shape (1,128)"
87123

88-
return self.getMatchScoreVector(inp_vec) > self.threshold
124+
score = self.getMatchScoreVector(inp_vec)
125+
126+
return self.is_it_a_trigger() if self.continuous else score >= self.threshold
127+
128+
def get_repeat_count(self)-> int :
129+
return self.__repeat_count
89130

90131
def getMatchScoreFrame(
91132
self,
@@ -110,6 +151,7 @@ def getMatchScoreFrame(
110151
111152
"""
112153

154+
"""
113155
if(not unsafe):
114156
upperPoint = max(
115157
(
@@ -118,6 +160,7 @@ def getMatchScoreFrame(
118160
)
119161
if(upperPoint > 0.2):
120162
return False
163+
"""
121164

122165
assert inp_audio_frame.shape == (RATE,), \
123166
f"Audio frame needs to be a 1 sec {RATE}Hz sampled vector"
@@ -126,7 +169,7 @@ def getMatchScoreFrame(
126169
audioToVector(
127170
inp_audio_frame
128171
)
129-
)
172+
)
130173

131174

132175
def checkFrame(self,inp_audio_frame:np.array,unsafe:bool = False) -> bool :
@@ -152,6 +195,7 @@ def checkFrame(self,inp_audio_frame:np.array,unsafe:bool = False) -> bool :
152195
assert inp_audio_frame.shape == (RATE,), \
153196
f"Audio frame needs to be a 1 sec {RATE}Hz sampled vector"
154197

198+
"""
155199
if(not unsafe):
156200
upperPoint = max(
157201
(
@@ -160,8 +204,10 @@ def checkFrame(self,inp_audio_frame:np.array,unsafe:bool = False) -> bool :
160204
)
161205
if(upperPoint > 0.2):
162206
return False
207+
"""
208+
score = self.getMatchScoreFrame(inp_audio_frame)
163209

164-
return self.getMatchScoreFrame(inp_audio_frame) > self.threshold
210+
return self.is_it_a_trigger() if self.continuous else score >= self.threshold
165211

166212
HotwordDetectorArray = List[HotwordDetector]
167213
MatchInfo = Tuple[HotwordDetector,float]
@@ -176,6 +222,7 @@ class MultiHotwordDetector :
176222
def __init__(
177223
self,
178224
detector_collection:HotwordDetectorArray,
225+
continuous=True
179226
):
180227
"""
181228
Inp Parameters:
@@ -190,6 +237,7 @@ def __init__(
190237
"Mixed Array received, send HotwordDetector only array"
191238

192239
self.detector_collection = detector_collection
240+
self.continous = continuous
193241

194242
def findBestMatch(
195243
self,
@@ -218,6 +266,7 @@ def findBestMatch(
218266
assert inp_audio_frame.shape == (RATE,), \
219267
f"Audio frame needs to be a 1 sec {RATE}Hz sampled vector"
220268

269+
"""
221270
if(not unsafe):
222271
upperPoint = max(
223272
(
@@ -226,16 +275,21 @@ def findBestMatch(
226275
)
227276
if(upperPoint > 0.2):
228277
return None , None
229-
278+
"""
230279
embedding = audioToVector(inp_audio_frame)
231280

232281
best_match_detector:str = None
233282
best_match_score:float = 0.0
234283

235284
for detector in self.detector_collection :
236285
score = detector.getMatchScoreVector(embedding)
237-
if(score<detector.threshold):
238-
continue
286+
if(self.continous):
287+
if(not detector.is_it_a_trigger()):
288+
continue
289+
else:
290+
if(score < detector.threshold):
291+
continue
292+
239293
if(score>best_match_score):
240294
best_match_score = score
241295
best_match_detector = detector
@@ -282,7 +336,7 @@ def findAllMatches(
282336
embedding = audioToVector(inp_audio_frame)
283337

284338
matches:MatchInfoArray = []
285-
339+
286340
best_match_score = 0.0
287341
for detector in self.detector_collection :
288342
score = detector.getMatchScoreVector(embedding)
@@ -301,29 +355,36 @@ def findAllMatches(
301355
from eff_word_net.streams import SimpleMicStream
302356
from eff_word_net import samples_loc
303357
print(samples_loc)
358+
304359
alexa_hw = HotwordDetector(
305360
hotword="Alexa",
306361
reference_file = os.path.join(samples_loc,"alexa_ref.json"),
307362
)
308363

309364
siri_hw = HotwordDetector(
310365
hotword="Siri",
311-
reference_file = os.path.join(samples_loc,"siri_ref.json")
312-
)
366+
reference_file = os.path.join(samples_loc,"siri_ref.json"),
367+
)
313368

314-
google_hw = HotwordDetector(
315-
hotword="Google",
316-
reference_file = os.path.join(samples_loc,"google_ref.json")
317-
)
369+
mycroft_hw = HotwordDetector(
370+
hotword="mycroft",
371+
reference_file = os.path.join(samples_loc,"mycroft_ref.json"),
372+
activation_count=3
373+
)
318374

319375
multi_hw_engine = MultiHotwordDetector(
320-
detector_collection = [alexa_hw,siri_hw,google_hw]
321-
)
376+
detector_collection = [
377+
alexa_hw,
378+
siri_hw,
379+
mycroft_hw,
380+
],
381+
)
322382

323383
mic_stream = SimpleMicStream()
324384
mic_stream.start_stream()
325385

326-
print("Say Google / Alexa / Siri")
386+
print("Say Mycroft / Alexa / Siri")
387+
327388
while True :
328389
frame = mic_stream.getFrame()
329390
result = multi_hw_engine.findBestMatch(frame)

eff_word_net/sample_refs/mycroft_ref.json

Lines changed: 1 addition & 0 deletions
Large diffs are not rendered by default.

setup.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33

44
setup(
55
name = 'EfficientWord-Net',
6-
version = '0.0.1',
6+
version = '0.1.1',
77
description = 'Few Shot Learning based Hotword Detection Engine',
88
long_description = open("./README.md",'r').read(),
99
long_description_content_type = 'text/markdown',

wakewords/mobile_ref.json

Lines changed: 0 additions & 1 deletion
This file was deleted.
Binary file not shown.
Binary file not shown.
8.46 KB
Binary file not shown.
Binary file not shown.
7.19 KB
Binary file not shown.
Binary file not shown.
Binary file not shown.

0 commit comments

Comments
 (0)