Skip to content

Commit 6fdd299

Browse files
Simplified multi trigger minimization logic and simplified api
1 parent 62162a6 commit 6fdd299

File tree

7 files changed

+384
-415
lines changed

7 files changed

+384
-415
lines changed

.gitignore

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
eff_word_net/__pycache__
22
.ipynb_checkpoints
3-
.venv37
3+
.venv*
44
build
55
EfficientWord_Net.egg-info
66
dist

README.md

Lines changed: 26 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -84,6 +84,12 @@ One needs to collect few 4 to 10 uniquely sounding pronunciations
8484
of a given wakeword. Then put them into a seperate folder, which doesnt contain
8585
anything else.
8686

87+
Or one could use the following command to generate audio files for a given word, uses ibm neural tts demo api, Kindly dont over use it for our sake (lol)
88+
89+
```bash
90+
python -m eff_word_net.ibm_generate
91+
```
92+
8793
Finally run this command, it will ask for the input folder's location
8894
(containing the audio files) and the output folder (where _ref.json file will be stored).
8995
```
@@ -96,9 +102,17 @@ The pathname of the generated wakeword needs to passed to the HotwordDetector de
96102
HotwordDetector(
97103
hotword="hello",
98104
reference_file = "/full/path/name/of/hello_ref.json"),
99-
activation_count = 3 #2 by default
105+
threshold=0.9, #min confidence required to consider a trigger
106+
relaxation_time = 0.8 #default value ,in seconds
100107
)
101108
```
109+
relaxation time parameter is used to determine the min time between any 2 triggers, any potential triggers before the relaxation_time will be cancelled
110+
111+
The detector operates on a sliding widow approach resulting in multiple triggers for single utterance of a hotword, the relaxation_time parameter can used to control the multiple triggers, in most cases 0.8sec(default) will do
112+
113+
<br>
114+
115+
## Out of the box sample hotwords
102116
Few wakewords such as **Mycroft**, **Google**, **Firefox**, **Alexa**, **Mobile**, **Siri** the library has predefined embeddings readily available in the library installation directory, its path is readily available in the following variable
103117

104118
```python
@@ -118,7 +132,6 @@ from eff_word_net import samples_loc
118132
mycroft_hw = HotwordDetector(
119133
hotword="Mycroft",
120134
reference_file = os.path.join(samples_loc,"mycroft_ref.json"),
121-
activation_count=3
122135
)
123136

124137
mic_stream = SimpleMicStream()
@@ -127,9 +140,12 @@ mic_stream.start_stream()
127140
print("Say Mycroft ")
128141
while True :
129142
frame = mic_stream.getFrame()
130-
result = mycroft_hw.checkFrame(frame)
131-
if(result):
132-
print("Wakeword uttered")
143+
result = mycroft_hw.scoreFrame(frame)
144+
if result==None :
145+
#no voice activity
146+
continue
147+
if(result["match"]):
148+
print("Wakeword uttered",result["confidence"])
133149

134150
```
135151
<br>
@@ -138,8 +154,7 @@ while True :
138154
## Detecting Mulitple Hotwords from audio streams
139155

140156
The library provides a computation friendly way
141-
to detect multiple hotwords from a given stream, installed
142-
of running `checkFrame()` of each wakeword individually
157+
to detect multiple hotwords from a given stream, instead of running `scoreFrame()` of each wakeword individually
143158

144159
```python
145160
import os
@@ -188,8 +203,10 @@ while True :
188203
Access documentation of the library from here : https://ant-brain.github.io/EfficientWord-Net/
189204

190205

191-
## About `activation_count` in `HotwordDetector`
192-
Documenatation with detailed explanation on the usage of `activation_count` parameter in `HotwordDetector` is in the making , For now understand that for long hotwords 3 is advisable and 2 for smaller hotwords. If the detector gives out multiple triggers for a single utterance, try increasing `activation_count`. To experiment begin with smaller values. Default value for the same is 2
206+
## Change notes from v0.1.1 to 0.2.2
207+
major changes to replace complex friking logic of handling poly triggers per utterance into more simpler logic and more simpler api for programmers
208+
209+
Introduces breaking changes
193210

194211

195212
## FAQ :

0 commit comments

Comments
 (0)