I'm trying to reverse engineer the "OK google" functionality implemented in my phone. 



What do you suppose I do with those feature / data sets? Since "OK google" responds to my voice independently of the rate of speech, methinks they are using a combination of regression analysis and discrete time warping. 

But, it's seemingly both speaker and pitch independent too, so there must be something else going on. There's no way they implemented a full Hidden Markov Model inside the phone's DSP, (it wouldn't make sense for just one hotword).

Thoughts?



---
Aperture Systems: Redefining Radiography -  http://aperture.systems/
http://adammunich.com/ - Cell: +1-650-452-0554

Be • knowledgeable •  social • patient • fearless • compassionate • fun • humble • forgiving.

Be a leader