I imagine a system which could receive sounds from multiple sources. Each source is detected and treated as independent source, thus information of each source is obtainable. Not only the information, but the sources themselves are detectable by it’s unique characteristic.. Evenmore, the location of each source is also detected.
I believe all of ‘em can be accomplished by the power of DSP. I’ll just strive for the best to combine the techniques such as Blind Source Separation, Speech Recognition and Sound Localization. But one thing still doesn’t exist. It’s the tone color detection. So it’s not only about translating voices into words, but also detecting who is speaking. Even more it’s also detecting whether the voice comes from human being or not.
Maybe I can use Principal Component Analysis for this.. or if it is not enough maybe I can use the Independent Component Analysis.. But what should be independent in the tone color? I still have no idea yet..
But it does not mean I stop dreaming and fulfilling my dream.
