LWN.net Logo

Simon - speech activated user interface for KDE (KDE.News)

Simon - speech activated user interface for KDE (KDE.News)

Posted Aug 24, 2009 18:47 UTC (Mon) by bedahr (guest, #60420)
In reply to: Simon - speech activated user interface for KDE (KDE.News) by jspaleta
Parent article: Simon - speech activated user interface for KDE (KDE.News)

Speech models are not code. Think of them as documents (in this metaphor simon is a document editor).

Of course there are existing speech models.

You could even use speech models created by SPHINX-Train by using a speech model converter to convert the model to HTK format (there is such a converter available on sourceforge).

BUT: Speech models created by the HTK can be used _freely_ anyways. You can create models using HTK and then basically use them for whatever you want. This is also the reason why the voxforge initiative can build their speech model using the HTK and still licence the model itself under the GPL license.

The HTK plain text hmm format is well documented.

You can check out an example here: http://www.repository.voxforge1.org/downloads/Nightly_Bui...
(The file hmmdefs is the HMM model created by the HTK).

I don't know what you mean by "bug in a speech model" but I am going to assume that you mean e.g. wrongly transcribed trainingssamples. Well fixing that would depend on how you built the model in the first place. In all likelyhood you would end up changing the input files and re-generating the whole model with those new parameters (using the HTK, SPHINX or whatever was used in the first place).

For the record: There is an open source initiative called ghmm which tries to create a GPL licenced library for working with HMM models but I contacted them and they said they were not ready for this kind of usage and generally want to be more general-purpose than the HTK so I am not sure if they will be soon/ever.

Also, the HTK is very high quality software and a good recognition rate is obviously the main goal for any speech recognition software - GPL or not.


(Log in to post comments)

Simon - speech activated user interface for KDE (KDE.News)

Posted Aug 24, 2009 18:56 UTC (Mon) by jspaleta (subscriber, #50639) [Link]

great!
...document format..not compiled code.
...open tool to convert other formats into that format.
...other formats creatable by open codebase.

This should be a non-issue if this comes up for discussion in a package review.

-jef


Simon - speech activated user interface for KDE (KDE.News)

Posted Aug 24, 2009 19:01 UTC (Mon) by bedahr (guest, #60420) [Link]

Thanks for actually _discussing_ this!

I can't remember how often I had the exact same issue raised but it always ended in someone crying out: "Uses non-GPL code! Kill it with fire!" (or similar) and not relating to any replies or explanations from my side at all.

So again, thanks for understanding the complicated situation!

Simon - speech activated user interface for KDE (KDE.News)

Posted Aug 24, 2009 19:46 UTC (Mon) by jspaleta (subscriber, #50639) [Link]

Make sure you are able to make the speech model as document format argument clear when someone steps up to submit the package. You might want to drop a blurb in a high level readme in the simon codebase which talks to this (if its not there already). When/if this comes up for submission as a Fedora package, there's no guarantee the reviewers will have read the discussion here..but they will review the material in the simon codebase in discussion with the packager. Dropping a note into a readme will help make reviewers aware that speech models are editable text file content and note at a minimum the existence of sphinx-train and the speech model format converter tool.

-jef

Simon - speech activated user interface for KDE (KDE.News)

Posted Aug 24, 2009 20:31 UTC (Mon) by bedahr (guest, #60420) [Link]

Yes I will add this information tomorrow.

Maybe I'll even add it to the FAQ of the project wiki...

But btw.: Has anyone even talked to the fedora team? Or is this a hypothetical discussion? If so it is oddly fedora specific IMHO?

Greetings,
Peter

Simon - speech activated user interface for KDE (KDE.News)

Posted Aug 24, 2009 20:47 UTC (Mon) by jspaleta (subscriber, #50639) [Link]

This is somewhat hypothetical.... someone has to do the packaging work and submit it for review... and I'm not aware of anyone working on packaging Simon yet for Fedora. Hell this is the first I heard of it. I'm holding out for direct neural interfaces instead of speech...moving my mouth takes soooo much effort.

I'll bet you dollars to doughnuts members of Fedora's Technical leadership will read the discussion here and will be aware of the content argument. But ultimately it comes down to someone taking the responsibility to maintain the Simon package and start the package submission review process. A summary of the situation in faq or readme will help prevent an unnecessary delay once someone does step forward.

I would also think a Debian packaging effort would also benefit from a summary of this discussion...if they aren't ready working on packages. I think they'll have similar concerns but I'm less informed about the details of Debian policy with regard to "content" versus "code" than I am about Fedora's policy.

-jef

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds