Simon Changelog

What's new in Simon 0.4.0

Oct 25, 2013
  • This new version of the open source speech recognition system Simon features a whole new recognition layer, context-awareness for improved accuracy and performance, a dialog system able to hold whole conversations with the user and more.
  • Revisiting Usability:
  • A lot of work has gone into making Simon easier to use - both for existing and new users.
  • Perhaps most visibly, the main window of Simon has been reorganized to bring the most important options together in one screen.
  • Moreover, the newly introduced Simon base model format (.sbm) and the integration of a GHNS online repository of base models have removed the last big hurdle of the initial configuration.
  • One can now easily go from a fresh installation to a working setup in less than 5 minutes without any preparation. Don't believe me? Check out the quick start below!
  • Many other, smaller changes sum up to one simple but important difference: Simon will overall require less user interaction while achieving more.
  • SPHINX:
  • One of the major internal changes of Simon 0.4 is of course the included support for the BSD licensed CMU SPHINX. While we still also maintain full support for HTK and Julius, new models compiled with Simon will default to the SPHINX backend and the (proprietary) HTK is no longer required to build user-generated models.
  • Best of all: Simon will select the correct backend for your configuration transparently and automatically.
  • Voxforge:
  • A major problem of open source speech recognition has always been the lack of freely available high quality speech models.
  • The Voxforge project has been working for years towards GPL acoustic models for a variety of languages. While their models are certainly not yet perfect, they offer a promising starting point.
  • The English Voxforge model is of course available as a Simon base model and can be downloaded and imported with Simon.
  • Additionally, starting with Simon 0.4, users will also have the option to contribute their gathered Simon training samples directly to the Voxforge server.
  • These recordings will then be used to train and improve the general acoustic models.
  • By the way: Behind the scenes this upload is based on SSC.
  • Context:
  • There is a simple rule of thumb in speech recognition: The smaller the application domain, the better the recognition accuracy. This was always one of the core principles of Simon.
  • In Simon 0.4, however, we went one step further: Simon can now re-configure itself on-the-fly as the current situation changes. Through so called "context conditions" Simon 0.4 can automatically activate and deactivate selected scenarios, microphones and even parts of your training corpus.
  • For example: Why listen for "Close tab" when your browser isn't even open? Or why listen for anything at all when you're actually in the next room listening to music? Yes, Simon is watching you.
  • Dialog System:
  • Simon 0.4.0 also ships with the new dialog system featuring scripted variables (Javascript), integration with Plasma data engines, a templating system and - of course - text-to-speech output.
  • Simonoid:
  • For users of KDE's plasma workspace, we now provide the "Simonoid" plasmoid to start and monitor Simon - including the current recording volume.
  • The screenshot above shows two instances of the plasmoid: One added to the panel and another one to the desktop.
  • and everything else:
  • Please don't be foold to think that the above is a complete list of all improvements. For example, we also have a new sample review tool called Afaras, integration with the Sequitur grapheme to phoneme framework, an Akonadi command plugin and many, many other noteworthy changes.
  • You'll have to try out Simon to see for yourself!