Active Topics

 



Notices


Reply
Thread Tools
vistaus's Avatar
Posts: 423 | Thanked: 478 times | Joined on Sep 2014 @ Netherlands
#771
Awesoooooome! But buggy still. I speak as clear as possible but she thinks I'm asking way different things than I really asked for Otherwise, great already!
 

The Following User Says Thank You to vistaus For This Useful Post:
Posts: 958 | Thanked: 3,426 times | Joined on Apr 2012
#772
Originally Posted by deiv View Post
good job! but, on my phone, saera doesn't speak and "she" mispelling almost everything
For speech support you need eSpeak installed. As for misspelling: not really sure how this would happen, could you give me an example?

Originally Posted by vistaus View Post
Awesoooooome! But buggy still. I speak as clear as possible but she thinks I'm asking way different things than I really asked for Otherwise, great already!
Bear in mind that as yet Saera supports only a limited number of actions and will try to interpret anything you say as one of those. Here are examples of things Saera should recognize correctly:
  • Set alarm for two thirty
  • Flip a coin
  • What emails do I have
  • Hello
  • Play music
  • Pause the music
  • Roll a pair of dice
  • I live in London
  • What time is it
  • What is the weather in New York

Does Saera correctly recognize things from this list?
 

The Following 2 Users Say Thank You to taixzo For This Useful Post:
Posts: 951 | Thanked: 2,344 times | Joined on Jan 2012 @ UK
#773
Where would one find espeak? I looked on open repos but found nothing for jolla
 

The Following 2 Users Say Thank You to mariusmssj For This Useful Post:
Posts: 752 | Thanked: 2,808 times | Joined on Jan 2011 @ Czech Republic
#774
Thanks for the app and for the compliment - I'm glad you like the icon

It looks very promising, but I have few comments:
1)
I think that the reason for misspelling, mentioned by vistaus, is the limited recognition dictionary.
Currently, it seems like Saera tries to fit everything you say into few words/sentences it recognizes, which results in triggering many actions you didn't want to trigger.

One example:
I was trying to say something to Saera, but it recognized one word I said as a name of a city and immediately changed my home location to that city.

If it had a bigger dictionary, it could either recognize it correctly, or recognize it incorrectly as something else (but not necessarily the city name, as there would be more words "between" your pronounced word and the city name) and then say "I'm sorry, I don't know what you mean by..." which doesn't seem to be an option right now.

EDIT:
I hope you know what I mean - more freedom in recognition would bring the possibility to 'fail' - to say something that doesn't trigger anything, but is at least well recognized


I remember that this was a problem on the N900, but with the computing power of Jolla, it shouldn't be a problem to have more robust recognition. Or am I wrong?

2)
I know that it can't get into the Harbour right now, but in case this situation changes, you should consider renaming the binary/package to 'harbour-saera' to prevent upgrade path breakages in the future.


3)
It would be cool to have some kind of modularity in the future:
Something like:
  • the GHNS API in KDE.
  • the Situations app on SailfishOS which allows to download more features from inside the app (even paid ones, in the future).

For example I thought of a silly plugin 'Truth or Dare', but it is too silly to push it to upstream, yet something I'd do if I had some free time


Anyways, good luck with the app!

Last edited by nodevel; 2015-04-18 at 21:07.
 

The Following 7 Users Say Thank You to nodevel For This Useful Post:
Posts: 958 | Thanked: 3,426 times | Joined on Apr 2012
#775
Originally Posted by mariusmssj View Post
Where would one find espeak? I looked on open repos but found nothing for jolla
I installed MartinK's build from here.



Originally Posted by nodevel View Post
Thanks for the app and for the compliment - I'm glad you like the icon

It looks very promising, but I have few comments:
1)
I think that the reason for misspelling, mentioned by vistaus, is the limited recognition dictionary.
Currently, it seems like Saera tries to fit everything you say into few words/sentences it recognizes, which results in triggering many actions you didn't want to trigger.

One example:
I was trying to say something to Saera, but it recognized one word I said as a name of a city and immediately changed my home location to that city.

If it had a bigger dictionary, it could either recognize it correctly, or recognize it incorrectly as something else (but not necessarily the city name, as there would be more words "between" your pronounced word and the city name) and then say "I'm sorry, I don't know what you mean by..." which doesn't seem to be an option right now.

I remember that this was a problem on the N900, but with the computing power of Jolla, it shouldn't be a problem to have more robust recognition. Or am I wrong?
It's not really a question of computing power; the input is currently decoded in essentially real-time, and would have no problem with more words. The issue with recognition accuracy is the training data. This is what sets commercial speech recognition systems like Google's apart - they use the same algorithms, but trained with vastly more data. Unfortunately, even if we had access to those data sets, they would be useless on a mobile platform simply due to size - and in that regard, the Jolla has no more storage space than the N900. The acoustic model that I have included is the VoxForge model, which is nearly 7 MB. To get recognition accuracy like Google's, you need tens of gigabytes of training data, which is not feasible on a mobile platform (and why Google Now and Siri send audio off to be processed on a server).

That's for the acoustic model. The other issue is the language model, which tells the recognition engine how words are likely to fit into a sentence. I currently have to hand-assemble the language model, which is partly why it is not so big - it took me about two days to build the current model. I'm working on scripting some bits, so it can load new words and fit them into grammar types (like being able to pronounce contact names), but the fact is that Julius doesn't have a free speech (dictation) grammar for English, and even if it did it would likely have accuracy issues like Pocketsphinx did.

Originally Posted by nodevel View Post
2)
I know that it can't get into the Harbour right now, but in case this situation changes, you should consider renaming the binary/package to 'harbour-saera' to prevent upgrade path breakages in the future.
Good point; I'll have that changed by release.

Originally Posted by nodevel View Post
3)
It would be cool to have some kind of modularity in the future:
Something like:
  • the GHNS API in KDE.
  • the Situations app on SailfishOS which allows to download more features from inside the app (even paid ones, in the future).

For example I thought of a silly plugin 'Truth or Dare', but it is too silly to push it to upstream, yet something I'd do if I had some free time


Anyways, good luck with the app!
Plugins are a planned feature, but not ready for this release.
 

The Following 10 Users Say Thank You to taixzo For This Useful Post:
Posts: 958 | Thanked: 3,426 times | Joined on Apr 2012
#776
I have created a website for the project.
 

The Following 10 Users Say Thank You to taixzo For This Useful Post:
Posts: 951 | Thanked: 2,344 times | Joined on Jan 2012 @ UK
#777
Can't seem to install MartinK's build of espeak, says "failed to install" every time. I guess I am missing other libraries that it requires.
 

The Following User Says Thank You to mariusmssj For This Useful Post:
Posts: 142 | Thanked: 120 times | Joined on Jul 2010
#778
Sorry if it was already asked or answered, but is there a "how to" somewhere ? I saw in the website previous post all the features, but how do you make them work, what do you have to ask?
 

The Following User Says Thank You to phap For This Useful Post:
Posts: 752 | Thanked: 2,808 times | Joined on Jan 2011 @ Czech Republic
#779
Originally Posted by mariusmssj View Post
Can't seem to install MartinK's build of espeak, says "failed to install" every time. I guess I am missing other libraries that it requires.
Yes, you need to install portaudio first.

EDIT:
Originally Posted by taixzo View Post
To get recognition accuracy like Google's, you need tens of gigabytes of training data, which is not feasible on a mobile platform (and why Google Now and Siri send audio off to be processed on a server).
Thank you for explanation! Could you maybe in the future offer an option to choose between Julius and the Google Speech Recognition API?
I am aware of the advantages of Julius (can work offline, not sending data to a 3rd party), but it shouldn't be too hard to implement.

Last edited by nodevel; 2015-04-19 at 07:14.
 

The Following 3 Users Say Thank You to nodevel For This Useful Post:
Posts: 951 | Thanked: 2,344 times | Joined on Jan 2012 @ UK
#780
Thanks nodevel I found that neil made a rmp on the openrepos

also taixzo the list of commands that it can do work pretty well
 

The Following 2 Users Say Thank You to mariusmssj For This Useful Post:
Reply

Tags
saera, speech-to-text


 
Forum Jump


All times are GMT. The time now is 17:39.