Text to Speech Library (Windows only)

Text to Speech Library (Windows only) - Printable Version

+- QB64 Phoenix Edition (https://qb64phoenix.com/forum)
+-- Forum: QB64 Rising (https://qb64phoenix.com/forum/forumdisplay.php?fid=1)
+--- Forum: Expanding Horizons (Libraries) (https://qb64phoenix.com/forum/forumdisplay.php?fid=21)
+---- Forum: SMcNeill (https://qb64phoenix.com/forum/forumdisplay.php?fid=22)
+---- Thread: Text to Speech Library (Windows only) (/showthread.php?tid=249)

Pages: 1 2

Text to Speech Library (Windows only) - SMcNeill - 04-27-2022

I turned the powershell stuff into a simple little library for people to make use of in their projects, and here it is:

Code: (Select All)
_Title "Steve's Powershell Speech Library"

Speech_IoR 'initialize or reset speech options

Speech_SaP "Hello World, This is a normal speed demo of David's voice" 'speak and print

_Delay 2

Speech_Speaker "Ziva"

Speech_Say "Hello again.  This is a normal speed demo of Ziva's voice." 'just speak this one

_Delay 2

Speech_Speaker "David"

Speech_Speed -10

Speech_SaP "And now I'm speaking as David, but I'm speaking veeery slow."

_Delay 2

Speech_Speaker "Ziva"

Speech_Speed 5

Speech_SaP "And now I'm a very hyper Ziva!"

_Delay 2

Speech_Speed 0

Speech_Volume 30

Speech_SaP "And now I'm whispering to you that I'm done with my demo!"

'$INCLUDE:'TextToSpeech.BM'

As you can see, all the commands are preceeded by "Speech_", to try and help keep the sub names unique, associative, and not interfere with any user variable names and such.

Routines in this little package are:

Speech_IoR -- Init or Reset. Call this first to initialize the settings (and turn volume up to 100, or else you'll be speaking on a MUTE channel)

Speech_Speaker -- Change the default speaker. Currently I only support "David" and "Ziva", but feel free to change or add to this if your system has other voices installed via language/voice packs.

Speech_Speed -- Set a value from -10 to 10 to adjust the speed of the speaker. 0 id default, -10 is sloooow, and 10 is faaaast.

Speech_Volume -- Set a value from 0 to 100 to adjust how loud you're going to be speaking with the voices.

Speech_OutTo -- Use this to change where you want the speech to go. Only options now are your speakers or a file. Since it's not currently in the demo, as I didn't want to randomly save junk to folks drives, an example looks like:

Speech_OutTo "MyTextToFile.wav"
Speech_OutTo "Speaker"
Speech_OutTo "" 'defaults/resets to speaker

Speech_Say -- Just says the text you specify with the settings you gave it previously.

Speech_SaP -- Says and Prints the text you specify to the screen as a quick print and speak shortcut. Uses previous settings.

Speech_ToWav -- Converts text to a wav file and saves it to the disk where you specify. Since it's not in the short demo above, usage would be as:

Speech_ToWav "Hello World. This is the text I'm saving to a file!", "MyFile.wav"

speak -- This is the master command with all the options built into it. You can skip everything else, if you want to use this as a stand alone command to do everything all at once. Everything else ends up calling this command at the end of the day, so you can bypass some processes if you call this directly.

And that's basically it for now. Windows Speech Synthesizer is quite a powerful little tool, with a ton of options which we can utilize with it, but I figure this is the basics of what someone would need to be able to do with it for a program. It seems to handle what I need from it for now.

If you guys need it to do more, feel free to ask and I'll see about adding extra functionality as people need it. Or, feel free to make the changes necessary yourself and then share them here with us so everybody else can enjoy any extra tweaks you guys add into the code.

To make use of this:

1) Download the library from the attachment below.
2) Move it to your QB64 folder.
3) '$INCLUDE:'TextToSpeech.BM' at the end of your program.
4) Speech_IoR inside your code to initialize everything
5) Call the other subs as you want to make use of them and alter the settings to your specific needs.

It's that simple! ;D

RE: Text to Speech Library (Windows only) - Dimster - 10-20-2022

Thanks for this routine Steve. It has added a little punch to my programming. Bonus - LPRINT also works

RE: Text to Speech Library (Windows only) - SMcNeill - 10-20-2022

(10-20-2022, 04:32 PM)Dimster Wrote: Thanks for this routine Steve. It has added a little punch to my programming. Bonus - LPRINT also works

Glad to see someone making use of it. One day, I'd love to just add TextToSpeeh directly into QB64 itself. Problem is, this method is Windows-Only, and if I was going to add it into the language, I'd want to add something which was cross-platform compatible -- and I don't know anything at the moment that is. Sad

RE: Text to Speech Library (Windows only) - Dimster - 10-24-2022

Hi Steve - I just wanted to clarify the control over where the printing occurs when using Speech_SaP. From playing around it seems I can place the printed aspect of the speech using Locate but it doesn't seem to work using _PrintString co-ordinates. Is there a Speech_SaP(100,356),"This next display is going to WOW you"?

RE: Text to Speech Library (Windows only) - SpriggsySpriggs - 10-25-2022

Steve, now you need to do Speech to Text (which PowerShell does support).

RE: Text to Speech Library (Windows only) - Pete - 10-25-2022

Steve,

Yar mar th'n welcum ta uze my audeeo tapes ta make yar speeks ta tex appie!

- Sam

RE: Text to Speech Library (Windows only) - Dimster - 10-26-2022

OH Lord ... I didn't intend that question to put any added work on your shoulders Steve. I'm having no problems with speech in graphic screens by simply separating the screen text from the spoken word

(ie _PrintString (10,25),"This Next Display is the contents of the Array Catch All"
Speech_Say ,"This Next Display is the contents of the Array Catch All")

RE: Text to Speech Library (Windows only) - SMcNeill - 10-26-2022

(10-26-2022, 12:25 PM)Dimster Wrote: OH Lord ... I didn't intend that question to put any added work on your shoulders Steve. I'm having no problems with speech in graphic screens by simply separating the screen text from the spoken word

(ie _PrintString (10,25),"This Next Display is the contents of the Array Catch All"
Speech_Say ,"This Next Display is the contents of the Array Catch All")

No extra work for this little addition at all. Just add a quick sub to handle the work as desired:

Code: (Select All)
_Title "Steve's Powershell Speech Library"

Speech_IoR 'initialize or reset speech options

Speech_SaP "Hello World, This is a normal speed demo of David's voice" 'speak and print

_Delay 2

Speech_Speaker "Ziva"

Speech_Say "Hello again.  This is a normal speed demo of Ziva's voice." 'just speak this one

_Delay 2

Speech_Speaker "David"

Speech_Speed -10

Speech_SaP "And now I'm speaking as David, but I'm speaking veeery slow."

_Delay 2

Speech_Speaker "Ziva"

Speech_Speed 5

Speech_SaP "And now I'm a very hyper Ziva!"

_Delay 2

Speech_Speed 0

Speech_Volume 30

Speech_SaP "And now I'm whispering to you that I'm done with my demo!"

Speech_SaPS 10, 15, "And now I'm using PRINTSTRING and SPEAKING!"

'$INCLUDE:'TextToSpeech.BM'

Sub Speech_SaPS (x, y, text$) 'Speak and PrintString

    _PrintString (x, y), text$

    Speech_Say text$

End Sub

RE: Text to Speech Library (Windows only) - Dimster - 10-26-2022

Works like a charm - thanks Mr. Wizzard

RE: Text to Speech Library (Windows only) - SMcNeill - 11-06-2022

(10-25-2022, 01:21 PM)Spriggsy Wrote: Steve, now you need to do Speech to Text (which PowerShell does support).

I was playing around with this concept a little bit this evening. From what I can tell, it's a worthless endeavor. Try this little powershell snippet out for example (save as something like "Test.ps1").

Code: (Select All)
add-type -assemblyname system.speech

$sp = [System.Speech.Recognition.SpeechRecognitionEngine]::new()

$sp.LoadGrammar([System.Speech.Recognition.DictationGrammar]::new())

$sp.SetInputToDefaultAudioDevice()

$result = $sp.Recognize()

$result

Now, powershell didn't like me just running the file as it exists, so I had to bypass a few security settings with:

Code: (Select All)
powershell -executionpolicy ByPass -File .\Test.ps1

Run it, and you get a blank prompt which basically appears to do nothing. (It's waiting a few seconds for you to talk to it...) Say something, and watch the hilarity ensue as it tries to decipher what you just said.

"Do you understand me at all?":

Code: (Select All)
Audio                : System.Speech.Recognition.RecognizedAudio

Alternates          : {0, 0, 0, 0...}

Text                : Then we now understand me at all

Confidence          : 0.08942806

Words                : {System.Speech.Recognition.RecognizedWordUnit, System.Speech.Recognition.RecognizedWordUnit,

                      System.Speech.Recognition.RecognizedWordUnit, System.Speech.Recognition.RecognizedWordUnit...}

Semantics            : {}

Homophones          : {0, 0, 0, 0...}

Grammar              : System.Speech.Recognition.DictationGrammar

ReplacementWordUnits : {System.Speech.Recognition.ReplacementText}

HomophoneGroupId    : 0

About 9% confidence that "Then we now understand me at all" is what I said...

And so, I tested it multiple times with a simple "HELLO WORLD" speech... My results were:

Code: (Select All)
Text                : A year later in

Text                : Al Oleg

Text                : The Moraga

Text                : The elena's

Text                : That illusion were in

5 tries, and as you can see, 5 nonsensical translations of what I said.

Now, is this an issue of a poor microphone? Poor speech training on my PC?

Code: (Select All)
I don't know the problem seems to be that it just doesn't understand me because these last few lines here I dictated flawlessly into Microsoft Word with my same setup on my PC

As it looks to me, Powershell speech-to-text isn't something worth me wasting my time exploring, since it never seems to generate more than an 8% certainty on anything I say.