How do I adjust pronunciations in a VoiceXML script using CDW 2024?
Last reviewed: 7/14/2024
HOW Article ID: H062404
The information in this article applies to:
- Developer Workbench 2024
- LexiconKit 10
- VoiceMarkupKit 10
- VoiceXMLKit 7
Summary
Speech synthesis is used in concert with a variety of technologies. Pronunciation is a universal issue in all speech synthesis. Discover how lexicons, Speech Synthesis Markup Language (SSML), and VoiceXML intersect for great sounding synthesized speech.
More Information
Chant Developer Workbench includes LexiconKit for generating, editing, speaking phonemes used for adjusting pronunciations to improve speech recognition accuracy and speech synthesis clarity. It includes VoiceMarkupKit for generating markup for synthesizing speech and may be used with adjusted pronunciations. It includes VoiceXMLKit for generating, editing, validating, and running VoiceXML documents that can synthesize speech that is annotated with pronunciation markup.
The VoiceXMLKit Specials sample illustrates a VoiceXML script for grocery store inquiries providing information about daily specials on demand. Part of the example script includes a reference to Haas avocados that English synthesizers mispronounce. Consider the following process to adjust the pronunciation of proper nouns such as Hass.
Chant Developer Workbench includes VoiceXMLKit that provides a full-featured editor for VoiceXML creation, modification, validation and execution of VoiceXML documents. VoiceXMLKit includes sample app projects in C++Builder, Visual C++, Delphi, Java and .NET that illustrate running the Specials VoiceXML script. The sample VoiceXML document Specials.vxml (Documents\Chant\VoiceXMLKit 7) may be opened in Developer Workbench to illustrate.
Run the script and select Fresh Produce when prompted to hear how Hass is pronounced.
To adjust the pronunciation of Hass [a] vowel (cat) sound to an [o] vowel (dog) sound, begin by opening an SSML editor with VoiceMarkupKit.
Select Microsoft SAPI5 Windows Speech Synthesis in the speech API dropdown. LexiconKit can generate default pronunciation phonemes with SAPI. Select any English voice. Select Speech menu item: Generate Phonemes to display the tool window. Enter Hass, select Noun or Proper Noun, en-US, ipa, and press the Generate Phonemes button that places the default on the clipboard. Type the phoneme tag and Hass as the content. Set the attribute alphabet value to ipa. Set the attribute ph value to the default phoneme by pasting the result of the Generate Phonemes button click. Select the Speech menu item: Start Speaking or click the Speak Text toolbar button to hear the default pronunciation.
To adjust the pronunciation, first copy and paste a duplicate of the phoneme tag. To edit the copy, select Speech menu item: Edit Phonemes to display the tool window. Scroll through the phonemes. Click on the open-mid back rounded phoneme. The [o] vowel (dog) sound is played and the phoneme is placed the clipboard. Select the [a] phoneme in the second phoneme markup and paste the open-mid back rounded phoneme to replace it. Select the Speech menu item: Start Speaking or click the Speak Text toolbar button to hear the pronunciations.
Select and copy the adjusted pronunciation markup and paste to replace Hass in the Specials VoiceXML script.
Run the document and select the Fresh Produce option when prompted to hear the adjusted pronunciation of Hass.
The previous example illustrates adjusting a pronunciation using W3C SSML and the ipa alphabet. Many applications may be using SAPI. To adjust pronunciations, the same steps may be used for SAPI. Begin by opening a SAPI XML editor with VoiceMarkupKit.
Select Microsoft SAPI5 Windows Speech Synthesis in the speech API dropdown. Select Speech menu item: Generate Phonemes to display the tool window. Enter Hass, select Noun or Proper Noun, en-US, sapi, and press the Generate Phonemes button that places the default on the clipboard. Type the pron tag and Hass as the content. Set the attribute sym to the default value by pasting the result of the Generate Phonemes button click. Select the Speech menu item: Start Speaking or click the Speak Text toolbar button to hear the default pronunciation.
To adjust the pronunciation, first copy and paste a duplicate of the pron tag. To edit the copy, select Speech menu item: Edit Phonemes to display the tool window. Scroll through the phonemes. Click on the ao – d [o] g phoneme. The [o] vowel (dog) sound is played and the phoneme is placed on the clipboard. Select the [ae] phoneme in the second pron markup and paste the ao – d [o] g phoneme to replace it. Select the Speech menu item: Start Speaking or click the Speak Text toolbar button to hear the pronunciations.
Adjusting speech synthesis pronunciations is easy if you have the right tools. The Chant Developer Workbench with LexiconKit, VoiceMarkupKit, and VoiceXMLKit are essential in helping speech synthesis sound good.