KB113: Speech Tags


What are Speech Tags

You can use Speech Tags wherever you can specify text to be spoken using Text to Speech (TTS). Speech Tags can be used to change the quality of the voice itself, or to change the pronunciation of a word. Some tags, such as [silence 1s] are used alone, but most are opening and closing tags that form a pair that wrap a word or phrase, such as [digits]123[/digits].

You can use the Script panel's Play button to listen to your tagged text to make sure it sounds right. If you hear the voice speak the text of a tag, then you know that the tag was not understood as a tag - check the spelling using the reference below. Be sure that each opening tag has a corresponding closing tag. If tags are not properly matched, you may not hear the voice at all.


Speech Tag Reference

Here is a list of available tags.

[alt] Tags a word so that it is spoken in an alternate manner, often associated with an alternate sense of the word. Modern voices use context to determine pronunciation, so this tag is rarely needed. She was fishing for [alt]bass[/alt].
[conversational] With certain voices, such as Matthew and Joanna, this tag alters the intonation to sound more relaxed and low-key, as in a natural conversation. See also [news]. [conversational]Sure. Try this.[/conversational]
[digits] Reads out a number as digits. The id is [digits]123[/digits].
[english] When using a non-English voice, tags a word or phrase that should be spoken using English pronunciation. C'est le [english]Character Builder[/english].
[french] When using an English voice, tags a word or phrase that should be spoken using French pronunciation. As they say, [french]c'est la vie![/french].
[german] When using an English voice, tags a word or phrase that should be spoken using German pronunciation. On the [german]autobahn[/german].
[ipa] Pronounces the enclosed word using IPA phonetic spelling. [ipa pɪˈkɑːn]pecan[/ipa]
[italian] When using an English voice, tags a word or phrase that should be spoken using Italian pronunciation. That's [italian]amore[/italian].
[news] With certain voices, such as Matthew and Joanna, this tag alters the intonation to sound more like a newscaster. See also [conversational]. [news]The rain in Spain, fell mainly in the plain.[/news]
[past] Tags a word as being a past-tense verb. Modern voices use context to determine pronunciation, so this tag is rarely needed. I [past]read[/past] it yesterday.
[silence] Specifies a pause in the speech. You can use value in seconds (s) or in milliseconds (ms). [silence 1.5s] silence 500ms]
[pinyin] For Madarin Chinese, pronounces the enclosed word using Pinyin phonetic spelling. [pinyin bao2]薄[/pinyin]
[pitch] Sets the pitch for a word or phrase. As an argument, you can use the word default, x-low, low, medium, high, x-high, or a relative value in %. This feature is not be supported on all voices. [pitch -5%]lower[/pitch]
[rate] Sets the rate for a word or phrase. As an argument, you can use the word default, x-slow, slow, medium, fast, x-fast, or a value between 20% and 200%. [rate 80%]slower[/rate]
[sampa] Pronounces the enclosed word using X-SAMPA phonetic spelling. [sampa pI"kA:n]pecan[/sampa]
[spell] Spells a word instead of speaking it normally. This can also be used with the letter 'a', which might otherwise sound like the word "a" in "a dog". The letter [spell]a[/spell].
[spanish] When using an English voice, tags a word or phrase that should be spoken using Spanish pronunciation. [spanish]Buenos días.[/spanish]
[spoken] Use this tag to wrap text as it should be spoken. This text will not appear in a user-facing transcript. Almost always used next to [written]. [spoken]woostershur[/spoken]
[verb] Tags a word as being a verb, to alter the pronunciation. Modern voices use context to determine pronunciation, so this tag is rarely needed. Let me [verb]present[/verb] you with a present.
[volume] Sets the volume for a word or phrase. As an argument, you can use the word default, silent, x-soft, soft, medium, loud, x-loud, or a value such as +ndB, -ndB. [volume -6dB]quiet[/volume]
[written] Use this tag to wrap text as it should be written. This text may appear in a user-facing transcript, but is never spoken. Almost always used next to [spoken]. [written]Worcestershire[/written]




Copyright © 2020 Media Semantics, Inc. All rights reserved.
Comments? Write webmaster@mediasemantics.com.
See our privacy policy.