Web Speech API does not support SSML input to the speech synthesis engine WebAudio/web-speech-api#10, or the ability to capture the output of speechSynthesis.speak()
as aMedaiStreamTrack
or raw audio https://lists.w3.org/Archives/Public/public-speech-api/2017Jun/0000.html.
See Issue 1115640: [FUGU] NativeTransferableStream.
Native Messaging => Deno.Command()
=> eSpeak NG => fetch()
=> Transferable Streams => MediaStreamTrack
.
Use local espeak-ng
with -m
option set in the browser.
Output speech sythesis audio as a live MediaStreamTrack
.
Use Native Messaging, Deno run()
to input text and Speech Synthesis Markup Language as STDIN to espeak-ng
, stream STDOUT in "real-time" as live MediaStreamTrack
.
eSpeak NG Building eSpeak NG.
Deno is used for serveTls()
and Deno.Command()
. Substitute your server language of choice. We do not install the deno
executable globally; we just use the executable in the unpacked extension directory.
git clone --branch deno-server https://github.com/guest271314/native-messaging-espeak-ng.git
cd native-messaging-espeak-ng
chmod u+x nm_deno.js deno_server.js
wget --show-progress --progress=bar --output-document deno.zip \
https://github.com/denoland/deno/releases/latest/download/deno-x86_64-unknown-linux-gnu.zip && \
unzip deno.zip && \
rm deno.zip
Follow these instructions at 1.
, 2.
, and 4.
to create a self-signed certificate for Chromium/Chrome to use for HTTPS.
# As an alternative, Chromium can be instructed to trust a self-signed
# certificate using command-line flags. Here are step-by-step instructions on
# how to do that:
#
# 1. Generate a certificate and a private key:
# openssl req -newkey rsa:2048 -nodes -keyout certificate.key \
# -x509 -out certificate.pem -subj '/CN=Test Certificate' \
# -addext "subjectAltName = DNS:localhost"
#
# 2. Compute the fingerprint of the certificate:
# openssl x509 -pubkey -noout -in certificate.pem |
# openssl rsa -pubin -outform der |
# openssl dgst -sha256 -binary | base64
# The result should be a base64-encoded blob that looks like this:
# "Gi/HIwdiMcPZo2KBjnstF5kQdLI5bPrYJ8i3Vi6Ybck="
#
# 3. Pass a flag to Chromium indicating what host and port should be allowed
# to use the self-signed certificate. For instance, if the host is
# localhost, and the port is 4433, the flag would be:
# --origin-to-force-quic-on=localhost:4433
#
# 4. Pass a flag to Chromium indicating which certificate needs to be trusted.
# For the example above, that flag would be:
# --ignore-certificate-errors-spki-list=Gi/HIwdiMcPZo2KBjnstF5kQdLI5bPrYJ8i3Vi6Ybck=
Pass the generated paths to Deno.serveTls()
in local_server.js
await serveTls(async(request) => {
// ...
}
, {
certFile: 'certificate.pem',
keyFile: 'certificate.key',
signal,
});
Navigate to chrome://extensions
, set Developer mode
to on, click Load unpacked
, select downloaded git directory.
Note the generated extension ID, substitute that value for <id>
in native_messaging_espeakng.json
, AudioStream.js
, local_server.js
.
Substitute full local path to local_server.js
for /path/to
in native_messaging_espeakng.json
.
"allowed_origins": [ "chrome-extension://xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/" ]
Copy native_messaging_espeakng.json
to NativeMessagingHosts
directory in Chromium or Chrome configuration folder, on Linux, i.e., ~/.config/chromium
; ~/.config/google-chrome-unstable
.
cp native_messaging_espeakng.json ~/.config/chromium/NativeMessagingHosts
Reload extension.
On origins listed in "matches"
array in "web_accessible_resources"
object in manifest.json
, e.g., at console
var { AudioStream } = await import('chrome-extension://<id>/AudioStream.js');
var text = `So we need people to have weird new
ideas ... we need more ideas to break it
and make it better ...
Use it. Break it. File bugs. Request features.
- Real time front-end alchemy, or: capturing, playing,
altering and encoding video and audio streams, without
servers or plugins!
by Soledad Penadés
von Braun believed in testing. I cannot
emphasize that term enough – test, test,
test. Test to the point it breaks.
- Ed Buckbee, NASA Public Affairs Officer, Chasing the Moon
Now watch. ..., this how science works.
One researcher comes up with a result.
And that is not the truth. No, no.
A scientific emergent truth is not the
result of one experiment. What has to
happen is somebody else has to verify
it. Preferably a competitor. Preferably
someone who doesn't want you to be correct.
- Neil deGrasse Tyson, May 3, 2017 at 92nd Street Y
It’s like they say - if the system fails you, you create your own system.
- Michael K. Williams, Black Market`;
var stdin = {cmd:`espeak-ng -m -v Storm --stdout`, input:`"${text}"`};
var espeakng = new AudioStream({ stdin, recorder: true });
espeakng
.start()
.then((ab) => {
console.log(
URL.createObjectURL(new Blob([ab], { type: 'audio/webm;codecs=opus' }))
);
})
.catch(console.error);
Abort the request and audio output.
await espeakng.abort()
- Include test for setting an SSML document at SpeechSynthesisUtterance .text property within speech-api
- This is again recording from microphone, not from audiooutput device
- Support SpeechSynthesis to a MediaStreamTrack
- Clarify getUserMedia({audio:{deviceId:{exact:<audiooutput_device>}}}) in this specification mandates capability to capture of audio output device - not exclusively microphone input device
- How to modify existing code or build with -m option set for default SSML parsing?
- Issue 795371: Implement SSML parsing at SpeechSynthesisUtterance
- Implement SSML parsing at SpeechSynthesisUtterance
- How is a complete SSML document expected to be parsed when set once at .text property of SpeechSynthesisUtterance instance?
- How to programmatically send a unix socket command to a system server autospawned by browser or convert JavaScript to C++ souce code for Chromium?
- <script type="shell"> to execute arbitrary shell commands, and import stdout or result written to local file as a JavaScript module
- Add execute() to FileSystemDirectoryHandle
- Issue 795371: Implement SSML parsing at SpeechSynthesisUtterance
- Implement SSML parsing at SpeechSynthesisUtterance
- How is a complete SSML document expected to be parsed when set once at .text property of SpeechSynthesisUtterance instance?
- How to programmatically send a unix socket command to a system server autospawned by browser or convert JavaScript to C++ souce code for Chromium?
- <script type="shell"> to execute arbitrary shell commands, and import stdout or result written to local file as a JavaScript module
- Add execute() to FileSystemDirectoryHandle
- SpeechSynthesis to a MediaStreamTrack or: How to execute arbitrary shell commands using inotify-tools and DevTools Snippets