Reading aloud with text-to-speech using JavaScript

Table of contents

The purpose
Implementation
Example
Result
Reference

The purpose

I’ll try it out, as it seems JavaScript can do text-to-speech.

Implementation

The example code on the following Mozilla page was missing the HTML portion and had some concerning aspects, so I modified it to work.

SpeechSynthesis - Web API | MDN

Web Speech API の SpeechSynthesis インターフェイスは、speech サービスのための制御インターフェイスです。これは、端末で利用可能な合成音声についての情報を取得するために使用されます。読み上げの開始および一時停止、他のコマンドで制御します。

The revised code is shown below. Comments have been added.

<html>
<body>
    <form>
        <input type="text" name="text" class="txt" value="will be read" required><br><br>
        <select name="example"></select><br><br>
        <input type="range" id="pitch" name="pitch" min="0" max="2" step="0.1" />
        <label for="pitch">PITCH</label><br><br>
        <input type="range" id="rate" name="rate" min="0" max="10" step="0.1" value="1"/>
        <label for="rate">RATE（speed）</label><br><br>
        <input type="button" value="READ" id="submit"><br><br>
    </form>

    <script>
        const synth = window.speechSynthesis;

        const button= document.querySelector("#submit");//READ button
        const inputTxt = document.querySelector(".txt");//Read text GUI
        const voiceSelect = document.querySelector("select");//kind of voice pulldown GUI

        const pitch = document.querySelector("#pitch");//pitch slider GUI
        const rate = document.querySelector("#rate");//rate slider GUI

        let voices = [];
        function populateVoiceList() { ///create pulldown for kind of voice
            voices = synth.getVoices();

            for (i = 0; i < voices.length; i++) {
                let option = document.createElement("option");
                option.textContent = voices[i].name + " (" + voices[i].lang + ")";

                if (voices[i].default) {
                    option.textContent += " -- DEFAULT";
                }

                option.setAttribute("data-lang", voices[i].lang);
                option.setAttribute("data-name", voices[i].name);
                voiceSelect.appendChild(option);
            }
        }
        populateVoiceList();

        if (speechSynthesis.onvoiceschanged !== undefined) {//update pulldown
            speechSynthesis.onvoiceschanged = populateVoiceList;
        }

        button.onclick = function (event) {//process for button click
            event.preventDefault();

            let utterThis = new SpeechSynthesisUtterance(inputTxt.value);
            let selectedOption = voiceSelect.selectedOptions[0].getAttribute("data-name");//get selected voice
            for (i = 0; i < voices.length; i++) {
                if (voices[i].name === selectedOption) {
                    utterThis.voice = voices[i];//set kind
                }
            }
            utterThis.pitch = pitch.value; //set rate from GUI
            utterThis.rate = rate.value; //set soeed from GUI
            synth.speak(utterThis);//READ

            inputTxt.blur();
         };
 </script>
    
</body>
</html>

Example

This reads the text you enter into the top text area.

Select a speech synthesis algorithm from the dropdown menu below.

Adjust the voice quality (pitch and speed) using the two sliders below that.

Result

We are able to synthesize speech using JavaScript.

by the way

English (and likely other languages) voice types cannot read Japanese text. (Conversely, Japanese voice types can read English text.)
The available voice options vary significantly depending on the browser and operating system. (Edge, for example, appears to fetch many language options from the network, resulting in a large number of voices appearing in the dropdown after a wait.) It's best to avoid specifying a fixed voice.

Reference

SpeechSynthesis - Web API | MDN