New Snippet: Azure Text to Speech

I made a Popclip Snippet that uses Azure’s Text to Speech API to pronounce the selected text. The speech service that Azure provides is better than the say command, it sounds like a real human being instead of a robot voice. However, Azure’s region and key are required.

5 Likes

Awesome, thanks for sharing!

The PopClip Snippet feature is really great, it makes write an extension a lot of easier. As a non-native English speaker, I need to know how to pronounce a word or a phrase in many times, this is the reason I made this extension, and I believe there are many people like me who need this feature. Hence, I’d like to kindly request to put this extension on the official extension store so that more people can get benefit from it. I do understand the maintenance and improvements could take more energy and time if it remains current status.

This snippet is already shared here, on the github and twitter(I saw it today). I did a google search and replied the topic which is the first result to make this snippet easier to find.

I wouldn’t be suitable for the extension directory in its current form because it doesn’t not work “out of the box” i.e. needs editing the code to work. It is better as a snippet that people can adapt with their API key.

I’m not currently accepting new submissions to the extensions directory, but I am however working on a new extensions directory that will be easier for people to submit their own extensions to. A modifed version of this (with the configuration as options rather than hardcoded) would be great for that when it’s ready.

1 Like

Thanks, this is my first time to make a popclip extension, the code is basically just working for myself. The options are definitely required if others want to using it easily. I’ll learn how to make a configurable popclip extension in the spare time.

1 Like

Here is an example of how you could add the options:

#!/bin/zsh
# #popclip
# name: Azure TTS
# icon: symbol:message.and.waveform
# options:
# - { identifier: region, label: Azure Region, type: string, description: "Helpful text here" }
# - { identifier: key, label: Azure Subscription Key, type: string, description: "Helpful text here" }
...

and then access the variable using $POPCLIP_OPTION_REGION and $POPCLIP_OPTION_KEY.

1 Like

The Region, Key, Output format, Language and Voice are all extracted to options. I’ve created a repo for anyone who want to contribute.

1 Like

Awesome! Thank you for doing this. The README is invaluable. I was able to successfully create the Azure resource and get my subscription key, and got it working.

(I had attempted to do it a couple of days ago but the Azure site is very hard to use and I gave up…)

1 Like

You’re welcome! I tried hours to find where the TTS is, where to get the subscription key and try to find and understand the REST api document, it was really hard… So I think put it in the README could save the time for others. Today, I tried to delete the current app and recreate another one for writing a better README instruction, but azure says that I have that app even though I’ve deleted it. Maybe I’ll try it some days later, and then improve the README.

I removed the details summary for Fill in the options to make it easier to find.

1 Like

A user has created an implementation for OpenAI at the following link: popclip-openai-tts/openai_tts.sh at main · YuChenSSR/popclip-openai-tts · GitHub. Unfortunately, the script contains a syntax error.

I have fixed the script and also added several additional options.

#!/bin/zsh
# #popclip
# name: openai TTS
# icon: symbol:message
# options:
# - { identifier: url, label: BASE_URL, type: string, default value: "https://api.openai.com/v1/audio/speech", description: "OpenAI API URL for Speech Service" }
# - { identifier: api_key, label: OpenAI API Key, type: string, description: "The API key for OpenAI" }
# - { identifier: tts-model, label: TTS-Model, type: multiple, values: [tts-1, gpt-4o-mini-tts], default value: "tts-1", description: "The TTS model for OpenAI" }
# - { identifier: speed, label: Playback Speed, type: string, default value: "1.250", description: "The playback speed for the audio (e.g., 0.75 for 75% speed)" }
# - { identifier: voice, label: Voice, type: multiple, values: [alloy, echo, fable, onyx, nova, shimmer], default value: shimmer, description: "The voice for the TTS service" }
# - { identifier: volume, label: Volume, type: multiple, values: ["10", "20", "30", "40", "50", "60", "70", "80", "90", "100"], default value: "50", description: "Playback volume" }

# Create a temporary variable for input and clean it
input_content=$(echo "$POPCLIP_TEXT" | tr '\n\r' ' ' | tr -s ' ')

# Create a unique temporary file in the system's temp directory
temp_file="/tmp/popclip_tts_$_$(date +%s).mp3"

# Store current volume
current_volume=$(osascript -e "output volume of (get volume settings)")

# Use curl to download and save the audio data to the temporary file
curl "${POPCLIP_OPTION_URL}" \
  -H "Authorization: Bearer ${POPCLIP_OPTION_API_KEY}" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "tts-1",
    "input": "'"${input_content//\"/\\\"}"'",
    "voice": "'"${POPCLIP_OPTION_VOICE}"'"
  }' \
  --output "${temp_file}"

# Set volume for playback using the PopClip option
osascript -e "set volume output volume ${POPCLIP_OPTION_VOLUME}"

# Play the temporary audio file using afplay at the specified speed
afplay -r ${POPCLIP_OPTION_SPEED} "${temp_file}"

# Restore original volume
osascript -e "set volume output volume ${current_volume}"

# Clean up the temporary audio file when you're done with it
rm "${temp_file}"

@nick PopClip runs for as long as afplay is active and playing the file. Is there a way to close PopClip earlier while allowing afplay to continue playing? During this time, PopClip isn’t usable - it can only interrupt the playback of the MP3 file.

Yes, in a shell script you can put & at the end of a command to run that command in the background and PopClip will then not wait for it. So:

afplay -r ${POPCLIP_OPTION_SPEED} "${temp_file}" &

I’ve already tried that. PopClip keeps spinning until the sound output has finished. The following commands are likely the source of the problem. :face_with_monocle:

Okay. Diese Zeile funktioniert:
nohup sh -c "afplay -r ${POPCLIP_OPTION_SPEED} '${temp_file}'; osascript -e 'set volume output volume ${current_volume}'; rm '${temp_file}'" >/dev/null 2>&1 &