Podcast Visibility Optimization

Podcast Glossary – all the terms and jargon explained

For those new to podcasting and not familiar with all the terms, jargon, lingo, and slang, this podcast glossary is for you. And for those with some podcasting experience, this is always a good refresher.

<img class=”wp-image-2109″ src=”https://www.voxalyze.com/wp-content/uploads/2021/08/pexels-photo-6686442-1-300×200.jpeg&#8221; alt=”” width=”436″ height=”290″ /> The world of podcasts.


A type of audio file that is available in digital format and can be downloaded on digital devices.

Baked in Ads or Host-read Ads

Baked in ads or host-read ads refer to the ads that are read and recorded by the podcast hosts. Baked in ads are inserted in the show, which means people who download the episode all hear the same ad.

Bit Depth

Bit depth controls the dynamic range of audio, it is the number of “bits” and information in each sample. The higher the bit depth is, the more dynamic range. The most common bit depths are 16, 24, and 32. For audio content, 16 is widely used.

Bit Rate

Bitrate refers to the rate at which the bits are transferred. The higher the bit rate, the bigger the audio file. For podcasters, the most used bitrate is 96 kbps mono.

<img class=”wp-image-2112″ src=”https://www.voxalyze.com/wp-content/uploads/2021/08/bit-300×200.jpeg&#8221; alt=”” width=”427″ height=”284″ /> Turned-on Monitor Displaying Frequency Graph.


Clipping refers to a form of distortion that produces unpleasant sounds. It can go above the power level of the recording window and damage the quality of the recording or even the equipment.

A simple graph demonstrating clipping: a sine wave exceeds its threshold (the red line). Image from Wikimedia Commons.


When compression is applied to an audio recording, the loudest part will be tamed while the quietest part will be highlighted, so that the recording has a more balanced and consistent volume level.

Condenser Microphone

Also known as capacitor microphones, condenser microphones are made of thin diaphragms and sensitivity, they can capture high frequencies with accuracy, and record more of the audio in an environment.

Cost Per Action (CPA)

A scheme where the podcast producers get paid per action they trigger and not per audience volume. The actions are typically new users or clients for a given service or product. The difficulty of CPA agreements between advertisers and podcast hosts lies in tracking. To date, the most common option used by podcast hosts is specific coupon codes.

Cost Per Mille (CPM)

Measurement metric of the cost per thousand listens or downloads of each ad. “Mille” means “thousand” in Latin. For example, if the price of an ad on a given podcast is $25 CPM, advertisers will pay the host $25 for every thousand listeners.


DAW is short for Digital Audio Workstation, it refers to the software and tool that you use to record and mix the podcast. So that it could be Adobe audition, Audacity, Reaper, etc.

Picture of a Digital Audio Workstation at work.

Dynamic Ad Insertion (DAI)

DAI is a technology that allows advertisers to insert targeted ads directly into audio streams to target specific podcast audiences. The whole process is automated and there is no negotiation around buying ad spaces. It also means that the podcast producers choosing DAI as a way to monetize their show don’t have control over the ads that will be played to the listeners.

Dynamic Microphones

Dynamic microphones are less sensitive than condenser ones, but they’re better used in louder environments, thus preferred for live uses.

Picture of a Silver Dynamic Microphone.


Equalizers or Equalization. It can help balance your audio’s sound quality by cutting or boosting certain frequencies so that the audio sounds clearer.


In the podcast world, encoding refers to the process of converting an audio file into an MP3 file for upload and distribution.


Different ways to organize your podcast’s content. Some of the most common formats of podcasts are Narrative, Solo, Co-host, 1:1 interview, roundtable, etc.


Gain is a measurement of the loudness of audio, it is used to adjust the sensitivity of your microphone. It controls the tone and the volume of audio before it goes through a recording device. So it needs to be adjusted BEFORE recording.

High-Pass Filter

A processor that removes unwanted frequencies that are lower than a determined cutoff frequency from your audio.


Hosting providers are services that store and help you manage and distribute your audio files. Just like there are servers for websites, there are servers for podcasts. The most common hosting providers are Buzzsprout, Libsyn, Simplecast, Acast, Podbean, Anchor (Spotify), Megaphone (Spotify), Captivate, Ausha, or Podigee.


An interface works like a mixer, it’s a bridge between the microphone and the recording platform. An interface allows users to provide phantom power for condenser microphones.

Photo of a Copper Audio Mixer.


A brief intro to your podcast, including the name, a short description of the content, and the host. A jingle usually lasts about 30 seconds or less.

Low-Pass Filter

A processor that removes unwanted frequencies that are higher than a determined cutoff frequency from your audio.


All information about your podcasts, including the podcast name, hosts and guests, cover art, episode name, and number, etc. Inserting the right keywords into the metadata of the podcast could greatly increase the audience’s interest in the show.

Mix Down

The process of combining multitrack audio into a single file.


Simply put: when you start profiting off your podcast. There are two types of monetization: direct and indirect. With direct monetization, podcasters make money by either selling subscriptions to their listeners (premium content for example) or by running ads on their show. Indirect monetization is when the podcast allows a business to gain new or retain existing customers.

<img class=”wp-image-2114″ src=”https://www.voxalyze.com/wp-content/uploads/2021/08/mono-300×225.jpeg&#8221; alt=”” width=”467″ height=”350″ /> Podcast monetization.


Mono sound means that only one channel is used to convert signals into sounds. It creates an effect of the sound is coming from only one source or one direction, even if there is in fact more than one speaker.


As opposed to a broadcast, which is aimed at a mass, wide audience, a narrowcast focuses on a specific target audience, such as employees of a company or members of an association. While regular podcasts are available to the general public, narrowcasts (also called private podcasts) are password-protected and can only be accessed with credentials.


In an audio waveform, the top is referred to as a peak, and the bottom is a trough. “Peaking” is when the peak goes too high, usually coming from loud noises, such as a cough or a yell.

Podcast Glossary

An alphabetic list of words with brief explanations focusing on the podcast industry. Simply put, a dictionary, but only with podcast-related terms – as what you are viewing right now.

Podcast Visibility Analytics (PVA)

PVA stands for Podcast Visibility Analytics and monitors the visibility of the different shows and episodes across the listening apps. This is a key enabling factor for Podcast Visibility Optimization (PVO). Voxalyze is the leader and pioneer in PVA.

<img class=”wp-image-2094″ src=”https://www.voxalyze.com/wp-content/uploads/2021/08/%E6%88%AA%E5%B1%8F2021-08-02-%E4%B8%8B%E5%8D%884.49.23-1-1024×516.png&#8221; alt=”” width=”532″ height=”268″ />

Podcast Visibility Optimization (PVO)

PVO stands for Podcast Visibility Optimization. It is also commonly named “SEO for podcasts” as it shares common attributes with SEO (Search Engine Optimization). PVO is the process of improving the visibility of a given podcast on audio listening platforms such as Apple Podcasts, Spotify, Google Podcasts… etc. The more visible a podcast, the more it will attract new listeners. PVO is done via a variety of levers as explained in this framework.

<a href=”https://www.voxalyze.com/podcast-visibility-optimization-stack-2021/”><img class=”wp-image-2054″ src=”https://www.voxalyze.com/wp-content/uploads/2021/07/image-1024×602.png&#8221; alt=”” width=”536″ height=”315″ /></a> PVO Framework by Voxalyze</figure>

Room Tone

The sound of an empty room when no dialogue is happening. Even in a silent room, the mic may still pick up some noises, so it is better to test the room tone before you start recording. This could help reduce the noise.

Image from The Criterion Collection.

RSS Feed

RSS stands for Really Simple Syndication. Generally, an RSS Feed is a file that summarizes the updates from a website, with links to a list of articles. Simply put, it means that your audience can access the content outside of your website. In the podcast world, an RSS feed contains information about your show and episodes and passes the information to listening platforms, like Apple Podcasts or Spotify.


As opposed to mono sound, stereo creates a natural, lifelike sound effect by using multiple channels to convert signals into sounds, to reach an effect of sound coming from various directions.


A piece of music used as an intro, end, or linkage of different sections to the podcast, usually no longer than 5 seconds.


The act of watch, read, or listen to content at a later date. Podcasting is an example of timeshifting since it’s not live like broadcasting, instead, you record, edit, and produce the audio, and then upload for the listeners.


A graph that demonstrates the recording of audio.

<img class=”wp-image-2128″ src=”https://www.voxalyze.com/wp-content/uploads/2021/08/waveform-300×125.png&#8221; alt=”” width=”521″ height=”217″ /> Default waveform view. Image from Audacity.

WAV File

Waveform Audio File Format. It was developed by IBM and Microsoft and is the main format for storing uncompressed audio on Windows systems.