<title>Author Summary</title> <p>The sound signal of speech is rich in temporal and frequency patterns. These fluctuations of power in time and frequency are called modulations. Despite their acoustic complexity, spoken words remain intelligible after drastic degradations in either time or frequency. To fully understand the perception of speech and to be able to reduce speech to its most essential components, we need to completely characterize how modulations in amplitude and frequency contribute together to the comprehensibility of speech. Hallmark research distorted speech in either time or frequency but described the arbitrary manipulations in terms limited to one domain or the other, without quantifying the remaining and missing portions of the signal. Here, we use a novel sound filtering technique to systematically investigate the joint features in time and frequency that are crucial for understanding speech. Both the modulation-filtering approach and the resulting characterization of speech have the potential to change the way that speech is compressed in audio engineering and how it is processed in medical applications such as cochlear implants.</p>