Author Archives: cafewalter

Is relative phase audible?

Every continuous tone can be represented as the sum of a bunch of sine waves, with various frequencies, relative volumes, and relative phase. I’ve often heard it asserted by knowledgeable people that we humans cannot hear relative phase relationships – that is, that two tones consisting of sine waves of the same frequencies and relative volumes as each other, but with different phase relationships, sound the same. This is based in part on early experiments by Helmholtz, pioneer of much of what we know about the harmonic structure of sounds. The rationale is that the way we hear has to do with tiny “hair cells” in the inner ear, each tuned to a particular frequency, that send out neural signals in proportion to the amount of energy at that frequency. The brain gets a moment-to-moment graph of energy versus frequency, with all the phase info stripped out, so there is simply no way for us to detect phase relationships.

However, just as often I’ve heard it asserted by audio professionals that relative phase is important. We hear about filters (like graphic EQ’s) introducing “phase problems”, or about time-alignment of loudspeaker drivers. If either of these are real problems, it must be because we can hear relative phase. For instance, most filters alter the phase of signals at frequencies near to the cutoff frequency: so, if you take a complex signal with a lot of harmonics and pass it through a filter, you’ll end up changing the phase relationships of the parts of the signal that make it through, even though the volume and frequency relationships haven’t changed. Is that a problem?
It turns out this question is surprisingly easy to answer these days, even though it was hard for Helmholtz. I don’t know whether everyone can hear relative phase, but I sure can.

The experiment
I started by making some raw materials: individual sine waves. I used SoundForge 6.0 to generate 15 seconds of 220Hz sine wave, at -12dBFS. (We’re going to be adding more things to it, so if we start at zero we’d get clipping, and that would skew the results.) Same thing for 440Hz (the second harmonic), 660Hz (third), and 880Hz (the fourth). Then I combined these to make my stimuli. First, I copied the 220Hz sine into a new file, and mixed in (using the Paste Special command) the 440Hz sine at equal volume, the 660Hz sine at -6dB, and the 880Hz sine at -12dB. I saved that as my “in phase” sample. Then, I again copied the 220Hz sine into yet another new file, and mixed in the other sines at the same proportion; however, for each of the other sines, I inverted the signal as I mixed it in. (This is an option in the Mix dialog in SoundForge, but if it hadn’t been I could have just modified the raw materials.) I saved this as my “out of phase” sample.

I should pause here to point out that what I’m talking about has very little to do with “in phase” and “out of phase” in the sense of what happens when you wire up one speaker or one mic connector backwards (better described as “polarity” than “phase”); nor does it have much to do with phase relationships between microphones, like when you have a close and a far mic on something and you get comb filtering because of the time lag between them.

Anyway, the question is, do these two samples sound the same or do they sound different? The only difference is the phase relationships: there’s exactly the same amount of signal energy at exactly the same component frequencies. If you plotted it on a spectrum analyzer the two signals would be identical. Do ears behave like spectrum analyzers, or are they fancier than that?

You can try for yourself. Here’s a 3.3 second excerpt of the in-phase wave, and here’s the same length of the out-of-phase one. They’re in .wav format rather than .mp3, because I don’t trust an MP3 to preserve relative phase – after all, the psychoacousticians say you can’t hear it, so a compression algorithm shouldn’t need to preserve it! (I don’t know, maybe they preserve it anyway; I just don’t trust it.) Play them and listen! Can you hear the difference? My guess is that some people can and some can’t.

Blind trials
But, it’s easy to fool yourself. If you want to be a bit more systematic about it, you might want to try a kinda-blind test. (This is still not perfect; ideally we’d use the computer to do a truly randomized, truly blind test. But it’s better than nothing.) Find a friend, and get them to flip a coin 10 times and write down the outcomes without telling you. Now, put the friend in front of the computer, and instruct them to play first the in-phase .wav, then the inverted one, and then one or the other depending on whether they got a head or a tail. They say nothing during this. You listen, and decide whether the third “mystery” sample was in phase or inverted, and write it down. Same thing, ten times in a row; friend says nothing, throughout. After all ten, compare your judgements with the coin tosses. If you can’t hear the difference, around half your answers will be wrong. If you got nine or ten right, that’s pretty good evidence that you can hear the difference. Have fun! To me they sound quite different; on casual listening my wife thought they sounded the same, but when she paid attention to them she was able to get them right nearly every time.

What next?
The next steps would be to try this with lower and higher frequencies; can I hear relative phase down at 40Hz? If so, I better be careful about how I design the highpass filters that block DC and subsonic rumble from getting through my audio gear, because they introduce phase shifts up around there. Can I hear relative phase at 5kHz and above? If so, then maybe those claims about phase-aligned speakers aren’t just marketing voodoo. And then, how small a phase difference can I hear? In this experiment it was the biggest possible difference, 180 degrees out of phase from the fundamental tone. Can I hear 90 degrees out of phase? 30 degrees? Most filters don’t skew things as radically as they were skewed in this experiment, so perhaps the effect is nothing to worry about.

And hey, maybe my methodology is wrong. If anyone sees a flaw in the way I did the experiment, let me know! I can think of one possible issue, which is that because of the way the harmonics stack up, slew rate distortion could be more of a problem in one sample than the other; I don’t think it is, though, because I’m playing this at low volumes on a system with an M-Audio Delta 2496 sound card through Mackie HR824’s, so I’m well within the limits of the system.

Some thoughts on music cables

This started as a response to a question on the email forum “The Bottom Line.” When it looked too long I decided to post it as a web page instead. Still, please be aware that there’s a lot that can be said about cables, and this is just an overgrown email message.

What makes cables different?
There are two kinds of cable differences that can be discussed: the kind that can be measured and is known to make a difference in sound according to commonly accepted electrical theory, and the kind that cannot or isn’t. Whether the second kind (oxygen-free copper, cryogenics, blah blah) makes a difference, I won’t discuss here. Let’s stick with the objectively measurable differences for this article.

Wires differ in:
• the size of the conductors;
• the number of conductors;
• the spacing between the conductors;
• the “dielectric” material that insulates between the conductors;
• whether there is a shield and how it is constructed;
• the outer insulating jacket.

These differences in turn cause differences in their electrical characteristics:
• how much power do they waste through heat;
• how much capacitance is there between the conductors;
• how much noise is generated by movement of static charges in the insulator when you wiggle the wire (this is called “triboelectricity”);
• how much external interference can be rejected by the shield;
• whether the wire can carry a balanced signal (needs two identical conductors plus, typically, a shield).

In addition, instrument cables have connectors on either end, which may differ in various things like mechanical strength, precise size, contact material, means whereby the wire is connected to the connectors, and means whereby the strain between the connector and wire is relieved. These don’t affect the sound in principle, but they affect how soon the cable fails, which affects the sound in the real world.

The differences in wire are very important, because mics, instruments, and amps all produce different sorts of signal. The relevant aspects of the signal are:
• how much power is being conveyed
• the impedance of the devices at either end
• whether the signal is balanced or unbalanced with respect to ground

I’m not going to explain what those mean, this is already getting long enough!
The job of a cable is to carry a signal between two points, without letting in any interference along the way and without modifying the signal at all. No cable does this perfectly, so there are tradeoffs to be made.

What’s what?
Let’s talk about some specific kinds of cable:

1. Speaker cable.
Speaker signals are high power, low impedance, unbalanced. Because they’re high power, interference and hum are very small by comparison, so the wires can be unshielded, and triboelectricity is not a problem. However, speaker wires are often run for long distances, so capacitance can be a problem, because it can cause the amplifier to oscillate at supersonic frequencies and destroy either itself or the speaker.

That, by the way, is why you should never use instrument cable for your speakers: its typically higher capacitance means you are likely to cause problems. Even if you don’t actually damage something, instability will change your sound and can decrease the amount of usable power. Some amps are less prone to this than others.

So, the ideal speaker cable has big, low-resistance conductors, with low capacitance between them. Shields are unnecessary.

2. Mic cable.
A mic signal is very low power, low impedance, balanced. It is quite susceptible to external interference, which is why it’s balanced: the idea is that the same interference will affect both signal conductors identically, and the mic preamp then subtracts one signal from the other, hopefully eliminating the interference.

Because the source and load are low impedance (and mics aren’t as susceptible to instability as amps are), capacitance is not a big problem; but anything you can do to help balance the signal will reduce noise. So some mic cables use “Star Quad” wiring where there are actually four, rather than two, signal conductors; they are intricately braided together and then paired up at the ends so that they behave like two conductors that are very close together physically. Star Quad increases the capacitance, but it reduces noise.
So, the ideal mic cable has at least two and maybe four signal conductors, plus a shield. Capacitance and triboelectricity are not problems.

3. Instrument cable.
An instrument signal is low power (but more than a mic), high-impedance, unbalanced. Like a mic signal it is susceptible to external interference, but different sorts: it is more sensitive to things like fluorescent lights and neon signs, less sensitive to motors and power lines.

Because the source and load are high impedance, any capacitance in the cable creates a low-pass filter. That is, it reduces the high frequencies in the signal. For that reason, you want low capacitance. But also, the high impedance means that triboelectric signals, which would get drained away by low impedance, can be a problem: when you wiggle the instrument cable you can hear noise from your amp. To deal with this, manufacturers add layers of intermediate insulators that are actually somewhat conductive.

Instrument cables are typically connected to musicians, who move around on stage. So they want to be fairly lightweight, which tends to mean physically smaller wires, which increases the capacitance. In principle you could make an instrument cable as beefy and as low-capacitance as a speaker cable, in which case they would be interchangeable (if it was shielded), but who would want that?

There is one problem with using the shield as a signal conductor rather than just as a shield, which is that shields sometimes get frayed or broken over time. To get around this, some instrument cables use two signal conductors, surrounded by a shield (just like a mic cable), with one of the signal conductors connected to the shield. (This is also claimed to produce some benefit with regard to interference, but that is only true in the presence of somewhat unusual kinds of interference, I believe. I have not seen a benefit.) The problem with doing this is that it increases the capacitance, as well as making the cable stiffer physically.

So, the ideal instrument cable has one signal conductor, surrounded by a shield, and is fairly low capacitance. Triboelectric shielding is useful.

The nature of electric bass signals

In designing various electric bass signal processors (some of them on these pages), I’ve had to learn a lot about the characteristics of an electric bass signal. They’re different than I initially expected. Here’s some of what I’ve found:

Signal Strength
To measure peak strength, I played the basses through a single JFET opamp stage with 10M input impedance and unity gain, powered by +/-15V supplies, into an opamp-based peak detector. The following numbers should be a fairly good representation of true peaks; these are much higher than the RMS signal strength because of the unusual waveform (see below).

My Keith Roscoe five-string, with active 18V Bartolini electronics, puts out as much as 6V peak when I bang on all the strings with both hands as loud as I possibly can. I imagine it would put out more if it had bigger batteries! In realistic but aggressive playing, slapping and plucking as hard as I can, it puts out peaks of 2V at most. In more normal (but still aggressive) playing, peaks are about 1V. That’s with the tone controls flat; boosting the tone controls boosts the peaks correspondingly (the Bartolini EQ is +/-18dB).

My Fender acoustic/electric bass, with Lane Poor P-style pickups connected directly to the output jack and no electronics at all (no tone or volume control, no preamp) puts out somewhat less: peaks are 2V making the most noise I can without breaking the instrument, 1V playing aggressively, and about 0.7V peaks playing normally. (In the past I’ve measured the output of the piezo pickup in this bass, and into a 10M impedance it’s comparable to the Keith Roscoe.)

In both instruments, negative peaks tended to be slightly higher than positive. I imagine this could be because I pluck in one direction, but I’m not sure.

Note that the signal from even the passive pickups is almost strong enough to drive a power amp directly: what a preamp needs to do is provide impedance matching, tonal coloration, and only a small amount of voltage gain. It’s different than an electric guitar and very different than a microphone.

Waveform
To measure waveforms, I again played the basses through a unity gain opamp stage with high input impedance and no filtering. This time I observed the results on an oscilloscope. I also recorded some of the results into a DAT deck and then transferred the results into my computer to generate the WAV files and images below and to do Fourier analysis.

Even when playing fairly gently, the waveform has extremely high harmonic content. The wave shape is generally somewhere between a sawtooth and an almost heartbeat-like shape. Here’s an excerpt of plucking my E string (and here’s the WAV file I excerpted it from):
estring

I expected plucking a harmonic to give a more pure sine wave. Surprisingly, it wasn’t much different. Here’s an excerpt of a harmonic, and again the WAV file it’s from:
harmonic

Spectrum analysis
Obviously from the above, the wave is far from a pure sine wave. In fact, upper harmonics dominate; here’s the Fourier transform of the plucked E string:

estringfft

Note that the second and third harmonics (80Hz and 120Hz) are both slightly higher than the fundamental, the fourth is about the same, and even by the 11th we’re within 18dB of the fundamental. There are harmonics well above the noise floor up to about 1kHz, around the 25th harmonic, and that’s on the unfretted low E string with a fairly smooth plucked sound. A slap or pop would probably have much more high frequency content, although much of it would not be harmonic in nature.

What’s the significance of that? Partly, it’s just a demonstration that I’m using active pickups, stainless steel strings, and a very good bass. An old Fender P-Bass would have much less high frequency content (for better or worse depending on one’s purpose – no holy wars here, please).

The data lead me to conclude that the easiest way to alter the sound of a bass is by tonal shaping rather than by distortion. Because the waveform is basically a pulse followed by a period of silence, clipping that pulse doesn’t have much effect on the harmonic content: the signal is in a sense already highly distorted. Put differently, what distortion does is add upper harmonics, and those harmonics appear to already be there in spades, just waiting to be colored by filtering. Wah effects, envelope followers, and comb filters (aka flangers and phasers) make sense; distortion doesn’t, based on this one datum. More research is clearly called for.

Another conclusion one could draw is that the sound of an un-EQ’d bass should be more appropriate in some forms of music than others. The harmonics correspond to other musical notes, with the higher harmonics tending to be more dissonant notes. For instance, suppose I plucked that E as the root of an Em chord. The guitarist is also playing an Em barre chord, two octaves up. My fourth harmonic is his root; good. But my fifth harmonic is approximately a G#, in the same octave as the guitarist’s G natural. Depending on what I’m playing, I might want to roll off my highs to avoid that conflict.

Still to come: Transients
It seems that a lot of the liveness of the sound may come from transient response. Here’s the envelope of the harmonic I was plucking earlier, and a few neighbors. Note the big initial spikes:

envelope

In this WAV file I’ve taken one of these harmonic plucks, duplicated it and digitally edited that initial spike in the second copy only, to reduce the volume to the same level as the following peaks. So, what you hear is the harmonic as played, followed by the modified version. To my ears, the modified version is smoother, more processed sounding, and less lively. I’d suggest setting up your media player to autorepeat and listen for a while, paying attention to the initial attack.

The peak is only a single half of a cycle; the actual attack, from zero to top of the peak, lasts 1msec. I conclude that transient response – the ability of an amplification system to go from zero to peak in very short time – is important in capturing the liveness of the bass sound. If you want that liveness, you probably want to make sure your gear is able to deal with those transients, and if you use a compressor you want to set the initial attack time slow enough not to affect them. Or the opposite, if you want the smoothness of the modified wave.

So, is reproducing those transients really a hard thing? How does commercially available gear fare at it? What characteristics of an amp or preamp design are important to preserve transient response? More on that later.