Are You on the Road to Audio Hell?

Comparison by Contrast Audiophile Road to Audio Hell

The quiz – We Audiophiles are always trying to sharpen our skills at evaluating audio components. However, the very methods we use can result in precisely the opposite of the effect desired, namely boredom or frustration with our audio system before we have even paid for it; in other words, AUDIO HELL. Take the following short quiz to help determine if you have traveled this road lately.

Do you try to arrange instantaneous A/B comparisons of brief segments of music to maximize your memory retention?
Do you bring the same group of “reference” test recordings to each audition in an effort to sort out specific performance capabilities and to prevent any disorientation or confusion which could result from using music with which you are unfamiliar?
Do you avoid using music of which you are particularly fond so that you can properly attend to objective analysis rather than be distracted by the music’s pleasures and passions?
Do you believe that the true function of an audio system is to recreate music; and that therefore you can only accurately evaluate audio playback if you have an extensive knowledge of live music performance?
Do you believe that if your evaluation addresses such matters as frequency range, signal/noise ratio, stage size and depth, instrumental separation and balance, timbre, and textual clarity that whatever other purely musical considerations there may be will take care of themselves?
Has it been your experience that some speakers are especially suitable for rock, others for classical and perhaps others for intimate jazz? How do you explain this phenomenon? Is this more or less inevitable?
When you ask yourself: “What should be the correct reference, live music or the recorded session?” do you conclude that it is one or the other? Are you comfortable with your answer to this question?

If you have answered “yes” to at least three of these questions, you can feel comfortable knowing that, like many other audiophiles, you are on the train to AUDIO HELL. If you answered “yes” to most of them, you may be beyond redemption; but we are here to help, and there is always hope. If you answered “yes” to question #3, you probably require the services of an audio exorcist; for if the purpose of your music playback system isn’t to involve you emotionally, then why aren’t you shopping at Sears? Before we take a more critical look at the implications of this quiz and your answers, it might be useful to go review the past few years to see how we got into this mess in the first place.

A Brief History

As the audio industry grew out of its infancy in the 1950’s and began to aspire to commercialism in the 1960’s, an evaluation and review procedure was adopted which initially attempted to mate the measured superiority of the developing technologies with the goal of better sound quality. It appeared that a conspiracy of purpose was entered into by the press and many companies in the industry based on the thesis that technical perfection – as demonstrated by measurements of particular specifications assumed to be relevant as well as correctly obtained – also led to sonic perfection. This thesis had the advantage that winners in the performance race could easily be decided by the evidence of such measurements. Such “proof” made possible facile marketing strategies which have persisted to the present despite overwhelming evidence to the contrary provided by our own ears in the most casual of listening auditions. By the mid-1970’s the development of this thesis had reached a stage with audio components where technical specifications were making further improvements practically impossible. The race for lower distortion, faster slew rates, better damping factors, wider bandwidths and more power had caught up with itself and ground to a halt.

At about this point, a number of smaller publications appeared which abandoned this thesis of measured performance (a kind of technical perfection) in favour of a more subjective approach in which listening to music through the components was considered the more useful tool; and its approximation to “live music” its most sought after criteria. The editorial position of some of these new “underground” magazines considered measurements as irrelevant or even damaging to the evaluation process, observing that audio components which measure the same can sound strikingly different. The result was that the method of auditioning equipment became more complicated; magazine reviewers spent hours listening to and comparing different components in order to decide which sounded best. Out of this history was born the “Golden Ear” upon whose judgement many consumers trusted with their available income. Every month a new product would appear which was hailed as the “best sound” and frequently the opinions of different magazine experts varied widely. Consumers might then choose an expert that they trusted, or become increasingly confused, or give up altogether, returning to the safer criteria of measurements.

By the mid-1980’s the merry-go-round had reached such a pace that most manufacturers resorted to placing their efforts in the tried and true marketplace of seductive advertising slogans and images, and hi-tech cosmetics and gadgetry. It had become too difficult to compete otherwise. The rule was that if the component and its advertised image looked expensive, then it must sound good as well. (Not least of the distractions the audio community has suffered was the switch from analog to digital, which led to such manifestly preposterous notions as “digital ready” speakers and amplifiers, as well as a nearly successful campaign to re-write the definition – as well as the experience – of the term “dynamic”).

As far as we know, there has been no rigorous critique of the critical methodology long in place, a method which we believe has contributed to the audio hell in which most of us find ourselves. None of the current methods now in favour; measurements and specifications, blind tests, double-blind tests, boogie factors, or comparisons to “real” music, have been definitive. Nor has there been a serious alternative offered which categorically presents an orderly, reasonably conclusive methodology by which we can evaluate our components and playback systems. This is exactly what we propose in this essay.

We believe that the basic reason why so many consumers are in AUDIO HELL, or on their way, is that they are confused about what should be the objective of their audio system, and therefore have adopted a method for the evaluation of audio components which often turns out to be counter-productive. If you agree that the goal of your audio system should be to involve us emotionally, physiologically and intellectually with a musical performance, then we would like to suggest the following description for its objective:

An ideal audio system should recreate an exact acoustical analog of the recorded program.

If so, then it would be very useful if we had meaningful knowledge of exactly what is encoded on our recordings. Unfortunately, such is not possible. (This assertion may appear casually stated, but on its truth much depends on the following argument; we therefore invite the closest possible scrutiny.) Even if we were present at every recording session, we would have no way of interpreting the electrical information which feeds through the microphones to the master tape – let alone to the resulting CD or LP – into a sensory experience against which we could evaluate a given audio system. Even if we were present at playback sessions through the engineer’s monitoring (read: “presumed reference”) system, we would be unable to transfer that experience to any other system evaluation. And even if we could hold the impression of that monitoring experience in our minds and account for venue variables, such knowledge would turn out to be irrelevant in determining system or component accuracy since the monitoring equipment could not have been accurate in the first place. (More about this shortly.) But if this is true, how can we properly evaluate the relative accuracy of any playback system or component?

The old method: comparison by reference

We should begin by examining the method in current favour: The usual procedure is to use one or more favoured recordings and, playing slices of them on two different systems (or the same system alternating two components, which amounts to the same thing); and then deciding which system (or component) you like better, or which one more closely matches your belief about some internalized reference, or which one “tells you more” about the music on the recording. It won’t work! … not event if you use a dozen recordings of presumed pedigree … not even if you compare the stage size frequency range, transient response, tonal correctness, instrument placement, clarity of test, etc. – not even if you compare your memory of your emotional response with one system to that of another – it makes little difference. The practical result will be the same: What you will learn is which system (or component) more closely matches your prejudice about the way a given recording ought to sound. And since neither the recordings nor the components we use are accurate to begin with, then this method cannot tell us which system is more accurate! It is methodological treason to evaluate something for accuracy against a reference with tools which are inaccurate – not least of which is our memory of acoustical data. Therefore, it is very likely to the point of certainty that a positive response to a system using this method is the result of a pleasing complementarity between recording, playback system, experience, memory, and expectation; all of which is very unlikely to be duplicated due to the extraordinarily wide variation which exists in recording method and manufacture. (Ask yourself, when you come across a component or system which plays many of your “reference” recordings well, if it also plays all your recordings well. The answer is probably “no;” and the explanation we usually offer puts the blame on the other recordings, not the playback system. And, no, we’re not going to argue that all recordings are good; but that all recordings are much better than you have let yourself believe).

Recognising that many will consider these statements as audiophile heresy; we urge you to keep in mind our mutual objective: to prevent boredom and frustration, and to keep our interest in upgrading our playback system enjoyable and on track. To this end it becomes necessary that we lay aside our need to have verified in our methodology beliefs about the way our recordings and playback systems ought to sound. As we shall see, marriage to such beliefs practically guarantees us passage to AUDIO HELL. It is our contention that, while nothing in the recording or playback chain is accurate, accuracy is the only worthwhile objective; for when playback is as accurate as possible, the chances for maximum recovery of the recorded program is greatest; and when we have as much of that recording to hand – or to ear – then we have the greatest chance for an intimate experience with the recorded performance. It only remains to describe a methodology which improves that likelihood. (This follows shortly).

Listeners claiming an inside track by virtue of having attended the recording session are really responding to other, perhaps unconscious, clues when they report significant similarities between recording session and playback. As previously asserted, no-one can possibly know in any meaningful way what is on the master tape or the resulting software, even if they auditioned the playback through the engineer’s “reference” monitoring system. Anyone who thinks that there exists some “reference” playback system that sounds just like the live event simply isn’t paying attention; or at best doesn’t understand how magic works. After all, if it weren’t for the power of suggestion, hi-fi would have been denounced decades ago as a fraud. Remember those experiments put on by various hi-fi promoters in the fifties in which most of the audience “thought” they were listening to a live performance until the drawing of the curtain revealed the Wizard up to his usual tricks. The truth is the audience “thought” no such thing; they merely went along for the ride without giving what they were hearing any critical thought at all. It is the nature of our psychology to believe what we see and to “hear” what we expect to hear. Only cynics and paranoids point out fallibility when everyone else is having a good time.

Another relevant misunderstanding involves the correct function of “monitoring equipment”. The purpose of such equipment is to get an idea of how whatever is being recorded will play back on a known system and then to make adjustments in recording procedure. It should never be understood by either the recording producer or the buyer that the monitoring system is either definitive or accurate, even though the engineer makes all sorts of placement and equipment decisions based on what their monitoring playback reveals. They have to use something, after all; and the best recording companies go to great lengths to make use of monitoring equipment that tells them as much as possible about what they are doing. But no matter what monitoring components are used, they can never be the last word on the subject; and it is entirely possible to achieve more realistic results with a totally different playback system, for example, a more accurate one. Notice “more accurate,” not “accurate.” It bears repeating that there is no such thing as an accurate system, nor an accurate component, nor an accurate recording. Yet as axiomatic as any audiophile believes these assertions to be, they are instantly forgotten the moment we begin a critical audition.

The proposed method: Comparison by contrast.

When auditioning only two playback systems using the usual method, we will have at least a 50% chance of choosing the one which is more accurate. However, evaluations of single components willy-nilly test the entire playback chain; therefore efforts to choose the more accurate component are compounded by the likelihood that we will be equally uncertain as to the accuracy of each of the system’s associated components if for no other reason than that they were chosen by a method which only guarantees prejudice. How can we have any confidence that having chosen one component by such a method that its presence in the system won’t mislead us when evaluating other components in the playback chain, present or future?

The way to sort out which system or component is more accurate is to invert the test. Instead of comparing a handful of recordings – presumed to be definitive – on two different systems to determine which one coincides with our present feeling about the way that music ought to sound, play a larger number of recordings of vastly different styles and recording technique on two different systems to hear which system reveals more differences between the recordings. This is a procedure which anyone with ears can make use of, but requires letting go of some of our favoured practices and prejudices.

In more detail, it would go something like this: Line up about two dozen recordings of different kinds of music – pop vocal, orchestral, jazz, chamber music, folk, rock, opera, piano – music you like, but recordings of which you are unfamiliar. (It is very important to avoid your favourite “test” recordings, presuming that they will tell you what you need to know about some performance parameter or other, because doing so will likely only serve to confirm or deny an expectation based on prior “performances” you have heard on other systems or components. More later.) First with one system and then the other, play through complete numbers from all of these in one sitting. (The two systems may be entirely different or have only one variable such as cables, amplifier, or speaker).

The more accurate system is the one which reproduces more differences – more contrast between the various program sources.

To suggest a simplified example, imagine a 1940’s wind-up phonograph playing recordings of Al Jolson singing “Swanee” and The Philadelphia Orchestra playing Beethoven. The playback from these recordings will sound more alike than LP versions of these very recordings played back through a reasonably good modern audio system. Correct? What we’re after is a playback system which maximizes those differences. Some orchestral recordings, for example, will present stages beyond the confines of the speaker borders, others tend to gather between the speakers; some will seem to articulate instruments in space; others present them in a mass as if perceived from a balcony; some will present the winds recessed deep into the orchestra; others up front; some will overwhelm us with a bass drum of tremendous power; others barely distinguish between the character of timpani and bass drum. In respect to our critical evaluation process, it is of absolutely no consequence that these differences may have resulted from performing style or recording methodology and manufacture, or that they may have completely misrepresented the actual live event. Therefore, when comparing two speaker systems, it would be a mistake to assume that the one which always presents a gigantic stage well beyond the confines of the speakers, for example, is more accurate. You might like – even prefer – what the system does to staging, but the other speaker, because it is realizing differences between recordings, is very likely more accurate; and in respect to all the other variables from recording to recording, may turn out to be more revealing of the performance.

Some pop vocal recordings present us with resonant voices, others dry; some as part of the instrumental texture, others envelope us leaving the accompanying instruments and vocals well in the background; some are nasal, some gravelly, some metallic, others warm. The “Comparison by Reference” method would have us respond positively to that playback system, together with the associated “reference” recording, that achieves a pre-conceived notion of how the vocal is presented and how it sounds in relation to the instruments in regard to such parameters as relative size, shape, level, weight, definition, et al. Over time, we find ourselves preferring a particular presentation of pop vocal (or orchestral balance, or rock thwack, or jazz intimacy, or piano percussiveness – you name it) and infer a correctness when approximated by certain recordings. We then compound our mistake by raising these recordings to reference status (pace Prof. Johnson), and then seek this “correct” presentation from every system we later evaluate; and if it isn’t there, we are likely to dismiss that system as incorrect. The problem is that since neither recording nor playback system was accurate to begin with, the expectation that later systems should comply is dangerous. In fact, if their presentations are consistently similar, then they must be inaccurate by definition simply because either by default or intention no two recordings are exactly similar. And while there are other important criteria which any satisfactory audio component or system must satisfy – absence of fatigue being one of the most essential – very little is not subsumed by the new method of comparison offered here.

The Hell of Conformity

The methodology of Comparison by Reference will necessarily result in an audio system which imbues a sameness, a sonic signature of sorts, that ultimately leads to the boredom which illuminates AUDIO HELL. The explanation for this lies in the fact that there are qualitative differences from recording to recording – regardless of the style of music – which have the potential to be realized or not, depending on the capability of the playback system. (This is one of the undisputed areas where the superiority of LP to CD is evident, in that there is an unmeasurable, but clearly audible, sameness – a sonic conformity of sorts – from CD to CD which does not persist to a similar degree with LP).

A significant part of the attraction to CD is its conformity to an amusical sense of perfection and repeatability: no mistakes in performance and a combined recording and playback “noise” lower than the ambient noise existing in any acoustical environment where real music is enjoyed. (This should not be taken as a “sour grapes” apology for LP surface noise.) We all know listeners whose entire attention in the audio system evaluation is directed to the presence of noise or the need for absolute sameness from playback to playback rather than on the playback of music. Their common complaint is “this recording didn’t sound that way the last time I heard it.” Have you ever considered that the search for perfection and the need for conformity are head and tail of the same coin, doubtless minted in the worst part of our human character? It remains only for us to be aware of how these “virtues” operate on us, how we are used by them, and in turn make ourselves into something that much less human. (Star Trek has been addressing these issues since the First Generation.) Perhaps civilization’s greatest enemy is not war, disease, or stress, after all; it’s boredom! This is why we must take the time from our daily routines to relax and reinvigorate ourselves by listening (for those of us not talented enough to play) to music. For this to happen effectively, the playback equipment must ensure the individuality of each recording. Otherwise, boredom – a very close relation to conformity and a direct descendant of colourised, sanitized, sound – will result. This stuff is as subtle as it is insidious; it will always be there for us to grapple with; and we must or we will end up like the tranquilizing acoustic wallpaper much of our music is rapidly becoming … or worse.

Encouragement Required

Qualitative differences are easily ignored if our methodology and goal is to achieve an identity with a reference; and our habit of listening for similarities with a reference will make for some awkward moments as we trek out trying to sort out matters of contrast. The latter requires a much broader attention span and invites every conceivable intellectual and emotional connection we can make with not just one or two recordings but many, and not just with their analogous counterparts in genre but with a range of wildly different styles, venues, and recording method.

When our attention is directed to similarities [between that which is under evaluation and another system, or our memory of a live music reference, or of the “best-ever” audio], we naturally focus on vertical (frequency domain) or static (staging) determinants. But the sonic signature of sameness is not only to be found in the frequency domain, which is where we usually think of looking for it and wherein we try to sort out tonal correctness, but in the time domain, where dynamic contrast lives. When our attention is directed to contrasts, we are more likely to focus on musical flow, dynamic resolution, and instrumental and vocal interplay. When we compare for what we take to be tonal correctness using the Comparison by Reference method, we will end up with results not likely to have been on the recording, but rather the effect of the complimentarity referred to earlier. When a system is found wanting because it does not uniformly reproduce large stages or warm voices, we will end up with a system which will compromise other aspects of accuracy, for not all recordings are capable in themselves of reproducing large stages or warm voices. When a playback system can reproduce gigantic stages or warm voices from some recordings and flat, constrained stages or cool voices from others, it follows that such a system is not getting in the way of those characteristics.

Using this method of evaluation takes some time, and some getting used to; but then we audiophiles have been known to spend hours sorting out the benefits or damage caused by AC conditioners or isolation devices. More to the point, after the 2 or 3 hours it takes to compare any two components by this method, we will have ruled out one of them, permanently! And if we find that neither is the decisive winner, then we can probably conclude that they are both sufficiently inaccurate as to exclude either from further consideration. In other words, we now have a method by which we can guarantee the correct direction of upgrade toward a more accurate system.

Detail and Resolution

We’d like to briefly examine one of the more interesting misperceptions common to audio critique. Many listeners speak of a playback system’s revolving power in terms of its ability to articulate detail, i.e. previous unnoticed phenomena. However, it is more likely that what these listeners are responding to when they say such-and-such has more “detail” is: unconnected micro-events in the frequency and time domains. (These are events that, if they were properly connected, would have realized the correct presentation of harmonic structure, attack, and legato.) Because these events are of incredibly short duration and because there is absolutely no analog to such events in the natural world and are now being revealed to them by the sheer excellence of their audio, these listeners believe that they are hearing something for the first time, which they are! And largely because of this, they are more easily misled into a belief that what they are hearing is relevant and correct. The matter is aided and abetted by the apparentness of the perception. These “details” are undeniably there; it is only their meaning which has become subverted. The truth is that we only perceive such “detail” from an audio playback system; but never in a live musical performance.

“Resolution” on the other hand is the effect produced when these micro-events are connected … in other words, when the events are so small that detail is unperceivable. When these events are correctly connected, we experience a more accurate sense of a musical performance. This is not unlike the way in which we perceive the difference between video and film. Video would seem to have more detail, more apparent individual visual events; but film obviously has greater resolution. If it weren’t for the fact that detail in video is made up of such large particles as compared to the micro-events which exist in audio, we might not have been misled about the term “detail”, and would have called it by its proper name, which is “grain”. Grain creates the perception of more events, particularly in the treble region, because they are made to stand out from the musical texture in an unnaturally highlighted form. In true high-resolution audio systems, grain disappears and is replaced by a seamless flow of connected musical happenings. [cf. “As Time Goes By” Positive Feedback Magazine, Vol. 4, No. 4-5, Fall ’93].

Development

Returning to our suggested methodology – let’s call it “Comparison by Contrast” – we strongly urge resisting the reflex to compare two systems using a single recording. This may require a few practice sessions comparing collections of recordings until you have been purged of the A/B habit, which tends to foster vertical rather than linear attention to the music. If you listen analytically to brief segments of music, switching back and forth, there is no possible way to get a sense of its flow and purpose in purely musical terms. Music and its performance (which are or ought to be inseparable) are very much about the development of expectations which are subsequently prolonged or denied. It is not possible to respond to this aspect of music as an A/B comparison and it may come as a surprise that an ability to convey this very quality of musical drama is the single most important distinguishing characteristic of audio systems or components.

By using the Comparison by Contrast method of evaluating components, we have in place a reliable procedure for sorting out the rest of the playback chain even in a pre-existing system whose components have not yet been put to the same test. Once you have ruled in a competent as being more accurate, it will fall out that some aspect of the sound will be less than completely satisfactory, simply because the more accurate the component, the more revealing of the entire playback chain whose errors become more apparent. The next step is to pick a component of a different function in the system – it is usually easier and more revealing to work from the source – and repeat the Comparison by Contrast method for each component in turn. This includes cables, line conditioners, RF filters, isolation devices, etc., as well as amplifiers, speakers, and source components.

The methodology of Comparison by Reference leaves us without a clue as to how to proceed when the inevitable boredom and frustration resulting from its compromises set in. The Comparison by Contrast method, which also results in compromise as any audio system must, will always offer more hints of a live performance – for this is what is usually recorded – since it has enabled us to get closer to the recording. And as more components are substituted using Comparison by Contrast, the result will always be positive in greater proportion to Comparison by Reference. By the way, a delightful outcome of continuing to advance your system by the Contrast method is that you will not only be required to broaden your supply of hitherto unfamiliar recordings to comply with the method, you will also find that your own library is already replete with recordings whose sonics are much better than you had previously given credit. In this way, you will not only become better acquainted with a hitherto back-shelved portion of your collection, you will discover how much more exciting music is immediately available to you; and voirla AUDIO HEAVEN.

The false prophet which diverts many audiophiles from the road to AUDIO HEAVEN is the notion that their audio system ought to portray each type of music in a certain way regardless of the recording methodology. An accurate playback system plays back the music as it was recorded onto the specific disc or LP being played; it does not re-interpret this information to coincide with some prejudice about the way music ought to sound through an audio system. (This explains why many people think that some speakers are especially suitable for rock and others for classical; if so, both are inaccurate.) To put it another way, you can’t turn a toad into a prince without having turned some rabbits into rats.

Only if your audio system is designed to be as accurate as possible – that is, only if it is dedicated to high contrast reproduction – can it hope to recover the uniqueness of any recorded musical performance. Only then can it possibly achieve for the listener an emotional connection with any and every recording – no matter the instrumental or vocal medium and no matter the message. Boredom and frustration are the inevitable alternatives. Think about it.

Leonard Norwitz
THE AUDIO NOTE CO. (USA)
San Jose, California
January – April 1993

Peter Qvortrup
Audio Note (UK) Ltd.
Brighton, England
August – December 1993

(Revised by L. Norwitz for the present edition from the published essays under the same title in Positive Feedback Magazine, December, January and February 1994).