CONFERENCE: SEPT 17th, 2005.
University of Westminster ; Marylebone Road, LONDON, W1
Recording in the real world
When I’m giving talks to younger people about sound equipment and sound recording, inevitably the subject concentrates on the physical properties of sound and how to capture it.
I am becoming more and more convinced that this is entirely the wrong approach; it is an example of how the ‘scientific method’ can be a hindrance rather than a help; I constantly worry that a recording engineer is still, even today, like a photographer facing a Grand Prix with a box camera.
It’s fine and necessary to learn about the available technology for capturing sound and being able to reproduce it again at a later date, but to get anything like a true understanding of what we are really getting into, one needs to understand much more about how we hear sound, both physically by knowing about the real mechanisms of hearing, but even more importantly, how sounds affect us and how we understand what sounds are.
Once the chaps have some sort of grasp of what and how we hear, they can start to apply that knowledge to the various parts of the recording process, starting in the studio at the microphone, then moving on to the microphone preamplifier and other necessary or unnecessary parts of the recording chain, up to the recording medium itself, and the ways of monitoring what’s going on, and how to use it once we’ve got it.
Sound: The Stuff of Recording
We are taught, parrot fashion, that sound doesn’t behave in a linear way like string or water, it is logarithmic. Usually we are told some ‘gee wizz’ facts to try to understand scale, but sadly our brains don’t work that way and it’s a tough concept to grasp. I think the only way to come to terms with sound levels is to think in terms of dB (decibels) and try to remember that 1000 watt amplifier is not that much louder than a 10 watt amplifier!
I am being intentionally flippant about this; at the two ends of the spectrum; to make a loud sound louder, a lot of extra energy is needed, if a sound is already very quiet, you can take most of its energy away, and it won’t seem to get much quieter!
And what about frequency? We can ‘hear’ a range of frequencies from about 25Hz up to around 14KHz. That can be measured as the ‘frequency response’ of our ears, but natural and musical sounds extend from as low as 8Hz and up to 40 to 50KHz. Is that relevant? Absolutely!!
Hearing and Perception
From the mid 1930s up to the late 1960s a mass of work was done in audio labs studying the biology and the physical limits of human ears. I’m not quite arrogant enough to dismiss the whole of the research out of hand, but I do insist that all that work needs to be placed in the context of the knowledge that what we think we hear is very much more important than what some figure on a graph tells us we should be hearing…. Or what some pundit in a magazine tells us we should be hearing! (and there seems to be a host of similarities between those two!)
The mechanism of the ear is reasonably well understood and taught; the path of pressure waves from the outside air causes movement of the eardrum, the bone structures act as an ‘impedance converter’ and transfer the vibrations across the middle ear to the inner ear where the pressure waves act on sensory cells in the ‘cochlea’. But that level of understanding of ‘hearing’ is about on a par with the knowledge that a dog usually has four legs.
Now, I want to scratch the surface just a little…..
Physical hearing tests (I’m sure that we have all seen and participated in those ‘hearing tests’ at audio shows…) show that we can discern frequencies as ‘notes’ down to about 30Hz, yet in our daily lives we are subjected to, and are well aware of lower frequencies (so called ‘infra sound’). These can come from mechanical things such as heating and air conditioning systems, trains and motors, as well as naturally (storms and wind).
At the other end of the spectrum we are not only aware of frequencies above 15KHz, but of course, all musical sounds contain harmonic information at high frequencies contributing to musical quality.
And all this has very little to do with ears…. It’s all to do with sound hitting our bodies and skin; not necessarily ‘trouser flappingly’ hard, our brains interpret subtle sounds from all over our bodies and integrates them with the refined signals that it gets from our ears. What we ‘hear’ is a combination of all that.
Noise and Range
Any discussion about the range of volumes that are heard by the human ear is bound to be complicated. Simplistically, we can hear sounds as quiet as a pin dropping onto carpet at a distance of 20 feet (OK, that was a guess!) up to a level where the pressure of sound causes physical pain; in front of the rig at an AC/DC concert. But between those extremes hearing does some amazing things: A trip out into the country on a quiet night can easily show how our hearing (note, I’m using the term ‘hearing’ rather than ‘ears’) changes and becomes very much more sensitive than normal.
Equally, in a noisy environment our hearing ‘desensitises’ as if it is compensating to make things more comfortable; and that’s exactly what it is doing.
Where these effects actually take place is complex and debatable; some of the compression effects take place in the middle and inner ear but I suspect that most of it is in the brain; and there are aspects of this biological compression that have a great bearing on quality and appreciation.
So far we have considered sound and hearing in terms of ranges of perception. It’s like describing a painting as ‘various coloured patches on a flat plane’. But eventually we want to move towards a better understanding of recorded (or created) performance, and so we need to know more about what our hearing considers good and acceptable and if there are any aspects of ‘not so good’ that have to be watched out for.
Distortion; Just a Very Quick Look
The simplest ‘musical’ note is a sine wave, this is a sound of a single frequency, devoid of harmonics. If a sine wave is distorted by compressing or constricting just the top or bottom of the wave, then harmonics (or you could say ‘other sounds) appear in the sound. These harmonics are called ‘even order’ harmonics and they are musically related to the fundamental frequency. The 2 nd harmonic is one octave above the fundamental, the 4 th is 2 octaves, and so on.
BUT if the sine wave is distorted symmetrically, top and bottom, the resulting harmonics are called ‘odd order’ , 3 rd, 5 th and 7 th harmonics, and these frequencies are musically (that is, within our scalar structure) unrelated to the fundamental frequency, they just sound harsh and unnatural. BUT WHY??
I have theories as to why even order distortion sounds acceptable while odd order doesn’t, a part of the answer probably lies in the way the cells respond in the inner ear; they are tiny hairs of different lengths that sway and trigger impulses from their roots. Another possibility is that it is because almost all harmonics that occur in nature are even order; the whistling of the wind, a human voice, the song of birds, all are rich in 2 nd order harmonics, as are the sounds from physical musical instruments, like the violin, the trumpet and the piano.
Other sorts of ‘distortion’ are Amplitude Distortion; which goes back to what our ears and brains do to loud and soft sounds…. They apply compression and that is a form of distortion.
And there is Amplitude Frequency Response Distortion; which is just a posh way of saying that some frequencies of sound are not heard as well as others.
And then there is an even more insidious form of distortion, and that is phase distortion… but that involves things like direction information and I shall come to that later.
Listening to all that stuff, you could be forgiven for thinking that I’m just a prophet of doom and that there’s no point in even trying to record sound faithfully…. But that’s not so at all; I’m merely trying to overcome any feelings that by using such and such a microphone in such and such a way will give a perfect recording; I’m saying that each and every recording is a separate work of art… it is a creative representation of the original. It might be what you think is an accurate copy, or it might be a truly creative version that enhances the performance…..
Compression and Fooling the Brain
Now what I really always want to talk about is compression!
Our ears are really not very good at handling the extreme range of sound volumes that we are subjected to…. No, that’s not quite right, it’s more true to say that the range of volume of sounds that we want to be able to appreciate is so vast that there has to be a variety of built-in compression systems just to stop our heads exploding! And there are!
There are purely physical compression systems both short term and long term; from non-linearities in the middle ear bone structures preventing large deviations from loud sounds, to inflammation effects in the middle and inner ear that severely reduce sensitivity when loud sounds are continuous. And there are sort of ‘software’ compression effects where our brains mask off some of the stream of impulses from the ears to reduce ‘volume’, and where I think it’s possible that extra processing power is recruited when conditions are extremely quiet to try to differentiate between meaningful sounds and extraneous noises of the body, like your heart, breathing and gut rumblings!
Because these biological compressors are active most of the time in normal daily living, what the brain ‘hears’ is constantly being altered and these alterations come and go dependent on the spectrum of the sounds and the format of the intensity… that is, there are different natural compression effects for sharp repetitive sounds and smooth continuous sounds.
What I am eventually getting to is that we can make use of knowledge of these effects; we can apply artificial corrections mimicking the natural ones, and fool the brain into thinking that certain sounds are quieter or louder than they really are, even within the context of other sounds occurring at the same time.
It’s very useful that the brain is amazingly adept at processing this sort of information, and we can go a whole lot further than the simple concept of applying some gutsy optical compression to an overall mix and make it sound louder. There are great subtleties waiting to be exploited… or plundered.
Attack and release shapes and times for these biological compressors are massively variable…. Very sharp transients, like gunfire for example are attenuated extremely fast, and, if the shock to the system was slight, the recovery time is also quick. But, the effect of a rock concert can give you 30dB of ‘biological compression’ for up to 24 hours! (It’s called “I’ve gone deaf!!”) Between these extremes there is a world of rapidly changing ‘gains’, and intelligent use of compressors can enhance depth, height, colour and transparency very much more effectively than altering mixing levels or applying that bane of quality… EQ.
Listening in Stereo
But now I would like to change the subject, from talking about how we hear once the sound gets to us, to the monitoring of the sound in the studio.
Years ago, I got involved with the development of monitoring loudspeakers for radio stations under the Independent Broadcasting Authority. This was in the mid 1970s and the standard loudspeaker for speech studios was the Spendor BC1, which I believe, had been developed by the BBC.
It’s true that when listened to at fairly low level the sound seemed to be accurate and convincing, but when I was testing lots of different experimental loudspeakers, I realised how extreme the ‘auto correction’ of our ears is!
You can put up a strange loudspeaker and listen to a known source or record, and in a matter of minutes, if you are not concentrating carefully on exact elements of the sound reproduction, you will start to accept the sound as ‘normal’!
For years subsequently, I was niggled by a number of aspects in stereo listening, particularly the received wisdom that we should listen to an identical pair of loudspeakers placed a distance apart, and the seemingly insoluble problem that if you want to reproduce a ‘natural’ bottom end at a reasonable volume, then usually there were inadequacies in the mid ranges. Another terrible anomaly has been the use of the ‘pan pot’ as a means of specifying position; the whole idea is false, and only works because we insist on listening with widely spaced loudspeakers!
(directional information has almost nothing to do with volume difference, it is determined by time difference and in nature, it is entirely sensed by our ear spacing; ‘panned’ information only works successfully with wide-spaced loudspeakers.)
Individually these problems have been addressed, we learned early on that one could give depth to a mono signal by adding reverb that contains multiple reflections. The problem of stereo placement of individual signals can be overcome with clever time delays…. but the conventional monitoring setup still feels like a compromise.
Single Point Monitoring
It was only when I started experimenting with Sum and Difference recording techniques that I started to suspect that there might be another practical solution to stereo listening. Recordings made with M/S mics (Middle and Side) certainly sound beautiful and ‘solid’ when replayed conventionally, but I got to thinking about the possibility of reversing the process…..
Let me talk for just a moment about M/S recording: I know I’m telling you things that are very simple and obvious, but it’s possible that you may not have heard it quite this way before!
You have a signal source that you want to record… say an acoustic guitar.
So you place a cardioid microphone in front of it to capture the sound at that point.
But, you would also like to record the effect of that guitar in the room in which it is being played, so a way of doing this is to place a second microphone, close to the first one, but picking up sound from the right and the left of the instrument…. This is done by using a ‘figure 8’ response mic set up across the sound field.
So we have a signal with the main ‘sound’ of the guitar, and a second signal containing ‘width’ information.
Now speaking very simplistically, the first signal contains mono information; that is information from the left and the right hand side.
The second signal also contains information from both the left and the right but because one side is the front of the mic, and the other is the back, then the two signals are out of phase…. That is left minus right.
So we have ‘middle’…. Left plus right, and ‘side’… left minus right.
Another way of saying it is ‘Sum and Difference’.
I don’t need a blackboard to show how if you add together the two signals, the result is Left only…
and if you electronically deduct the second signal from the first, you are left with Right only. And if you carry out that manoeuvre, you get a very effective stereo image.
Now doesn’t that all sound splendid…. BUT we are fooling ourselves!!
All that manipulation is actually a lie…. We have just CALLED the signals LEFT and RIGHT, they are only simple approximations; the mono signal contains the bulk of the information from the guitar direct, the ‘difference’ signal contains reflections from elsewhere in the room, but arriving back at the microphone pick-up point.
Yet the system fools our ears nicely, and recordings made that way are very effective, and much more natural than those made with a simple ‘stereo pair’, and ‘of course’ infinitely better than anything that has its stereo positioning determined by pan pots!
SO I got to thinking…… If you can get realistic stereo recordings from what is effectively a ‘single point’, then perhaps there’s a better way of listening than the accepted conventional idea of two loudspeakers set a distance apart…. where the proper stereo image is only audible from a single ‘sweet spot’ at the apex of the listening triangle.
My ‘Single Point Monitor’ is still only a prototype: (MONITOR ONE tm)
I have taken the normal Left and Right signals from a stereo signal, and summed them, and using a big chunky amplifier, fed the sum to a wide range monitor loudspeaker with lots of bottom end.
Also from the normal ‘Left and Right’ signals, I derived a ‘Left minus Right’ signal, by inverting the phase of the ‘Right’ signal and adding it to the ‘Left’. I fed this to a second but less powerful power amplifier and this is fed to a pair of loudspeakers wired out-of-phase, and set up on top of the mono monitor in a wide ‘V’ formation facing the listener area.
The bulk, possibly 95% of the sound, is reproduced from the ‘Sum’ loudspeaker…. After all, this is just a ‘MONO’ sum of the stereo signal, and it’s exactly the same as listening to a mono radio or television.
When the ‘DIFFERENCE’ signal is created, all the mono information cancels itself out in the ‘Left minus Right’ conversion, so the actual volume from the difference ‘speakers is quite low, hence the lower power amplifier.
The effect of this odd looking array has some surprises:
The first and most obvious effect is that there seems to be sounds coming from outside the loudspeaker! But then, that’s the whole point of the exercise…
The second, and this is very plain, because the path of the sound from the speaker to the listener is so simple, the detail in the sound is very much clearer; there are no small phasing errors that normally occur due to differing path lengths to the conventional pair of loudspeakers.
The effect of ‘space’ is certainly equal to, or better than a conventional pair, the ‘quality’ is remarkable and I shall carry on using it as a studio monitor.
So, to sum up briefly, I started out talking about the importance of how we think we hear, rather than the narrow applications of physics and biology; and I questioned the suitability of the ‘Scientific Method’ when you are talking about sound. This led on to how we perceive volume frequency and quality, and then turning the whole thing on its head, how we reproduce sound to achieve acceptable results.
I have tried to demonstrate just one area… My particular area… Where our basic learning and ideas are being questioned; and it’s from these sorts of thinking, theories and experiments that true progress happens, rather than the barren emotionless developments of pure physics with their restricting and incorrect assumptions of how the world is. The ‘progress’ in sound recording and reproduction exemplified in MP3 etc. and the cries of tone-deaf software engineers who believe that they now control the future of sound.
Talking about the ‘single point monitor’, while there is still months of experimenting to do, the results are such that I think it’s worthwhile trying to re-educate the recording fraternity to throw away their ‘nearfields’!
Copyright Ted Fletcher 2005