Frequently Asked Questions (FAQ) file for rec.audio.pro Version 0.9.2 Edited and compiled by Gabe Wiener David Josephson (david@josephson.com) Mark Plancke (mark@soundtechrecording.com) Many thanks to all who have contributed to the FAQ. Individual contributions are credited at the end of their respective sections. The core FAQ writers are listed below. The rubric in brackets will be used to indicate who has written a particular section. Brian Allen (buzallen@teleport.com) [Brian] Scott Dorsey (kludge@netcom.com) [Scott] Harvey Gerst (harvey@itrstudio.com) [Harvey] Christopher Hicks (cmh@eng.cam.ac.uk) [Chris] David Josephson (david@josephson.com) [David] Marco Olivotto (sonica@tqs.it) [Marco] Dick Pierce (dpierce@world.std.com) [Dick] Mark Plancke (mark@soundtechrecording.com) [Mark] David Rick (drick@hach.com) [David R] Joe Schottman (jschottm@E115074.vtacs.com) [Joe] Gabe Wiener [Gabe] PLEASE NOTE Each author maintains and asserts full legal copyright on his contribution to the FAQ. Compilation copyright (C) 1996 by Gabe M. Wiener. All rights reserved. Permission is granted for non-commercial electronic distribution of this document. Distribution in any other form, or distribution as part of any commercial product requires permission. Inquire to (david@josephson.com) --------- TABLE OF CONTENTS FOR THE FAQ: ** - Denotes new additions since last version. MM/DD] Denotes revision date [Mark] Section I - Netiquette Q1.1 - What is this newsgroup for? What topics are appropriate here and what topics are best saved for another newsgroup? Q1.2 - Do I have to be a "professional" to post here? Q1.3 - I've got a question, what is the first thing I should do? **[9/98] Q1.4 - I need to ask the group for help with selecting a piece of equipment. What information should I provide in my message? Q1.5 - I want to sell something. Is this the place to do it? **[9/98] Q1.6 - What is some general Netiquette to follow? **[9/98] Q1.7 - Hey, I posted something here, and now I'm getting a bunch of junk mail! What gives? **[9/98] Q1.8 - Hey, I posted a simple question, and some jerk gave me a rude answer! What gives? **[9/98] Section II - The business of audio Q2.1 - How does one get started as a professional audio engineer? Q2.2 - Are audio schools worth the money? Which schools are best? Q2.3 - What are typical rates for various professional audio services? Section III - Audio Interconnections Q3.1 - How are professional transmission lines and levels different from consumer lines and levels? What is -10 and +4? What's a balanced or differential line? Q3.2 - What is meant by "impedance matching"? How is it done? Why is it necessary? Q3.3 - What is the difference between dBv, dBu, dBV, dBm, dB SPL, and plain old dB? Why not just use regular voltage and power measurements? Q3.4 - Which is it for XLRs? Pin 2 hot? Or pin 3 hot? Q3.5 - What is phantom power? What is T-power? Can I damage my dynamic or ribbon mic by plugging it in to phantom power? **[9/98] Q3.6 - How do I interconnect balanced and unbalanced components? Q3.7 - What are ground loops and how do I avoid them? Q3.8 - What is the "Pin 1 problem" and how do I avoid it? Section IV - Analog tape recording Q4.1 - What does it mean to "align" a tape machine? Q4.2 - What is bias? What is overbias? Q4.3 - What is the difference between Dolby A, B, C, S, and SR? How do each of these systems work? What is the effect on recorded sound of these systems? **[9/98] Q4.4 - What is Dolby HX-Pro? Q4.5 - How does DBX compare to Dolby? Q4.6 - How much better are external microphone preamplifiers than those found in my portable recorder? Q4.7 - What is an MRL? Where do I get one? Section V - Digital recording and interconnection Q5.1 - What is sampling? What is a sampling rate? Q5.2 - What is oversampling? Q5.3 - What is the difference between a "1 bit" and a "multibit"converter? What is MASH? What is Delta/Sigma? Should I really care? Q5.4 - On an analog recorder, I was always taught to make sure the signal averages around 0 VU. But on my new DAT machine, 0 is all the way at the top of the scale. What's going on here? Q5.5 - Why doesn't MiniDisc or Digital Compact Cassette sound as good as DAT or CD? After all, they're both digital. Q5.6 - What is S/P-DIF? What is AES/EBU? Q5.7 - What is clock jitter? Q5.8 - How long can I run AES/EBU or S/P-DIF cables? What kind of cable should I use? Q5.9 - What is SCMS? How do I defeat it? Q5.10 - What is PCM-F1 format? Q5.11 - How do digital recorders handle selective synchronization? Q5.12 - How can a 44.1 kHz sampling rate be enough? Q5.13 - Doesn't the 44.1 kHz sampling rate make it impossible to reproduce square waves? Q5.14 - How can a 16-bit word-length be enough? Q5.15 - What's all this about 20- and 24-bit digital audio? Aren't CDs limited to 16 bits? Section VI - Digital editing and mastering Q6.1 - What is a digital audio workstation? Q6.2 - How is digital editing different from analog editing? Q6.3 - What is mastering? Q6.4 - What is normalizing? Q4.5 - I have a fully edited DAT that sounds just like I want it to sound on the CD. Is it okay to send it to the factory? Q6.6 - What is PCM-1630? What is PMCD? Q6.7 - When preparing a tape for CD, how hot should the levels be? Q6.8 � What is CD-R? Why do I care about disk-at-once or track-at-once? Q6.8 - Where can I get CDs manufactured? Q6.9 - How are CD error rates measured, and what do they mean? Section VII - Market survey. What are my options if I want -- Q7.1 - A portable DAT machine or rack size DAT machine **[9/98] Q7.2 - A good but inexpensive compressor **[9/98] Q7.3 - An inexpensive stereo microphone **[9/98] Q7.4 - An inexpensive pair of microphones for stereo **[9/98] Q7.5 - A good microphone for recording vocals **[9/98] Q7.6 - A good microphone for recording [insert instrument here] **[9/98] Q7.7 - A small mixer **[9/98] Q7.8 - A portable cassette machine **[9/98] Q7.9 - A computer sound card for my IBM PC or Mac **[9/98] Q7.10 - An eight-track digital recorder? Section VIII - Sound reinforcement Q8.1 - We have a fine church choir, but the congregation can't hear them. How do we mic the choir? Q8.2 - How do I 'ring out' a system? [9/98] Q8.3 - How much power to I need for [insert venue here]? Q8.4 - How good is the Sabine feedback eliminator? Section IX - Sound restoration Q9.1 - How can I play old 78s? Q9.2 - How can I play Edison cylinders? Q9.3 - What are "Hill and Dale" recordings, and how do I play them back? Q9.4 - What exactly are NoNOISE and CEDAR? How are they used? Q9.5 - How do noise suppression systems like NoNOISE and CEDAR work? Q9.6 - What is forensic audio? Section X - Recording technique, Speakers, Acoustics, Sound Q10.1 - What are the various stereo microphone techniques? [9/98] Q10.2 - How do I know which technique to use in a given circumstance? [9/98] Q10.3 - How do I soundproof a room? Q10.4 - What is a near-field monitor? Q10.5 - What are the differences between "studio monitors" and home loudspeakers? Section XI - Industry information Q11.1 - Is there a directory of industry resources? Q11.2 - What are the industry periodicals? Q11.3 - What are the industry trade organizations? Q11.4 - Are there any conventions or trade shows that deal specifically with professional audio? Section XII - Miscellaneous Q12.1 - How do I modify Radio Shack PZMs? Q12.2 - Can I produce good demos at home? [9/98] Q12.3 - How do I remove vocals from a song? Section XIII - Bibliography Q13.1 - Fundamentals of Audio Technology Q13.2 - Studio recording techniques Q13.3 - Live recording techniques Q13.4 - Digital audio theory and practice Q13.5 - Acoustics Q13.6 - Practical recording guides Section XIV Q14.1 - Who wrote the FAQ [9/98] Q14.2 - How do you spell and pronounce the FAQ maintainer's surname? -------- THE FAQ: Section I - Netiquette -- Q1.1 - What is this newsgroup for? What topics are appropriate here, and what topics are best saved for another newsgroup? [9/98] This newsgroup exists for the discussion of issues and topics related to professional audio engineering. We generally do not discuss issues relating to home audio reproduction, though they do occasionally come up. The rec.audio.* hierarchy of newsgroups is as follows: rec.audio.pro Issues pertaining to professional audio rec.audio.marketplace Buying and trading of consumer equipment rec.audio.tech Technical discussions about consumer audio rec.audio.opinion Everyone's $0.02 on consumer audio rec.audio.high-end High-end consumer audio rec.audio.misc Everything else alt.music.4-track Deals with smaller home based recording projects rec.music.makers.marketplace A music related for sale forum, with some high end Please be sure to select the right newsgroup before posting. -- Q1.2 - Do I have to be a "professional" to post here? No. Anyone is welcome to post on rec.audio.pro so long as the messages you post are endemic to the group in some way. If you are not an audio professional, we would ask that you read this FAQ in full before posting. You may find that some of your essential questions about our field are answered right here. But if not, feel free to ask us. -- Q1.3 - I've got a question, what is the first thing I should do? [9/98] Start with this FAQ! In the case of many questions, they have already been asked, and answered. In some case they have been beaten into the ground. A good starting point before asking a question is to go to and check the archives there. To do so, click on power search. Enter rec.audio.pro under the forum. Then enter a few keywords about your question, and do a search. [Joe] -- Q1.4 - I need to ask the group for help with selecting a piece of equipment. What information should I provide in my message? If you are going to post a request for advice on buying equipment, please provide the following information. Your application for the equipment What other equipment you will be using it with Your budget for the equipment Any specific requirements the equipment should have There is nothing worse than messages like "Can anyone recommend a DAT machine for me to buy???" Sure we can. But what do you want to _do_ with it? We can recommend DAT machines for $400 or for $14,000. -- Q1.5 - I want to sell something. Is this the place to do it? [9/98] According to the rec.audio.pro charter selling is allowed. The unofficial mood of the group is that individuals wishing to sell studio related gear are welcome, with moderation. There have been negative feelings expressed towards dealers posting their wares. [Joe] The generally accepted way for dealers to post for sale items is to provide a pointer or link to their web page where the detailed information is posted. This also applies to private parties who have many items for sale. [Mark] -- Q1.6 - What is some general Netiquette to follow? [9/98] Trim what you quote. If you want to add to something someone has posted, remove most of the redundancy - if someone is reading the thread they have most likely read the message you are adding to, and don't need to see it again. And those excess lines can be painful to readers who have to pay by the minute for their connection, or who have slow connections. Don't post binaries to the newsgroup. Many people won't be interested, but it will slow down their connection, and irritate them. Also, if too many binaries are posted, this may cause some sysadmins to remove rec.audio.pro from the available list of newsgroups for their users. Turn off HTML coding within your newsreader. If you are using Netscape or Internet explorer, you may be posting in HTML code. To turn it off in Netscape 4, go to the edit menu, to preferences, to Mail & Groups, to Messages, and turn off "By default, send rich text (HTML) messages. I expect Internet Explorer is similar. If you are selling something, post one ad, and let that be it. Most news servers will keep your message for at least seven days. Posting your ad day after day is more likely to turn off potential buyers than to find them. If you are going to advertise your gear on one than one newsgroup, it is recommended that you do so as a cross-post. Many newsreaders will filter the message as read after it is read on any newsgroup, so that when the user reads another group he/she will not see your message again and again. Don't engage in flame wars or respond to trolls. 30 people going back and forth between "ADAT's suck!" and "ADAT's rule!" doesn't add anything to the group. If you wish to discuss something, add facts to back your opinion. And if someone posts an obviously inflammatory message, don't bother responding, no matter how angry or clever you might be. That's exactly what they want. If you ignore them, such people go away. It is not recommended that you post something with "please respond via e-mail" or the like. You are asking a large group of people to take time out of their busy lives to discuss something with you. The least you can do is take the time out of yours to check back on the newsgroup. Beyond that, other people might want to see the reply as well. If you have a real reason why you can't check the newsgroup, include that in your message to avoid irritating people. [Joe] -- Q1.8 - Hey, I posted something here, and now I'm getting a bunch of junk mail! What gives? [New 9/98] Unfortunately, junk e-mailers (spammers) have discovered that they can find people's e-mail addresses over Usenet, and hit them with junk mail. The passive approach is to disguise your e-mail. There are various ways to do this, check with your newsreader documentation. Be aware that this may keep you from getting the help you want - most people won't spend hours figuring out your disguised e-mail if it bounces back. The active way is attempting to get your authorities to make junk e-mail illegal. A good starting point is and . [Joe] -- Q1.9 - Hey, I posted a simple question, and some jerk gave me a rude answer! What gives? [New 9/98] Some of us do what we do because we don't play well with others. After all, we spend our time sitting in a room all the time with no natural light. And to quote Harvey Gerst "Which would you choose; someone with a 50/50 smattering of knowledge and very polite, or 97% accuracy and blunt? [Joe] ===== Section II - The business of audio -- Q2.1 - How does one get started as a professional audio engineer? There are as many getting-started stories as there are audio engineers. The routes into the industry are highly dependent on what aspect of the industry one wishes to enter. For instance, many engineers who work in the classical-music field have at one time or another been classical performers. Others enter through their work in other musical genres, or through engineering programs at universities or technical schools. Without exception, everyone in the industry has learned at least a portion of their craft from watching those with more hands-on experience. Whether this comes from a formal internship or just from sustained observation and long-term question-asking, it is almost always universally true. [Gabe] -- Q2.2 - Are audio schools worth the money? Which schools are best? An audio school will teach you the basics of the audio business, but just like any technical school, what they teach you may not be worth what you pay. There are several schools of thought: 1. Audio schools are great, you get trained on the gear that is used by top studios and costs millions of dollars, you get taught by pros in the field and you have job placement assistance after you graduate. 2. Going to an audio school is like wanting to learn aviation, and when you start flight school they teach you a 747. In the real world, you are probably not going to have 96 channel automated consoles on your first job. You are not going to mix your first live gig on a 48-channel 100,000 watt stadium PA rig. Better to start off on real world equipment and work your way up to the top-of-the-line stuff. Most recording studios are 24-track analog or less and most PA systems are 16 channel, 3,000 watts or less. Don't buy education for something you will never get to use after you leave the school. 3. Audio Schools are a waste of money. Instead of spending $18,000 for a course and having nothing to show for it but a technical certificate (which everyone knows is no help at all getting a job). You would be better off spending the 18 grand on books and gear and learning by trial and error, or saving the 18 grand altogether and learning first from reading, and later from apprenticing. [jsaurman@cftnet.com (Jim Saurman)] Jim summarizes the opinions pretty well. Recognize that an altogether different option is to attend a full four-year college program. Many colleges and universities offer such programs. Examples include Peabody Conservatory, Cleveland Institute of Music, McGill University, New York University, University of Miami at Coral Gables, and the University of Massachusetts at Lowell. Without fail, graduates from these sorts of programs earn far more respect than graduates of any technical school. [Gabe] -- Q2.3 - What are typical rates for various professional audio services? Depends on what you want to have done, and where. One can pay upwards of $300/hr for prime studio rental time in New York. In a small community however, one might find a project studio for $25/hr. Generally speaking, the rule is: the rarer the service, the more it will cost. In a community with dozens of small 8-track studios, you won't have to pay much. If you need emergency audio restoration, or mastering by a top-flight pop-music engineer, you can expect to drop many hundreds of dollars an hour. Like so many other things in this industry, there are no rules, and Smith's invisible hand guides the market. [Gabe] ===== Section III - Audio Interconnections -- Q3.1 - How are professional transmission lines and levels different from consumer lines and levels? What is -10 and +4? What's a balanced or differential line? Professional transmission lines differ from consumer lines in two ways. First, consumer lines tend to run about 14 dB lower in level than pro lines. Second, professional lines run in differential, or balanced, configuration. In a single-ended line, the signal travels down one conductor and returns along a shield. This is the simplest form of audio transmission, since it is essentially the same AC circuit you learned about in high-school physics. The problem here is that any noise or interference that creeps into the line will simply get added to the signal and you'll be stuck with it. In a differential line, there are three conductors. A shield, a normal "hot" lead, and a third lead called the "cold" or "inverting" lead, which carries a 180-degree inverted copy of the hot lead. Any interference that creeps into the cable thus affects both the hot and cold leads equally. At the receiving end, the hot and cold leads are summed using a differential amplifier, and any interference that has entered the circuit (called "common-mode information" since it is common to both the hot and cold leads), gets canceled out. Differential lines are thus better suited for long runs, or for situations where noise or interference may be a factor. [Gabe] -- Q3.2 - What is meant by "impedance matching"? How is it done? Why is it necessary? We can talk about the characteristic impedance of an input, which is to say the ratio of voltage to current that it likes to see, or how much it loads down a source. (You can think of this as being an "AC resistance" and you would be mostly right, although it's actually the absolute magnitude of the vector drawn by the resistive and reactive load components. Dealing with line level signals, reactive components are going to be negligible, though). In general, in this modern world, most equipment has a low impedance output, going into relatively high impedance input. This wastes some amount of power, but because electricity is cheap and it's possible to build low-Z outputs easily today, this is not a big deal. With microphones, it _is_ a big deal, because the signal levels are very low, and the drive ability poor. As a result, we try and get the best efficiency possible from microphones to get the lowest noise floor. This is often done by using transformers to step up the voltage or step it down, to go into a higher or lower Z load. Transformers have some major disadvantages in that they can be significant sources of non-linearity, but back in the days of tubes they were the only solution. Tubes have a very high-Z input, and building balanced inputs with tubes requires three devices instead of one. As a result, all mike pre-amps would have a 600 ohm balanced input, with a transformer, driving a pre-amp tube. Today, transistor circuits can be used for impedance matching, although they are often more costly and can be noisier in cases. As a result of the expense, consumer equipment was built with high-Z microphone inputs, and high-Z microphones. This resulted in more noise pickup problems, but was cheaper to make. Unfortunately this still held on into the modern day of the transistor, and a lot of high-Z consumer gear exists. Guitar pickups are generally high-Z devices, and require a direct box to reduce the impedance so that they can go into a standard 600 ohm mike pre-amp directly. Many years ago, the techniques that were used in audio came originally from telephone company practice. Phone systems operate with 150 or 600 ohm balanced lines, and adoption of this practice into the audio industry caused those standards to be used. In the modern age where lines are relatively short and transformers considered problematic, the tendency has been to have low-Z outputs for all line level devices, driving high-Z inputs. While this is not the most efficient system, it is relatively foolproof, and appears on most consumer equipment. A substantial amount of professional gear, however, still uses internal balancing transformers or resistor networks to match to a perfect 600 ohm impedance. [Scott] [Ed. note: Modern equipment works on principles of voltage transfer rather than power transfer. Thus a standard audio circuit today is essentially a glorified voltage divider. You have a very low output impedance and a very high input impedance such that the most voltage is dropped across the load. This is not an impedance-matched circuit in the classic sense of the word. Rather, it is a "bridged" or "constant voltage" impedance match, and is the paradigm on which nearly all audio circuits operate nowadays. -Gabe] -- Q3.3 - What is the difference between dBv, dBu, dBV, dBm, dB SPL, and plain old dB? Why not just use regular voltage and power measurements? Our ears respond logarithmically to increases in sound pressure level. In order to simplify the calculations of these levels, as well as the electrical equivalents of them in audio systems, the industry uses a logarithmic system to denote the values. Specifically, the decibel is used to denote logarithmic level above a given reference. For instance, when measuring sound pressure level, the basic reference against which we take measurements is the threshold of hearing for the average individual, 10^-12 W/m^2. The formula for dB SPL then becomes: 10 Log X / 10^-12 where X is the intensity in watts per square meter The first people who were concerned about transmitting audio over wires were, of course, the telephone company. Thanks to Ma Bell we have a bunch of other decibel measurements. We can use the decibel to measure electrical power as well. In this case, the formula is referenced to 1 milliwatt in the denominator, and the unit is dBm. 1 milliwatt was chosen as the canonical reference by Ma Bell. Since P=V^2 / R, we can also express not only power gain in dB but also voltage gain. In this case the equation changes a bit, since we have the ^2 exponent. When we take the logarithm, the exponent comes around into the coefficient, making our voltage formula 20 log. In the voltage scenario, the reference value becomes 0.775 V (the voltage drop across 600 ohms that results in 1 mW of power). The voltage measurement unit is dBv. The Europeans, not having any need to abide by Ma Bell's choice for a canonical value, chose 1V as their reference, and this is reflected as dBV instead of dBv. To avoid confusion, the Europeans write the American dBv as dBu. Confused yet? [Gabe] -- Q3.4 - Which is it for XLRs? Pin 2 hot? Or pin 3 hot? Depends on whom you ask! Over the years, different manufacturers have adopted varying standards of pin 2 hot and pin 3 hot (and once in a while, pin *1* hot!). But nowadays most manufacturers have adopted pin 2 hot. Still, it is worth taking the extra minute or two to check the manual. The current AES standard is pin 2 hot. [Gabe] -- Q3.5 - What is phantom power? What is T-power? Can I damage my dynamic or ribbon mic by plugging it in to phantom power? Condenser microphones have internal electronics that need power to operate. Batteries, or separate power supplies using multi-conductor cables powered early condenser microphones. In the late 1960's, German microphone manufacturers developed 2 methods of sending power on the same wires that carry the signal from the microphone. The more common of these methods is called "phantom power" and is covered by DIN spec 45596. The positive terminal of a power supply is connected through resistors to both signal leads of a balanced microphone, and the negative terminal is connected to ground. 48 volts is the preferred value, with 6800 ohm resistors in each leg of the circuit, but lower voltages and lower resistor values are also used. The precise value of the resistors is not too critical, but the two resistors must be matched within 0.4%. Phantom power has the advantage that a dynamic or ribbon mic may be plugged in to a phantom powered microphone input and operate without damage, and a phantom powered mic can be plugged in to the same input and receive power. The only hazard is that in case of a shorted microphone cable, or certain old microphones having a grounded center tap output, current can flow through the microphone, damaging it. It's a good idea anyway to check cables regularly to see that there are no shorts between any of the pins, and the few ribbon or dynamic microphones with any circuit connection to ground can be identified and not used with phantom power. There is also a very slim chance with a dynamic or ribbon mic plugged in to phantom power over a very long cable, that one side of the circuit might make contact sooner than the other side, and result in a burst of current charging the capacitance of the cable. If you want to be doubly sure there is no such possibility, turn off all phantom power before plugging in or unplugging any microphone. This also prevents the huge pop that will come out of your speakers if you plug in something phantompowered with the power on. T-power (short for Tonaderspeisung, also called AB or parallel power, and covered by DIN spec 45595) was developed for portable applications, and is still common in film sound equipment. T-power is usually 12 volts, and the power is connected across the balanced pair through 180 ohm resistors. Only T-power mics may be connected to T-power inputs; dynamic or ribbon mics may be damaged and phantom powered mics will not operate properly. [David] -- Q3.6 - How do I interconnect balanced and unbalanced components? First, let's define what the terms mean. The simplest audio circuit uses a single wire to carry the signal; the return path, which is needed for current to flow in the wire, is provided through a ground connection, usually through a shield around the wire. This system, called unbalanced transmission, is very susceptible to hum pickup and cannot be used for low level signals, like audio, for more than a few feet. Balanced transmission occurs when two separate and symmetrical wires are used to carry the signal. A balanced input is sensitive only to voltage that appears between the two input terminals; the circuit cancels signals from one terminal to ground. The simplest way to connect between balanced and unbalanced equipment is to use a transformer. The signals are magnetically coupled through the core of the transformer and either side may be balanced or unbalanced. Good transformers are expensive, however, and there are cheaper methods that can be used in some instances. An unbalanced output can be connected to a balanced input. For instance, from the unbalanced output of a CD player, connect the center pin to pin 2 of the balanced XLR input connector, and the ground to pins 1 and 3. To connect the balanced output of something to an unbalanced input requires different techniques depending on whether the output is active balanced (each side has a signal with respect to ground) or floating balanced (for instance, the secondary of a transformer with no center-tap connection). If it's an active balanced output, you can simply use half of it; connect pin 2 to the unbalanced input, and pin 1 to ground, leaving pin 3 floating. If this doesn't work (no or very weak signal) connect pin 3 of the output to pin 1 and ground and leave pin 2 connected to the unbalanced input center pin. Some active balanced outputs, particularly microphones, use the balanced circuit to cancel distortion, so this hookup may result in higher distortion than if a proper balanced-to-unbalanced converter such as a differential stage or a transformer were used. [David] -- Q3.7 - What are ground loops and how do I avoid them? One of the most difficult troubleshooting tasks for the audio practitioner is finding the source of hum, buzz and other interfering signals in the audio signal. Often these are caused by "ground loops." This unfortunate and inaccurate term (it need not be in the "ground" path, and the "loop" is not what causes the problem) is poorly understood by most users of audio equipment. A better name for this phenomenon is "shared path coupling" because it happens when two signals share the same conductor path and couple to each other as a result. Another semantic problem that should be addressed early on is the idea that "ground" is one place where all currents go. It's not, there's nothing special about calling a signal "ground," current still flows through any path that's available to it Referring to the discussion above regarding unbalanced signal paths, recall that there must be a complete circuit from the output of some device, through the input of another device and back to the "return" side of the output if any current is to flow. Current doesn't flow by itself, it must have a complete path. If there are multiple paths over which the current might flow, the current will be divided among them with most of the current flowing through the path having the least resistance. Any available path, regardless of the resistance in it, will carry some of the current, it's not a case of all the current following the path that has least resistance. For example, suppose we have two units connected together through a small piece of coaxial cable, and the units are also connected together at the wall outlet through their grounded power cords -- the ground pins are connected to the chassis at each end. The audio signal goes along the center of the coaxial cable, and part of it might come back along the shield of the coax, but part will also go through the ground wire of one unit and back through the ground wire of the other unit. A problem arises when some other signal is also flowing through this same return path. The other signal might be another audio signal, video, data, or power. All of the currents in a wire add together, and the resistance of the wire causes a voltage to appear in proportion to the current flowing. All of these voltages add together, so there is a little bit of the video signal added to the audio, some of the power signal added to the video, some of the power signal added to the audio, etc. In rare instances, the "loop" of wire formed by the intended ground return path. The happenstance lower resistance return path formed by mounting hardware, power cords, etc. can form a magnetic pickup as well, so that magnetic fields radiated by transformers, CRT's, etc. can also induce a current in the "loop," which makes yet another source of noise voltage. This shared path coupling is a constant problem with unbalanced audio systems. Lots of different methods have been tried to get around the problem, many of them dangerous. Clipping off the ground leads of equipment so there is no common power line path between them simply makes any fault or leakage current follow some other path. Back through the signal cable to some equipment that has a ground -- perhaps through the user's body, if all the ground pins have been removed. The only general solution to "ground loop" coupling with unbalanced equipment is to connect all the chassis together with a very low resistance path (copper strap or braid, for example), on the principle that since the resistance is so low, any leakage current will produce a correspondingly low signal voltage. It may also be effective to interrupt the ground path of shield conductors over signal wires; force the return path to go through the designated common strap while leaving the shield in place only for electrostatic screening. With balanced equipment, no current should be flowing in the shield conductors, and in fact performance should be identical with the shield left disconnected at one end (preferably the receiver end). Therefore balanced systems should be impervious to shared path coupling or "ground loop" problems. But in fact they aren't, because most signals inside a given piece of equipment are unbalanced, and there are often return paths internal to the equipment that can be shared with return paths between other units of equipment connected to it. Especially with mixed digital, video and audio signals and high gain, high negative feedback amplifier circuitry, this can be a big problem -- small currents can create big effects -- and this brings us to the next question. [David] -- Q3.8 - What is the "Pin 1 problem" and how do I avoid it? This is a special case of "ground loop" or shared path coupling. Recently this has been discussed in great detail and clarity by a group led by the consultant Neil Muncy of Toronto. Suppose you have a mixer, whose balanced output is connected to an amplifier's balanced input through a correctly wired cable. Both units are powered from the AC mains and one or both have some small amount of AC leakage current that travels to ground through all available ground paths -- including the shield of the cable that connects the two units. So far so good, no harm done because the circuit is balanced and any common mode voltage from current flowing through the shield will be canceled by the amplifier input. However... a small part of this leakage current also travels through the shield of the wire going from the back panel XLR connector to the PC board, through some "ground" traces on the PC board, and back out through the power line ground cable. No problem so far, except that some gain stage on that same PC board also uses that piece of ground trace in its negative feedback loop, and some part of that leakage signal will be added to the signal in that gain stage. It might be video, or data, or other audio signal(s), or (most commonly) power. The solution to this variant of shared path coupling is the same sort of approach that applies to other unbalanced signals: give the leakage current a very low resistance path to follow, and remove as many of the shared paths as possible. Within a unit of equipment, all the XLR connectors' pin 1 terminals should be connected to ground with very low resistance (big) wire or traces, and preferably all of the ground connections should be made at one point, the so-called "star ground" system. A brute force approach is to assume that the back panel is the star ground, and wire every connector's pin 1 solidly to the panel as directly as possible, and lift all the ground wires but one that go from the connectors to the circuitry. In this way, all the external leakage currents (the "fox" to use Neil Muncy's term) will be conducted through the back panel and out of the way. Rather than running them through the ground traces on the PC board where they will mix with internal low level signals in high gain stages (the "hen house"). Individual wires can be run from points on the circuit board that need to be at "ground" potential to a common point on the back panel, which is designated a "zero signal reference point" (ZSRP). Equipment that has a reputation for being "quiet" and easy to use in many different applications is often found to be wired this way, while equipment that is "temperamental" if often found to be wired in such a way that leakage currents are easily coupled to internal signal lines. There's a simple test that can be done to check equipment susceptibility to this problem. Connect the output, preferably balanced and floating, of an ordinary audio oscillator to the pin 1 of any two XLR connectors on the equipment. Now operate the equipment through its various modes, gain settings, etc. You may be surprised to find the audio oscillator's signal appearing in many different places in the equipment. All of these issues are expertly explained in a single issue of the Journal of the AES, June 1995. You can get this as a reprint from the AES office in New York, http://www.aes.org/ [David] ===== Section IV - Analog tape recording -- Q4.1 - What does it mean to "align" a tape machine? There are a number of standard adjustments on any analogue tape machine, which can roughly be broken up into mechanical and electronic adjustments. The mechanical adjustments include the head position (height, skew, and azimuth), and sometimes tape speed. Incorrect head height will result in poor S/N and leakage between channels, because the tracks on the head do not match up exactly with those on the tape. Incorrect tape skew will result in level differences between channels and uneven head wear, because there is more pressure on the top of the head than the bottom (or vice versa). Incorrect azimuth will result in loss of high frequency response and strange skewing of the stereo image. Tape speed error will result in tonal shifts. Although on many machines with capstan speed controlled by crystal or line frequency, it is not adjustable. Electronic adjustments include level and bias adjustments for each channel. Some machines may have bias frequency adjustments, equalization adjustments for playback and record emphasis, pre-distortion adjustments, and a varied bevy of adjustments for noise reduction systems. Alignment is relatively simple, and the same general method applies from the smallest cassette deck to the largest multi-track machine. First, put a test tape on the machine. Use a real reference tape, from the manufacturer, from MRL , or a similarly legitimate lab. DO NOT EVER use a homebrew test tape that was recorded on a "known good" machine. You will regret it someday. Spend the money and get a real test tape (and not one of the flaky ones from RCA). 1. Speed adjustment (if necessary). Play back a 1 KHz reference tone and, using a frequency counter, adjust the tape speed for proper frequency output. There are strobe tapes available for this as well, but with cheap frequency counters available, this method is much easier. 2. Head height and skew adjustments. Better see your machine's manual on this one, because I have seen a variety of ways of doing this. 3. Azimuth adjustment. I find the easiest way to do this is to take the left and right outputs and connect them to the X and Y inputs of an oscilloscope, and play back a 1 KHz reference tone, while adjusting the azimuth until a perfectly-diagonal line appears. You can do this by ear if you are desperate, but I strongly recommend the Lissajous method, which is faster and more accurate. On multi-track decks, use the two tracks as close as possible to the edge of the tape. Now you have the playback head azimuth set... put a 1 KHz source into the record input, with a blank tape on the machine, and adjust the azimuth of the record head for the proper diagonal line. 4. Playback eq adjustment (if necessary). This is a case of playing back various test tones at different frequencies, and adjusting the response curve of the deck to produce a flat output. You can also do this by playing back white noise and using a third-octave spectrum analyzer of great accuracy to adjust for flat response. Again, this is one to check your deck's manual for, because the actual adjustments vary from one machine to another, and you will want to use the test tape once again. 5. Record eq adjustment (if necessary). How this is done (and whether you want to do it after biasing the tape) depends a lot on your deck. 6. Bias adjustment. There are a lot of ways to do this. My favorite method is to use a white noise source, and adjust the bias until the source and tape output sound identical. Some people prefer to use a signal generator and set so that the levels of recorded tones at 1 KHz and 20 KHz are identical. I find I can get within .5 dB by ear, though your mileage may differ. [Ed. note: Many tapes have recommended overbias settings, and many decks will also provide a chart that correlates the amount of overbias against available tape formulations. -Gabe] 7. Record level adjustment. I use a distortion analyzer, and set the level so that at +3 dB, I get 3% distortion on the output. Some folks who are using very hot tape set the machines so that a certain magnetic flux is produced at the heads given a certain input, but I find setting for a given distortion point does well for me. If you don't have a distortion analyzer, use a 1 KHz tone source and set so that you have the onset of audible distortion at +3 dB, and you will be extremely close. [Ed. note: The traditional way to do this is to align the repro side of the machine using a calibration tape, and then to put the machine into record. Monitoring off the repro head, the operator then aligns the record electronics until the output is flat. -Gabe] At this point, you will be pretty much set. Whether you want to do this all on a regular basis is a good question. You should definitely go through the complete procedure if you ever change brands of tape. Checking the mechanical parameters on a regular basis is a good idea with some decks (like the Ampex 350), which tend to drift. Clean your heads every time you put a new reel on, and demagnetize regularly. [Scott] There are two other references for aligning a tape deck available on the web at the following URL's. Larry Seyer - How To Align An Analog Tape Recorder http://www.larryseyer.com/align.htm Bill Vermillion - Magnetic Tape Recorder Alignment http://www.arachnaut.org/service/bilver-align.html [Mark] -- Q4.2 - What is bias? What is overbias? With just the audio signal applied to a tape, the frequency response is very poor. High frequency response is much better than low frequency, and the low frequency distortion is very high. In 1906, the Poulson Telegraphone managed to record an intelligible voice on a magnetic medium, but it was not until the 1930s when this problem was solved by German engineers. To compensate for the tape characteristic, a very high frequency signal is applied to the tape in addition to the audio. This is typically in the 100 KHz range, far above the audio range. With the bias adjusted properly, the frequency response should be flat across the audible range. With too low bias, bass distortion will be the first audible sign, but with too much bias, the high frequency response will drop off. Incidentally, digital recording equipment takes advantage of the very non-linearity that is a problem with analogue methods. It records a square wave on the tape, driving the tape into saturation at all times, and extracts the signal from the waveform edges. As a result, no bias is required. (For a good example of the various digital recording methods, check out NASA SP 5038, Magnetic Tape Recording.) [Scott] [Ed. note: For those looking for an understanding of why we need bias in the first place, here is one way to think about it. Tape consists of lots of small magnetic particles called domains. These domains are exposed to a magnetic field from the record head and oscillate in polarity as the AC signal voltage changes. Domains, being physical objects, have inertia. Every time the analog signal crosses from positive to negative and back again, the voltage passes the zero point for an instant. At this moment, the domain is at rest, and like any other physical object, there is a short period of inertia before it gets moving again. The result is the bizarre high-frequency performance characteristic that Scott described. The high frequency of a bias signal simply ensures that the domains are always kept in motion, negating the effect of inertia at audio frequencies. -[Gabe] -- Q4.3 - What is the difference between Dolby A, B, C, S, and SR? How do each of these systems work? How do they affect the sound? The Dolby A, B, C, SR, and S noise reduction (NR) systems are non-linear level-dependent companders (compressors/expanders). They offer various amounts of noise reduction, as shown in the table below. Dolby HF NR LF NR Number Of Active Target System Effect Effect Frequency Bands Market Year ------ ------ ------ ---------------------------- --------- ---- A 10 dB 10 dB 4 fixed Pro audio 1967 B 10 dB -- 1 sliding (HF) Domestic 1970 C 20 dB -- 1 sliding (HF) Domestic 1981 SR 24 dB 10 dB 1 sliding (HF), 1 fixed (LF) Pro audio 1986 S 24 dB 10 dB 1 sliding (HF), 1 fixed (LF) Domestic 1990 ------ ------ ----- ---------------------------- --------- ---- The band-splitting system used with Dolby A NR is a relatively costly technique, although it can deal with noise at all frequencies. The single sliding band techniques used in Dolby B and C systems are less costly, making them more suitable for consumer tape recording applications where the dominant noise contribution occurs at high frequencies. The typical on-record frequency response curves for the Dolby B NR system look something like those depicted below. The curves for Dolby C, SR, and S are similar, but the actual response levels and behavior at high frequencies are modified to extract better performance form these more advanced systems. | 0dB -|---------------------------------------------------- | | -10dB -| | /------------------------------- | / -20dB -|------------------/ | | -30dB -| /----------------------------- | / | / -40dB -|----------------/ |______________________________________________________ | | | 20Hz 1kHz 20kHz| The above picture attempts to show that the encoding process provides selective boost to high frequency signals (decoding is the exact reciprocal), and the curves correspond to the results achieved when no musical signal is applied. The amount of boost during the compansion depends on the signal level and its spectral content. For a tone at -40dB at 3 kHz, the boost applied to signals with frequencies above this would probably be the full 10dB allowed by the system. If the same tone were at a level of -20dB, then the boost would be less, maybe about 5dB. If the tone was at 0dB, then no boost would be supplied, as tape saturation would be increased (beyond it's normal amount). The single band of compansion utilized with Dolby B NR reaches sufficiently low in frequency to provide useful noise reduction when no signal is present. Its width changes dynamically in response to the spectral content of music signals. As an example, when used with a solo drum note the companding system will slide up in frequency so that the low frequency content of the drum will be passed through at its full level. On replay, the playback of the bass drum is allowed to pass through without modification to its level, while the expander lowers the volume at high frequencies above those of the bass drum, thus providing a reduction in tape hiss where there is no musical signal. If a guitar is now added to the music signal, the companding band slides further up in frequency allowing the bass drum and guitar signals through without any compansion, while still producing a worthwhile noise reduction effect at frequencies above those of the guitar. The Dolby B NR system is designed to start taking effect from 300Hz, and its action increases until it reaches a maximum of 10dB upwards of 4kHz. Dolby C improves on this by taking effect from 100Hz and providing about 15dB of NR at 400Hz, increasing to a maximum of 20dB in the critical hiss region from 2kHz to 10kHz. Dolby C also includes spectral skewing networks which introduce a roll off above 10kHz prior to the compander when in encoding mode. This helps to reduce compander errors caused by unpredictable cassette response above 10kHz, and an inverse boost is added after the expander to compensate. Although this reduces the noise reduction effect above 10kHz, the ear's sensitivity to noise in that region is diminished, and the improved encode/decode tracking provides important improvements in overall system performance. An anti-saturation shelving network, beginning at about 2kHz, also acts on the high frequencies but it only affects the high-level signals that would cause tape saturation. A complementary network is provided in the decode chain to provide overall flat response. When the tape is played back, the inverse of the above process takes place. For an accurate decoding to occur, it is necessary that playback takes place with no offsets in levels between record and replay. IE. If a 400 Hz tone is recorded at 0dB (or -20dB), then it must play back at 0dB (or -20dB). This will help ensure correct Dolby "tracking". Just think about it: if a -40dB tone at 8kHz was recorded with Dolby B on, then it would actually have a level of -30dB on tape. The same tone, if it were at a -20dB level, would have a level of about -15dB on tape. If the sensitivity of the tape was such that anything recorded at 0dB actually went on tape as -10dB, then you can see that the Dolby encoded tones would actually be at a lower level, and the system would have no way of determining this. It assumes 0dB in = 0dB out. Hence the signal would be decoded with the incorrect amount of de-boost. The Dolby SR and S NR systems provide slightly more NR than Dolby C at high frequencies, 24dB vs. 20dB, but they also achieve a 10dB NR effect at low frequencies below 200Hz as well. This is obtained using a two-band approach, the low frequencies being handled by a fixed-band processor, while a sliding band processor tackles the high frequencies. This reduces the potential for problems such as "noise pumping", caused by high-level low frequency transient signals (bass notes from drums, double basses, organs), raising the sound level in a cyclic fashion. Dolby SR and S also contain the spectral skewing and anti-saturation circuits for high-level high-frequency signals that are implemented with Dolby C. The performance of the sliding band is improved over that obtained with Dolby B and C NR systems by reducing the degree of sliding that occurs in the presence of high-frequency signals. This increases the noise reduction effect available at frequencies below those occurring in the music signal. An additional benefit of the Dolby S NR system for consumers is that the manufacturers of cassette decks who are licensed to use the system must adhere to a range of strict performance standards. These include an extended high frequency response, tighter overall response tolerances, a new standard ensuring head height accuracy, increased overload margin in the electronics, lower wow and flutter, and a head azimuth standard. These benefit users by enhancing the performance of cassette recorders as well as helping to ensure that tapes recorded on one deck will play back accurately on any other. [Witold Waldman - witold@aed.dsto.gov.au] Improvement in signal-to-noise ratio, or any other parameter for that matter, doesn�t come without a price. In the case of Dolby noise reduction, the calibration of record and playback levels is critical. Without the right setup, the wrong part of the playback transfer curve will be overlaid on the record transer curve, with the result that there�s a strange bump in the overall linearity of the recording. So for any of these methods, it is essential to read and understand the Dolby setup procedure and make sure that the calibration tone (which also uniquely identifies the type of Dolby being used) is recorded at the correct level, and then the playback unit can be matched to thatlevel. Once the levels are set correctly, the remaining sonic artifacts have to do with the tape being pushed closer to its limits in the extremes of the frequency range. The high frequency information thus sometimes seems a bit more compressed (besides being accompanied by less noise.) And, some would argue that running the audio through another dozen or more op-amps per channel must create sonic artifacts too. [David] -- Q4.4 - What is Dolby HX-Pro? HX-Pro is a scheme to reduce the level of the bias signal when high frequency information is present in the recorded signal. Sufficient high frequency information will act to bias the tape itself, and by reducing the AC bias signal somewhat, additional signal can be applied without saturating the tape. This is a single-ended system; it requires no decoding on playback, because it merely permits more signal to be recorded on the tape. In theory it is an excellent idea, and some implementations have lived up to the promise of the method, although some other implementations have produced unpleasant artifacts. [Scott] -- Q4.5 - How does DBX compare to Dolby? [Anyone?] [Gabe] -- Q4.6 - How much better are external microphone preamplifiers than those found in my portable recorder? Going by the rule that "external is better than internal," the external pre-amps are likely to sound better. Besides the issue of electrical shielding and interaction it is simply the case that a designer who is spending *all* his time on a project designing only a pre-amp is likely to do a better job of it. As opposed to a tape machine design team that has to worry how they're going to fit the pre-amp into the box and still have enough room for the rest of the tape machine. [Gabe] -- Q4.7 - What is an MRL? Where do I get one? An MRL is a reference alignment tape from Magnetic Reference Laboratory. These tapes, available in every conceivable tape speed, tape width, equalization, and field strength, contain alignment tones useful in calibrating the electronics of analog tape machines. These tapes can be ordered from many pro audio dealers. If not, you can contact MRL directly at: Magnetic Reference Laboratory, Inc. 229 Polaris Avenue, Suite 4 Mountain View, CA 94043 Tel: (650) 965-8187 Fax: (650) 965-8548 http://www.flash.net/~mrltapes ===== Section V - Digital recording and interconnection -- Q5.1 - What is sampling? What is a sampling rate? Sampling can be (roughly) defined as the capture of a continuously varying quantity at a precisely defined instant in time. Most usually, signals are sampled at a set of sample-points spaced regularly in time. Note that sampling in itself implies nothing about the representation of sample magnitude by a number. That process is called quantisation. The Nyquist theorem states that in order to faithfully capture all of the information in a signal of one-sided bandwidth B, it must be sampled at a rate greater than 2B. A direct corollary of this is that if we wish to sample at a rate of 2B then we must pre-filter the signal to a one-sided bandwidth of B, otherwise it will not be possible to accurately reconstruct the original signal from the samples. The frequency 2B that is the minimum sample rate to retain all of the signal information is called the Nyquist frequency. The spectrum of the sampled signal is the same as the spectrum of the continuous signal except that copies (known as aliases) of the original now appear centered on all integer multiples of the sample rate. As an example, if a signal of 20 kHz bandwidth is sampled at 50 KHz then alias spectra appear from 30 - 70 kHz, 80 - 120 kHz, and so on. It is because the alias spectra must not overlap that a sample rate of greater than 2B is required. In digital audio we are concerned with the base-band - that is to say the signal components which extend from 0 to B. Therefore, to sample at the standard digital audio rate of 44.1 kHz requires the input signal to be band-limited to the range 0 Hz to 22.05 kHz. [Chris] -- Q5.2 - What is oversampling? To take distortion-less samples at 44.1kHz requires that the analogue signal be band limited to 22.05kHz. Since the audio band is reckoned to extend to 20kHz we require an analogue filter that cuts off very sharply between 20kHz and 22kHz to accomplish this. This is expensive, and suffers from all the ailments associated with analogue electronics. Oversampling is a technique whereby some of this filtering may be done (relatively cheaply and easily) in the digital domain. By sampling at a high rate (for example 4 times 44.1kHz, or 176.4kHz) the analogue filter can have a much lower slope since its transition band is now 20kHz to 88kHz (i.e. half of 176kHz). The samples are then passed through a digital filter with a sharp cutoff at 20kHz, after which three of every four are discarded, resulting in the sample stream at 44.1kHz that we require. [Chris] -- Q5.3 - What is the difference between a "1 bit" and a "multibit" converter? What is MASH? What is Delta/Sigma? Should I really care? Audio data is stored on CD as 16-bit words. It is the job of the digital to analogue converter (DAC) to convert these numbers to a varying voltage. Many DAC chips do this by storing electric charge in capacitors (like water in buckets) and selectively emptying these buckets to the analogue output, thereby adding their contents. Others sum the outputs of current or voltage sources, but the operating principles are otherwise similar. A multi-bit converter has sixteen buckets corresponding to the sixteen bits of the input word, and sized 1, 2, 4, 8 ... 32768 charge units. Each word (i.e. sample) decoded from the disc is passed directly to the DAC, and those buckets corresponding to 1's in the input word are emptied to the output. To perform well the bucket sizes have to be accurate to within +/- half a charge unit; for the larger buckets this represents a tolerance tighter than 0.01%, which is difficult. Furthermore the image spectrum from 24kHz to 64kHz must be filtered out, requiring a complicated, expensive filter. Alternatively, by using some digital signal processing, the stream of 16-bit words at 44.1kHz can be transformed to a stream of shorter words at a higher rate. The two data streams represent the same signal in the audio band, but the new data stream has a lot of extra noise in it resulting from the word length reduction. This extra noise is made to appear mostly above 20kHz through the use of noise shaping, and the oversampling ensures that the first image spectrum occurs at a much higher frequency than in the multi-bit case. This new data stream is now converted to an analogue voltage by a DAC of short word length; subsequently, a simple analogue filter can filter out most of the noise above 20kHz without affecting the audio signal. Typical configurations use 1-bit words at 11.3MHz (256 times over sampled), and 4-bit words at 2.8MHz (64 times over sampled). The former requires one bucket of arbitrary size (very simple); it is the basis of the Philips Bitstream range of converters. The latter requires four buckets of sizes 1, 2, 4 and 8 charge units, but the tolerance on these is relaxed to about 5%. MASH and other PWM systems are similar to Bitstream, but they vary the pulse width at the output of the digital signal processor. This can be likened to using a single bucket but with the provision to part fill it. For example, MASH allows the bucket to be filled to eleven different depths (this is where they get 3.5 bits from, as 2^(3.5) is approximately eleven). Lastly it is important to note that these are all simply different ways of performing the same function. It is easy to make a lousy CD player based around any of these technologies; it is rather more difficult to make an excellent one, regardless of the DAC technology employed. Each of the conversion methods has its advantages and disadvantages, and as ever it is the job of the engineer to balance a multitude of parameters to design a product that represents value for money to the consumer. [Chris] -- Q5.4 - On an analog recorder, I was always taught to make sure the signal averages around 0 VU. But on my new DAT machine, 0 is all the way at the top of the scale. What's going on here? Analog recorders are operated such that the signal maintains a nominal level that strikes a good balance between signal-to-noise ratio and headroom. Further, since analog distorts very gently, you often can exceed your headroom in little bits and not really notice it. Digital is not nearly as forgiving. Since digital represents audio as numerical values, higher levels will eventually force you to run out of numbers. As a result, there is an absolute ceiling as to how hot you can record. If you record analog and have a nominal 12 dB of headroom, you'll probably be okay if you have one 15 dB transient that lasts for 1/10th of a second. The record amps _might_ overload, the tape _might_ saturate, but you'll probably be fine. In a digital system, that same 3 dB of overshoot would cause you to clip hard. It would not be subtle or forgiving. You would hear a definite snap as you ran out of room and chopped the top of your waveform off. The reality is that digital has NO HEADROOM, because there is no margin for overshoot. You simply must make sure that the entire dynamic range of the signal fits within the limits of the dynamic range of your recorder, without exception. The only meaningful absolute on a digital recorder, therefore, is the point at which you will go into overload. The result is the metering system we now have. 0 dB represents digital ceiling, or full-scale. The negative numbers on the scale represents your current level relative to the ceiling. Thus, to return to our example, if you have a transient with 15dB of overshoot past your nominal level, you must then place your nominal level at a maximum of -15 dB. 0 dB on the meters is the absolute limit of what you can record. [Gabe] -- Q5.5 - Why doesn't MiniDisc or Digital Compact Cassette sound as good as DAT or CD? After all, they're both digital. Both MD and DCC use loss-y compression algorithms (called ATRAC and PASC respectively); crudely, this means that the numbers coming out of the machine are not the same as those that went in. The algorithms use complex models of the way the ear works to discard the information that it thinks would not be heard anyway. For example, if a pin dropped simultaneously with a gunshot, it may be reasonable to suggest that it isn't worth bothering to record the sound of the pin! In fact it turns out that around 75 to 80 per cent of the data for typical music can be discarded with surprisingly little quality loss. However, nobody denies that there is a quality loss, particularly after a few generations of copying. This fact and others make both MD and DCC useful only as a consumer-delivery format. They have very little use in the studio as a recording or (heaven forbid!) mastering format. [Chris] Recent advances have made the MD much closer to the uncompressed signal. They have also been integrated into a number of lower cost 4 and 8 track recording devices. MD is now also found in a number of radio stations. [Joe] [9/98] -- Q5.6 - What is S/P-DIF? What is AES/EBU? AES/EBU and S/P-DIF describe two similar protocols for communicating two-channel digital audio information over a serial link. They are slightly different in details, their basic format is almost identical, but there are enough differences that the two are, for all intents and purposes electrically incompatible. Both of these digital protocols are described fully in an international standard, IEC 958, available from the International Electrotechnical Commission. AES/EBU (which stands for the joint Audio Engineering Society/European Broadcasting Union standard) is the so-called "professional" protocol. It uses standard 3-pin XLR connectors and 110-ohm balanced differential cables for connection (no, standard microphone cables, not even good quality cables, won't work, even though it seems they might) and a 5 volt, differential signal. S/P-DIF (which stands for Sony/Philips Digital InterFace, a now obsolete standard superseded by IEC 958) is the so-called "consumer" format. It uses what appears to be standard RCA connectors and cables, but, in fact, require 75-ohm connectors and cables. Good quality video "patch" cables have proven adequate (no, standard "audio" patch cords, even excellent quality versions, have been shown not to work). The signals are 0.5 volts unbalanced. The actual data streams are very similar. Each sample period a "frame" is transmitted. Each frame consists of two "subframes", one each for left and right channels, each subframe is 32 bits wide. In that subframe, 4 bits are used for synchronization, then up to 24 bits are usable for audio (the "consumer mode" format is limited to 16 bits). The remaining four bits are used for parity (the first level of error detection), validity, user status and channel status. 192 subframes are collected, and the 192 user bits and 192 channel status bits are collected into separate 24 8 bit status bytes for each channel. The channel status bytes are interesting, because they contain the important control information and the major differences between the two protocol formats. One bit tells whether the data stream is professional or consumer format. There are bits that specify (optionally) the sample rate, de-emphasis standards, channel usage, and other information. The consumer format has several bits allocated to copy protection and control: the SCMS bits. Now, the notion that all of this is encoded in a standard may be reassuring, but a standard is nothing but a voluntary statement of common industry practice. There is a lot of incompatibility between equipment out there caused directly by subtle differences between interpretations and implementations. The result is that some equipment simply refuses to talk to each other. Even THAT possibility is stated in the standard! [Dick] -- Q5.7 - What is clock jitter? Clock jitter is a colloquialism for what engineers would readily call time-domain distortion. Clock jitter does not actually change the physical content of the information being transmitted, only the time at which it is delivered. Depending on circumstance, this may or may not affect the ultimate decoded output. Let's look at this a little more closely. Digital audio is sent as a set of binary digits 1's and 0's. But that is only a logical construct. In order to transmit binary math electrically, we use square waves. Realize that although we have two mathematical states, we have to transmit such a construct using control voltages and comparators. All digital audio systems start with a crystal controlled oscillator producing a square wave signal that is used to synchronize the entire digital audio sampling and playback processes. Now, for a clock, we don't really care about the fact that the clock might be at state 1 or state 0 at any given moment. That doesn't give us any information. As a computer, I can't tell if my clock has just gotten to state 1, or if it's been sitting there for a microsecond. Thus it isn't the states we care about. Instead, we care about the state changes - when the clock shifts from one state to the other. Now, in a perfect square wave (no such thing exists), the change of state would be instantaneous. BOOM...it's done. But in reality, it doesn't work this way. Square waves contain high orders of harmonics. Fourier teaches us that all complex waveforms are made up of simpler waveforms. Thus, as we run through noisy electronics, long cables, inadvertent filtering circuits, we begin to lose some of our harmonics. When this happens, our square wave begins to lose form. The result of this is that our nice sharp corners become rounded. So our state changes are no longer precisely at the edge anymore, because there is no more edge. The pointy edge is now all fuzzy. It now depends on design of the electronic comparator circuit as to when the clock state will change, as the stage change has shifted. The clock is, essentially, jittering. People love to bark out "Bits is bits. A copy of a computer file works as well as the original." Yes, this is true. But these jittering bits can create audible distortion during the digital to analog conversion, and the industry is working hard to reduce the amount of jitter present in digital systems. Furthermore, emerging research is suggesting that certain types of jitter may produce digital copies with eccentricities that result in more jittery output on playback. The jury is still out on the specifics however. Stay tuned. [Gabe] -- Q5.8 - What kind of cable AES/EBU or S/P-DIF cables should I use? How long can I run them? The best, quick answer is what cables you should NOT use! Even though AES/EBU cables look like ordinary microphone cables, and S/P-DIF cables look like ordinary RCA interconnects, they are very different. Unlike microphone and audio frequency interconnect cables, which are designed to handle signals in the normal audio bandwidth (let's say that goes as high as 50 kHz or more to be safe), the cables used for digital interconnects must handle a much wider bandwidth. At 44.1 kHz, the digital protocols are sending data at the rate of 2.8 million bits per second, resulting in a bandwidth (because of the biphase encoding method) of 5.6 MHz. This is no longer audio, but falls in the realm of bandwidths used by video. Now, considerations such as cable impedance and termination become very important, factors that have little or no effect below 50 kHz. The interface requirements call for the use of 110 ohm balanced cables for AES/EBU interconnects, and 75 ohm coaxial unbalanced interconnects for S/P-DIF interconnects. The used of the proper cable and the proper terminating connectors cannot be over emphasized. I can personally testify (having, in fact, looked at the interconnections between many different kinds of pro and consumer digital equipment) that ordinary microphone or RCA audio interconnects DO NOT WORK. It's not that the results sound subtly different, it's that much of the time, it the receiving equipment is simply unable to decode the resulting output, and simply shuts down. Fortunately, there is a ready solution for S/P-DIF cables. Any store that sells high quality 75 ohm RCA video interconnect (or "dubbing") connectors also sells high-quality S/P-DIF interconnects as well. They may not know it, but they do. This is because the signal and band pass requirements for video and S/P-DIF cables are the same. National chains such as Radio Shack sell such cables, and the data seems to indicate that they are good digital interconnects. For AES/EBU, there are fewer, less common solutions. Companies such as Canare make excellent cables. Professional audio suppliers and distributors may be good sources for such cables. If you are handy with a soldering iron, then you can purchase 110 ohm balanced shielded cable and make your own (which I have done quite successfully). Cables such as Alpha Twinax, Carol Twin Coaxial, Belden 9207 twin axial, and the like, all work well for this application. Use high-quality XLR connectors (be warned that these cables are 0.330 inches in diameter and are a VERY tight fit in the neoprene strain relief's of many connectors: warming them in hot water makes them pliable enough to work well). As to how long these cables can be, it's hard to say. However, a couple of general rules apply. S/P-DIF was NEVER intended to be a long-haul hardware interconnect. The relevant specifications talk of interconnect lengths less than 10 meters (33 feet). In fact, many pieces of equipment cannot tolerate cables even that long, due to the excessive capacitance and possibly induced common mode interference. AES/EBU is more tolerant of longer runs because it is balanced (thus more immune to interference) and it's run at a higher signal level (5 volts instead of 0.5 volts). The standards "allow signal transmission up to a few hundred meters in length." The reality is that much is highly dependent upon the actual conditions at hand. The requirements are that the received signal fit within certain requirements of rise time/period and voltage level, the so-called "eye diagram". In other words, regardless of what kind of cable you use, if it can't move the voltage at the receiver far enough soon enough, it simply isn't going to work. Another complicating factor is that both protocols allow a degree of multi-drop capability. This means a single transmitter can drive several receivers (the last of which must be terminated with the proper termination impedance). However, implementing multi-drop puts more stringent requirements on impedance matching. [Dick] -- Q5.9 - What is SCMS? How do I defeat it? SCMS is the Serial Copy Management System, a form of copy protection that was mandated by Federal law (the Home Recording Rights Act). SCMS consists of a set of subcode flags that indicate to a digital recorder whether or not the source may be copied. Under the HRRA, consumers are permitted to make one digital generation, but no more. Thus when, for instance, the consumer copies a CD onto DAT, the SCMS flag is set on the copy and no further generations can be made. SCMS is only mandated in consumer machines. Any recorder sold through professional channels, and which is intended for use in professional applications, does not have to implement it. There are several professional products, such as Digital Domain's FCN-1 format converter , which allow manipulation of the SCMS flags. These units exist so that professional engineers may adjust the subcode bits of the recordings they produce. [Gabe] SCMS is now also found on some consumer grade CD burners. [Joe] [9/98] -- Q5.10 - What is PCM-F1 format? In the 1980s, before the DAT era, Sony produced a set of PCM adapters that enabled one to record digital audio using a video cassette machine. These units had RCA audio connections for input and output, as well as video I/O that could be sent to, and received from, the VCR. At the time, these systems offered performance far in excess of conventional analog recorders available in the price category. Sony released many models, including the PCM-F1, PCM-501, PCM-601, and PCM-701. Perhaps the most interesting is the PCM-601, which has S/P-DIF digital I/O. These units are highly prized since they are the only units that can be used to make digital transfers of F1 tapes to modern hardware. There are some engineers who insist that, despite the clunkiness of the format by modern DAT standards, the F1 series was the best digital format ever developed. To this day, it is not surprising to see an F1 encoder on a classical recording session. [Gabe] -- Q5.11 - How do digital recorders handle selective synchronization? Selective Synchronization, or "Sel-Sync" as it is often called, is the ability of a recorder to play and record simultaneously, allowing synchronous recording of new material onto specific tracks without erasing everything on tape. This technique is what makes overdubbing possible. On an analog recorder, audio tracks are discrete entities, and the sync head is really just a stack of individual heads, any one of which is capable of recording or playing back. Thus sel-sync is a relatively simple matter of putting some heads into record and others into repro. In the digital world, the problem is highly complex. First, A/D and D/A conversion involves an acquisition delay of several milliseconds. Second, and more importantly, digital tracks are not discrete. Rather, they are multiplexed together on a tape, along with subcode and other non-audio information. So how can you replace one track and leave the others untouched? The answer is a technique called "read before write" (RBW) or "read, modify, write" (RMW) which involves a second set of heads. The data is read from the tape and flushed into a buffer, where it can be modified, and ultimately written back to the tape. Thus when you "punch in" on a digital deck, you are physically re-writing all the tracks, not just the one you're overdubbing. You are not, however, changing the data on any track other than the one you want to replace. [Gabe] -- Q5.12 - How can a 44.1 KHz sampling rate be enough to record all the harmonics of music? Doesn't that mean that we chop off all the harmonics above 20 KHz? Doesn't this affect the music? After all, analog systems don't filter out all the information above 20 kHz, do they? This whole question is based on the premise that "analog systems don't filter out all the information above 20 kHz." Indeed there are mixers and power amplifiers and other electronic systems that are capable of stunningly wide bandwidth, often exceeding 100 kHz, the same cannot be said for the entire analog reproduction chain. The mechanical transducers, microphones, speaker and phono cartridges seldom have real response far exceeding 20 kHz. In fact, some of the most highly regarded large diaphragm condenser microphones often used in very high quality recordings seldom exceed 18 kHz bandwidth. Analog tape recorders rarely have bandwidths as wide as 25 kHz, and LP reproduction systems have similar limitations in reality. So while it may be possible to send very high frequency ultrasonic signals through parts of both analog and digital reproduction chains, there are, in both technologies, fundamental and insurmountable limits to the bandwidth that, in reality, lead to very similar actual reproducible bandwidths in each. Thus, one of the basic premises of the question is flawed. Analog systems DO filter out information above 20 kHz. Further, the frequency response and phase errors of even the very best well-maintained analog reproduction systems have response errors far exceeding those of even middle of the line digital equipment. Whether one person may find those errors tolerable or even likeable or not is a matter or personal preference that is beyond the scope of this or any other technical discussion. There are a variety of anecdotal tales that are advanced to "prove" that the ear can hear far beyond what is conventionally accepted as the 20 KHz upper limit. An upper limit that, for the most part, applies to young people only: modern high SPL music and noise levels has lead to a widespread deterioration in the hearing of the adult population at large, and especially amongst young males. For example, there is an apocryphal story about Rupert Neve that tells of a console channel that sounded particularly "bad". It was later discovered that it was oscillating at some ultrasonic frequency, like 48 kHz. Rupert Neve is rumored to have seized upon this as "proof" that the ear can hear well beyond 20 kHz. However, there exist an entire range of perfectly plausible mechanisms that require NO ultrasonic acuity to detect such a problem. For example, the existence of ANY non-linearity in the system would result in the production of inter-modulation tones that would fall well within the 20 kHz audio band and certainly would make it sound awful. Even the problem that was causing the oscillation itself could lead to massive artifacts at much lower frequencies that would completely account for the alleged sound of the mixer in the complete absence of a 48 kHz "whistle." Whether 20 KHz is an adequate bandwidth is a debatable subject. However, several important facts have to be remembered. First, BOTH analog AND digital reproduction systems suffer from roughly the same bandwidth limiting. Second, digital systems using properly implemented oversampling techniques have far less severe phase and frequency response errors within the audible band. No analog storage and reproduction system can match the phase and response linearity of a digital system, both at low and high frequencies. Once those demonstrable facts are acknowledged, then the discussion about supra-20 KHz aural delectability can continue, knowing that, if it is demonstrated to be significant, both systems are provably deficient. [Dick] -- Q5.13 - Yeah, well what about square waves? I've seen square wave tests of digital systems that show a lot of ringing. Isn't that bad? Square waves are a mathematically precisely defined signal. One of the ways to describe a perfect square wave is as the sum an infinite series of sine waves in a precise phase, harmonic and amplitude relationship. The relation is: 1 1 1 1 F(t) = sin(wt) + -sin(3wt) + -sin(5wt) + -sin(7wt) + -sin(9wt) ... 3 5 7 9 Where t is time, w is "radian frequency", or 2 pi times frequency. Remember - we require an infinite number of terms to describe a perfect square wave. If we limit the number of terms to, say, 10 terms, (such as the case with a 1 kHz square wave perfectly band limited to 20 kHz), there simply aren't enough terms to describe a perfect square wave. What will result is a square wave with the highest harmonic imposed on top as "ringing." In fact, this appearance indicates that the phase and frequency response is perfect out to 20 kHz, and the bandwidth limiting is limiting the number of terms in the series. Well, what would a perfect analog system do with square waves? As it turns out, if you take a high quality 15 IPS tape recorder, bias and adjust it for the flattest possible frequency response over the widest possible bandwidth, the result looks remarkably like that of a good digital system for exactly the same reasons. On the other hand, adjust the analog tape recorder for a square wave response that has no ringing, but the fastest possible rise time. Now listen to it: it sounds remarkably dull and muffled compared to the input. Why? Because in order to achieve that square wave response, it's necessary to severely roll off the high-end response in order to suppress the high-frequency components needed to achieve fastest rise time. [Dick] -- Q5.14 - How can a 16-bit word length be enough to record all the detail in music? Doesn't that mean that the sound below -96 dB gets lost in the noise? Since it is commonly understood that humans can perceive audio that IS below the noise floor, aren't we losing something in digital that we don't lose in analog? You're correct in saying that human hearing is capable of perceiving audio that is well below the noise floor (we won't say what kind of noise floor just yet). The reason it can do this is through a process the ear and brain employ called averaging. If we look at a single sample in a digital system or an instantaneous snapshot in an analog system, the resulting value that we measure will consist of some part signal and some part ambiguity. Regardless of the real value of the signal, the presence of noise in the analog system or quantisation in the digital system sets a limit on the accuracy to which we can unambiguously know what the original signal value was. So on an individual sample or instantaneous snapshot, there is no way that either ear or measurement instrument can detect signals that are buried below either the noise or the quantisation level (when properly dithered). However, if we look at (or listen to) much more than a single sample, through the process of averaging, both instruments and the ear are capable of detecting real signals below the noise floor. Let's look at the simple case of a constant voltage that is 1/10th the value of the noise floor. At the instantaneous or sample point, the noise value overwhelms the signal completely. But, as we collect more consecutive snapshots or samples, an interesting thing begins to happen. The noise (or dither) is random and its long-term average is, in fact, 0. But the signal has a definite value, 1/10. Average the signal long enough, and the average value due to the noise approaches 0, but the average value of the signal remains constant at 1/10. A somewhat analogous process happens with high frequency tones. In this case the averaging effect is that of a narrow-band filter. The spectrum of the noise (or simple dither) is broadband, but the spectrum of the tone is very narrow band. Place a filter centered on the tone and while we make the filter narrower and narrower, the contribution of the noise gets less and less, but the contribution of the signal remains the same. Both the ear and measurement instruments are capable of averaging and filtering, and together are capable of pulling real signals from deep down within the noise, as long as the signals have one of two properties. Either a period that is long compared to the inherent sampling period of the signal in a digital system or long compared to the reciprocal of the bandwidth in an analog system, or a periodic signal that remains periodic for a comparably long time. Special measurement instrument were developed decades ago that were capable of easily detecting real signals that were 60 dB below the broadband noise floor. And these devices are equally capable of detecting signals under similar conditions in properly dithered digital systems as well. How much the ear is capable of detecting is dependent upon many conditions, such as the frequency and relative strength of the tone, as well as individual factors such as aging, hearing damage and the like. But the same rules apply to both analog systems with noise and digital systems with de-correlated quantisation noise. [Dick] Q5.15 - Q5.14 - What's all this about 20- and 24-bit digital audio? Aren't CDs limited to 16 bits? Yes, CDs are limited to 16 bits, but we can use >16-bit systems to produce 16-bit CDs with higher quality than we could otherwise. We are able to record audio with effective 20-bit resolution nowadays. The finest A/D converter systems have THD+N values around -118 dB with linearity extending far below even that. When it comes time to reduce our word-length to 16 bits, we can use any one of a variety of noise shaping curves. The job of which is to mix with our 24-bit audio, shift the dither spectrum of the noise into areas where our ears are less sensitive, thus enabling the noise component to comprise audio information at the spectral areas where our ears are most sensitive. See Lipschitz's seminal papers for fuller detail on this subject. Furthermore, we often perform DSP calculations on our audio, and to that end it is worthwhile to carry out the arithmetic with as much precision as we can in order to avoid rounding errors. Most digital mixers carry their math out to 24-bit precision at the I/O, with significantly longer word lengths internally. As a result, two 16-bit signals mixed together can produce a valid 24-bit output word. For that matter, a 16-bit signal subjected to a level change can produce a 24-bit output if desired (except, of course, for a level change that is a multiple of 6 dB, as that's just a shift left or right). The number of noise shaping curves available today is staggering. Sony SBM, Weiss, Meridian 618, Sonic TBM, Apogee UV-22, Prism SNS, Lexicon, PONS, Waves, and, of course, the classic Lipschitz curve are just a few of the multitudinous options that now exist. [Gabe] ===== Section VI - Digital editing and mastering -- Q6.1 - What is a digital audio workstation? A digital audio workstation (DAW) is one of our newest audio buzzwords, and applies to nearly any computer system that is meant to handle or process digital audio in some way. For the most part however, the term refers to computer-based nonlinear editing systems. These systems can comprise a $500 board that gets thrown into a PC, or can refer to a $150,000 dedicated digital mastering desk. [Gabe] -- Q6.2 - How is digital editing different from analog editing? In the days of analog editing, one edited with a razor blade and a diagonal splicing block. Making a cut meant scrubbing the tape over the head, marking it with a grease pencil, cutting, and then taping the whole thing back together. Analog editing (particularly on music) was as much art as it was craft, and good music editors were worth their weight in gold. In many circles, analog editing has gone the way of the Edsel, replaced by digital workstation editing. For complex tasks, DAW-based editing offers remarkable speed, the ability to tweak an edit after you make it, a plethora of crossfade parameters that can be optimized for the edit being made, and most importantly, the ability to undo mistakes with a keystroke. Nearly all commercial releases are being edited digitally nowadays. Since satisfactory editing systems can be had for around $1,000, even home recordists are catching onto the advantages. More elaborate systems can cost tens of thousands of dollars. There are certain areas where analog editing still predominates, however. Radio is sometimes cited as an example, though this has begun to change thanks to products like the Orban DSE 7000. The needs of radio production are often quite different from those of music editors, and a number of products (the Orban being a fine example) have sprung up to fill the niche. Nonetheless, in spite of the rapid growth of DAW's in the radio market, razor blades are still found in daily use in radio stations. [Gabe] -- Q6.3 - What is mastering? Mastering is a multifaceted term that is often misunderstood. Back in the days of vinyl records, mastering involved the actual cutting of the master that would be used for pressing. This often involved a variety of sonic adjustments so that the mixed tape would ultimately be properly rendered on vinyl. The age of the CD has changed the meaning of the term quite a bit. There are now two elements often called mastering. The first is the eminently straightforward process of preparing a master for pressing. As most mixdowns now occur on DAT, this often involves the relatively simple tasks of generating the PQ subcode necessary for CD replication. PQ subcode is the data stream that contains information such as the number of tracks on a disc, the location of the start points of each track, the clock display information, and the like. This information is created during mastering and prepared as a PQ data burst which the pressing plant uses to make the glass pressing master. Mastering's more common meaning, however, is the art of making a recording sound "commercial." It is the last chance one has to get the recording sounding the way it ought to. Tasks often done in mastering include: adjustment of time between pieces quality of fade-in/out relation of levels between tracks (such that the listener doesn't have to go swinging the volume control all over the place) program EQ to achieve a desired consistency compression to make one's disc sound LOUDER than others on the market the list goes on. A good mastering engineer can often take a poorly-produced recording and make it suitable for the market. A bad one can make a good recording sound terrible. Some recordings are so well produced, mixed, and edited that all they need is to be given PQ subcode and sent right out. Other recordings are made by people on ego trips, who think they know everything about recording, and who make recordings that are, technically speaking, wretched trash. Good mastering professionals are acquainted with many styles of music, and know what it is that their clients hope to achieve. They then use their tools either lightly or severely to accomplish all the multiple steps involved in preparing a disc for pressing. [Gabe] -- Q6.4 - What is normalizing? Normalizing means bringing a digital audio signal up in level such that the highest peak in the recording is at full scale. As we saw in Q5.4, 0 dB represents the highest level that our digital system can produce. If our highest level is, for instance, -6 dB, then the absolute signal level produced by the player will be 6 dB lower than it could have been. Normalizing just maximizes the output so that the signal appears louder. Contrary to many frequently-held opinions, normalizing does NOT improve the dynamic range of the recording in any way, since as you bring up the signal, you also bring up the noise. The signal-to-noise ratio is a function of the original recording level. If you have a peak at -6 dB, that's 6 dB of dynamic range you didn't use, and when you normalize it to 0 dB, your noise floor will rise an equivalent amount. Normalizing may help optimize the gain structure on playback, however. Since the resultant signal will be hotter, you'll hear less noise from your playback system. But the most common reason for normalizing is to make one's recordings sound, LOUDER, BRIGHTER, and have more PUNCH, since we all know that louder recordings are better, right? :-) [Gabe] -- Q6.5 - I have a fully edited DAT that sounds just like I want it to sound on the CD. Is it okay to send it to the factory? This is a highly case-specific question. Some people truly have the experience to produce DATs on mixdown or editing that are ready to go in every conceivable way. Often these people can send their tapes out for pressing without the added expense of a mastering house. However, if you do not have this sort of expertise, and if your only reason for wanting to send it out immediately is because you know that it is technically possible to press from what you have. You would be advised to let a mastering engineer listen to and work on your material. A good mastering engineer will often turn up problems and debatable issues that you didn't even know were there. Also, any decent mastering house will provide you with a master on a format significantly more robust than DAT. DAT is a fine reference tape format, but it simply is not the sort of thing you want to be sending to a pressing plant. [Gabe] -- Q6.6 - What is PCM-1630? What is PMCD? PCM-1630 is a modulation format designed to be recorded to 3/4" videotape. It was, for many years, the only way one could deliver a digital program and the ancillary PQ information to the factory for pressing. The PCM-1630 format is still widely used for CD production. But PCM-1630 is now certainly an obsolete system, as there are many new formats that are superior to it in every way. One of the most popular formats for pressing now is PMCD (Pre-Master Compact Disc). This format, developed by Sonic Solutions, allows for CD pressing masters to be written out to CD-R's that can be sent to the factory directly. These CD-R's contain a PQ burst written into the lead-out of the discs. Some plants have gone a step further and now accept regular CD-R's, Exabyte tapes, or even DAT's for pressing. The danger here is that some users may think that they can prepare their own masters without the slightest understanding of what the technical specifications are. For instance, users preparing their own CD-Rs must do so in one complete pass. It is not permitted, for instance, to stop the CD-R deck between songs, as this creates unreadable frames that will cause the disc to be rejected at the plant. [Gabe] -- Q6.7 - When preparing a tape for CD, how hot should the levels be? Ideally you should record a digital master such that your highest level is at 0 dB, if only to maximize the dynamic range of your recordings. Many people like their CDs to be loud, and thus they will normalize anyway even if they don't hit 0 dB during recording. Some classical recordings are deliberately recorded with peaks that are significantly lower than 0 dB. This is done in order to prevent quiet instruments such as lutes and harpsichords from being played too loud. If you record a quiet instrument such as a harpsichord out to 0 dB, the listener would have to put the volume control all the way at the bottom in order to get a realistic level, and the inclination would be to play it at a "normal" listening level. By dropping the mastering level, the listener is more likely to set the playback level appropriately. -- Q6.8 � What is a CD-R? Why do I care about disk-at-once vs track-at once? CDs were initially intended as a mass produced play-only media. Recently, recordable CD�s (CD-R) have become affordable. The cost of adding a CD burner to a computer is now down to around $300, so for small runs that may be an advisable route. [Joe] [9/98] CDs are written or �burned� in the CD-R recorder in either one of two modes, disk-at-once (DAO) and track-at-once (TAO). With disk-at-once, the entire CD is written in one stream starting with the table of contents. This is the preferred format for sending a CD-R to a duplicating plant, because you can make a CD-R DAO master disk of which the pressed CD will be a bit-for-bit clone. If you want to record additional tracks at different times, this might not be practical, so track-at-once operationis also offered. This has problems too, because a short burst of digital errors are generated at the place on the disk where the laser turned on or off. Nearly all consumer CD players will play these CD-R�s (once they have been "finished" or the table of contents written) but many CD pressing plants will refuse them because of the errors. [David] -- Q6.9 Where can I get CDs manufactured? A large number of CD manufactures can be found advertising in the back pages of recording magazines. In addition, searching your favorite web search engine should turn up a good number of options. [Joe] Be sure to know whether you are dealing with an actual manufacturer or a broker. One isn�t necessarily better than the other (there are smalltime manufacturers that might not be as careful about quality as the bigger ones, and brokers might be able to get you a better deal with a bigger plant. But ask questions and be sure you know what happens if the pressed CD doesn�t sound right. [David] -- Q6.9 - How are CD error rates measured, and what do they mean? [Forthcoming. -Gabe] ===== Section VII - Market survey. What are my options if I want -- -- Q7.1 - A portable DAT machine or a rack sized DAT machine [9/98] A fairly comprehensive survey of DAT machines has been conducted by the DAT-Heads mailing list. Consult the homepage at: http://www.eklektix.com/dat-heads. [Joe] -- Q7.2 - A good but inexpensive compressor [9/98] Two units that have received favorable notice on r.a.p. recently are the RNC - the Really Nice Compressor, available from FMR audio - http://www.fmraudio.com. and the Empirical Labs EL-8 Distressor - http://www.wavedistribution.com. [Joe, David] -- Q7.3 - An inexpensive stereo microphone [9/98] This is a really difficult question, as it implies a universally accepted single-point stereo technique, and there isn�t one. Sony http://www.sel.sony.com/ has a $300 unit, Shure�s VP-88 http://www.shure.com/ is about three times as much and a bit better. [David, whose company doesn�t make one.] -- Q7.4 - An inexpensive pair of microphones for stereo [9/98] While large-diaphragm condenser mics are sometimes used for stereo work, small-diaphragm condensers are much more common due to their superior off-axis response and much lower cost. For that reason, only small-diaphragm condensers are discussed here. Here's a good assortment of options at various price levels. (Pricing is approximate!) We strongly encourage buying matched pairs or sequential serial numbers when available, particularly on the less-expensive models. Warning - most of these microphones require phantom power. Under $300 per pair. (Frankly, it's worth spending more than this if you possibly can.) Audio-Technica Pro-37R. Not the flattest thing on earth, but cheap and does the job. Requires phantom power. Shure 849. Pretty much the same mic as the Shure Beta Green 4.0, for less money. Battery or phantom. Under $400 per pair. (These mics are still a compromise, but good for non-serious use.) Audio-Technica ATM 33a. Resistant to overload and reasonably quiet. Battery or phantom. Octava MK-012. Multiple capsules! High mic-to-mic variability: listen to the ones you're buying. Phantom. Shure SM-94. Probably a lower-spec version of the SM-81. Battery or phantom. Under $600 per pair. (If you are taking other people's money, this is the least you should pay.) AKG "Blue Line" (391, etc.) Versatile microphone family, many interchangeable capsule options. Less top end than some. Audio-Technica 4041. A clear-sounding, if somewhat bright mic, available in matched pairs. A fine choice. Crown CM-700. A nice sounding mic at an excellent price. Rather low output: use it for close micing. Shure SM-81. Common studio utility condenser. Not overly bright, reasonably quiet, very popular. Under $1400 per pair. (Sometimes not much under!) AKG C460. Bright-sounding, and impervious to overload. Tremendous on brass sections. Placement-sensitive when close-micing. Alternative-pattern capsules available. Microtech-Gefell M300. Much more bottom than the Neumann Km-84 and KM-184. Neumann KM-184. Slightly brighter-sounding successor to the much-loved KM-84. Good choir mic. Frequent close-micing choice, due to good pattern control. Sennheiser K4 family. A versatile series with interchangeable pieces. More common in broadcast use. More than $2000 per pair. The sky's the limit. Listen and compare carefully before choosing.) Danish Pro Audio 4000 family and other models. Very accurate omnidirectional mics, requiring special techniques. A few cardioids, too. Josephson Series Six family. Optimized for transient and phase response. Built to order. Neumann KM-100 family. Generally considered dark-sounding, can be quite lovely. Schoeps Colette family. Longtime favorites for location recording. Many capsules and accessories. Sennheiser MKH family. The quietest small diaphragm condensers you can buy. Low distortion, too. Recording the Grateful Dead? While you were comatose, Jerry Garcia died. The Dead are no longer touring. Get a life. If you want to record some other band, maybe you don't need microphones. If there isn't a multi-output distribution box behind the console, the band probably doesn't want you making tapes. [David R] -- Q7.5 - A good microphone for recording vocals [9/98] This is really an unanswerable question. Go to a local studio with a good stock of mics. Rent a few hours of studio time. Ask them to record your voice using the following mics through a pop filter. Sing about one minute of the same song, but don't identify which mic your using on the track. Have the engineer keep track of each mic and which track it is. Mics to consider: Rode NT1 AT 4050 AKG414 AKG C3000 Shure SM-7 Sennheiser 421 Neumann TLM-103 Any Beyer ribbon mic Coles 4038 Shure SM-57 Any others they have in the below $1,000 price range. (If they have a U87, or other expensive mics, get recordings of those too, as a reference.) Have a CD-R made. Go home, hide the track sheet, and listen to all the tracks for a week. Pick the track you thinks sounds best have someone write down the track number, then put the CD-R away for a week. Listen to it again and see if you still like the same track. If you choose a different track, put the CD-R away for a week. After a week, listen to it again and see if you still like the same track. When you pick the same track twice in a row, that's the one for you. Get the track sheet and find out which mic it is. No cheating, and you can have other people choose as well. [Harvey] If you're buying the mic for a specific vocalist, the best idea is to audition everything in your price range on the voice in question. A microphone that works poorly for one singer may be perfect for the next. Generally speaking, the more broadly applicable microphones cost more, so you may be able to save some money if it only needs to work for you. Some entry-level products are the AKG C3000, Audio Technica 4033, Beyer MC834, Equitek E-200, Electrovoice RE-1000, Langevin CR-3A and Rode NT-2. A number of these mics can be had new for under $500, and even less if you find a used one. That's money well spent if they do your job. If you're taking in outside work, consider the following: AKG 414B-ULS. The 414 is a studio stalwart, good on voices and instruments alike. You can use one of these on a rock vocal one day, an acoustic guitar the next, and pair of them for a classical concert the following night. It's fairly present sounding, but still reasonably flat. The combination of four patterns, two bass roll-off's, and two pad options makes it highly adaptable; you can expect to have one or two in the mic locker for your entire career. For a bit more money, there's also the 414TL-II, but this has a tailored presence rise, which makes it impressive on (some) vocals, at the expense of being less broadly applicable for other purposes. Used 414's can often be found at prices in the $650-750 range; the current list price is $995. Neumann TLM-193. First things first: this mic doesn't sound like a U87. If that's the sound you want, find a sympathetic banker, or a rich uncle. (Or audition the Audio-Technica 4033, which sounds closer than you might expect.) Judged on its own merits, the TLM-193 sounds slightly laid back, but very, very natural, even off-axis. When it's important to get something good down on tape without a lot of fuss, the 193 is a fine tool. There are no switchable patterns, pads, or bass roll-off to mess with, you just put it in a good spot and roll tape. Plus, you can use it on a half dozen different tracks on the same song without creating terrible problems at mixdown. The optional shock mount is worth having since there's no bass roll-off switch. These are still a bit rare on the used market, and they cost $750-850 when you find one. The list price is about $1300. [I'm unsure of this price. -- DR] Neumann has released the TLM-103, which is also single pattern microphone like the TLM-193. This microphone utilizes a capsule derived from half of a U87 capsule and comes pretty close to sounding like a U87, some people even like them better. These retail for under $1000. [Mark, David] Audio-Technica 4050. If the two mics above are out of the question, this one is worth a listen. For substantially less money, it's got three patterns, pad and roll-off switches, and a nifty shock mount. An Audio-Technica doesn't have the same rate-card appeal as a Neumann or AKG, but your average struggling garage band is going to think it's pretty darn cool, and they'll be right. The main things to watch out for are off-axis coloration and excessive sibilance on some singers. Place it carefully, and you can get good results. These cost $600-700 new, less used. Q7.6 - A good microphone for recording [insert instrument here] [9/98] Miking a drum kit? [Someone check this! This is not my area of expertise! -- DR] If you need to economize when miking drums, the best way is to use fewer microphones. With careful placement, you can often get by with four: snare, kick, and a pair of overheads. The canonical mic for snare drum is the Shure SM-57. They're cheap, which is good, because drummers are always breaking them. For overheads, a pair of condenser microphones is almost compulsory. See the discussion above for ideas. That leaves the kick drum mic. What you choose for kick drum depends on whether you can afford to dedicate a mic to that use, or whether your choice must serve other purposes as well. If miking drum kits is a small part of your studio routine, it may make sense to buy a more broadly applicable microphone, even if it costs a bit more. One excellent choice would be a Sennheiser MD-421, which can also be used for voice-overs, female singers, miking guitar cabinets, and much more. For a bit more money, an Electrovoice RE-20 is another good choice with similar applications. Both of these microphones require more experimentation with placement than a "single-use" kick mic, but the pay back is a broader tonal palette. If you own a large-diaphragm condenser mic, you might be interested to know that these sometimes work great on kick. Use caution: begin by placing it well outside the drum shell, and move it only as close as you need to in order to get the desired sound. You'll probably need to pad the output. If a dedicated kick drum mic makes sense, one very common choice is the AKG D-112. These cost $200-250 and work with a minimum of fuss. If that's too much money, the Audio-Technica ATM-25 and Pro-25 are serviceable; both use the same diaphragm, but the Pro-25 has a less-rugged case. [David R] -- Q7.7 - A small mixer [9/98] The Mackie 1202 (available in the basic 1202 or 1202vlz variations) is commonly cited by readers of RAP as a great small mixer. For a larger format, the Allen & Heath MixWizard series has been recommended by many people. [Joe] -- Q7.8 - A portable cassette machine [9/98] About the only company seriously still in the portable cassette recorder business is Sony. The TCD-D5M seems to still be in production, as the is WM-D6. See http://www.sony.sel.com/ for the latest. [David] -- Q7.9 - A computer sound card for my IBM PC or Mac [9/98] This answer will become obsolete very quickly. For years, sound cards were considered as an add-on for games and funny noises, and the audio quality was about what would be expected (despite the phenomenal and unattainable specs claimed by the makers). There is a web site of info on conventional sound cards at http://www.larrysworld.com/soundcard.htm. However, nearly all of these are designed for non-critical applications. If you want to do digital sound editing on your PC, you really want to transfer in and out of the computer digitally, without having to convert back and forth to analog. New entries in this product category are coming all the time, with expansion to cover 24 bit samples nd/or 96 kHz sampling rates. The RME cards, made in Germany and sold in the US by Sek�d, the Sonorus cards and the Zefiro ZA-2 are all suitable. [David] -- Q7.10 - An eight-track digital recorder? ===== Section VIII - Sound reinforcement -- Q8.1 - We have a fine church choir, but the congregation can't hear them. How do we mic the choir? -- Q8.2 - How do I 'ring out' a system? [9/98] Ringing out a 'system' simply means creating an equal opportunity environment for frequency. Hence this environment is where feedback at any given frequency is no more likely to feedback than any other frequency, (i.e. flat). In a 'rung out' system the response of the system should be flat verses the response of the microphone that is primarily driving the system with consideration to the various problems that a room can contribute. This is primarily true in monitors where often the mix is centered on a particular microphone and less true at front of house where you don't want to base your mix off the response of a SM58. Front of house will want to be truly 'flat' to be rung out. Any microphone characteristics, which are undesirable (like many common live mics), can be dealt with on the channel EQ's. When 'ringing' out a system, whether front of house or a monitor mix, the best place to start is with your voice. Using a microphone which you are used to the sound of, listen to your voice over the system and using the equalization available, use at least a third octave, remove any obvious frequencies that sound out of balance. A good way to do this, especially if you are not use to identifying frequencies, is to 'bump' up each band and get a feel for what it sounds like and then make the decision whether or not it is in balance with the rest. When adjusting the EQ be careful not to remove too much of any frequencies. Abuse of the EQ can cause many more problems than it solves regarding feedback, as well as create problems with the overall balance and gain structure. In general if the speakers and system in general are of high quality and the room is not too awkward you probably wont be pulling any more than 4 or 5 db out of 6 or so bands. If you notice your pulling out more than 10 bands with 6+ db, it might be best to flatten and start over. Many times a RTA can come in handy for doing this. Often after starting off with the technique above I will use a RTA to check for any problem areas. Sometimes just your voice alone won't make you aware of certain acoustical problems of a room. Also keep in mind that the response of a system, and a room, is going to change with the sound pressure level. When your listening to the system turn it up a bit, get closer to the volume that the show will be performed at. Spending time adjusting the system prior to show time can reduce many headaches during the show. One way I often judge whether the system is 'rung out' is to turn it up with a live microphone hooked up a bit past the typical maximum level. When the system just starts to feedback listen to the frequencies that ring. If it is apparent that there is more than one frequency feeding back at the same time then you at a good point. This indicates that there is not a single frequency in the bandwidth which will feedback before any others, this is the goal of ringing out the system. The more full bandwidth the ring the less likely that a particular frequency will feedback later on. Getting use to doing this is one of the keys to being good at it. Try and keep in mind what various types of speakers sound like as you encounter them and what areas of the bandwidth they are problematic in. This will help you when time is tight. [Brian] -- Q8.3 - How much power to I need for [insert venue here]? -- Q8.4 - How good is the Sabine feedback eliminator? ===== Section IX - Sound restoration -- Q9.1 - How can I play old 78s? First rule of thumb: DO NOT PLAY THEM WITH AN LP STYLUS! The grooves on 78s are gigantic compared to the microgroove LPs. As a result, specialized styli are needed for proper playback of 78s. Also, the RIAA equalization curve normally used for LPs has no relation to the frequencies that were equalized on 78 recordings. The easiest stylus to obtain is the Shure V15 with the 78 pickup. This is a good, though not great, 78 stylus and will do an okay job at playing most discs. The serious 78 collector will want to obtain not only a collection of 78 styli for the various discs in their collection (groove sizes varied, and most serious collectors own a handful of styli), but also a pre-amp with variable equalization curves. One supplier of all this apparatus is Audio-78 Archival Supplies at (415) 457-7878. [Gabe] -- Q9.2 - How can I play Edison cylinders? Edison cylinders are best played back with Audio 78's adapter. You remove the horn and reproducer element from your cylinder machine, install their electric reproducer, and connect it up to a phono pre-amp, sans RIAA. [Gabe] -- Q9.3 - What are "Hill and Dale" recordings, and how do I play them back? Hill & Dale recordings are discs where the grooves move vertically instead of horizontally. Edison, for instance, cut his discs this way. In order to play Edison discs, one needs a special glass ball stylus that is 3.7-4.0 mil wide. This is available from Audio 78, among other places. Also, since the information is vertical instead of horizontal, one must rewire a stereo phono cartridge to reject the normal horizontal information and reproduce only the normally discarded vertical information. This is easily accomplished by wiring a stereo cartridge in mono, summing the channels, with one channel out of phase. In other words, connect the cartridge as follows: ______________ | ____ to pre-amp + L - + R - | | | | | | |_________| | |__________________ One caveat: if your cartridge has one lug shorted to ground, make sure that this lug is connected to the ground on your preamplifier. It doesn't actually matter which channel you invert. Some pre-amps like the FM Acoustics 222 or the OWL 1 have a switch that will do this for you without rewiring. [Gabe] -- Q9.4 - What exactly are NoNOISE and CEDAR? How are they used? NoNOISE and CEDAR are systems for noise removal. Both of them approach the same sorts of noise, but use different algorithms and have different user interfaces, often with differing effectiveness. Noise can be broken down into several categories: IMPULSIVE NOISE: Pops, clicks, thumps, snaps. CRACKLE: The low-level "bacon frying" effect heard on LP's and 78's. HISS: Tape hiss, surface noise, amplifier hiss, broadband noise BUZZ: 60 Hz hum, any other steady-state noise that is relatively narrow-band NoNOISE and CEDAR are two (expensive) techniques for removing many of these ailments. It is rare that it is possible to remove all of the problem, nor is it ever possible to remove it with no degradation of the program material. [Gabe] -- Q9.5 - How do noise suppression systems like NoNOISE and CEDAR work? Digital techniques have been applied to many facets of sound processing and recording and have, on the whole, been found to give results far superior to their analogue counterparts. Nowhere is this more true than in the field of audio restoration, where excellent processes have been developed for removal of impulsive noise (thumps, clicks and ticks) and attenuation of continuous broadband noise (such as tape hiss). Example techniques for these two are outlined below. IMPULSIVE NOISE In this category we include many types of disturbance, from the click generated by a scratch on a 78rpm disc, to the tiny tick created by a single corrupt bit in a digital data stream. Also included is crackly surface noise from 78's (that sounds like a frying pan), though this requires somewhat different treatment; however the outline presented below is fairly similar for both processes. Typically, audible clicks are of a few microseconds to a few milliseconds in duration, and their density can be up to a few thousand clicks per second on poor-quality material. First the audio is split into short blocks of maybe 10ms duration. A model is fitted to each block; this model can be thought of as a description of the signal in simple mathematical terms. The model is chosen such that musical data is a good fit, but the impulsive noise is a poor fit. For example, a simple model could be a sum of sinewaves, whose number, frequencies, amplitudes and phases are the model parameters. The parameters are calculated such that when these sinewaves are added together they match the musical parts of the signal accurately, but match the impulsive noise badly. Now the model can be thought of as a prediction of the music. In undamaged sections the prediction is close (since music is known to consist of a sum of sinewaves, at least approximately); during clicks and pops etc. the prediction is poor, because the model has been designed to match the music, and not the noise. Now we can achieve impulsive noise removal by replacing the data that fits the model badly (i.e. the clicks) with data predicted by the model, which is known to be a close approximation to the music. BROADBAND NOISE Broadband noise is usually better tackled in the frequency domain. What this entails is taking a block of data (as in the impulsive noise case) but then calculating its spectrum. From the spectrum an estimate can be made of which frequencies contain mostly signal, which contain mostly noise. To help in making this discrimination we first take a "fingerprint" of the noise from an unrecorded section, such as the lead-in groove of a record, or a silence between movements of a symphony. This spectrum of this fingerprint is then compared with the spectrum of each block of musical data in order to decide what is noise and what is music. The de-noising process itself can be thought of as an automatically controlled, cut-only graphic equalizer. For each block, the algorithm adjusts the attenuation of each frequency band so as to let the music through, but not the noise. If the SNR in a particular band is high (i.e. lots of signal, little noise) then the gain is left close to unity. If the SNR is poor in a given band, then that band is heavily attenuated. [Chris] -- Q9.6 - What is forensic audio? Forensic audio is audio services for legal applications. Forensics breaks down into four main categories. TAPE ENHANCEMENT: Digital and analog processing to restore verbal clarity and make tapes easier to understand in a courtroom situation. AUTHENTICITY: Electronic and physical microscopic examination of a tape to prove that it has not been tampered with, altered, or otherwise changed from its original state. Another common authenticity challenge is to determine whether a given was tape was indeed made on a given machine. VOICE IDENTIFICATION: Voice ID, or voiceprinting, is the science that attempts to determine what was said, and by whom. A variety of analog and digital analysis processes are used to analyze the frequency and amplitude characteristics of a human voice and compare it against known samples. [More to come on this] [Gabe] ===== Section X - Recording technique, Speakers, Acoustics, Sound -- Q10.1 - What are the various stereo microphone techniques? [9/98] There are basically three classes of stereo microphone techniques: 1. Coincident pair (AKA XY) 2. Spaced pair (AKA AB) 3. Near-coincident pair The technique known as "binaural" is sometimes treated as a fourth category; in practice, it is strictly related to the concept behind the near-coincident pair. The basic ideas behind these three techniques are as follows: COINCIDENT PAIR Relies on two identical directional microphones (e.g. cardioid), mounted as close as possible to each other, angled apart. The stereo image is produced by intensity difference of the signals arriving at the two microphones. The stereo spread is wider for larger angles and for narrower polar patterns. SPACED PAIR Relies on two identical microphones, spaced by several feet and pointing towards the sound source. The most popular method uses Omni's (spaced Omni's), but, in principle, any polar pattern can be used. The stereo image is produced by time difference of the signals arriving at the two microphones. The stereo spread is wider for larger spacing. NEAR-COINCIDENT PAIR Relies on two identical directional microphones angled apart, with their capsules spaced usually by a few inches. The stereo image is produced both by intensity and time difference of the signals arriving at the two microphones. Larger angles and greater spacing enhance the stereo spread. BINAURAL (AND RELATED TECHNIQUES) The baffled Omni's / dummy head technique is based on the idea that our ears behave as omnidirectional pickups, but their polar pattern is significantly altered by the head, which acts as a mechanical equalizer (it filters mid and high frequencies in a rather complicated way). So, by using two Omni microphones with a baffle, or even a model head between them, the whole system tends to behave as our hearing system. It actually happens, and these techniques give extremely good results on headphones, if they're used correctly. The result is less satisfactory on loudspeakers, and sometimes a circuit called "Crosstalk Canceller", which combines delay and equalization of the L and R signals, is used to give a proper imaging on loudspeakers as well. Two more important points: It is well known that, in general, the spaced pair technique tends to exaggerate stereo-spread and produce the effect known as "hole in the middle": the larger the separation between microphones, the more evident the hole. In order to minimize this effect, a third mic is often placed between the two in order to make the central image more solid. An array of three spaced Omni's (called "Decca tree") is possibly the most famous arrangement of its kind, and has been extensively used by Decca for classical music recordings (albeit not exclusively). The mid-side (MS) technique is actually a special case of the coincident pair technique. It uses two microphones: one with a figure-of-eight polar pattern, facing sideways (S signal), and another microphone of any polar pattern, pointing at the center of the sound source (M signal). The two signals can be easily matrixed into L and R signals. More details about this technique in a special section. [Marco] -- Q10.2 - How do I know which technique to use in a given circumstance? [9/98] This largely depends on at least two factors: the kind of sound you're after and the acoustic behavior of the room or hall you're recording in. I personally believe that recording an orchestra or, anyway, an extended set of acoustic instruments still be one of the most challenging, rewarding and at times bewildering experiences that an engineer may have. It may be very frustrating to even come close to what is, by your standards and judgement, the ideal sound; also because there are no strict rules of thumb. Anyway, keep in mind a few basic guidelines - and then experiment! Absolutely basic hint: move them more, equalize them less. That is - if the tonal balance is wrong, try displacing the microphones or changing technique, before resorting to EQ. A flat recording with a proper choice of microphones and positioning will invariably sound better than a "wrong" recording, tentatively cured by means of equalization. In most cases, you will get more satisfactory results by using a well-thought out and well-positioned stereo pair with a minimum number of spot microphones, rather than putting as many as 20 of the latter in a formidable attempt to mix them so to recreate the original sound. The modern digital formats are very unforgiving when it comes to phase distortions and weird coloration's of the sound arising from too many microphones interacting. A band director I worked with was very puzzled when I told him I had the idea of recording their album with no more than eight mics (I actually used six), because their previous work had involved no less than twenty-six spot microphones. When I listened to it, it sounded awful, to me - the snare being much more prominent than the whole trumpet section... so, if it's not necessary, don't do it! Omni's tend to be significantly better behaved than directional microphones, as far as frequency response goes. In that sense, they are usually more suitable as ambience microphones, unless you have to work in an environment which is acoustically bad (e.g. resonance's, etc.). One problem that may arise with Omni's is that they tend to pick up a lot of ambience from the back. This may make it necessary to move them closer to the source in order to obtain a proper balance between direct sound and reverberant field - which is not always ideal when one is recording an ensemble. Directional microphones are less sensitive to the ambience, and can therefore be placed farther from the ensemble while retaining a correct balance. Some people are very fond of the spaced Omni's technique. The main problem is that the sound may become phase shifted, especially at high frequencies, and this technique is not mono-compatible. Also, it is sometimes difficult to fit the almost universally used third center microphone in the picture (see Q10.1). The really good side of this approach is that you get a very broad and yet clear stereo image, and, because Omni's are used, possibly the most faithful tonal balance. A spaced pair technique worth mentioning is the Faulkner set-up, involving two figure-of-eight microphones spaced 20 cm apart and pointing towards the sound source: one might expect to have a narrow stereo image, but that's not the case - and the sonic image is quite focused. The MS technique is quite handy when you are dealing with ensembles, which require a relatively close microphone placement (e.g. choirs), and in a good reverberant ambience. Also, I personally like to think more in terms of stereo width than L/R position, which is necessarily the case with the MS technique. With a small ensemble, in a dryish room, the Blumlein technique (two coincident figure-of-eight microphones angled 90=B0 apart) may yield very good results, mostly in terms of stereo imaging. It is my very personal opinion that an average method which will rarely produce awful results is the ORTF (semi-coincident cardioids spaced by 17 cm and angled 110=B0 apart). A few recordings I've done with this technique sound superb to me, others sound decent, very few acceptable. None sounds totally wrong, though. I'd personally recommend this technique for a quick recording in a place whose acoustic character is unknown and as a good starting point, in general. 8. Last but not least, if you're dealing with a well-balanced and not-too-big ensemble or any small group with instruments whose loudness is on the soft side). I recommend you try the Jecklin disc (or OSS technique). Two Omni's separated by a circular baffle of acoustic foam 28 cm of diameter: a first step towards binaural recordings which can lead to astonishing results. Be careful, though positioning is extremely crucial, and can be an excruciatingly gory affair, at times... Also, this is "not so recommended" in strongly reverberant halls, as the Omni's tend to pick up more ambience than necessary. One last small detail: some Omni's are more suitable for work in the near field (that is, close to the sound source); whereas others are happier to operate in the so-called diffuse field (relatively far from the sound source). For stereo recording of large ensembles, the latter are recommended. Some manufacturers (e.g. Bruel & Kjaer) can provide caps, which act as mechanical equalizers, so that one may adapt the microphones to diverse situations. [Marco] -- Q10.3 - How do I soundproof a room? Despite what you may have seen in the movies or elsewhere, egg crates on the wall don't work! First, understand what "soundproofing" means. Here we mean the means and methods to prevent sound from the outside getting in, or sound from the inside getting out. The acoustics within the room are another matter altogether. There are three very important requirements for soundproofing: mass, absorption, and isolation. Actually, there are also three others: mass, absorption, and isolation. And to finish the job, you should also use: mass, absorption, and isolation. Sound is the mechanical vibration propagating through a material. The level of the sound is directly related to the size of those vibrations. The more massive an object is, the harder it is to move and the smaller the amplitude of the vibration set up in it under the influence of an external sound. That's why well isolated rooms are very massive rooms. A solid concrete wall will transmit much less sound then a standard wood-framed, gypsum board wall. And a thicker concrete wall transmits less than a thinner one: not so much because of the distance, but mostly because it's heavier. Secondly, sound won't be transmitted between two objects unless it's mechanically coupled. Air is not the best coupling mechanism. But solid objects usually are. That's why well isolated rooms are often set on springs and rubber isolators. It's also why you may see rooms-within rooms: The inner room is isolated from the outer, and there may be a layer of absorptive material in the space between the two. That's also why you'll also see two sets of doors into a recording studio: so the sound does not couple directly through the door (and those doors are also very heavy!). If you are trying to isolate the sound in one room from an adjoining room, one way is to build a second wall, not attached to the first. This can go a long way to increasing the mechanical isolation. Try using two sheets of drywall instead of one on each wall, and use 5/8" drywall instead of 3/8", it's heavier. But remember: make it heavy, and isolate it. Absorptive materials like foam wedges or Sonex and such can only control the acoustics in the room: they will do nothing to prevent sound from getting in or out to begin with. [Dick] There is a very good reference on Studio and Control Room design with lot's of good information on soundproofing and materials located on the web at the following URL. The Recording Studio Design Page http://www.mcs.net/~malcolm/ -- Q10.4 - What is a near-field monitor? A near field monitor is one that is design to be listened to in the near field. Simple, eh? The "near field" of a loudspeaker is area where the direct, unreflected sound from the speaker dominates significantly over the indirect and reflected sound, sound bouncing off walls, floors, ceilings, the console. Monitoring in the near field can be useful because the influence of the room on the sound is minimized. Near field monitors have to be physically rather small, because you essentially need a small relative sound source to listen to (imagine sitting two feet away from an 18" woofer and a large multi- cellular horn!). The physics of loudspeakers puts severe constraints on the efficiency, power capabilities and low frequency response of small boxes, so these small, near-field monitors can be inefficient and not have the lowest octave of bass and not play ungodly loud. [Dick] -- Q10.5 - What are the differences between "studio monitors" and home loudspeakers? It depends upon whom you ask. There are speakers called "monitor" speakers that are found almost exclusively in homes and never in studios. The purpose of a monitor speaker is to monitor the recording and editing process. If you buy the concept that they are but one tool in the process (and probably the most frequently used single tool at that), and if you buy the concept that your tools should be flawless, than the requirements for a monitor speaker are easy to state (but hard to achieve): they should be the most neutral, revealing and unbiased possible. They are the final link between your work and your ears, and if they hide something, you'll never hear it. If they color something, you might be tempted to uncolor it incorrectly the other way. There is another camp that suggests that monitor speakers should represent the lowest common denominator in the target audience. The editing and mix process should be done so that the results sound good over the worst car speaker or boom box around. While such an idea has validity as a means of verifying that the mix will sound good over such speakers, using them exclusively for the process invites (as has been thoroughly demonstrated in many examples of absolutely terrible sounding albums) the possibility of making gross mistakes that simply can't be heard in the mixing process. [Dick] -- Q10.6 - My near field monitors are affecting the colors on my video monitor. What can I do to shield the speakers? Despite a lot of folk lore and some very impressive sounding wisdom here on the net and in showrooms, there is effectively nothing that you can do to the speakers or the monitor, short of moving them away from one another, that will solve this problem. The problem comes from the magnetic field created by and surrounding the magnets in the loudspeaker. It's possible to design a magnet that has very little external field, but it can be an expensive proposition for a manufacturer. If the magnets do have large external fields, the only technique that works is by solving the problem at the source: the magnet. Special canceling magnets are used, sometimes in conjunction with a "cup" shield directly around the magnet. You'll hear suggestions from people about placing a sheet of iron or steel between the speakers and the monitor. That might change the field, but it will not eliminate it. As often as not, it will make it worse. You'll also here from people about shielding the speaker by lining the enclosure with lead or copper. This method is absolutely guaranteed to fail: lead, copper, aluminum, tin, zinc and other such materials have NO magnetic properties at all, they will simply make the speaker heavier and won't solve the problem at all. There is but one material that has a shot at working: something called mu-metal, a heavy, very expensive, material designed for magnetic shield that requires extremely sophisticated and difficult fabrication and annealing techniques. Its cost is far greater than buying a new set of speakers that does not have the problem, and it may not even work if the majority of the offending field is radiated from the front of the speaker, which you obviously can't shield. Try moving the speakers relative to your monitor. Often, moving them an inch or two is enough to cure the problem or at least make it acceptable. Sometimes, placing the speakers on their sides with the woofers (the major offenders in most cases) farthest away from the monitor works. [Dick] ----- Section XI - Industry information Q11.1 - Is there a directory of industry resources? Q11.2 - What are the industry periodicals? Q11.3 - What are the industry trade organizations? Q11.4 - Are there any conventions or trade shows that deal specifically with professional audio? ----- Section XII - Miscellaneous Q12.1 - How do I modify Radio Shack PZMs? [Chris?] Q12.2 - Can I produce good demos at home? Q12.2 - Can I produce good demos at home? [9/98] Sure you can. The problem is focusing on what one wants to do, depending on the kind of music to be recorded and the results one aims to obtain. There is a whole world of difference between, say, demoing your latest dance song and recording a tune based on acoustic guitar and vocals. So, in order to avoid an inordinate growth of this section, I will try to give a few hints that might turn in handy if the purpose is that of making an acceptable recording. It is hard to state any universal rules, but I guess that there are some general lines that at least make some sense, no matter whether you're MIDI-based want to record your own band in the basement. The following ideas are not, of course, carved in stone, and someone could argue about some of them; I guess they at least suggest some basic guidelines, though. Remember that a good demo's purpose is that of giving a fair idea of what a band sounds like and how a song is written, arranged and performed. It is not that of sounding like a top-flight production. If you have to record drums, remember that a few decent microphones give you a better sound than 10 crappy ones. Try as follows: two overhead mics above the cymbals (condensers, if possible; otherwise a couple of SM57s will do); one mic on the kick, and one of the snare. Try to get a reasonable balance, and record everything to 2 tracks on your multi-track, with overheads panned hard left and right and kick and snare panned dead center. Try adding a dash of suitable reverbs to the latter, if you wish and a bit of compression as well, if you have the chance. Even one overhead might work, if you're short of tracks. You'll end up with mono drums, but I'd argue that's less relevant than one might think, at demo-level. On the subject of drum-micing - but also in general: remember that EQ can be used to cut, as well as boosting! Cutting the LF a bit on the overheads (and, in general, on signals not having predominant bass frequencies) may help a lot in terms of image sharpening. EQ-wise, try to give each part a definite area of the frequency spectrum to live in, with a pinch of salt. An electric guitar with a serious bottom end may sound marvelous on its own, but it might as well become a spike in your side when the bass is added. You may discover that cutting the LF a bit on the guitar could help more than boosting it on the bass. Try not to over-use reverb. If you listen carefully to almost any pro recording, you will be amazed at how subtly and sparingly reverbs are used. The same goes for delays and other effects, unless you're after a deliberate abuse of 'em, of course. If you have a limited number of tracks and need more, you can easily bounce more tracks together to free space. On digital machines as the ADAT this doesn't affect the quality of your recording significantly. But if you have an analogue multi-track, after two bounces you're likely to be in trouble because of layered tape noise. On cheap analogue multi-tracks the HF signals tend to blur, as well. To avoid this, you may want to mix the tracks you have onto some digital medium (DAT, Minidisc, or what-have-you), and then record them back onto a fresh section of the tape. The bonus is that you don't actually have to destroy your previous tracks. About compression: in my opinion, if you have a compressor, you'd better use it on the vocals and on other instruments when needed, rather than compress the whole mix, unless you know exactly how to do it. If you have an analogue multi-track you can try recording every signal a bit hotter than suggested in your manual. You will probably get a nice, warm and SLIGHT compression, which usually makes the whole thing sound tighter. 8. As for mixing: A: don't mix on headphones B: keep your loudspeakers at reasonable levels C: turn them definitely down every once in a while and check if you can still hear everything D: if you doubt some part is too loud, try listening to the mix from an adjacent room: if the part in question sounds louder than the rest, it is a good idea to turn its level down a bit E: if you mix onto a digital medium, stay well away from clipping (0 dB), and mind those transients in particular F: try, try, try, try again, then try... If you record for purely listening purposes, once you have a mix you're satisfied with, don't worry about it anymore. If you want to send your stuff around, though, and you are not completely satisfied with the quality of your demo, you may want to invest in the services of a mastering house. A top-class facility will be probably expensive, but there are places with good gear and trained engineers who won't leave you broke. You shouldn't expect miracles, but some little wonders can be achieved, at mastering stages. Most of all: try to have what everything in music should be about - Serious Fun. "When you try to drop in / on the head of a pin / that's Recording". Remember! [Marco] -- Q12.3 - How do I remove vocals from a song? You probably want a device called the Thompson Vocal Eliminator made by LT Sound in Atlanta, Georgia. The device will cancel out any vocal that is mono and panned dead center. The unit works by filtering out the low frequencies phase canceling the rest of the signal, and then mixing the filtered bass back in. The result is a signal with the center, common information cancelled out. Sometimes it works well, other times it sounds awful. [Gabe] ----- Section XIII - Bibliography Q13.1 - Fundamentals of Audio Technology Q13.2 - Studio recording techniques Q13.3 - Live recording techniques -- Q13.4 - Digital audio theory and practice Ken C. Pohlmann, "Principles of Digital Audio," SAMS/Prentice Hall, 1993 [Excellent introduction and explanation of all aspects of digital audio principals and practices] John Watkinson, "The Art of Digital Audio," Focal Press, 1989 [ditto] Francis Rumsey and John Watkinson, "The Digital Interface Handbook," Focal Press, 1993 [deals with interfacing standards and protocols for both audio and video] IEC Standard Publication 958, "Digital Audio Interface," International Electro-Technical Commission, 1989, 1993 [THE standard!] Claude E. Shannon and Warren Weaver, "The Mathematical Theory of Communication," U. Chicago Press, 1963. [The seminal work on digital sampling...includes the original 1948 BSTJ sampling paper] -- Q13.5 - Acoustics Leo Beranek, "Acoustics," New York, American Institue of Physics, 1986 [The bible, but heavily mathematical, very thick and obtuse, a good book despite the Lincoln Hall disaster.] F. Alton Everest, "The Master Handbook of Acoustics," Tab Books 1989 [A good entry level text on acoustics, studio and listening room design, not heavily mathematical]. Arthur H. Benade, "Fundamentals of Musical Acoustics," New York, Dover Publications, 1990 [A thorough book, more on the acoustics of PRODUCING music rather than REPRODUCING it.] Any good physics text book is useful for keeping the bunk at bay. Q13.6 - Practical recording guides ----- Section XIV - Miscellaneous Q14.1 - Who wrote the FAQ? [9/98] GABE WIENER - IN MEMORIUM 1997/04/10 Since his days as a student at Columbia University, Gabe did wonderful work as a recording engineer, producer, restorer, and entrepreneur. The company he founded, Quintessential Sound, is successful and highly regarded; PGM Records, his label devoted to under-recorded Baroque music, has received many excellent reviews I knew Gabe well from the Internet, and from telephone conversations. It is a sign of our times that we could be comfortable and friendly with one another even though we stood face to face only on three occasions. The last of those was on the day that the first PGM release was recorded. I consider it a great privilege to have introduced Gabe to Gavin Black, the harpsichordist who performed the music on PGM 101; Gabe and Gavin worked closely together since then. That day I watched Gabe work. He had rented the best space he could find--the American Academy of Arts and Letters, on 155th St. in Washington Heights, NY--and had brought in his Nagra-D tape recorder, microphones, a laptop connected to the recorder, and a few other items. With him were an assistant, Gavin, and a harpsichord technician. Having heard Gavin's playing on many occasions (we were college roommates almost 20 years ago), I was familiar with his playing, and was thus able to concentrate on Gabe's work. Gabe was amazing. His focus was beyond compare, his musical acumen completely blended with his engineering work. Most of all, his natural ability to work with the people around him carried the project to completion. The CD was recorded in a single day. In mourning him, I can take comfort only in knowing that Gabe Wiener was given several years of that which most of us dream of having for a day: doing exactly what he wanted to do, excelling at it, gaining the respect of his peers, and succeeding in worldly terms. However crushing his early death seems, the memory of this time, all too short, when Gabe's enthusiasm, drive, skill, and sheer pleasure led to fine music-making and superb recording for all of us to hear, is something to live by. Roger Lustig (julierog@ix.netcom.com) Q14.2 - How do you spell and pronounce the FAQ founder's surname? Those who paid attention during fourth grade English learned the rule "I before E except after C." There is no C in my name, and therefore the I comes before the E. My last name is spelled "Wiener" not "Weiner." And it rhymes with cleaner, and starts with a W, not a V. [Gabe] ===== Gabe Wiener Dir., PGM Early Music Recordings A Div. of Quintessential Sound, Inc., New York Mastering-Restoration (212) 586-4200 http://www.pgm.com "I am terrified at the thought that so much hideous and bad Recording music may be put on records forever."--Sir Arthur Sullivan