Audio Basics: Digitising Audio


There are many way to digitise audio, the process of converting analogue audio to digital data. The miniaturisation of electret condenser microphones combined with the low cost of the electronics required for the process has resulted in a plethora of devices which have the capability to record audio ie. notebooks, mobile phones, mp3 players, cameras, web cams.

Computers/Notebooks/Netbooks


cc licensed flickr photo shared by clearf
For educators the most commonly used tool will be their computer or notebook/netbook which used a dedicated sound card to perform the conversion. Many computers and especially notebooks will have this fully integrated and there will be not separate sound card. Either way sound cards conventionally have 3 or 4 inputs.

Microphone In

Typically a 3.5 mm socket this is used to connect the analogue microphone jack. As most microphones designed for computers use electret condenser microphones they requires 3-9v power to operate, this is provided by the computer/notebook via the socket. Apple MAC computers tend don't provision for this and will require a powered microphone to operate.

Audio Out

This is a 3.5 mm socket used to plug either a set of headphone or a powered computer speakers.

Line In

Some sound cards will have a 3rd socket labelled Line In or they will combine it with the Mic In socket and provide a software option to select which mode is active. Line In refers to the signal levels suitable for the input socket. Microphones have very low signal levels so require high amplification by the sound card. If you wish to record from a device such as a cd player, mp3 player or from a mixing deck then you will need to use the line In socket or setting as these levels are much higher set at a standard level referred to as line level. Many devices which output audio will have a line out socket which you would connect via a lead to your line In on the computer.

Optical digital audio input (S/PDIF)

S/PDIF is used to interconnect high quality audio equipment in the professional audio field. It has begun to appear in consumer devices such as home theatre systems which support Dolby Digital or DTS surround sound. Using TOSLINK cables S/PDIF will also the transfer of audio data using optical technologies. Most Windows based machines don't provide support for S/PDIF but optional high level sounds cards do. Contemporary Apple MAC computers combine the line In socket with S/PDIF.

Dedicated Audio Recorders



cc licensed flickr photo shared by sridgway

For those who want higher quality recording and require a portable solution there are a range of dedicated digital audio recorders available on the market. These typically have the all or some of the following functionality

  • Record in uncompressed or compressed formats ie. mp3
  • Record to flash memory eg. sd card accessible via usb port
  • Light and portable with long battery life 5-10 hrs of continuous operation
  • Have high quality internal microphones and provision to connect to a range if external microphones
  • Allow line in recording
  • Live monitoring of audio
  • Gain control on input levels
Some examples on the market today are

Edirol R-09HR

Maranz PMD661

Zoon H4n

Mobile Phones & mp3 Players



cc licensed flickr photo shared by sridgway

Most mobile phones these days will allow you to record audio to the internal memory in a compressed format such as mp3. Smartphones such as the Apple ipod, RIM's Blackberry and Android based phones all have recording built in or numerous down loadable dedicated applications exist to support the function. While very convenient this method relies on the small internal microphones of the phone and are only suitable for taking memos where the subjects are in close proximity.

Many portable mp3/music players such as the Apple ipod range and Iriver also support recording to memory. Like phones, poor microphones and sound quality are an issue but they are very cheap and highly portable and may be fine for recording your own presentations or an interview with 2 people at close range. The Apple ipod range has a large range of 3 party hardware extensions to facilitate audio recording.


cc licensed flickr photo shared by sridgway

Audio sample settings

Irrespective of the device you are using to record audio you will need to decide what settings you wish to use prior to the digitisation process. These set the quality of the sampling process and cannot be altered once the file is captured so it is imperative you make the right decision first off.

When recording capturing audio we can control many of the settings, which will affect quality and file size.

When determining the settings for sound files there is a trade-off between file size and sound quality - the higher the sound quality, the larger the file size, and vice versa.

There are 2 factors which determine the quality and file size of your recording.

Factors Affecting File Quality and Size of a recording


Sample rate

220px-Analog_signal.png
Sampled_signal.png
Analogue Signal
Sampled File
cc licensed photo shared by Wikimedia Commons

When recording audio we can select the level of sampling, or the sample rate. The sample rate refers to the number of samples per second taken by the sound recorder as it is recording.

Different sample rates give us different sound quality. The more samples of a sound that are taken, the better the sound quality. However, the higher the sample rate, the larger the file size.

Some common sample rates found on recording hardware are listed below

Sample Rate
Quality
Frequency range
11,025 samples per second, (11 Khz)
Telephone quality
0-5,512 hz
22,000 samples per second, (22 Khz)
AM Radio quality Often used for podcasts
0-11,025 Hz
44,100 samples per second, (44.1 Khz)
CD Music quality
0-22,050 HZ
48,000 samples per second, (48 Khz)
DVD, Digital Audio Tape (DAT)
0-24,000 HZ
96,000 samples per second, (96 Khz)
DVD/BLUE RAY quality surround sound
0-48,000 HZ

For recordings with mainly voice, anywhere between 22,000Hz (Hertz = samples per second) and 44,100Hz is acceptable.

22,000 or 22 Khz samples per second – AM radio quality – will give us a small audio file size while still retaining good sound quality and is often used for podcasting.

The higher the frequency of the sounds in the analogue source file the higher the digital sample rate will need to be to faithfully reproduce the signal. Musical instruments and singing require a higher sample rate than pure talking voices.

The Nyquist–Shannon sampling theorem states that perfect reconstruction of a signal is possible when the sampling frequency is greater than twice the maximum frequency of the signal being sampled. - Sampling Rate, From Wikipedia, the free encyclopedia

Sample Size

When a sample is taken by a analogue to digital converter it converts it to a digital number. Numbers in computers are expressed as a series of bits, the more bits the larger the number which can be expressed. For example a 16 bit or "word" of information can store a value between 0 and 65535, 24 bit between 0 and 16,777,216 levels, whereas a 32 bit number can store between 0 and 4294967295. What this mean in practical terms is that the larger the bit size the greater the resolution and accuracy you have in the number used to express the level of the sample at a moment in time.

Most recorders and software programs will have the following options

16 Bit
24 bit
32 Bit
32 Bit Float (here an extra bit is added after the recording is made to provide greater mathematical accuracy in post production)

The most common is 16 bit and is perfectly sufficient for most purposes, certainly for podcasts. Once again the larger the bit size the larger the sampled file will become. For high fidelity recording such as music then 24 bit is a preferred option.

Space required for an uncompressed recorded stereo digital file (compressed mp3 is shown as comparative example)

Sample Size
Sample Rate
Bit Rate
File size for 1 Min
File size for 3 Min
16
44.1 Khz
1.35 Mbit/Sec
10.1 Mb
30.3 Mb
16
48 Khz
1.46 Mbit/Sec
11.0 Mb
33 Mb
24
96 Khz
4.39 Mbit/Sec
33 Mb
99 Mb
mp3 file
128 K/bit
0.13 Mbit/Sec
0.94 Mb
2.82 Mb
Reference: TweakHeadz Lab, 16 Bit vs. 24 Bit Audio: Discussion of the mysteries behind bit-depth, sample rates and sound quality

Most dedicated audio recorders will provide you a range of pre set combinations for uncompressed recording.

For example on a Marantz PMD661 you can choose from the following for uncompressed audio in the *.wav format

Sample Rate

44.1 Khz
48 Khz
96 Khz

Rec Format (Sample Size)

PCM-16
PCM-24

Stereophonic (Stereo) or Monaural (mono)


Stereophonic sound or more commonly called stereo is the process of simultaneously recording 2 independent tracks of audio. Used in combination with strategic multiple microphone placement during recording the technique was designed to simulate being present at the recording while listening to the recording using headphones.

As you
24 minutes of 16 bit 44.1 Khz Stereo = 256 Mb
48 minutes of 16 bit 44.1 Khz Mono = 256 Mb

Audio file formats


There are many different audio file formats based upon operating system preferences, software choice and encoding codecs employed. The two most common in our work are:

Wave file (.wav) – Default uncompressed audio format on Windows. Supported on almost all computer systems. Most recording programs and many recording MP3 players can record audio data in this format.

MPEG-4 (*.m4a) - This is a compressed container format to store audio recorded in the MEPG format. Common in the Apple ecosystem and the latest iteration of ipod nanos and iphones record to this format.

MP3 (.mp3) – Compressed audio format that can considerably reduce file size while still retaining acceptable sound quality. MP3 compression works by eliminating the frequencies that the human ear is unable to hear. Find more information on the processes employed at the following website.

Because we want to store the files on portable devices and distribute them over the Internet, our goal is to achieve the smallest file size possible while still retaining acceptable sound quality. The compressed MP3 audio file format is best for this.