top of page

Challenges and issues

The main issue with binaural sound in music is its cross compatibility with playback systems other than headphones. ‘Binaural recordings can sound amazing on headphones, but don't work very well on conventional stereo speakers.’ (Marshall, 2004). The main issue is that the spatial locating of sound when listening does not transmit well on loud speaker systems, this is due to crosstalk from speakers. When consuming through headphones, each ear will only hear what it coming from each headphone. Whereas, when listening on speakers each ear will hear both speakers. However, crosstalk cancelling in loudspeakers is possible and if the listener is positioned correctly (Rumsey, 2011).

Figure 5 Cross-talk Diagram

Transaural processing is an emerging technology that aims to resolve the crosstalk issue. It is when the audio signal from one speaker is phase inverted and fed into another. A time delay is used so it cancels out the sound from the original speaker when it arrives at the second speaker thus removing crosstalk. This idea can be used in both binaural and surround sound audio playback. However, transaural processing is only effective in the ‘sweet spot’ of the speaker system setup (Jost and Jot, 2000).

 

Another problem is that a large amount of music is consumed in mono, meaning there is no spatial effect created when played back. For example, many portable radios play back sound in mono and many clubs and venues do the same. Therefore, any stereo or binaural information is collapsed down to one single mono track (Robjohns and Senior, 2011).

 

Our brain also causes a problem with binaural. Vision is the most dominant sense in humans, meaning we rely on it more than any other sense such as hearing (Brainline.org, n.d.). Therefore, we naturally expect to see sounds that are happening. This means it is particularly difficult to portray sounds that are in front of a person through binaural as there is no correlation between the sound and the visuals (Pike, 2013).

 

Given the fact that no human head is the same, binaural sound will also never be perfectly matched to any person. However, it can still create a close representation even with a mass between two microphones rather than a dummy head (Pike, 2013). Results from binaural recording are often mostly realistic but are not entirely convincing. Again, this is due to the variation in the shape of the pinna (part of the ear outside of the head) size and shape of head and other bodily shape and features. It has not yet been possible to easily and quickly provide a personalised head related transfer function (HRFT) to each individual listener (Rumsey, 2014).

‘Put simply, HRTFs are the time, level, and spectral characteristics of the ear signals associated with different source locations.’ (Rumsey, 2014).

It has been proven that using an HRTF that does not match with the consumers can, in some cases, provide better localization accuracy. If a persons’ head and ear design is better for localising sound, when used for recording in a binaural setup, it creates a more realistic experience for the listener (Rumsey, 2011).

bottom of page