ac_Sound Transform collage by Iuri Kothe
Although our repositioning of augmented reality as aura recognition
(aurec) has brought us closer to the perspective necessary to envision new aurec applications, there remains a major obstacle facing widespread use of aurec: the user interface. Many opinion leaders are vocal advocates of visual interfaces for aurec, whether they take the form of smart phone aurec “windows” or high-tech sunglasses/contact lenses that display visual overlays directly in front of our eyes.
The shortcoming in these visions is threefold; 1) there is negligence of our eyes' naturally narrow perceptual “bandwidth” – our eyes' function is very similar to a stereoscopic laser scanner; we focus on every word when we read, not the whole page. Displays which cram our visual field with metadata are therefore bound to be highly distracting. And even if the overlays very subtly follow our eyes to only display information about the things upon which we hold our gaze, there is a fundamental secondary flaw which will hold back this method of aurec for years to come: 2) economics. At the present moment, aurec optical gear is very expensive, is not being mass produced, and is likely to be unwieldy, nevermind a major fashion blunder. Aside from some very enthusiastic science fiction fans, few regular consumers are prepared to line up this holiday season and buy aurec goggles. 3) Lastly, while we will most likely give up much privacy in order to benefit from aurec, we will still be more inclined to use aurec if we can do so discreetly. Holding our aurec devices out in front of us in order to see overlays on a person is not exactly subtle.
So, what to do? How can aurec progress now if the technologies available for visualization are presently so fundamentally limited ways as to make their widespread use a fantasy better suited for at least 10 years hence? My sincere belief is that the answer lies soundly … in sound!
Humans have always used sound to carry metadata. With the wider “bandwidth” of our ears as a receptor, all manner of technologies – from church bells to alarm clocks, washing machine buzzers to AOL's “you've got mail” notification – have used sound as the medium of choice to transmit information which is proximately and temporally specific. Furthermore, we can pick out this sonic information amongst the myriad other background sounds with ease; our ears are made for it. Just as we can hear the voice of a friend in a noisy crowd, distinguish our own cellphone ringing in a busy train terminal, or listen just for the solo violin in an entire symphony, our sense of hearing is capable of filtering a vast volume of sonic information down to an incredibly granular level.
The economic advantages of sound are, in comparison to the visual options, tremendous. Everyone with a smart phone already has a pair of ear buds in their pocket, and we've already witnessed business people all over the globe become prototypical sonic cyborgs with their bluetooth earpieces. The costs in bandwidth, storage and processing power of delivering sound are far cheaper than visuals. The likelihood of early adoption of sonic aurec is therefore much higher, as far more people are likely to be early adopters if they don't need to buy new hardware.
Thanks to the prevalence of ear buds, sound is also a completely discreet carrier of information. By blending in with the background created by widespread use of personal mp3 players, aurec ear buds will not identify the wearer as unusual in any way. This covert quality will be critical for future models of aurec as well, as we expect more and more seamless aurec experiences and streamline the technology to make it integrated and less distracting.
Relevant Further Reading
The motivation for using non-speech sound in human-computer interactions is manifold, because:
- Sound represents frequency responses in an instant (as timbral characteristics)
- Sound represents changes over time, naturally
- Sound allows microstructure to be perceived
- Sound rapidly portray large amounts of data
- Sound alerts listeners to events outside their current visual focus
- Sound holistically brings together many channels of information
The different perceptual characteristics make sound ideal to complement visually displayed information.
Multi-touch designer and developer Richard Monson-Haefel considers sound as an important part of our user interfaces. As an application of “Calm Technology” which revolves around giving feedback about the running state of a system in the ‘periphery’ of our consciousness – a concept introduced by ubiquitous computing pioneers Mark Weiser and John Seely Brown – he proposes to attach a sound to every process running on your computer: an unique croak, chirp or trill – the sounds of frogs, crickets, and cicadas of a small pond at dusk. Resulting in an ambient environmental murmur people should be able to interpret.
“If every process had a unique croak, chirp, or trill – a sound that is the same every time the process is run – our computers would have a kind of natural ambient pond-like sound when it ran. At first we would take notice but after a short time the sound would settle into the periphery of our awareness so that we would only take notice when a new, and unexpected sound, was introduced. If we just installed some new software a new sound would register when the software was installed and become a part of the natural and healthy ambient audio rhythm of the computer. If, however, some new process – one we did not intentionally install – was introduced such as a virus, the new pond-sound (i.e. croak, chirp or trill) would be out of place and stand out. We might take notice and wonder, what new process is running?”
Writing about the human experience of night before electricity, A. Roger Ekirch points out that almost all internal architectural environments took on a murky, otherworldy lack of detail after the sun had gone down. It was not uncommon to find oneself in a room that was both spatially unfamiliar and even possibly dangerous; to avoid damage to physical property as well as personal injury to oneself, several easy techniques of architectural self-location would be required.
Citing Jean-Jacques Rousseau's book Émile, Ekirch suggests that echolocation was one of the best methods: a portable, sonic tool for finding your way through unfamiliar towns or buildings. And it could all be as simple as clapping. From Émile: "You will perceive by the resonance of the place whether the area is large or small, whether you are in the middle or in a corner." You could then move about that space with a knowledge, however vague, of your surroundings, avoiding the painful edge where space gives way to object. And if you get lost, you can simply clap again.
Ekirch goes on to say, however, that "a number of ingenious techniques" were developed in a pre-electrified world for finding one's way through darkness (even across natural landscapes by night). These techniques were "no doubt passed from one generation to another," he adds, implying that there might yet be assembled a catalog of vernacular techniques for navigating darkness. It would be a fascinating thing to read.
Some of these techniques, beyond Rousseau and his clapping hands, were material; they included small signs and markers such as "a handmade notch in the wood railing leading to the second floor," allowing you to calculate how many steps lay ahead, as well as backing all furniture up against the walls at night to open clear paths of movement through the household.
The history of independent cinema is one of the development of a visual language of increasing subtlety and expression. Locative or Mobile Media are in their infancy and are only just starting to explore work with a comparable range and depth. The idea that a real space could become the diegetic extension of narrative is a concept as relevant to architects as it is to cultural theorists, filmmakers or media artists. We are witnessing the birth of a medium for which sound is the most appropriate tool. In this medium, for obvious reasons the visual is finally on an equal footing with the auditory. To quote Sean Cubbitt:
In the evolving audiovisual arts, sound can no longer afford to subordinate itself to vision, nor can it demand of audiences that they inhabit only ideal and interchangeable space. Any relation to screen will require that the audience be mobilised. …. Sound enters space not to imitate sculpture or architecture, but, through electronic webs, to weave a geographic art that understands too that the passage of time is the matter of history: a diasporan art.”