Quantcast

Don't have an account? Register Now! Forgot password?

Maximum IT
Features

White Paper: Audio Fingerprinting

comment Commentsprint Printemail EmailDeliciousDiggStumbleUponRedditFacebookSlashdot

You’re twiddling your thumbs while waiting in the check-out line at your favorite retailer and you hear a great new song over the PA system. You could turn to the next person in line and ask if they know it—engaging in an impromptu but probably fruitless game of Name That Tune—or you could whip out your smartphone, record a snippet of it, and send it to a music-discovery service. It will report back with the name of the song and that of the artist who recorded it, which album it appears on, what year it was released—heck, with a couple of button presses, you can buy the song right then and there.

What technology magic makes such a thing possible? It’s called audio fingerprinting, and it’s gaining significant traction with both music lovers and rights holders looking to protect their assets. There are two basic components to an audio-fingerprinting system: A database containing the unique audio fingerprints of millions of songs, and a tool that can analyze a song and search that database for a match.

Creating an audio fingerprint is a lot trickier than it sounds. The human ear will perceive the CD version of the Beatle’s “Eleanor Rigby,” for instance, to be identical to the version that’s ripped and encoded as an MP3 at a bit rate of 128 Kb/sec—audio quality aside, they’re both “Eleanor Rigby.” A computer examining the code used to store those two files, on the other hand, will perceive them to be completely different. And the same goes for a third version encoded using FLAC and a fourth encoded using AAC.

To get around this problem, a software algorithm—a precise sequence of instructions adhering to a specific set of rules—must analyze the actual sound of a song to determine its perceptual characteristics instead of simply relying on the way bits are arranged to store it. Most audio-fingerprinting systems rely on two types of perceptual characteristics in music: semantic features, such as beats per minute, genre, and mood; and non-semantic features, such as pitch detection, amplitude, and spectral flatness (a measurement used to describe the power levels of each band in a waveform). Semantic features are inherently more difficult to compute than non-semantic features, because they don’t always have clear and unambiguous meanings and they can evolve over time. “Hard rock,” for instance, classified a completely different type of music 30 years ago than it does today, while “prog rock” and “hip-hop” didn’t exist at all back then.

Divide and Conquer

Most audio-fingerprint systems divide an audio signal into a series of frames. They then use some form of a fast Fourier transform algorithm to track changes in the semantic and non-semantic features described above. Finally, a classification algorithm examines each frame and organizes them into sub-fingerprints. The basic unit that contains enough data to identify an audio clip consists of a series of sub-fingerprints and is known as a fingerprint block. These audio fingerprints are then stored in a database.

An unidentified song is analyzed by software running on a PC, smart phone, or similar device in the same way that the songs in the database were. The software generates a hash table that serves as an index to the fingerprint database, compares the unidentified song’s fingerprint to those in the database, and searches for a match. The software doesn’t need to analyze the entire song to derive a fingerprint; typically, just three or four seconds are enough. The algorithm needs only to find points of similarity between the unknown song and an entry in the database to make a match. In this respect, identifying songs based on their musical fingerprint is very similar to the way that a forensic expert matches a suspect’s fingerprint to one found at a crime scene.

Real-world Applications

Consumers can already choose from a number of software applications that make good use of audio fingerprinting. The free, open-source Picard program, for instance, can help identify mystery tracks in your music library by analyzing songs and comparing their characteristics to audio fingerprints stored in the free, user-maintained MusicBrainz metadata database. When a match is found, Picard can update the tracks’ ID3 tags with the correct song title, artist name, album title, genre, and more.

Shazam Entertainment offers software that you can install on an iPod Touch or iPhone, or Android smartphone that will record a snippet of music (creating a file about 20KB in size). The software sends this file to Shazam over the Internet (the recording is not retained on your phone), which will attempt to match the audio fingerprint with one in its database. If it’s successful, it will send back a message informing you of the track title, artist name, album title, and other information. From there, you can search for related videos on YouTube or buy the track from iTunes (if you’re using the iPod Touch or iPhone) or the Amazon MP3 store (if you’re using the Android).

Several companies offer commercial software that uses audio fingerprinting to help identify and track copyrighted music and video. Audible Magic, for instance, operates a fingerprint database containing more than five million works. Its customers use this data not only to identify copyrighted content moving over the Internet, but also from radio and television broadcasts.

COMMENTS:7
COMMENTS
avatarI wonder if this is how they

I wonder if this is how they verify the authenticity of an Obama Bin Laden message.

Login or register to post comments
avatar?

Isnt this like the 3rd repeat article of this?

Login or register to post comments
avatarCool tech

Picard seems to work really well in most cases...in fact, better than Winamp's auto-tagger, which I believe uses this very same technology.  I wonder how long it'll be before voice recognition using this technology is available to the masses.

Login or register to post comments
avatarPoor Picture

Could someone please explain to me what the picture of the memristor has to do with audio finger printing?

Login or register to post comments
avatarHuman Error?

Hahahahaha. Maybe audio fingerprints are processed by memresistors? Doubt it SEVERLY, but it's funny to see a mistake once in a while.

Login or register to post comments
avatarI've been dreaming about

I've been dreaming about this technology for years. Glad to see that it is ready for consumers to use.

Login or register to post comments

This Month's Issue
FEATURE How to Get FREE Programs, Services, Software & MoreFEATURE Digital Photo Printer RoundupHOW TOBuild a 3D CameraFEATUREDIY Arcade PCWHITE PAPERHow TRIM Works