Forró Fingerprinting

Audio Fingerprinting is used as a building block in Automatic Content Recognition (ACR). The job of ACR is to identify content such as television shows, advertisements, movies, and music. Forró is designed for the monitoring use case: it scales to hundreds of millions of consumer media devices (e.g., TVs, Set Top Boxes). With the monitoring use case, each participating consumer media device generates fingerprints every second from the audio passing through the consumer device's audio pipeline. To do this cost effectively requires careful design of the algorithms used in not only fingerprint generation but also in ACR infrastructure design.

Forró ACR generates viewership data for non-real-time and near-real-time applications. For non-real-time applications the key metric is not identification time, but rather the minimum time span that the ACR can accurately identify. For near-real-time applications like trending, the delay in identifying content can be 5 to 10 seconds.

The monitoring use case is very different from Shazam, which is 1) user initiated and thus represents far less frequent queries per consumer device, 2) it disambiguates music rather than the broader set of media, and 3) it is real-time requiring responses preferably in under 1 second.

The non-real-time nature of most of the intended applications allows optimizations not possible for Shazam and similar systems.

Forro exploits advances in machine learning and utilizes hardware acceleration not available in older consumer devices, but now widely present.