WIT Press


Normalized Auditory Attention Levels For Automatic Audio Surveillance

Price

Free (open access)

Paper DOI

10.2495/RISK080441

Volume

39

Pages

10

Page Range

453 - 462

Published

2008

Size

631 kb

Author(s)

L. Couvreur, F. Bettens, J. Hancq & M. Mancas

Abstract

In this paper, we define features that can be computed along audio signals in order to assess the level of auditory attention on a normalized scale, i.e. between 0 and 1. The proposed features are derived from a time-frequency representation of audio signals and highlight salient regions such as regions with high loudness, temporal and frequency contrasts. Normalized auditory attention levels can be used to detect sudden and unexpected changes of audio textures and to focus the attention of a surveillance operator to sound segments of interest in audio streams that are monitored. The proposed algorithms have been tested on audio material consisting of security-relevant audio events (e.g., gun shot, glass breaking, woman’s scream, siren sound, etc) embedded in sound ambiences in public places (e.g., airport hall, metro station, subway train, sport stadium, etc). Keywords: public security, audio surveillance, normalized auditory attention levels, audio-based saliency levels, audio-based rarity levels. 1 Introduction Nowadays, public security represents a major challenge for public authorities and a profitable market for private companies. More and more surveillance equipment is deployed and human resources are enlisted in order to monitor and secure public places (e.g., urban zones, mass transportation hotspots, wide commercial malls, large sporting or cultural events, massive community demonstrations, etc). Such public security is classically achieved by remotely operating numerous video sensors at key locations in the places to be secured and conveying images via network equipment to screen walls in surveillance rooms. In order to enhance the awareness of the surveillance operators, this security system is more and more often completed with sensors of different

Keywords

public security, audio surveillance, normalized auditory attention levels, audio-based saliency levels, audio-based rarity levels.