Speech enhancement using the binary mask method and its application to law enforcement

18 January 2011

Toby Davies

This work presents an analysis of the recently-proposed 'binary mask' technique for the enhancement of intelligibility of speech in noise, with a view to its deployment as a forensic tool within law enforcement, an area for which machine-learning algorithms are feasible. Computational analysis of the algorithm and its variants is carried out with a view to increasing understanding an optimisation of key parameters, and this is followed by intelligibility testing with human listeners.
In both cases the performance of previous work is not achieved, with significant deterioration of intelligibility instead being shown. Other factors, however, such as alternative mask definition and use of data smoothing, are seen to have intriguing effects on both pattern classification performance and intelligibility. Analysis of the technique from a legal perspective is also provided, focussing on the potential admissibility of such a technique in court and the requirements for legal validation as a forensic tool.