Date of Original Version



Technical Report

Abstract or Table of Contents

In this technical report, we present a powerful new machine learning-­‐based audio processing system that enables synchronization of audio-­‐rich video and discovery of specific sounds at the frame level within a video. This tool is particularly useful when analyzing large volumes of video obtained from social media and other open source Internet platforms that strip technical metadata during the uploading process. The tool creates a unique sound signature at the frame level for each video in a collection, and synchronizes videos that are recorded at the same time and location. The use of this tool for synchronization ultimately provides a multi-­‐perspectival view of a specific event, enabling efficient event reconstruction and analysis by investigators. The tool can also be used to search for specific sounds within a video collection (such as gunshots). Both of these tasks are labor intensive when carried out manually by human investigators. We demonstrate the utility of this system by analyzing video from Ukraine and Nigeria, two countries currently relevant to the work of Center for Human Rights Science collaborators