Cesar starts with raw video footage recorded at a specific event by different people, at different moments and from different perspectives. Next, software called a “narrative engine” uses what it knows of the relationships between people to create dynamic stories. These are tailored to an individual’s preferences, interests and social connections by automatically stitching together parts of clips into a seamless video stream.
“The stories are highly personal depending on the recipient of the story,” says Cesar.
The system works by synchronising all the video clips with a master audio track that is recorded at the event. The audio of each clip gives it a digital fingerprint allowing similar footage in different clips to be matched up. The software analyses the video content - applying facial recognition techniques, for example - and contextual information added by the film-maker. It then puts together clips or partial clips, producing a bespoke video edit for every user.
The system was tested at a school concert that was filmed with 12 cameras - some fixed, some belonging to parents in the audience - generating more than 300 raw video clips. These clips were pooled and annotated with personal details, including who was in the clip, or which musical instrument was shown. Most parents agreed that the tailored films made the viewing experience more personal.
The team presented the work at the 2012 Symposium on Document Engineering in Paris, earlier this month. “We’re living in a world of abundant content,” says Mor Naaman of Mahaya, a start-up that is developing software to find and organise social media shared from real-world events. “The real technical challenge is to do this at scale.”