About TalkMiner

Lecture webcasts are readily available on the Internet. These include class lectures, research seminars and product demonstrations. Webcasts routinely combine presentation slides with either a synchronized audio stream (i.e., podcast) or an audio/video stream. Conventional web search engines retrieve this content if you include "webcast" or "lecture" among your search terms, or search a website that specifically organizes lecture content. But users, particularly students, want to find the place in a lecture when an instructor covers a specific topic. Answering these queries requires a search engine that can search within the webcast to identify important keywords.

TalkMiner aggregates and indexes lecture videos available across the internet. The system processes RSS feeds from a variety of sites to collect lecture videos. The system automatically processes the video to generate metadata describing each talk including the video frames that contain slides, their time offsets, and the text recovered from those frames by optical character recognition. TalkMiner does not maintain a copy of the original videos. When a user plays a lecture, the video is played from the original website on which the lecture video is hosted. As a result, storage requirements for TalkMiner are modest. For more information on our use of video distributed on the internet, refer to our Privacy Policy.

TalkMiner's automatic video analysis identifies slide images within each lecture video. Because slide images underlie both our search index and browsing interface, automatic detection is a critical component of the system. The text-based search index organizes the videos according to the words extracted from the presentation slides. When a user issues a query, the recovered text from the entire lecture is used to identify relevant lecture videos. If the user selects a specific talk for detailed review, thumbnail images of all slides within that video are shown, and visual cues are displayed to indicate slides that contain one or more of the search terms. When the user locates a slide of interest, the video segment in which it appears may be directly accessed for playback by clicking on the corresponding slide thumbnail. The interface thus enables efficient content-based search and non-linear playback.

TalkMiner builds its index and interface from commonly recorded video rather than using a custom designed lecture-capture system or requiring careful post-capture authoring. Moreover, the system does not impose onerous constraints on the captured video. TalkMiner's slide detection algorithm was developed to handle common lecture webcast video production techniques. Such production techniques include shooting a slide screen from the back of the room, picture-in-picture compositing, and multiple camera productions that intersperse full-frame shots of slides with shots of the speaker. Each of these cases present challenges to assembling a set of distinct and useful slide keyframes for browsing an individual lecture video. Additionally, we developed processing for slides with built up content that are revealed gradually (e.g., bullet lists). TalkMiner's specialized content analysis allows it to scale to include a greater volume and variety of content. At the same time, its minimal storage and processing requirements enable it to do so at a much lower cost than would otherwise be possible.

Technical details appear in our 2010 ACM Multimedia paper or check out the FAQ.

TalkMiner press mentions

MIT Technology Review: Video Content Search Gets a Boost

CBCnews: Wanted: Better ways to search online video

Review and description of TalkMiner at MakeUseOf.com: