Real-time topic detection with bursty n-grams: RGU's submission to the 2014 SNOW Challenge.
MetadataShow full item record
MARTIN, C. and GOKER, A. 2014. Real-time topic detection with bursty n-grams: RGU's submission to the 2014 SNOW challenge. In the Proceedings of the social news on the web 2014 data challenge (SNOW-DC 2014), 8th April 2014, Seoul, Korea. Seoul: CEUR [online], pages 9-16. Available from http://ceur-ws.org/Vol-1150/martin.pdf
Twitter is becoming an ever more popular platform for discovering and sharing information about current events, both personal and global. The scale and diversity of messages makes the discovery and analysis of breaking news very challenging. Nonetheless, journalists and other news consumers are increasingly relying on tools to help them make sense of Twitter. Here, we describe a fully-automated system capable of detecting trends related to breaking news in real-time. It identifies words or phrases that `burst' with sudden increased frequencies, and groups these into topics. It identifies a diverse set of recent tweets that are related to these topics, and uses these to create a suitable human-readable headline. In addition, images coming from the diverse tweets are also added to the topic. Our system was evaluated using 24 hours of tweets as part of the Social News On the Web (SNOW) 2014 data challenge.