Querying Big‐graphs – Streaming and Beyond

Monday, Monday, 27 February 2017


ISI Red Room

Prof. Arijit Khan

Graphs are a ubiquitous model to represent objects and their relations. However, the complex combinations of structure and content, coupled with massive volume, high streaming rate, and uncertainty inherent in the data, raise several challenges that require new efforts for smarter and faster graph querying.

Many graphs such as those formed by the activity on social networks, communication networks, and telephone networks are defined dynamically as rapid edge streams on a massive domain of nodes. Efficient processing of rapid and massive streams, often in space‐limited architectures such as FPGAs, network interface cards, routers, and switches, requires approximation through succinct synopses created in a single‐pass. In the first half of the talk, I shall discuss our novel synopsis structure that efficiently summarizes massive graph streams, while also retaining information about the structural behavior of the underlying graph dataset. I shall demonstrate how one can use our synopsis to determine important structural properties such as reachability over high‐frequency edges. In the second half of the talk, I shall discuss our newest progress with faster and more accurate stream processing: How adding a pre‐filtering stage which dynamically identifies and aggregates the most frequent items improves the accuracy and throughput of stream processing.

Arijit Khan is an Assistant Professor in the School of Computer Science and Engineering at Nanyang Technological University, Singapore. His research interests span in the area of big-data, big-graphs, and graph systems. He earned his PhD from the Department of Computer Science, University of California, Santa Barbara, USA, and did a post-doc in the Systems group at ETH Zurich, Switzerland. Arijit is the recipient of the prestigious IBM PhD Fellowship in 2012-13. He published several papers in premier database and data-mining conferences and journals including SIGMOD, VLDB, TKDE, ICDE, SDM, EDBT, and CIKM. Arijit co-presented tutorials on emerging graph queries, big-graph systems, and uncertain graphs at ICDE 2012, VLDB 2014, VLDB 2015, and served in the program committee of KDD, SIGMOD, VLDB, ICDE, ICDM, EDBT, WWW, and CIKM. Arijit served as the co-chair of Big-O(Q) workshop co-located with VLDB 2015, and contributed a chapter on Big-Graphs querying and mining in the Springer Handbook of Big Data Technologies.