Graph Representation Learning in the Presence of Community Outliers

August 30, 9:00 am - 10:30 am (CEST)

Speakers: Sambaran Bandyopadhyay, Saley Vishal Vivek, M. Narasimha Murty

Tutorial website: https://sites.google.com/view/node-embedding-with-outliers/home

Agenda: Graph Representation Learning in the Presence of Community Outliers

Graph representation learning (also known as graph embedding) has received a significant amount of interest from the network analysis and general artificial intelligence community. Here, the goal is to learn vector representation of the nodes of a graph (or the entire graph itself) which can then be used with any standard machine learning algorithms for different graph mining tasks. However, real world networks come with community outliers, which violate the overall community structure of the graph. The presence of the outlier nodes are not known and they are often difficult to detect. As these outlier nodes are connected to other nodes in the graph randomly, state-of-the-art graph representation algorithms suffer heavily in the presence of even a small number of community outlier nodes in a graph. Thus, it is important to minimize the effect of outlier nodes while generating the representation of the other nodes in the network.

In this tutorial, we start with giving a high level overview of the state-of-the-art node representation learning algorithms such as random walk based unsupervised algorithms, graph neural network based semi-supervised algorithms, etc. We introduce different types of community outliers in a graph and show their adverse effect on the node embedding algorithms. We will cover in details about all the state-of-the-art recently proposed node representation algorithms which minimize the adverse effect of the outlier nodes in the representation learning framework. All these algorithms integrate outlier detection and node embedding into a single optimization framework. We also cover detailed experimental analysis and comparisons before concluding the tutorial.

Sambaran Bandyopadhyay (sambband@in.ibm.com) is an Advisory Research Engineer at IBM Research, India. He is also a final year PhD student (Aug, 2016 to July, 2020) at the department of Computer Science and Automation at Indian Institute of Science, Bangalore. Sambaran has published his research works in top tier AI conferences such as AAAI, IJCAI, KDD, WSDM, ICAPS, ECAI etc. and also has 8 patents filed to USPTO. He has been a member of the technical program committee of the top AI conferences such as ICML, AAAI, ECML-PKDD, etc.

Saley Vishal Vivek (vishalsaley@iisc.ac.in) has completed Master of Technology from the department of Computer Science and Automation at Indian Institute of Science, Bangalore. He is currently working as an Applied Scientist at Amazon Machine Learning, India.

M. Narasimha Murty (mnm@iisc.ac.in) received the PhD degree from the Indian Institute of Science, Bangalore, India, in 1982. He is currently a professor in the Department of Computer Science and Automation at the Indian Institute of Science. He has guided about 20 PhD students in the areas of pattern recognition and data mining. He has also published around 125 papers in various journals and conference proceedings in these areas. His coauthored work, “Data Clustering: A Review” (with Anil Jain and P.J. Flynn, ACM Computing Surveys, 1999), is among the most popular data mining and computing survey articles downloaded (more than 16,000 citations according to Google Scholar). He is also an elected fellow of the Indian National Academy of Engineering (FNAE), New Delhi, and the National Academy of Sciences (FNASc), Allahabad. In the past, he has worked on several Indo-US projects and visited Michigan State University, East Lansing, and University of Dauphine, Paris. His current research interests primarily span data mining.