An Overview of Graph Convolutional Networks

Technical Column

Author: Liu Zhongyu

Edited by Luobotu

Today, I want to share with you about Graph Convolutional Networks. With the development of artificial intelligence, many people have heard of concepts like machine learning, deep learning, and convolutional neural networks. However, Graph Convolutional Networks are not often mentioned. So, what are Graph Convolutional Networks? Simply put, they study graph data and use convolutional neural networks as their model.

Why Graph Convolutional Networks

Since 2012, deep learning has achieved great success in the fields of computer vision and natural language processing. How does it outperform traditional methods?

Assume we have an image that needs classification. Traditional methods require manually extracting features such as texture, color, or some higher-level features. These features are then fed into classifiers like random forests to produce an output label indicating the category. In contrast, deep learning inputs an image into a neural network and directly outputs a label. This end-to-end learning avoids manual feature extraction or rules, automatically extracting features from raw data. Compared to traditional methods, deep learning can learn more efficient features and patterns.

An Overview of Graph Convolutional Networks

Convolutional neural networks are great, but they are limited to data in Euclidean domains. What is Euclidean data? The most significant feature of Euclidean data is its regular spatial structure, such as images being regular square grids or speech being regular one-dimensional sequences. These data structures can be represented using one-dimensional or two-dimensional matrices, making them efficient for convolutional neural networks to process.

An Overview of Graph Convolutional Networks

However, in our real life, many data do not have regular spatial structures, referred to as Non-Euclidean data. Examples include recommendation systems, electronic transactions, computational geometry, brain signals, and molecular structures represented as graphs. In these graph structures, each node has different connections; some nodes have three connections, while others have two, resulting in irregular data structures.

Let’s illustrate what a graph is with two typical business scenarios:

An Overview of Graph Convolutional Networks

Social networks are very suitable for expressing data with graphs.

The above graph depicts various nodes in a social network and their relationships. Users A and B, as well as posts, are nodes. The relationship between users is following, while the relationship between users and posts may be publishing or sharing. Through such a graph, we can analyze what users are interested in, further implementing a recommendation mechanism.

An Overview of Graph Convolutional Networks

Graphs in e-commerce scenarios

In e-commerce, the key nodes we first think of are users, transactions, and products. Nodes associated with users may include registered addresses and shipping addresses; transactions may relate to products, shipping addresses, transaction IPs, etc.; products may relate to categories. The relationships between these nodes indicate that users can not only purchase products through transactions but also rate them. Such graph data can be used for two purposes: recommendations and anti-fraud.

From the above two examples, we can clearly see that graphs have two basic characteristics:

First, each node has its own feature information. For example, in the above graph, if we establish a risk control rule, we need to check whether the user’s registered address, IP address, and transaction shipping address are the same. If these feature information do not match, the system will determine that this user has a certain fraud risk. This is an application of feature information of graph nodes.

Second, each node in the graph also has structural information. If, for a certain period, a particular IP node connects to many transaction nodes, meaning there are many edges extending from that IP node, the risk control system will determine that this IP address is at risk. This is an application of structural information of graph nodes.

In summary, in graph data, we need to consider both the feature information and the structural information of nodes. If we rely on manual rules for extraction, we will inevitably lose many hidden and complex patterns. So, is there a way to automate the simultaneous learning of both feature information and structural information of graphs? — Graph Convolutional Networks

What are Graph Convolutional Networks?

Graph Convolutional Networks (GCN) are methods for deep learning on graph data.

An Overview of Graph Convolutional Networks

Graph Convolution Operator:

An Overview of Graph Convolutional Networks

The formula for the graph convolution operator is given above, with the center node denoted as i;

An Overview of Graph Convolutional Networks

How to understand the graph convolution algorithm? Let’s break it down into three steps (note that different colors represent different weights):

Step 1: Send each node transmits its feature information to neighboring nodes after transformation. This step involves extracting and transforming the node’s feature information.

An Overview of Graph Convolutional Networks

Step 2: Each node gathers feature information from its neighboring nodes. This step involves merging the local structural information of the node.

An Overview of Graph Convolutional Networks

Step 3: Transform the gathered information through a nonlinear transformation to enhance the model’s expressiveness.

An Overview of Graph Convolutional Networks

Graph Convolutional Networks possess the following properties of convolutional neural networks:

1. Local parameter sharing, the operator is applicable to each node (the circles represent the operator), shared everywhere.

2. The receptive field is proportional to the number of layers; initially, each node contains information from direct neighbors. When calculating the second layer, it can include information from the neighbors’ neighbors, thus more information is involved in the computation. The more layers, the broader the receptive field, and the more information participates in the computation.

An Overview of Graph Convolutional Networks

Let’s look at the GCN model framework. The input is a graph, which undergoes layer-by-layer computations and transformations, ultimately outputting another graph.

An Overview of Graph Convolutional Networks

GCN models also exhibit three properties of deep learning:

1. Hierarchical structure (features are extracted layer by layer, with each layer being more abstract and advanced);

2. Nonlinear transformation (enhancing the model’s expressiveness);

3. End-to-end training (no need to define any rules; just label the nodes of the graph and let the model learn by itself, integrating feature and structural information.)

Four characteristics of GCN:

1. GCN is a natural extension of convolutional neural networks in the graph domain.

2. It can simultaneously learn node feature information and structural information in an end-to-end manner, making it the best choice for graph data learning tasks.

3. Graph convolution is highly applicable, suitable for nodes and graphs of any topology.

4. In tasks like node classification and edge prediction, it significantly outperforms other methods on public datasets.

How to use Graph Convolutional Networks

Here, I will share an experiment from our practical application scenario:

An Overview of Graph Convolutional Networks

The input of the experiment consists of a validation dataset represented as a graph, where nodes are validation events and their related attribute nodes, such as IP, DeviceID, UA, etc. (We used a total of 30 days of validation data, with each two-hour interval constituting a graph, resulting in a total of 360 graphs.)

The output of the experiment is a human-machine classification of event nodes, indicating whether they are normal or abnormal.

Experiment Details

Network Structure:

GCN(128)->GCN(64)->GCN(64)->Linear(2)

Training: Adam optimizer, lr=0.001

Reference Benchmark: Using GBDT, which can only learn feature information, as a benchmark, grid_search for hyperparameters. GBDT is currently the most popular shallow classifier.

We trained on the data from the first day, and the results over the next 30 days are as follows:

An Overview of Graph Convolutional Networks

The accuracy of the GCN model shows minimal decay, while the decay of GBDT is significant. Clearly, the GCN model performs better in human-machine discrimination and exhibits better robustness.

An Overview of Graph Convolutional Networks

7-day evaluation effect visualization (using the first day’s data to train the model, observing its prediction effect and the last layer’s output tsne visualization results on the seventh day). The above figure shows that the GCN maintains a clear boundary for sample discrimination on the seventh day, while GBDT’s boundary for sample discrimination has become quite blurred. In summary, the structural information learned by GCN not only performs well in human-machine discrimination but also exhibits better robustness.

Final Thoughts

Due to time constraints, many issues were only superficially addressed; there is still much interesting content about GCN. We will launch a column titled “Graph Learning,” where authors will share more comprehensive graph learning algorithms with you.

We have always considered ourselves a technology-driven company and claim to be an AI company. Being an AI company is not as glamorous as it sounds; in reality, it can be quite challenging, as many technologies are still immature and not readily available. When conducting company business, we encounter various real-life problems, stumble upon many pitfalls, and of course, gain insights and experiences. We believe these are another form of value created by enterprises and should be well-utilized.

“You have one idea, I have one idea; after exchanging, we both have two ideas” — this is the significance of sharing. Therefore, we will also organize a series of valuable sharing columns later, aiming to share the knowledge summarized, learned, and created in practical applications with everyone. Of course, we also warmly welcome anyone to discuss, communicate, and improve together; this is the feedback we most look forward to from this endeavor.

Outline of the “Graph Learning” Column

Chapter 1 Graphs and Their Application Scenarios

Chapter 2 Graph Propagation Algorithms

Chapter 3 Community Detection and High-Density Subgraphs

Chapter 4 Heterogeneous Information Networks

Chapter 5 Graph Representation Learning

Chapter 6 Graph Convolutional Networks

In total, there will be six chapters, expected to span 25-30 articles. Interested friends are welcome to keep following us!~

Recommended Papers

Semi-Supervised Classification with Graph Convolutional Networks

https://openreview.net/pdf?id=SJU4ayYgl

Modeling Relational Data with Graph Convolutional Networks

https://arxiv.org/abs/1703.06103

Inductive Representation Learning on Large Graphs

https://arxiv.org/abs/1706.02216

Feel free to follow our technical column, and you can follow our WeChat public account and reply “papers” in the background to obtain three papers!~

You can also add the technical assistant’s WeChat “geetest1024” to discuss and progress together!

An Overview of Graph Convolutional Networks

Leave a Comment