ArmGAN: Adversarial Representation Learning for Network Embedding

Network embedding aims to learn low-dimensional representations of nodes in a network, which can be used for many downstream network analysis tasks. Recently, many network embedding methods based on Generative Adversarial Networks (GANs) have been proposed. However, GAN-based methods mainly face two challenges: (1) Existing GAN-based methods often use GANs to learn Gaussian distributions as priors for network embeddings, making it difficult to distinguish node representations from Gaussian distributions. (2) Existing methods do not apply adversarial learning strategies to learn the representation mechanism but only to the representation results. As a result, they do not fully leverage the advantages of GANs, leading to a decline in performance on network analysis tasks.We propose a new adversarial learning-based network embedding method (ArmGAN) that truly applies adversarial learning strategies to learn the representation mechanism.Specifically, the first two components, called the Encoder and Competitor, aim to learn two different representation mechanisms (i.e., two ways of projecting data into latent space). The two components compete with each other to improve their representation mechanisms.The third component is the Discriminator, which aims to distinguish between the representation mechanisms of the Encoder and Competitor.In addition, we design a perturbation strategy to generate fake networks from the original network, obtaining a competitive “fake” representation mechanism.

By reading this article, you can learn about:

How to more efficiently integrate the generative adversarial mechanism with existing graph representation learning?
How to reasonably set adversarial strategies to make the representation learning model approach an optimal solution?

This article also contains several aspects that require further exploration, such as our current focus on exploring the paradigm of graph representation learning combined with GANs, using random perturbation methods for data generation rather than model generation; our adversarial mechanism focuses on node-level representation adversarial rather than graph-level adversarial. We hope to discuss these issues together in future work.

Paper Address:

https://ieeexplore.ieee.org/document/9512395

Open Source Address:

https://github.com/wt-tju/ArmGAN/

Research Background

Network data, or graph-structured data, exhibits superior performance in an increasing number of scenarios due to the added “edges” in Euclidean data, which better models the ubiquitous relationships in the real world. However, network data is high-dimensional and sparse, and there exists topological coupling among the data, making it difficult to handle.

Generative Adversarial Networks (GANs) have become powerful deep generative models. The inspiration for GANs comes from a two-player game in game theory. The two players in GANs are the generator G (which generates data similar to real data) and the discriminator D (which distinguishes between real data and generated data). In other words, the generator’s goal is to “fool” the discriminator by generating data that is as close as possible to real data. The discriminator’s goal is to “expose” the generator by distinguishing between real data and generated data. Although GANs were initially proposed for generating images, they have recently been extended to network embedding. For example, ARVGA utilizes adversarial learning to regularize the embedding results of autoencoders, forcing the embeddings to match a Gaussian distribution. This framework further extends to allow autoencoders to reconstruct topology and node attributes, rather than just reconstructing the network structure in ARVGA. ANE proposed an inductive variant of DeepWalk to preserve network structure attributes in latent space and utilizes adversarial learning by matching latent representations to given priors (such as uniform or Gaussian distributions). VANE proposed a multi-view adversarial framework based on two adversarial games, where the first game enhances the comprehensiveness of node representations by distinguishing different view information, and the second game ensures the robustness of node representations by adapting the distribution of node representations to a given noise distribution. GANE can simultaneously perform feature representation learning and link prediction by using GANs to regularize vertex pairs, forcing the generated vertex pairs to resemble real data. Overall, most existing adversarial learning-based network embedding methods can be categorized as adversarially forcing the embedding results to follow given or latent distributions.

It is worth noting that these existing methods often apply adversarial learning strategies to representation results, such as matching the distribution of representations to an arbitrary prior, such as the Gaussian distribution in most cases. However, this strategy makes it difficult to distinguish representations from Gaussian noise, as it requires representations to follow a Gaussian distribution, which is roughly equivalent to adding a Gaussian regularization term to the representations. While this is reasonable to some extent, it does not fully utilize the inherent advantages of adversarial learning.

ArmGAN: Adversarial Representation Learning Method for Graph Representation

The framework proposed in this paper mainly consists of three parts (as shown in the figure below). First, we utilize a graph autoencoder with mutual information regularization as the first component (Encoder, at the top of the figure) to learn the true representation mechanism, capable of capturing both node attributes and network topology information simultaneously. Second, to produce a more competitive fake representation mechanism, we also use a graph autoencoder with mutual information regularization as the second component (Competitor, also known as Negative Sample Generator), where the input to the Competitor is the network after perturbation of node attributes or topology. The third component, the Discriminator (in the middle of the figure), aims to correctly distinguish between the true representation mechanism and the fake representation mechanism. The representation mechanism can be viewed as a mapping from the input space to the latent space, so we use the combination of input features X and output representations Z to represent a representation mechanism. Through adversarial learning strategies, the model can learn effective representations with greater expressive power.

1. Graph Autoencoder with Mutual Information Regularization

This can be viewed as a classical graph autoencoder with the introduction of mutual information regularization constraints, which can more effectively combine node features and network topology.

Classical Autoencoder GAE

The encoding process utilizes two layers of Graph Convolutional Networks:

Where:

The decoding phase reconstructs the network topology, with the reconstruction loss defined as follows:

The reconstructed adjacency matrix can be represented as:

Mutual Information Regularization

Mutual information can measure the dependency between two variables. In recent years, some have estimated mutual information between two random variables using neural networks, which has wide applications in information theory and image processing. Therefore, to ensure that the learned representation vectors fully reflect node features and capture the characteristics that best represent the original node attributes in a low-dimensional space, we added mutual information regularization constraints on top of the autoencoder. The mutual information estimation between node attributes and representations is as follows:

2. Negative Sample Generator

To produce a more competitive fake representation mechanism, we also use a graph autoencoder with mutual information regularization as the Competitor but perturb the original network. We adopt three types of perturbations: perturbing attributes, flipping topology, and perturbing both attributes and topology. The reconstruction loss and mutual information regularization for the negative sample generator are defined as follows:

3. Representation Mechanism Discriminator

The core of our model is the adversarial learning representation mechanism, rather than the representation results. As mentioned earlier, our new framework ArmGAN includes two representation mechanisms, namely the positive representation mechanism generated by the autoencoder with mutual information regularization and the negative representation mechanism generated by the negative sample generator. The challenge lies in how to convert these two representation mechanisms into recognizable inputs for the discriminator. In fact, the representation mechanism can be seen as a mapping from input to low-dimensional representation. The mapping mechanism can be represented by combining the mapped input and output. Thus, we use the combination of node attributes (mapped input) and low-dimensional representations (mapped output) as recognizable input for the discriminator to represent our representation mechanism. We represent the negative representation mechanism in two ways.

Direct Mapping Representation Mechanism

We directly use the combination of perturbed attributes and the representations learned by the negative sample generator to represent the negative representation mechanism. We use the combination of original attributes and representations generated by the encoder to represent the positive representation mechanism. In this case, the objective function for the discriminator can be represented as:

Mutual Information Representation Mechanism

We use the combination of original attributes and representations learned by the negative sample generator to represent the negative representation mechanism. In this case, the objective function for the discriminator can be represented as:

4. Model Learning

The overall model learning process can be divided into a generation process and a discrimination process, where the objective of the generation process is as follows:

The objective of the discrimination process is as follows:

The algorithm pseudocode is as follows:

Experimental Analysis

We conducted experiments on seven real graph datasets. The statistical information of the datasets is as follows:

Node Classification Results

Node Clustering Results

Link Prediction Results

Conclusion

In this paper, we propose a new generative adversarial network representation framework ArmGAN, which truly applies adversarial learning strategies to learn representation mechanisms. The new generative adversarial framework includes three roles: an autoencoder with mutual information regularization, a negative sample generator, and a discriminator. The method has been evaluated on seven real datasets for different network analysis tasks. Experimental results show that the method significantly outperforms other representative network representation methods.

I. Goodfellow, J. PougetAbadie, M. Mirza, B. Xu, D. WardeFarley, S. Ozair, A. Courville, and Y. Bengio, Generative adversarial nets, in Advances in Neural Information Processing Systems, 2014, pp. 2672–2680.
S. Pan, R. Hu, G. Long, J. Jiang, L. Yao, and C. Zhang, Adversarially regularized graph autoencoder for graph embedding, in Proceedings of the 27th International Joint Conference on Artificial Intelligence, 2018, pp. 2609–2615.
T. N. Kipf and M. Welling, Variational graph auto-encoders, in Proceedings of the 5th International Conference on Learning Representations, 2016.
S. Pan, R. Hu, S. Fung, G. Long, J. Jiang, and C. Zhang, Learning graph embedding with adversarial training methods, IEEE Transactions on Cybernetics, vol. 50, no. 6, pp. 2475–2487, 2019.
Q. Dai, Q. Li, J. Tang, and D. Wang, Adversarial network embedding, in Proceedings of the 32nd Association for the Advancement of Artificial Intelligence, 2018, pp. 2167–2174.
Perozzi, B., Al-Rfou, R., & Skiena, S. (2014). Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 701-710).
D. Fu, Z. Xu, B. Li, H. Tong, and J. He, A view-adversarial framework for multi-view network embedding, in Proceedings of the 29th International Conference on Information and Knowledge Management, 2020, pp. 2025–2028.
H. Hong, X. Li, and M. Wang, GANE: A generative adversarial network embedding, IEEE Transactions on Neural Networks and Learning Systems, vol. 31, no. 7, pp. 2325–2335, 2019.

Leave a Comment Cancel reply