At first,we get the definition of Conditional GCN model:
$f\left(H^{(l)}, A\right)=\sigma\left(A H^{(l)} W^{(l)}\right)$
In the formula, $A H $ means every node in the graph need to consider it\'s own neighbour.
Where
$W^{(l)}$ is a weight matrix for the $l -th$ neural network layer .
$\sigma(\cdot)$ is a non-linear activation function like the ReLU.
If you are smart enough ,you can see that : $A H $ will not consider node itself.
改进一:Add self-connection
An improvement to this problem is to change $A$ as $\tilde{A}=A+I_{N} $ ( self-connection自环) ,
Degree matrix : $\tilde{D}_i=\sum\limits_{j} \tilde{A}_{i j}$
总结:添加自环的原因是节点自身的特征有时也很重要。
改进二:Normalized adjacency matrix
Different node have different count of neighbours and weight of edge.If one node have a lot of neighbours ,after aggregating neighbours information ,eigenvalues are much larger than nodes with few neighbours.So ,we need to do a Normalization. $\tilde{A}$ change into the following formulation:
$\tilde{D}^{-1} \tilde{A}$
总结:为防止边权大的节点特征值特征值过大影响信息传递,这边需要采用归一化来消除影响。
改进三:Symmetric normalization
上述归一化只考虑了聚合节点 $i$ 的度的情况,但没有考虑到邻居 $j$ (其节点的情况),即未对邻居 $j$ 所传播的信息进行归一化。(此处默认每个节点通过边对外发送相同量的信息, 边越多的节点,每条边发送出去的信息量就越小, 类似均摊. ) (要理解这个问题得先知道矩阵左乘和右乘的概念,参考《矩阵的左乘和右乘》)
$\tilde{A}$ change into the following formulation:
$\tilde{D}^{-\frac{1}{2}} \tilde{A} \tilde{D}^{-\frac{1}{2}}$
In this article ,we use the following layer-wise propagation rule:
$H^{(l+1)}=\sigma\left(\tilde{D}^{-\frac{1}{2}} \tilde{A} \tilde{D}^{-\frac{1}{2}} H^{(l)} W^{(l)}\right)$
Where
$\tilde{A}=A+I_{N}$;
$\tilde{D}_{i i}=\sum\limits _{j} \tilde{A}_{i j}$;
$\sigma(\cdot)$ is an activation function;
$H^{(l)} \in \mathbb{R}^{N \times D}$ is the matrix of activations in the $l-th$ layer , and $H^{(0)}=X$ .
借《拉普拉斯矩阵》为例子:
$\begin{array}{l}\tilde{A}=\left\{\begin{array}{cccccc}1 & 1 & 0 & 0 & 1 & 0 \\1 & 1 & 1 & 0 & 1 & 0 \\0 & 1 & 1 & 1 & 0 & 0 \\0 & 0 & 1 & 1 & 1 & 1 \\1 & 1 & 0 & 1 & 1 & 0 \\0 & 0 & 0 & 1 & 0 & 1\end{array}\right\} \tilde{D}=\left\{\begin{array}{cccccc}3 & 0 & 0 & 0 & 0 & 0 \\0 & 4 & 0 & 0 & 0 & 0 \\0 & 0 & 3 & 0 & 0 & 0 \\0 & 0 & 0 & 4 & 0 & 0 \\0 & 0 & 0 & 0 & 4 & 0 \\0 & 0 & 0 & 0 & 0 & 2\end{array}\right\} \tilde{D}^{-1 / 2}=\left\{\begin{array}{cccccccc}\frac{1}{\sqrt{3}} & 0 & 0 & 0 & 0 & 0 \\0 & \frac{1}{\sqrt{4}} & 0 & 0 & 0 & 0 \\0 & 0 & \frac{1}{\sqrt{3}} & 0 & 0 & 0 \\0 & 0 & 0 & \frac{1}{\sqrt{4}} & 0 & 0 \\0 & 0 & 0 & 0 & 0 & 0 & \frac{1}{\sqrt{2}}\end{array}\right\}\end{array}$
$\tilde{D}^{-\frac{1}{2}} \tilde{A} \tilde{D}^{-\frac{1}{2}}=\left\{\begin{array}{cccccc}\frac{1}{\sqrt{3}} & 0 & 0 & 0 & 0 & 0 \\0 & \frac{1}{\sqrt{4}} & 0 & 0 & 0 & 0 \\0 & 0 & \frac{1}{\sqrt{3}} & 0 & 0 & 0 \\0 & 0 & 0 & \frac{1}{\sqrt{4}} & 0 & 0 \\0 & 0 & 0 & 0 & \frac{1}{\sqrt{4}} & 0 \\0 & 0 & 0 & 0 & 0 & \frac{1}{\sqrt{2}}\end{array}\right\}\left\{\begin{array}{cccccc}1 & 1 & 0 & 0 & 1 & 0 \\1 & 1 & 1 & 0 & 1 & 0 \\0 & 1 & 1 & 1 & 0 & 0 \\0 & 0 & 1 & 1 & 1 & 1 \\1 & 1 & 0 & 1 & 1 & 0 \\0 & 0 & 0 & 1 & 0 & 1\end{array}\right\}\left\{\begin{array}{cccccc}\frac{1}{\sqrt{3}} & 0 & 0 & 0 & 0 & 0 \\0 & \frac{1}{\sqrt{4}} & 0 & 0 & 0 & 0 \\0 & 0 & \frac{1}{\sqrt{3}} & 0 & 0 & 0 \\0 & 0 & 0 & \frac{1}{\sqrt{4}} & 0 & 0 \\0 & 0 & 0 & 0 & \frac{1}{\sqrt{4}} & 0 \\0 & 0 & 0 & 0 & 0 & \frac{1}{\sqrt{2}}\end{array}\right\}$