{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Graph Convolutional Prediction of Protein Interactions in Yeast"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "*(This demo is a part of [Deep Learning for Network Biology](http://snap.stanford.edu/deepnetbio-ismb/) tutorial.)*"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In this example, we demonstrate the utility of deep learning methods for an important prediction problem on biological graphs. In particular, we consider the problem of predicting [protein-protein interactions](https://en.wikipedia.org/wiki/Protein%E2%80%93protein_interaction) (PPIs). \n",
    "\n",
    "Protein-protein interactions (PPIs) are essential to almost every process in a cell. Understanding PPIs is crucial for understanding cell physiology in normal and disease states. Furthermore, knowledge of PPIs can be used:\n",
    "* for drug development, since drugs can affect PPIs,\n",
    "* to assign roles (i.e., protein functions) to uncharacterized proteins,\n",
    "* to characterize the relationships between proteins that form multi-molecular complexes, such as the proteasome.\n",
    "\n",
    "We represent the totality of PPIs that happen in a cell, an organism or a specific biological context with a [protein-protein interaction network](https://www.nature.com/subjects/protein-protein-interaction-networks). These networks are mathematical representations of all physical contacts between proteins in the cell.\n",
    "\n",
    "The development of large-scale PPI screening techniques, especially [high-throughput affinity purification combined with mass-spectrometry](https://doi.org/10.1016/j.tibtech.2016.02.014) and the [yeast two-hybrid assay](https://en.wikipedia.org/wiki/Two-hybrid_screening), has caused an explosion in the amount of PPI data and the construction of ever more complex and complete interaction networks. For example, the figure below is a graphical representation of three different types of protein-protein interaction networks in [yeast *S. cerevisiae*](https://en.wikipedia.org/wiki/Saccharomyces_cerevisiae). The structure of the binary interaction network is obviously different from the structure of the co-complex interaction network. The network structure of the literature-curated dataset resembles that of the co-complex dataset, even though the literature-curated datasets are reported to contain mostly binary interactions.\n",
    "\n",
    "However, current knowledge of protein-protein interaction networks is both [*incomplete* and *noisy*](https://doi.org/10.1016/j.cell.2014.10.050), as PPI screening techniques are limited in how many true interactions they can detect. Furthermore, PPI screening techniques often have high false positive and negative rates. These limitations present a great opportunity for computational methods to predict protein-protein interactions.  \n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<center>\n",
    "<img src=\"yeast_interactome.png\" alt=\"Yeast protein-protein interaction networks\" style=\"width: 800px\"/>\n",
    "</center>\n",
    "\n",
    "Image taken from: http://interactome.dfci.harvard.edu/S_cerevisiae"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In this demo, we take the yeast protein-protein interaction network ([download the preprocessed network](http://snap.stanford.edu/deepnetbio-ismb/ipynb/yeast.edgelist)) and use the network to build a model for predicting new protein-protein interactions. We formulate this prediction task as **a link prediction problem on unweighted and undirected networks** and use a graph convolutional neural network to solve the task.\n",
    "\n",
    "Note that link prediction is a standard prediction task in network biology, and thus this example can be easily extended and adapted to many other biomedical applications. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Graph Convolutional Neural Network (GCN)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In what follows, we give a complete [Tensorflow](https://www.tensorflow.org/) implementation of a two-layer graph convolutional neural network (GCN) for link prediction. We closely follow the GCN formulation as presented in [Kipf et al., ICLR 2017](https://arxiv.org/pdf/1609.02907.pdf). "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "from __future__ import division\n",
    "from __future__ import print_function\n",
    "\n",
    "import time\n",
    "\n",
    "import numpy as np\n",
    "import scipy.sparse as sp\n",
    "\n",
    "import networkx as nx\n",
    "import tensorflow as tf\n",
    "from sklearn.metrics import roc_auc_score\n",
    "from sklearn.metrics import average_precision_score"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# Set random seed\n",
    "seed = 123\n",
    "np.random.seed(seed)\n",
    "tf.set_random_seed(seed)\n",
    "\n",
    "# Settings\n",
    "flags = tf.app.flags\n",
    "FLAGS = flags.FLAGS\n",
    "flags.DEFINE_float('learning_rate', 0.01, 'Initial learning rate.')\n",
    "flags.DEFINE_integer('epochs', 20, 'Number of epochs to train.')\n",
    "flags.DEFINE_integer('hidden1', 32, 'Number of units in hidden layer 1.')\n",
    "flags.DEFINE_integer('hidden2', 16, 'Number of units in hidden layer 2.')\n",
    "flags.DEFINE_float('dropout', 0.1, 'Dropout rate (1 - keep probability).')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Various Utility Functions"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def load_data():\n",
    "    g = nx.read_edgelist('yeast.edgelist')\n",
    "    adj = nx.adjacency_matrix(g)\n",
    "    return adj\n",
    "\n",
    "\n",
    "def weight_variable_glorot(input_dim, output_dim, name=\"\"):\n",
    "    init_range = np.sqrt(6.0 / (input_dim + output_dim))\n",
    "    initial = tf.random_uniform(\n",
    "        [input_dim, output_dim], minval=-init_range,\n",
    "        maxval=init_range, dtype=tf.float32)\n",
    "    return tf.Variable(initial, name=name)\n",
    "\n",
    "\n",
    "def dropout_sparse(x, keep_prob, num_nonzero_elems):\n",
    "    noise_shape = [num_nonzero_elems]\n",
    "    random_tensor = keep_prob\n",
    "    random_tensor += tf.random_uniform(noise_shape)\n",
    "    dropout_mask = tf.cast(tf.floor(random_tensor), dtype=tf.bool)\n",
    "    pre_out = tf.sparse_retain(x, dropout_mask)\n",
    "    return pre_out * (1. / keep_prob)\n",
    "\n",
    "\n",
    "def sparse_to_tuple(sparse_mx):\n",
    "    if not sp.isspmatrix_coo(sparse_mx):\n",
    "        sparse_mx = sparse_mx.tocoo()\n",
    "    coords = np.vstack((sparse_mx.row, sparse_mx.col)).transpose()\n",
    "    values = sparse_mx.data\n",
    "    shape = sparse_mx.shape\n",
    "    return coords, values, shape\n",
    "\n",
    "\n",
    "def preprocess_graph(adj):\n",
    "    adj = sp.coo_matrix(adj)\n",
    "    adj_ = adj + sp.eye(adj.shape[0])\n",
    "    rowsum = np.array(adj_.sum(1))\n",
    "    degree_mat_inv_sqrt = sp.diags(np.power(rowsum, -0.5).flatten())\n",
    "    adj_normalized = adj_.dot(degree_mat_inv_sqrt).transpose().dot(degree_mat_inv_sqrt).tocoo()\n",
    "    return sparse_to_tuple(adj_normalized)\n",
    "\n",
    "\n",
    "def construct_feed_dict(adj_normalized, adj, features, placeholders):\n",
    "    feed_dict = dict()\n",
    "    feed_dict.update({placeholders['features']: features})\n",
    "    feed_dict.update({placeholders['adj']: adj_normalized})\n",
    "    feed_dict.update({placeholders['adj_orig']: adj})\n",
    "    return feed_dict\n",
    "\n",
    "\n",
    "def mask_test_edges(adj):\n",
    "    # Function to build test set with 2% positive links\n",
    "    # Remove diagonal elements\n",
    "    adj = adj - sp.dia_matrix((adj.diagonal()[np.newaxis, :], [0]), shape=adj.shape)\n",
    "    adj.eliminate_zeros()\n",
    "\n",
    "    adj_triu = sp.triu(adj)\n",
    "    adj_tuple = sparse_to_tuple(adj_triu)\n",
    "    edges = adj_tuple[0]\n",
    "    edges_all = sparse_to_tuple(adj)[0]\n",
    "    num_test = int(np.floor(edges.shape[0] / 50.))\n",
    "    num_val = int(np.floor(edges.shape[0] / 50.))\n",
    "\n",
    "    all_edge_idx = range(edges.shape[0])\n",
    "    np.random.shuffle(all_edge_idx)\n",
    "    val_edge_idx = all_edge_idx[:num_val]\n",
    "    test_edge_idx = all_edge_idx[num_val:(num_val + num_test)]\n",
    "    test_edges = edges[test_edge_idx]\n",
    "    val_edges = edges[val_edge_idx]\n",
    "    train_edges = np.delete(edges, np.hstack([test_edge_idx, val_edge_idx]), axis=0)\n",
    "\n",
    "    def ismember(a, b):\n",
    "        rows_close = np.all((a - b[:, None]) == 0, axis=-1)\n",
    "        return np.any(rows_close)\n",
    "\n",
    "    test_edges_false = []\n",
    "    while len(test_edges_false) < len(test_edges):\n",
    "        n_rnd = len(test_edges) - len(test_edges_false)\n",
    "        rnd = np.random.randint(0, adj.shape[0], size=2 * n_rnd)\n",
    "        idxs_i = rnd[:n_rnd]                                        \n",
    "        idxs_j = rnd[n_rnd:]\n",
    "        for i in range(n_rnd):\n",
    "            idx_i = idxs_i[i]\n",
    "            idx_j = idxs_j[i]\n",
    "            if idx_i == idx_j:\n",
    "                continue\n",
    "            if ismember([idx_i, idx_j], edges_all):\n",
    "                continue\n",
    "            if test_edges_false:\n",
    "                if ismember([idx_j, idx_i], np.array(test_edges_false)):\n",
    "                    continue\n",
    "                if ismember([idx_i, idx_j], np.array(test_edges_false)):\n",
    "                    continue\n",
    "            test_edges_false.append([idx_i, idx_j])\n",
    "\n",
    "    val_edges_false = []\n",
    "    while len(val_edges_false) < len(val_edges):\n",
    "        n_rnd = len(val_edges) - len(val_edges_false)\n",
    "        rnd = np.random.randint(0, adj.shape[0], size=2 * n_rnd)\n",
    "        idxs_i = rnd[:n_rnd]                                        \n",
    "        idxs_j = rnd[n_rnd:]\n",
    "        for i in range(n_rnd):\n",
    "            idx_i = idxs_i[i]\n",
    "            idx_j = idxs_j[i]\n",
    "            if idx_i == idx_j:\n",
    "                continue\n",
    "            if ismember([idx_i, idx_j], train_edges):\n",
    "                continue\n",
    "            if ismember([idx_j, idx_i], train_edges):\n",
    "                continue\n",
    "            if ismember([idx_i, idx_j], val_edges):\n",
    "                continue\n",
    "            if ismember([idx_j, idx_i], val_edges):\n",
    "                continue\n",
    "            if val_edges_false:\n",
    "                if ismember([idx_j, idx_i], np.array(val_edges_false)):\n",
    "                    continue\n",
    "                if ismember([idx_i, idx_j], np.array(val_edges_false)):\n",
    "                    continue\n",
    "            val_edges_false.append([idx_i, idx_j])\n",
    "\n",
    "    # Re-build adj matrix\n",
    "    data = np.ones(train_edges.shape[0])\n",
    "    adj_train = sp.csr_matrix((data, (train_edges[:, 0], train_edges[:, 1])), shape=adj.shape)\n",
    "    adj_train = adj_train + adj_train.T\n",
    "\n",
    "    return adj_train, train_edges, val_edges, val_edges_false, test_edges, test_edges_false\n",
    "\n",
    "\n",
    "def get_roc_score(edges_pos, edges_neg):\n",
    "    feed_dict.update({placeholders['dropout']: 0})\n",
    "    emb = sess.run(model.embeddings, feed_dict=feed_dict)\n",
    "\n",
    "    def sigmoid(x):\n",
    "        return 1 / (1 + np.exp(-x))\n",
    "\n",
    "    # Predict on test set of edges\n",
    "    adj_rec = np.dot(emb, emb.T)\n",
    "    preds = []\n",
    "    pos = []\n",
    "    for e in edges_pos:\n",
    "        preds.append(sigmoid(adj_rec[e[0], e[1]]))\n",
    "        pos.append(adj_orig[e[0], e[1]])\n",
    "\n",
    "    preds_neg = []\n",
    "    neg = []\n",
    "    for e in edges_neg:\n",
    "        preds_neg.append(sigmoid(adj_rec[e[0], e[1]]))\n",
    "        neg.append(adj_orig[e[0], e[1]])\n",
    "\n",
    "    preds_all = np.hstack([preds, preds_neg])\n",
    "    labels_all = np.hstack([np.ones(len(preds)), np.zeros(len(preds))])\n",
    "    roc_score = roc_auc_score(labels_all, preds_all)\n",
    "    ap_score = average_precision_score(labels_all, preds_all)\n",
    "\n",
    "    return roc_score, ap_score"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Define Convolutional Layers for our GCN Model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "class GraphConvolution():\n",
    "    \"\"\"Basic graph convolution layer for undirected graph without edge labels.\"\"\"\n",
    "    def __init__(self, input_dim, output_dim, adj, name, dropout=0., act=tf.nn.relu):\n",
    "        self.name = name\n",
    "        self.vars = {}\n",
    "        self.issparse = False\n",
    "        with tf.variable_scope(self.name + '_vars'):\n",
    "            self.vars['weights'] = weight_variable_glorot(input_dim, output_dim, name='weights')\n",
    "        self.dropout = dropout\n",
    "        self.adj = adj\n",
    "        self.act = act\n",
    "\n",
    "    def __call__(self, inputs):\n",
    "        with tf.name_scope(self.name):        \n",
    "            x = inputs\n",
    "            x = tf.nn.dropout(x, 1-self.dropout)\n",
    "            x = tf.matmul(x, self.vars['weights'])\n",
    "            x = tf.sparse_tensor_dense_matmul(self.adj, x)\n",
    "            outputs = self.act(x)\n",
    "        return outputs\n",
    "\n",
    "\n",
    "class GraphConvolutionSparse():\n",
    "    \"\"\"Graph convolution layer for sparse inputs.\"\"\"\n",
    "    def __init__(self, input_dim, output_dim, adj, features_nonzero, name, dropout=0., act=tf.nn.relu):\n",
    "        self.name = name\n",
    "        self.vars = {}\n",
    "        self.issparse = False\n",
    "        with tf.variable_scope(self.name + '_vars'):\n",
    "            self.vars['weights'] = weight_variable_glorot(input_dim, output_dim, name='weights')\n",
    "        self.dropout = dropout\n",
    "        self.adj = adj\n",
    "        self.act = act\n",
    "        self.issparse = True\n",
    "        self.features_nonzero = features_nonzero\n",
    "\n",
    "    def __call__(self, inputs):\n",
    "        with tf.name_scope(self.name):\n",
    "            x = inputs\n",
    "            x = dropout_sparse(x, 1-self.dropout, self.features_nonzero)\n",
    "            x = tf.sparse_tensor_dense_matmul(x, self.vars['weights'])\n",
    "            x = tf.sparse_tensor_dense_matmul(self.adj, x)\n",
    "            outputs = self.act(x)\n",
    "        return outputs\n",
    "    \n",
    "    \n",
    "class InnerProductDecoder():\n",
    "    \"\"\"Decoder model layer for link prediction.\"\"\"\n",
    "    def __init__(self, input_dim, name, dropout=0., act=tf.nn.sigmoid):\n",
    "        self.name = name\n",
    "        self.issparse = False\n",
    "        self.dropout = dropout\n",
    "        self.act = act\n",
    "\n",
    "    def __call__(self, inputs):\n",
    "        with tf.name_scope(self.name):\n",
    "            inputs = tf.nn.dropout(inputs, 1-self.dropout)\n",
    "            x = tf.transpose(inputs)\n",
    "            x = tf.matmul(inputs, x)\n",
    "            x = tf.reshape(x, [-1])\n",
    "            outputs = self.act(x)\n",
    "        return outputs"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Specify the Architecture of our GCN Model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "class GCNModel():\n",
    "    def __init__(self, placeholders, num_features, features_nonzero, name):\n",
    "        self.name = name\n",
    "        self.inputs = placeholders['features']\n",
    "        self.input_dim = num_features\n",
    "        self.features_nonzero = features_nonzero\n",
    "        self.adj = placeholders['adj']\n",
    "        self.dropout = placeholders['dropout']\n",
    "        with tf.variable_scope(self.name):\n",
    "            self.build()\n",
    "        \n",
    "    def build(self):\n",
    "        self.hidden1 = GraphConvolutionSparse(\n",
    "            name='gcn_sparse_layer',\n",
    "            input_dim=self.input_dim,\n",
    "            output_dim=FLAGS.hidden1,\n",
    "            adj=self.adj,\n",
    "            features_nonzero=self.features_nonzero,\n",
    "            act=tf.nn.relu,\n",
    "            dropout=self.dropout)(self.inputs)\n",
    "\n",
    "        self.embeddings = GraphConvolution(\n",
    "            name='gcn_dense_layer',\n",
    "            input_dim=FLAGS.hidden1,\n",
    "            output_dim=FLAGS.hidden2,\n",
    "            adj=self.adj,\n",
    "            act=lambda x: x,\n",
    "            dropout=self.dropout)(self.hidden1)\n",
    "\n",
    "        self.reconstructions = InnerProductDecoder(\n",
    "            name='gcn_decoder',\n",
    "            input_dim=FLAGS.hidden2, \n",
    "            act=lambda x: x)(self.embeddings)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Specify the GCN Optimizer"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "class Optimizer():\n",
    "    def __init__(self, preds, labels, num_nodes, num_edges):\n",
    "        pos_weight = float(num_nodes**2 - num_edges) / num_edges\n",
    "        norm = num_nodes**2 / float((num_nodes**2 - num_edges) * 2)\n",
    "        \n",
    "        preds_sub = preds\n",
    "        labels_sub = labels\n",
    "\n",
    "        self.cost = norm * tf.reduce_mean(\n",
    "            tf.nn.weighted_cross_entropy_with_logits(\n",
    "                logits=preds_sub, targets=labels_sub, pos_weight=pos_weight))\n",
    "        self.optimizer = tf.train.AdamOptimizer(learning_rate=FLAGS.learning_rate)  # Adam Optimizer\n",
    "\n",
    "        self.opt_op = self.optimizer.minimize(self.cost)\n",
    "        self.grads_vars = self.optimizer.compute_gradients(self.cost)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Train the GCN Model and Evaluate its Accuracy on a Test Set of Protein-Protein Interactions"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Given a training set of protein-protein interactions in yeast *S. cerevisiae*, our goal is to take these interactions and train a GCN model that can predict new protein-protein interactions. That is, we would like to predict new edges in the yeast protein interaction network. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "adj = load_data()\n",
    "num_nodes = adj.shape[0]\n",
    "num_edges = adj.sum()\n",
    "# Featureless\n",
    "features = sparse_to_tuple(sp.identity(num_nodes))\n",
    "num_features = features[2][1]\n",
    "features_nonzero = features[1].shape[0]\n",
    "\n",
    "# Store original adjacency matrix (without diagonal entries) for later\n",
    "adj_orig = adj - sp.dia_matrix((adj.diagonal()[np.newaxis, :], [0]), shape=adj.shape)\n",
    "adj_orig.eliminate_zeros()\n",
    "\n",
    "adj_train, train_edges, val_edges, val_edges_false, test_edges, test_edges_false = mask_test_edges(adj)\n",
    "adj = adj_train\n",
    "\n",
    "adj_norm = preprocess_graph(adj)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# Define placeholders\n",
    "placeholders = {\n",
    "    'features': tf.sparse_placeholder(tf.float32),\n",
    "    'adj': tf.sparse_placeholder(tf.float32),\n",
    "    'adj_orig': tf.sparse_placeholder(tf.float32),\n",
    "    'dropout': tf.placeholder_with_default(0., shape=())\n",
    "}\n",
    "\n",
    "# Create model\n",
    "model = GCNModel(placeholders, num_features, features_nonzero, name='yeast_gcn')\n",
    "\n",
    "# Create optimizer\n",
    "with tf.name_scope('optimizer'):\n",
    "    opt = Optimizer(\n",
    "        preds=model.reconstructions,\n",
    "        labels=tf.reshape(tf.sparse_tensor_to_dense(placeholders['adj_orig'], validate_indices=False), [-1]),\n",
    "        num_nodes=num_nodes,\n",
    "        num_edges=num_edges)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Epoch: 0001 train_loss= 0.70988 val_roc= 0.83630 val_ap= 0.81451 time= 3.09125\n",
      "Epoch: 0002 train_loss= 0.70988 val_roc= 0.87255 val_ap= 0.86465 time= 3.01855\n",
      "Epoch: 0003 train_loss= 0.70984 val_roc= 0.87762 val_ap= 0.86818 time= 3.08140\n",
      "Epoch: 0004 train_loss= 0.70964 val_roc= 0.87790 val_ap= 0.86833 time= 3.06789\n",
      "Epoch: 0005 train_loss= 0.70679 val_roc= 0.87796 val_ap= 0.86837 time= 2.99488\n",
      "Epoch: 0006 train_loss= 0.69930 val_roc= 0.87799 val_ap= 0.86838 time= 3.22557\n",
      "Epoch: 0007 train_loss= 0.68876 val_roc= 0.87799 val_ap= 0.86838 time= 3.18693\n",
      "Epoch: 0008 train_loss= 0.68346 val_roc= 0.87800 val_ap= 0.86838 time= 3.67348\n",
      "Epoch: 0009 train_loss= 0.68193 val_roc= 0.87800 val_ap= 0.86838 time= 3.71131\n",
      "Epoch: 0010 train_loss= 0.67285 val_roc= 0.87800 val_ap= 0.86838 time= 3.22863\n",
      "Epoch: 0011 train_loss= 0.65272 val_roc= 0.87800 val_ap= 0.86838 time= 3.03830\n",
      "Epoch: 0012 train_loss= 0.63608 val_roc= 0.87797 val_ap= 0.86837 time= 3.09419\n",
      "Epoch: 0013 train_loss= 0.63756 val_roc= 0.87793 val_ap= 0.86836 time= 3.98356\n",
      "Epoch: 0014 train_loss= 0.64147 val_roc= 0.87788 val_ap= 0.86834 time= 4.87875\n",
      "Epoch: 0015 train_loss= 0.63750 val_roc= 0.87781 val_ap= 0.86830 time= 3.77994\n",
      "Epoch: 0016 train_loss= 0.63342 val_roc= 0.87770 val_ap= 0.86826 time= 2.95455\n",
      "Epoch: 0017 train_loss= 0.63338 val_roc= 0.87753 val_ap= 0.86817 time= 3.45614\n",
      "Epoch: 0018 train_loss= 0.63927 val_roc= 0.87732 val_ap= 0.86806 time= 3.69183\n",
      "Epoch: 0019 train_loss= 0.64602 val_roc= 0.87717 val_ap= 0.86798 time= 4.33306\n",
      "Epoch: 0020 train_loss= 0.64856 val_roc= 0.87706 val_ap= 0.86792 time= 3.81857\n",
      "Optimization Finished!\n",
      "Test ROC score: 0.87905\n",
      "Test AP score: 0.86948\n"
     ]
    }
   ],
   "source": [
    "# Initialize session\n",
    "sess = tf.Session()\n",
    "sess.run(tf.global_variables_initializer())\n",
    "\n",
    "adj_label = adj_train + sp.eye(adj_train.shape[0])\n",
    "adj_label = sparse_to_tuple(adj_label)\n",
    "\n",
    "# Train model\n",
    "for epoch in range(FLAGS.epochs):\n",
    "    t = time.time()\n",
    "    # Construct feed dictionary\n",
    "    feed_dict = construct_feed_dict(adj_norm, adj_label, features, placeholders)\n",
    "    feed_dict.update({placeholders['dropout']: FLAGS.dropout})\n",
    "    # One update of parameter matrices\n",
    "    _, avg_cost = sess.run([opt.opt_op, opt.cost], feed_dict=feed_dict)\n",
    "    # Performance on validation set\n",
    "    roc_curr, ap_curr = get_roc_score(val_edges, val_edges_false)\n",
    "\n",
    "    print(\"Epoch:\", '%04d' % (epoch + 1), \n",
    "          \"train_loss=\", \"{:.5f}\".format(avg_cost),\n",
    "          \"val_roc=\", \"{:.5f}\".format(roc_curr),\n",
    "          \"val_ap=\", \"{:.5f}\".format(ap_curr),\n",
    "          \"time=\", \"{:.5f}\".format(time.time() - t))\n",
    "\n",
    "print('Optimization Finished!')\n",
    "\n",
    "roc_score, ap_score = get_roc_score(test_edges, test_edges_false)\n",
    "print('Test ROC score: {:.5f}'.format(roc_score))\n",
    "print('Test AP score: {:.5f}'.format(ap_score))"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 2",
   "language": "python",
   "name": "python2"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 2
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython2",
   "version": "2.7.13"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}