Although Tensorflow is the most popular Deep Learning Framework in 2016, Pytorch, a smaller new framework developed by FAIR(Facebook AI Research), become a dark horse this year. Pytorch supports Dynamic Graph Computing, which means you can freely add or remove layers in your model at runtime. It makes developer or scientist build new models more rapidly.
To fight back Pytorch, Tensorflow team add a new mechanism named “Eager Mode”, in which we could also use Dynamic Graph Computing. The example of “Eager Mode” looks like:

import tensorflow as tf
import tensorflow.contrib.eager as tfe
tfe.enable_eager_execution() #Enalbe Eager Execution Mode
x = [[2.]]
m = tf.matmul(x, x)
print(m)

As above, unlike traditional Tensorflow application that use “Session.run()” to execute whole graph, developers could see values and gradients of variables in any layer at any step.
How did Tensorflow do it? Actually, the tricks behind the API is not difficult. Take the most common Operation ‘matmul’ as example:

# file: tensorflow/python/ops/math_ops.py
def matmul(a,
           b,
           transpose_a=False,
           transpose_b=False,
           adjoint_a=False,
           adjoint_b=False,
           a_is_sparse=False,
           b_is_sparse=False,
           name=None):
......
    if use_sparse_matmul:
      return sparse_matmul(
          a,
          b,
          transpose_a=transpose_a,
          transpose_b=transpose_b,
          a_is_sparse=a_is_sparse,
          b_is_sparse=b_is_sparse,
          name=name)
    else:
      return gen_math_ops._mat_mul(
          a, b, transpose_a=transpose_a, transpose_b=transpose_b, name=name)

Le’t look into “gen_math_ops._mat_mul()”:

# file: bazel-genfiles/tensorflow/python/ops/gen_math_ops.py
def _mat_mul(a, b, transpose_a=False, transpose_b=False, name=None):
......
  if _ctx.in_graph_mode():
    _, _, _op = _op_def_lib._apply_op_helper(
        "MatMul", a=a, b=b, transpose_a=transpose_a, transpose_b=transpose_b,
        name=name)
    _result = _op.outputs[:]
    _inputs_flat = _op.inputs
    _attrs = ("transpose_a", _op.get_attr("transpose_a"), "transpose_b",
              _op.get_attr("transpose_b"), "T", _op.get_attr("T"))
  else:
    _attr_T, _inputs_T = _execute.args_to_matching_eager([a, b], _ctx)
    (a, b) = _inputs_T
    _inputs_flat = [a, b]
    _attrs = ("transpose_a", transpose_a, "transpose_b", transpose_b, "T",
              _attr_T)
    _result = _execute.execute(b"MatMul", 1, inputs=_inputs_flat,
                               attrs=_attrs, ctx=_ctx, name=name)
  _execute.record_gradient(
      "MatMul", _inputs_flat, _attrs, _result, name)
  _result, = _result
  return _result

As we can see, in Graph Mode, it will go to “_apply_op_helper()” to build graph (but not running it). In Eager Mode, it will execute the Operation directly.