tensorflow调试 tensorflow1 教程

转载

小屁孩 2024-05-27 16:33:52

文章标签 tensorflow调试可视化 python tensorflow java 文章分类 机器学习人工智能

本教程对应的tensorflow版本为：tensorflow 1版本

简介

　　深度学习的框架有很多：TensorFlow、Caffe、Theano、Torch...TensorFlow作为谷歌重要的开源项目，有非常火热的开源的开源社区推动着开源项目的发展，它能让项目有旺盛的生命力并在生命周期中不断涌现新的功能并以较快的迭代来更新Bug修复。Keras是在TensorFlow基础上构建的高层API，Keras在TensorFlow中。

　　Tensorflow中实现代码可能跟我们python程序有那么一点不一样，因为tensorflow有他自己的框架和体系，会用自己更加适配的方式来表达描述过程。

基本用法

tensorflow的程序通常被组织成两个相互独立的阶段，一个构建计算图(tf.Graph)的阶段，一个运行计算图(tf.Session)的阶段

在构建计算图阶段：由张量与操作，张量Tensor主要保存了3个属性：名字(name)、维度(shape)、类型(type);操作OP,图中的节点。

在运行计算图阶段：使用会话执行构建好的图中的操作

　　TensorFlow图描述了计算的过程。为了进行计算，图必须在会话里被启动。会话将图的OP分发到诸如CPU或GPU之类的设备上, 同时提供执行OP的方法。这些方法执行后, 将产生的tensor返回。

import tensorflow as tf

# 构建加法 计算图
a = tf.constant(2.0, dtype=tf.float32)
b = tf.constant(3.0)
c = tf.add(a, b)
print("a", a)        # a Tensor("Const:0", shape=(), dtype=float32)
print("b", b)        # b Tensor("Const_1:0", shape=(), dtype=float32)
print("c", c)        # c Tensor("add:0", shape=(), dtype=int32)

# 开启会话，在会话中运行计算图
with tf.Session() as sess:
    c_t = sess.run(c)   # 把run之后的结果保存在c_t中
    print(sess.run(b))        # 3.0
    print("在会话中run后的加法结果", c_t)     # 5

以上的图中，有是三个节点，两个constant节点，一个add节点，为了得到加法结果，再次强调我们必须在会话中run这个图

计算图(graph)

图中包含了操作(tf.Operation)和数据(tf.Tensor)

默认图

通常TensorFlow会默认帮我们创建一张图,

查看默认图有两种方法：

1、default_g = tf.get_default_graph()　　# 创建一个默认图的实例，绑定在default_g

2、op、sess都有graph属性，默认在一张图中

　　a.graph　　　　# <tensorflow.python.framework.ops.Graph object at 0x000001E3D5D12940>

　　sess.graph　　# <tensorflow.python.framework.ops.Graph object at 0x000001E3D5D12940>

创建图

new_g = tf.Graph()　　　　创建一张新的图的实例，绑定在new_g

如果要在这张图上定义数据和操作，可以使用new_g.as_default()，上下文管连器

要在tf.session(graph=new_g)选择自定义的图

# 自定义图
new_g = tf.Graph()
# 在自己的图中定义数据和操作
with new_g.as_default():
    a_new = tf.constant([[20]])

# 开启new_g的会话
with tf.Session(graph=new_g) as new_sess:
    a_new_value = new_sess.run(a_new)
    print("新的图的a的值", a_new_value)

会话(Session)

要评估张量，需要实例化一个tf.Session()对象，在会话中，调用 run 方法来运行图节点以及查看张量，

sess.run(a) 和Tensor.eval（a) 返回与张量内容相同的numpy 数组

Tensor.eval方法和sess.run()仅在tf.Session处于活跃时才起作用。

with tf.Session() as sess:　　# 启动默认图
    print(a.eval（))　　# 3
    print(sess.run(a))　　# 3

会话有两种开启的方式

1、tf.Session()

2、tf.InteractiveSession()，用于在交互模式中打开会话，ipython,shell,jupyter Notebook。

Session在使用完后需要释放资源，除了使用sess.close()外，一般使用“with代码块”来自动关闭。

tf.Session(graph=None, config=None)

graph=new_graph　　# new_graph是自定义创建的新的图的对象实例

config:清晰地显示操作运行在哪些设备上如果想要知道设备信息，把config设置成config = tf.ConfigProto(allow_soft_placement=True, log_device_placement=True)

会话的run()

run(fetches, feed_dict=None, options=None, run_metadata=None)

fetches：单一的operation，或者元组operation、列表operation

a = tf.constant(2.)
b = tf.constant(3.)
c = a+b

sess = tf.Session()
s = sess.run(c)
print(s)
print(c.eval（session=sess))

View Code

feed_dict：参数允许调用者覆盖图中张量的值，运行时赋值。

　　　　与tf.placeholder()配合使用，则会检测值的形状是否与占位符匹配。

我们可以通过sess.run取回多个tensor

input1 = tf.constant(3.0)
input2 = tf.constant(2.0)
input3 = tf.constant(5.0)
intermed = tf.add(input2, input3)
mul = tf.mul(input1, intermed)

with tf.Session():
  result = sess.run([mul, intermed])
  print result

# 输出:
# [array([ 21.], dtype=float32), array([ 7.], dtype=float32)]

View Code

feed操作

　　使用 tf.placeholder() 为这些操作创建占位符，run时候通过feed_dict指定参数替换占位符。

tf.placeholder(dtype, shape=None, name=None)

dytpe：数据类型，常用的是tf.float32、tf.float64、或者
shape：数据类型，默认为None，表示一维值；也可以表示多维，[None, 28, 28,1]
name：名称

返回：Tensor 类型

　　举个例子

a = tf.placeholder(tf.float32)
b = tf.placeholder(tf.float32)
c = tf.add(a,b)

with tf.Session() as sess:
    re = sess.run(c, feed_dict={a:1, b:2.5})
    print(re)        # 3.5
    print(sess.run(c, feed_dict={a:[1., 3.], b:[2., 3.]}))    # [3. 6.]

Tensor 张量

张量Tensor有3个属性：名字(name)、维度(shape)、类型(type);

0 阶张量

一个数字的大小

mammal = tf.Variable("Elephant", tf.string)
ignition = tf.Variable(451, tf.int16)
floating = tf.Variable(3.14159265359, tf.float64)
its_complicated = tf.Variable(12.3 - 4.85j, tf.complex64)

1 阶张量

mystr = tf.Variable(["Hello"], tf.string)
cool_numbers  = tf.Variable([3.14159, 2.71828], tf.float32)
first_primes = tf.Variable([2, 3, 5, 7, 11], tf.int32)
its_very_complicated = tf.Variable([12.3 - 4.85j, 7.5 - 6.23j], tf.complex64)

2 阶张量

mymat = tf.Variable([[7],[11]], tf.int16)
myxor = tf.Variable([[False, True],[True, False]], tf.bool)
linear = tf.Variable([[4], [9], [16], [25]], tf.int32)
squarish = tf.Variable([ [4, 9], [16, 25] ], tf.int32)
rank = tf.rank(squarish_squares)

在图像处理的过程中，会使用许多4阶张量，维度对应批次大小、图像宽度、图像高度和颜色通道。

my_image = tf.zeros([10, 299, 299, 3])  # batch x height x width x color

改变形状：tf.reshape(x, shape)

three = tf.ones([3, 4, 5])
matrix = tf.reshape(three, [6, 10])  #重塑成 6*10 
matrixB = tf.reshape(matrix, [3, -1])  # 重塑成 3x20

切片索引的时候[3, -1]列表示重塑成3行任意列。

constant 常量 Tensor

生成0的张量函数 tf.zeros(shape=[2,2], dtypt=tf.float32, namhanghe=None)　　

生成1的张量函数 tf.ones(shape=[2,2], dtypt=tf.float32, namhanghe=None)

生成都是value值的dims形状的张量　　tf.fill(dims, value, name=None)

import tensorflow as tf
t1 = tf.fill([2,3], 3)
sess = tf.Session()
print(sess.run(t1))

# [[3 3 3]
#  [3 3 3]]

View Code

生成常数 tf.constant(value, dtype=None, Shape=None, name="Const")

numpy数据转换为tensorflow数据：data_tensor= tf.convert_to_tensor(data_numpy)

tensorflow数据转换为numpy数据：data_numpy = data_tesor.eval（)在会话中运行

random 随机 Tensor

tf.random_normal(shape, mean=0.0, stddev=1.0, dtype=tf.float32, seed=None, name=None)

　　生成标准正太分布的随机张量tf.truncated_normal(shape, mean=0.0, stddev=1.0, dtype=tf.float32, seed=None, name=None)

　　shape表示生成张量的维度，mean是均值，stddev是标准差。这个函数产生截断的正太分布，就是说产生正太分布的值与均值的差值大于两倍的标准差，那就重新生成。和一般的正太分布的产生随机数据比起来，这个函数产生的随机数与均值的差距不会超过两倍的标准差，但是一般的别的函数是可能的。

tf.random_uniform(shape, minval=0, maxval=None, dtype=tf.float32, seed=None, name=None)

　　从均匀分布中返回随机值，shape形状、minval最小值、maxval最大值tf.random_shuffle(value, seed=None, name=None)

　　沿着要被洗牌的张量的第一个维度，随机打乱。value要洗牌的张量

initialize 初始化Tensor

tf.constant_initializer(value=0, dtype=tf.float32)

　　也可以简写为tf.Constant(),初始化为常数，这个非常有用，通常偏置项就是用它初始化的。

　　由它衍生出的两个初始化方法：

tf.zeros_initializer(shape, dtype=tf.float32, partition_info=None)
tf.ones_initializer(dtype=tf.float32, partition_info=None)
tf.constant_initializer(0)，tf.Constant(0)

tf.truncated_normal_initializer(mean=0.0, stddev=1.0, seed=None, dtype=tf.float32)

　　也可简写为tf.TruncatedNormal()，生成截断正态分布的随机数，这个初始化方法好像在tf中用得比较多。tf.random_normal_initializer(mean=0.0, stddev=1.0, seed=None, dtype=tf.float32)

　　用正态分布产生张量的初始化器，mean均值、stddev方差、seed随机种子、dtype数据类型。

tf.random_uniform_initializer(minval=0, maxval=None, seed=None, dtype=tf.float32)

　　初始化均匀分布的随机Tensor，参数分别用于指定minval最小值、maxval最大值、seed随机数种子、dtype类型。

tf.cast(x, dtype, name=None)　　# 把张量x，变成任何的dtype

变量

变量也是张量的一种

创建变量 get_variable；variable

　　tensorflow中创建变量的函数有tf.get_variable；tf.Variable

tf.get_variable(name, shape=None, dtype=None, initializer=None, regularizer=None, trainable=None, collections=None, caching_device=None, partitioner=None, validate_shape=True, use_resource=None, custom_getter=None, constraint=None)

　　获取一个已经存在的变量或者创建一个新变量

name：新变量或现有变量的name
shape：新变量或现有变量的shape
dtype：新变量或现有变量的类型(默认为DT_FLOAT)。
initializer：变量的初始化方式
regularizer：函数(tensor-->tensor)，将函数应用于新创建的变量，结果将添加到集合tf.GraphKeys.REGULARIZATION_LOSSES中，并可用于正则化。
trainable：默认为True，将变量添加到图形集合GraphKeys.TRAINABLE_VARIABLES，此集合用于优化器优化的默认变量列表
collections：要将变量添加到的图表集合列表。默认为[GraphKeys.GLOBAL_VARIABLES](参见tf.Variable)
validate_shape：如果为True，则默认为initial_value的形状必须已知。如果为False，则允许使用未知形状的值初始化变量。
use_resource：如果为False，则创建常规变量。如果为true，则使用定义良好的语义创建实验性ResourceVariable。默认为False(以后版本中将更改为True)。在Eager模式下，此参数始终强制为True。

例如：创建一个名为“variable_1”的变量，shape为[1,2,3]的三维张量。默认情况下dtype=tf.float32

v_1 = tf.get_variable("v_1", shape=[1, 2, 3], dtype=tf.int32, initializer=tf.zeros_initializer())

我们还可以将Tensor变量作为初始化对象，但此时我们不能设置shape初始化形状，而需要使用初始化张量tf.constant([23, 42])的shape

v_2 = tf.get_variable("v_2", dtype=tf.int32, initializer=tf.constant([23, 42]))

tf.Variable(initial_value=None, trainable=None, collections=None, validate_shape=True, caching_device=None, name=None, variable_def=None, dtype=None, expected_shape=None, import_scope=None, constraint=None, use_resource=None, shape=None)

　　在graph中创建变量

initial_value：Tensor，定义Variable的初始值。
trainable：默认为True，将变量添加到图形集合GraphKeys.TRAINABLE_VARIABLES中，此集合用于优化器优化的默认变量列表
collections：graph集合列表的关键字。新变量将添加到这个集合中。默认为[GraphKeys.GLOBAL_VARIABLES]。
validate_shape：如果为True，则默认为initial_value的形状必须已知。如果为False，则允许使用未知形状的值初始化变量。
name：变量名，可选，默认为“Variable”并自动获取
dtype：数据类型

v = tf.Variable(5, name="v", dtype=tf.float16)

　　我们可以使用assign、assign_add方法为变量tf.Variable赋值

import tensorflow as tf
# 创建一个变量, 初始化为标量 0.
state = tf.Variable(0, name="counter")
one = tf.constant(1)
v = tf.get_variable("v", shape=(), initializer=tf.zeros_initializer())

assignment = v.assign_add(1)    # 变量加1
new_value = tf.add(state, one)  # new_value = state + 1
update = tf.assign(state, new_value)    # state = new_value 只是在描绘图，并不会真正的执行

# 启动图后, 变量必须先经过'初始化'(init)
init = tf.global_variables_initializer()

# 启动图, 运行 op
with tf.Session() as sess:
  sess.run(init)    # 运行 'init' 
  print(sess.run(state))    # 打印'state'的初始值 0
  print(sess.run(assignment))    # assignment.eval（)
  # 运行 op, 更新 'state', 并打印 'state'
  for _ in range(3):
    sess.run(update)
    print(sess.run(state))

# 0
# 1
# 2
# 3

View Code

区别

区别1：使用tf.Variable时候，如果检测到命名冲突，系统会自己处理。使用tf.get_variable()时，系统不会处理相同变量名的冲突，而会报错。

tf.Variable 创建变量

w_1 = tf.Variable(3, name="w_1")
w_2 = tf.Variable(1, name="w_1")
print(w_1.name)     # w_1:0
print(w_2.name)     # w_1_1:0

tf.get_variable 创建变量

w_1 = tf.get_variable(name="w_1", initializer=1)
w_2 = tf.get_variable(name="w_1", initializer=2)

# ValueError: Variable w_1 already exists, disallowed. Did you mean to set reuse=True in VarScope?

区别2：当变量空间参数reuse=True时，由于tf.Variable()每次都在创建新对象，所以和它没有什么关系，对于get_vatiable()，如果是一个已经创建的变量对象，就把那个对象返回，如果是之前没有创建的变量对象的话，就创建一个新的。

import tensorflow as tf

with tf.variable_scope("scope1"):
    scope1_1 = tf.get_variable("w1", shape=[])
    scope1_2 = tf.Variable(0.0, name="w2")
with tf.variable_scope("scope2"):
    scope2_1 = tf.get_variable("w1", shape=[])
    scope2_2 = tf.Variable(0.0, name="w2")
with tf.variable_scope("scope1", reuse=True):
    scope1_reuse_1 = tf.get_variable("w1", shape=[])
    scope1_reuse_2 = tf.Variable(1.0, name="w2")
    
# 相同的get_variable name在不同的variable_scope内，最后的name是不同的
print(scope1_1.name, scope1_2.name)  # scope1/w1:0 和 scope1/w2:0
print(scope2_1.name, scope2_2.name)  # scope2/w1:0 和 scope2/w2:0
# Variable遇见相同的名字会自动创建新的name
print(scope1_reuse_1.name, scope1_reuse_2.name)  # scope1/w1:0 和 scope1_1/w2:0

print(scope1_1 is scope1_reuse_1)   # True  w1和w1_p 指向同一个对象
print(scope1_2 is scope1_reuse_2)   # False w2和w2_p 指向不同对象

代码中创建了变量的，“变量”必须经过初始化，然后在会话中运行下面一段进行初始化，一次性初始化所有变量，初始化后，才能调用与显示。

a = tf.Variable(initial_value=50)
b = tf.Variable(initial_value=50)
c = tf.add(a, b)

init = tf.global_variables_initializer()    # 初始化变量

with tf.Session() as sess:      # 开启会话
    sess.run(init)      # 允许初始化
    a_value, b_value, c_value = sess.run([a,b,c])
    print("a_value",a_value)
    print("b_value", b_value)
    print("c_value", c_value)

tf.global_variables_initializer不会指定变量的初始化顺序，因此如果变量的初始值取决另一个变量的值，那么很有可能出现错误，所以我们最好使用variable.initialized_value()，而非veriable，也就是在创建变量的时候给变量初始化。

v = tf.get_variable("v", shape=(), initializer=tf.zeros_initializer())　　# 在创建变量的时候初始化 推荐
w = tf.get_variable("w", initializer=tf.zeros())　　　　# 创建变量 等后面通过tf.global_variables_initializer统一初始化

当然我们也可以自行初始化变量

session.run(my_variable.initializer)
my_variable.initializer.run()

tf.add_to_collection：把变量放入一个集合，把很多变量变成一个列表

tf.get_collection：从一个列表中取出全部变量，是一个列表

tf.add_n：把一个列表的东西都依次加起来

import tensorflow as tf
 
v1 = tf.get_variable(name='v1', shape=[1], initializer=tf.constant_initializer(0))
tf.add_to_collection('loss', v1)
v2 = tf.get_variable(name='v2', shape=[1], initializer=tf.constant_initializer(2))
tf.add_to_collection('loss', v2)

init = tf.global_variables_initializer()

with tf.Session() as sess:
    sess.run(init)
    print(tf.get_collection("loss"))
    # [<tf.Variable 'v1:0' shape=(1,) dtype=float32_ref>, 
    # <tf.Variable 'v2:0' shape=(1,) dtype=float32_ref>]
    print(sess.run(tf.add_n(tf.get_collection("loss"))))    # [2.]

变量命名空间 tf.name_scope；variable_scope

tensorflow中创建变量命名空间域的函数有 tf.name_scope；tf.variable_scope

使用tf.variable_scope()修改变量的命名空间，使得代码的结构更加清晰，还能够使得TensorBoard更加整洁。同时给变量取名name也可以使得TensorBoard更加整洁。

首先看看比较简单的tf.name_scope("scope_name")

tf.name_scope主要结合tf.Variable()来使用，它的主要目的是为了方便管理参数命名。

import tensorflow as tf

with tf.name_scope('conv1'):
    weights1 = tf.Variable([1.0, 2.0], name='weights')
    bias1 = tf.Variable([0.3], name='bias')

# 下面是在另外一个命名空间来定义变量的
with tf.name_scope('conv2'):
    weights2 = tf.Variable([4.0, 2.0], name='weights')
    bias2 = tf.Variable([0.33], name='bias')

# 所以，实际上weights1 和 weights2 这两个引用名指向了不同的空间，不会冲突
print(weights1.name)        # conv1/weights:0
print(weights2.name)        # conv2/weights:0

# 这时候如果再次执行上面的代码，就会再生成其他命名空间
with tf.name_scope('conv1'):
    weights1 = tf.Variable([1.0, 2.0], name='weights')
    bias1 = tf.Variable([0.3], name='bias')

with tf.name_scope('conv2'):
    weights2 = tf.Variable([4.0, 2.0], name='weights')
    bias2 = tf.Variable([0.33], name='bias')

print(weights1.name)        # conv1_1/weights:0
print(weights2.name)        # conv2_1/weights:0

我们再来看看tf.variable_scope("scope_name")

tf.variable_scope()主要结合tf.get_variable()来使用，实现变量共享。

with tf.variable_scope('v_scope'):
    Weights1 = tf.get_variable('Weights', shape=[2,3])
    bias1 = tf.get_variable('bias', shape=[3])

# 下面来共享上面已经定义好的变量
# 在下面的 scope 中的get_variable()变量必须经过get_variable()定义过了，才能设置 reuse=True，否则会报错reuse=True，否则会报错
with tf.variable_scope('v_scope', reuse=True):
    Weights2 = tf.get_variable('Weights')

print(Weights1.name)    # v_scope/Weights:0
print(Weights2.name)    # v_scope/Weights:0

　　我们可以看到这两个变量命名空间名都是 v_scope，reuse=True只是让后一个v_scope重用上一个v_scope的所有变量。在后一个v_scope中定义的变量，必须是已经在上一个v_scope中经过get_variable定义过的，否则会报错。tf.Variable()每次都在创建新对象，而tf.get_variable()如果变量存在，则使用以前创建的变量，如果不存在，则新创建一个变量。

从输出我们可以看出来，这两个引用名称指向的是同一个内存对象

　　共享变量有两种方法

# 方法一
with tf.variable_scope("name") as scope:
    scope.reuse_variables()
# 方法二
with tf.variable_scope("name", reuse=True):
    scope.reuse_variables()

View Code

我们同时使用Variable和get_variable看看输出结果：

with tf.variable_scope('v_scope') as scope1:
    Weights1 = tf.get_variable('Weights', shape=[2,3])
    bias1 = tf.Variable([0.52], name='bias')

# 下面来共享上面已经定义好的变量
with tf.variable_scope('v_scope', reuse=True) as scope2:
    Weights2 = tf.get_variable('Weights')
    bias2 = tf.Variable([0.52], name='bias')

print(Weights1.name)        # v_scope/Weights:0
print(Weights2.name)        # v_scope/Weights:0
print(bias1.name)           # v_scope/bias:0
print(bias2.name)           # v_scope_1/bias:0

如果reuse=True的scope中的变量没有经过get_variable定义，则会报错。

with tf.variable_scope('v_scope') as scope1:
    Weights1 = tf.get_variable('Weights', shape=[2,3])
    bias1 = tf.Variable([0.52], name='bias')        # bias1 的定义方式是使用Variable

print(Weights1.name)        # v_scope/Weights:0
print(bias1.name)           # v_scope/bias:0

# 下面来共享上面已经定义好的变量
# 在下面的 scope 中的get_variable()变量必须经过get_variable()定义过了，才能设置 reuse=True，否则会报错
with tf.variable_scope('v_scope', reuse=True) as scope2:
    Weights2 = tf.get_variable('Weights')
    bias2 = tf.get_variable('bias', [1])  # ‘bias

print(Weights2.name)
print(bias2.name)

# Variable v_scope/bias does not exist, or was not created with tf.get_variable()

总结：如果我们使用tf.variable_scope定义变量命名空间，尽量使用tf.get_variable创建变量。

用tensorflow做一个简单的线性回归

import tensorflow as tf

x_data = tf.random_normal(shape=[100,1])
y_data = tf.matmul(x_data, [[0.4]]) + 5

# 构造一个线性模型
b = tf.Variable(initial_value=tf.random_normal(shape=[1, 1]))
W = tf.Variable(initial_value=tf.random_normal(shape=[1, 1]))
y = tf.matmul(x_data, W) + b

# 最小化方差
loss = tf.reduce_mean(tf.square(y - y_data))
# 设置学习率0.5的梯度下降算法，求最小值
optimizer = tf.train.GradientDescentOptimizer(0.01).minimize(loss)
init = tf.global_variables_initializer()    # 初始化变量

# 开启会话
with tf.Session() as sess:
    sess.run(init)
    print("训练前的w: %f 和b:%f" % (W.eval（), b.eval（)))
    for i in range(0, 1000):
        sess.run(optimizer)
        if i % 20 == 0:     # 每20次打印一次
            print("第 %d 次训练前的w: %f 和b:%f 损失值%f" % (i, W.eval（), b.eval（), loss.eval（)))

View Code

TensorBoard：可视化学习

TensorFlow是tensorflow的可视化工具，实现程序可视化的过程

1、数据序列化-events文件

在会话中写入事件文件，然后在path路径创建一个even文件

tf.summary.FileWriter("path", graph=sess.graph)

path：事件文件的写入地址
graph：选择描绘的计算图，graph=sess.graph 或者 graph=tf.get_default_graph() 选择的都是默认图

import tensorflow as tf 

a = tf.constant(20, name="a")
b = tf.constant(30, name="b")
c = a + b

with tf.Session() as sess:
    c = sess.run(c)
    writer = tf.summary.FileWriter("./", sess.graph)　　# 写入事件文件
    writer.close()　　# 写完之后最后要记得关闭

2、启动TensorBoard

在cmd中或者Git Bash中运行 tensorboard --logdir="path"

然后在谷歌浏览器中 localhost:6006 就可以看到图了

tensorflow调试 tensorflow1 教程_python

我们来看一个更复杂的TensorBoard图，代码如下

# 实现一个线性回归
import tensorflow as tf

# 准备数据
X = tf.random_normal(shape=[100, 1])
y_true = tf.matmul(X, [[0.8]]) + 0.7

# 构建线性模型的tensor变量Weight, bias
Weight = tf.Variable(initial_value=tf.random_normal(shape=[1, 1]))
bias = tf.Variable(initial_value=tf.random_normal(shape=[1, 1]))
y_predict = tf.matmul(X, Weight) + bias

# 构建损失方程，优化器及训练模型操作train
loss = tf.reduce_mean(tf.square(y_predict - y_true))
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.5)
train = optimizer.minimize(loss)

init = tf.initialize_all_variables()     # 构建变量初始化操作init

with tf.Session() as sess:
    sess.run(init)
    writer = tf.summary.FileWriter("./", sess.graph)

    for step in range(1000):
        sess.run(train)
        if step % 20 == 0:     # 每20次打印一次
            print(step, sess.run(Weight), sess.run(bias))

    writer.close()

View Code

tensorflow调试 tensorflow1 教程_python_02

3、增加变量显示功能

目的：在TensorBoard中观察模型参数、损失值等变量的变化

步骤：

1、收集变量用于tensorboard可视化

tf.summary.scalar(name="", tensor)　　 # 收集标量，例：loss、accuary、learning_rate
tf.summary.histogram(name="", tensor)　　# 收集高维度的变量，例：训练过程中的变量，以直方图的形式显示变量分布情况
tf.summaty.image(name="", tensor)　　　　# 收集图片张量[image_num, height, width, channels]，展示训练过程中记录的图像
tf.summaty.audio(name="", tensor)　　　　# 收集音频张量[audio_num, frames, channels]，播放训练过程中记录的音频

数据的tensorboard可视化展现可以通过该：链接

2、合并变量并写入事件文件

　　1、将所有的summary保存到磁盘，以便tensorboard可视化：merged = tf.summary.merge_all()　

　　2、在会话中创建事件文件保存图：event_file = tf.summary.FileWriter("./", sess.graph)

　　3、在会话中运行合并操作：summary = sess.run(merged)　　　　　　# 每次迭代都需要运行

　　4、将每次迭代更新后的summary写入事件文件：event_file.add_summary(summary, step)　　　　

　　5、最后记得关闭事件文件：event_file.close()

# 实现一个线性回归
import tensorflow as tf

# 准备数据
X = tf.random_normal(shape=[100, 1])
y_true = tf.matmul(X, [[0.8]]) + 0.7

# 构建线性模型的tensor变量Weight, bias
Weight = tf.Variable(initial_value=tf.random_normal(shape=[1, 1]))
bias = tf.Variable(initial_value=tf.random_normal(shape=[1, 1]))
y_predict = tf.matmul(X, Weight) + bias

# 构建损失方程，优化器及训练模型操作train
loss = tf.reduce_mean(tf.square(y_predict - y_true))
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.01)
train = optimizer.minimize(loss)

# 收集变量
tf.summary.scalar("error", loss)            # 收集标量
tf.summary.histogram("Weight", Weight)      # 收集高维变量
tf.summary.histogram("bias", bias)
merged = tf.summary.merge_all()             # 合并变量

init = tf.initialize_all_variables()        # 变量初始化操作

with tf.Session() as sess:
    sess.run(init)
    event_file = tf.summary.FileWriter("./", sess.graph)     # 创建事件文件

    for step in range(100):
        sess.run(train)
        if step % 20 == 0:     # 每20次打印一次
            print(step, sess.run(Weight), sess.run(bias))

        summary = sess.run(merged)                  # 运行合并变量操作
        event_file.add_summary(summary, step)       # 将每次迭代后的变量写入事件文件
    event_file.close()

View Code

菜单栏多了 scalars(标量)、distributions(分布图)、histogram(直方图)

tensorflow调试 tensorflow1 教程_java_03

我们先来看看scalars

tensorflow调试 tensorflow1 教程_python_04

我们再来看看 distributions和histogram

tensorflow调试 tensorflow1 教程_tensorflow_05

tensorflow调试 tensorflow1 教程_java_06

我们可以看到，我们所看到的grapth图结构是比较乱的，我们添加变量命名空间让图结构显示的更整齐。

tensorflow调试 tensorflow1 教程_可视化_07

tensorflow调试 tensorflow1 教程_可视化_08

模型的保存与加载

模型保存

saver = tf.train.Saver(var_list=None, max_to_keep=5)　　　保存和加载模型(保存文件格式：checkpoint文件)

var_list：指定要保存的变量，可以作为一个dict或列表传递，如果为None，就是保存所有变量。
max_to_keep：保留检查点文件的数量，创建新文件时会删除旧的文件

saver.save(sess, save_path="./liner.ckpt", global_step=step)　　保存模型

sess：会话名字
save_path：设定权重参数保存的路径和文件名；
global_step=step：将训练的次数作为后缀加入到模型名字中。

一次 saver.save() 后可以在文件夹中看到新增的四个文件，

checkpoint 　　　　　　记录最新的模型

***.meta　　　　　　　　存储网络结构

***.data/***.index　　存储训练好的参数

tensorflow调试 tensorflow1 教程_java_09

import tensorflow as tf

v1 = tf.get_variable("v1", shape=[3], initializer=tf.zeros_initializer)
v2 = tf.get_variable("v2", shape=[5], initializer=tf.zeros_initializer)
inc_v1 = v1.assign(v1+1)        # v1变量+1
dec_v2 = v2.assign(v2-1)        # v2变量-1

init = tf.global_variables_initializer()
saver = tf.train.Saver()        # 创建Saver对象

with tf.Session() as sess:
    sess.run(init)
    inc_v1.op.run()
    dec_v2.op.run()
    for epoch in range(300):
        if epoch % 10 ==0:
            # 在会话中保存模型
            save_path = saver.save(sess, "./model/model.ckpt", global_step=epoch)
    print("Model saved in path: %s" % save_path)

下图是训练过程中生成的几个模型文件列表

tensorflow调试 tensorflow1 教程_python_10

模型加载

加载模型：saver.restore(sess, "./liner.ckpt")

获取最新的模型：checkpoint = tf.train.latest_checkpoint("./model/")

import tensorflow as tf
import os

v1 = tf.get_variable("v1", shape=[3])
v2 = tf.get_variable("v2", shape=[5])
tf.add_to_collection("variable", v1)
tf.add_to_collection("variable", v2)

def load_model(sess, ckpt):
    # ckpt是模型路径
    if os.path.isdir(ckpt):
        # 获取最新的模型
        checkpoint = tf.train.latest_checkpoint(ckpt)  # ./model/model.ckpt-290
    else:
        checkpoint = ckpt
    print(checkpoint)
    meta = checkpoint + '.meta'  # './model/model.ckpt-290.meta'
    
    saver = tf.train.import_meta_graph(meta)  # 加载graph图结构
    saver.restore(sess, checkpoint)  # 加载模型参数


with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    model_dir = "./model/"
    load_model(sess, model_dir)
    # 从变量集和名为"variable"的集和中取出变量
    a, b = tf.get_collection('variable')   
    print(a,b)

    print("模型恢复....")
    print("v1 : %s" % v1.eval（))
    print("v2 : %s" % v2.eval（))

sv = tf.train.Supervisor(logdir="./my_dir", init_op=init)

一般我们在训练模型之前，都会检查本地是否有之前已经训练好并保存了的模型，所以会做一次if判断，但是Supervisor可以节省这一步，sv = tf.train.Supervisor(logdir=log_path, init_op=init)会判断模型是否存在.如果存在,会自动读取模型.不用显式地调用restore，具体如下：

log_path = "./Source/model/supervisor"
log_name = "linear.ckpt"

saver = tf.train.Saver()            # 创建saver对象
init = tf.global_variables_initializer()

sess = tf.Session()
sess.run(init)

if len(os.listdir(log_path)) != 0:  # 如果已经有模型，则直接读取
    saver.restore(sess, os.path.join(log_path, log_name))
for step in range(201):
    sess.run(train)
    if step % 20 == 0:
        print(step, sess.run(W), sess.run(b))
saver.save(sess, os.path.join(log_path, log_name))

用Supervisor

log_path = "./Source/model/supervisor"
log_name = "linear.ckpt"

init = tf.global_variables_initializer()

sv = tf.train.Supervisor(logdir=log_path, init_op=init)  # logdir用来保存checkpoint和summary
saver = sv.saver                  # 创建saver
sess = sv.managed_session()        # 会自动去logdir中去找checkpoint，如果没有的话，自动执行初始化

for i in range(201):
    sess.run(train)
    if i % 20 == 0:
        print(i, sess.run(W), sess.run(b))
saver.save(sess, os.path.join(log_path, log_name))

通过checkpoint找到模型文件名

tf.train.get_checkpoint_state(checkpoint_dir,latest_filename=None)

参数：

checkpoint_dir：checkpoint文件的路径
latest_filename：指定checkpoint的名字

返回的是CheckpointState proto对象，CheckpointState proto对象有两个可用属性。

model_checkpoint_path：最新的chechpoint路径
all_model_checkpoint_paths：文件中所有的checkpoints路径列表

ckpt = tf.train.get_checkpoint_state(checkpoint_dir=checkpoint_dir)
if ckpt and ckpt.model_checkpoint_path:
    print(ckpt.model_checkpoint_path)  # ./model/model.ckpt-1000(1000是训练步数)
    print(ckpt.all_model_checkpoint_paths)  # ['./model/model.ckpt-500', './model/model.ckpt-1000']
    # 返回最新的chechpoint文件名 
    ckpt_name = os.path.basename(ckpt.model_checkpoint_path)  # model.ckpt-1000
    print("新的chechpoint文件名", ckpt_name)  # model.ckpt-1000
    saver.restore(sess, os.path.join(checkpoint_dir, ckpt_name))
    global_step = int(next(re.finditer("(\d+)(?!.*\d)", ckpt_name)).group(0))  # 1000

命令行参数的使用

　　tensorflow支持程序从命令行接受参数，即使用tf.app.flag，tf.app.flag可以定义各种参数类型

tf.app.flag.DEFINE_****(参数名, 默认值, 参数说明文档字符串)

整形参数：tf.app.flag.DEFINE_integer(flag_name, default, docstring)
字符串参数：tf.app.flag.DEFINE_string(flag_name, default, docstring)
布尔值参数：tf.app.flag.DEFINE_boolean(flag_name, default, docstring)
浮点值参数：tf.app.flag.DEFINE_float(flag_name, default, docstring)
...

　　通过tf.app.flags.FLAGS，可以调用到我们前面具体定义的参数名flag_name

import tensorflow as tf 

# 定义命令行参数
tf.app.flags.DEFINE_integer("step", 100, "训练模型的步数")
tf.app.flags.DEFINE_string("model_dir", "Unknown", "模型保存的路径")

FLAGS = tf.app.flags.FLAGS                  # 简化变量名

print("step:", FLAGS.step)                  # step: 100
print("model_dir:", FLAGS.model_dir)        # model_dir: Unknown

　　tf.app.run() 可以自动运行脚本中的main(argv)函数，如果脚本没有main(argv)函数，会报错。

　　main函数中的argv参数打印出来是脚本的地址

import tensorflow as tf 

def main(argv):
    print(argv)     # ['C:\\Users\\Never\\Desktop\\temp\\temp.py']

if __name__ == "__main__":
　　tf.app.run()        # 自动调用脚本中的main(argv)函数

关于多个GPU的分类使用

　　TensorFlow一般你不需要显式指定使用CPU还是GPU，TensorFlow能自动检测。如果检测到 GPU，TensorFlow会尽可能地利用找到的第一个GPU来执行操作。

　　如果你的电脑有两个GPU，tensorflow默认是不会使用的，为了让 TensorFlow 使用这些 GPU, 你必须将 op 明确指派给它们执行. with.Device 语句用来指派特定的 CPU 或 GPU 执行操作：

with tf.Session() as sess:
  with tf.device("/gpu:1"):    # 使用第2个GPU
    matrix1 = tf.constant([[3., 3.]])
    matrix2 = tf.constant([[2.],[2.]])
    product = tf.matmul(matrix1, matrix2)
    ...

关于IPython的tensorflow的使用

　　为了便于使用诸如IPython之类的Python交互环境,比如jupyter notebook。可以使用InteractiveSession代替Session 类，使用Tensor.eval（)和Operation.run()方法代替Session.run()。这样可以避免使用一个变量来持有会话。

# 进入一个交互式 TensorFlow 会话.
import tensorflow as tf
sess = tf.InteractiveSession()

x = tf.Variable([1.0, 2.0])
a = tf.constant([3.0, 3.0])

# 使用初始化器 initializer op 的 run() 方法初始化 'x'
x.initializer.run()

# 增加一个减法 sub op, 从 'x' 减去 'a'. 运行减法 op, 输出结果
sub = tf.subtract(x, a)
print(sub.eval（))    # [-2. -1.]

View Code

　　运行tensorflow的时候，会出现红色警告I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2

　　如果不想看到这个，在脚本开始的时候，加

import os
os.environ["TF_CPP_MIN_LOG_LEVEL"]= "2"
# 或者
import os
os.environ["KERAS_BACKEND"] = "tensorflow"

参考文献

tensorflow中文社区

【tensorflow】tf.train.get_checkpoint_state

TensorFlow中numpy与tensor数据相互转化

模型保存和加载

本文章为转载内容，我们尊重原作者对文章享有的著作权。如有内容错误或侵权问题，欢迎原作者联系我们进行内容更正或删除文章。

上一篇：greenplum最新版 greenplum gpload

下一篇：java token延长有效期 token的有效期

提问和评论都可以，用心的回复会被更多人看到评论

发布评论

相关文章

官方博客	全部文章	热门标签	班级博客
了解我们	网站地图	意见反馈

鸿蒙开发者社区	51CTO学堂
51CTO	软考资讯