Tensorflow2.0常用基础API

tensorflow2.0改进之后已经非常像numpy形式了，不用像之前的session那样操作，一些基本的操作如下。需要注意的店以及部分数据均写在代码注释中。

1. tf.constant / string

常量和字符串操作，constant是常量的意思。

代码语言：javascript复制

t = tf.constant([[1., 2., 3.], [4., 5., 6.]])
a = tf.ones([512,100])
b = tf.random.uniform([512, 100])

# index
print(t)
print(t[:, 1:])
print(t[..., 1]) # 取最后一维，索引为1的一列

# ops
print(t 10)  # t t就是对应元素相加
print(tf.square(t))
print(t @ tf.transpose(t)) # 矩阵乘法，用@

# numpy conversion
print(t.numpy())
print(np.square(t))
np_t = np.array([[1., 2., 3.], [4., 5., 6.]])
print(tf.constant(np_t))

# Scalars（标量）
t = tf.constant(2.718)
print(t.numpy())
print(t.shape)

# strings
t = tf.constant("cafe")
print(t)
print(tf.strings.length(t))
print(tf.strings.length(t, unit="UTF8_CHAR"))
print(tf.strings.unicode_decode(t, "UTF8”)) # tf.Tensor([ 99 97 102 101], shape=(4,), dtype=int32)

# string array
t = tf.constant(["cafe", "coffee", "咖啡"])
print(tf.strings.length(t, unit="UTF8_CHAR"))
r = tf.strings.unicode_decode(t, "UTF8”) # 这个地方会产生一个RaggedTensor，因为每个字长度不定

2. tf.RaggedTensor

RaggedTensor是不定长tensor，tf2.0新加的。在实际场景中用的挺多的，因为输入有时候不是固定的。

代码语言：javascript复制

# ragged tensor
r = tf.ragged.constant([[11, 12], [21, 22, 23], [], [41]])
# index op
print(r)
print(r[1])
print(r[1:3])


# ops on ragged tensor
r2 = tf.ragged.constant([[51, 52], [], [71]])
print(tf.concat([r, r2], axis = 0))# <tf.RaggedTensor [[11, 12], [21, 22, 23], [], [41], [51, 52], [], [71]]>
r3 = tf.ragged.constant([[13, 14], [15], [], [42, 43]])
print(tf.concat([r, r3], axis = 1))# <tf.RaggedTensor [[11, 12, 13, 14], [21, 22, 23, 15], [], [41, 42, 43]]>


# 转成tensor，缺的地方补0，注意的是0只能补在正常值的后面！
print(r.to_tensor())
# tf.Tensor(
# [[11 12 0]
# [21 22 23]
# [ 0 0 0]
# [41 0 0]], shape=(4, 3), dtype=int32)

3. tf.SparseTensor

稀疏tensor，处理很多0那种

代码语言：javascript复制

# sparse tensor，0可以在正常值的前面
s = tf.SparseTensor(indices = [[0, 1], [1, 0], [2, 3]],# 第一个参数，每一个值的坐标，同indexes
                    values = [1., 2., 3.],# 第二个参数，每一个值具体的数字
                    dense_shape = [3, 4])# 第三个参数，整体的shape
print(s)
print(tf.sparse.to_dense(s))# 转换成tensor，注意不是to_tensor,是to_dense

# ops on sparse tensors，乘法可以用，是element-wise
s2 = s * 2.0
print(s2)

# 加法不适用！会报错，需要使用的话得先转成tensor
try:
    s3 = s   1
except TypeError as ex:
    print(ex)

# 矩阵乘法适用，但是不能使用@，需要使用特定的api，或者是先转换成tensor再用@
s4 = tf.constant([[10., 20.],
                    [30., 40.],
                    [50., 60.],
                    [70., 80.]])
print(tf.sparse.sparse_dense_matmul(s, s4)) # 等价于print(tf.sparse.to_dense(s) @ s4)

# to_dense的时候inidices需要按顺序
s5 = tf.SparseTensor(indices = [[0, 2], [0, 1], [2, 3]],# 这里的[0, 2], [0, 1]就没有按顺序，如果to_dense会报错
                     values = [1., 2., 3.],
                     dense_shape = [3, 4])
print(s5)# 输出sparseTensor不会报错
s6 = tf.sparse.reorder(s5)# 排序一下就不会报错了
print(tf.sparse.to_dense(s6))

4. tf.Variables

Variables是可以赋值的变量，但是不能使用=，需要使用assign

代码语言：javascript复制

# Variables，变量
v = tf.Variable([[1., 2., 3.], [4., 5., 6.]])
print(v)
print(v.value())# 变成constant tensor
print(v.numpy())

1
# assign value，用于赋值，不能使用“=”！！！
v.assign(2*v) # element-wise product
v[0, 1].assign(42) # 指定位置赋值
v[1].assign([7., 8., 9.])# 给一列赋值

5. 自定义loss

代码语言：javascript复制

def customized_mse(y_true, y_pred):
    return tf.reduce_mean(tf.square(y_pred - y_true))


model = keras.models.Sequential([
    keras.layers.Dense(30, activation='relu',
                            input_shape=x_train.shape[1:]),
    keras.layers.Dense(1),
])
model.summary()
model.compile(loss=customized_mse, optimizer="sgd",
              metrics=["mean_squared_error"])
callbacks = [keras.callbacks.EarlyStopping(
            patience=5, min_delta=1e-2)]

loss=customized_mse原来是loss=“mean_squared_error”，相当于使用系统自带的loss。现在只需要自定义一个函数，然后传入函数就行了。metrics是一些其他的指标，如果把MSE放进去会发现，自己定义的和系统自带的MSE计算出的结果是一样的。

6. layer相关api

代码语言：javascript复制

layer = tf.keras.layers.Dense(100) # 设置神经元个数
layer = tf.keras.layers.Dense(100, input_shape=(None, 5)) # 设置输入shape，可以不设置，自动匹配

layer.variables # 输出相关参数，包括可训练和不可训练的
layer.trainable_variables # 输出可训练的参数，比如wx b里面的w和b

如果想要看神经元具体被训练成多少，layer.trainable_variables是一个好选择

7. 自定义layer

如果想要自定义层，比如不想使用dense层这种情况，可以通过继承的方式，自定义layer。主要从init build call 三个方向完成

代码语言：javascript复制

# customized dense layer.
class CustomizedDenseLayer(keras.layers.Layer):
    def __init__(self, units, activation=None, **kwargs):
        self.units = units # units是输出的第一个维度
        self.activation = keras.layers.Activation(activation)
        super(CustomizedDenseLayer, self).__init__(**kwargs)
    
    def build(self, input_shape):
        """构建所需要的参数"""
        # x * w   b. input_shape:[None, a] w:[a,b] output_shape: [None, b]
        self.kernel = self.add_weight(name = 'kernel’, # kernel就是wx b里面的w的意思
                                      shape = (input_shape[1], self.units), # 这里的shape就是[a,b]，也就是w的shape
                                      initializer = 'uniform’, # 初始化采用均匀分布
                                      trainable = True) # 希望w是可训练的
        self.bias = self.add_weight(name = 'bias',
                                    shape = (self.units, ), # 这里shape只有一个维度，就是[a,b]中的b，所以就是units
                                    initializer = 'zeros’, # bias初始化一般用0
                                    trainable = True) 
        super(CustomizedDenseLayer, self).build(input_shape) # 需要用一下父类的build
    
    def call(self, x):
        """完成正向计算"""
        return self.activation(x @ self.kernel   self.bias) # 比较简单，直接wx b

如果想要自定义一个简单层，可以通过更轻便的方法，比如说把一个公式变成一个层，以softplus为例

代码语言：javascript复制

# tf.nn.softplus : log(1 e^x)，可以看出relu的平滑版本，>0的时候几乎y=x，<0的时候控制住(0,1)内
customized_softplus = keras.layers.Lambda(lambda x : tf.nn.softplus(x))

# 调用
model = keras.models.Sequential([
    CustomizedDenseLayer(30, activation='relu',
                         input_shape=x_train.shape[1:]),
    CustomizedDenseLayer(1),
    customized_softplus, # 等价于以下两种写法
    # keras.layers.Dense(1, activation="softplus"),
    # keras.layers.Dense(1), keras.layers.Activation('softplus'),
])
model.summary()
model.compile(loss="mean_squared_error", optimizer="sgd")
callbacks = [keras.callbacks.EarlyStopping(
    patience=5, min_delta=1e-2)]

8. tf.function和auto-graph

如果想要把python本身的函数，转成tf的图，需要用以下内容。用起来其实没区别，问题是转换之后速度会提升。tf2.0的新特性。

方法一：

代码语言：javascript复制

# tf.function and auto-graph.
def scaled_elu(z, scale=1.0, alpha=1.0):# 先定义一个python函数
    # z >= 0 ? scale * z : scale * alpha * tf.nn.elu(z)
    is_positive = tf.greater_equal(z, 0.0) # tf判断大于等于
    return scale * tf.where(is_positive, z, alpha * tf.nn.elu(z)) # tf用where实现三元表达式

scaled_elu_tf = tf.function(scaled_elu) # 转换成图

# 这俩输出结果是一样的，但是图更快
print(scaled_elu(tf.constant([-3., -2.5])))
print(scaled_elu_tf(tf.constant([-3., -2.5])))

print(scaled_elu_tf.python_function is scaled_elu) # 还可以把变成图的函数变回来

# 测试时间
%timeit scaled_elu(tf.random.normal((1000, 1000)))
%timeit scaled_elu_tf(tf.random.normal((1000, 1000)))
# 24.8 ms ± 3.73 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
# 23.1 ms ± 5.67 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

最后测试时间可以看出来，图的函数更快，而且这个在GPU上表现的会更好。

方法二：

采用加@装饰器的方法，个人感觉更好用

代码语言：javascript复制

# 1   1/2   1/2^2   ...   1/2^n

@tf.function
def converge_to_2(n_iters):
    total = tf.constant(0.) # 其实就是x，写total规范一点
    increment = tf.constant(1.) # 其实就是y，写increment规范一点
    for _ in range(n_iters):
        total  = increment
        increment /= 2.0
    return total

print(converge_to_2(20))

关于variable，需要在tf.function外面定义，函数内部不能定义变量（神经网络中常用！）

代码语言：javascript复制

var = tf.Variable(1.) # 需要在外面定义

@tf.function
def add_21(var):
    return var.assign_add(21) #  =

print(add_21(var))
# tf.Tensor(22.0, shape=(), dtype=float32)

9. 函数签名与图结构

由于python是弱类型语言，如果不对函数进行类型的规范，容易出错。函数签名的意思就是给函数做输入的类型规范。

代码语言：javascript复制

@tf.function(input_signature=[tf.TensorSpec([None], tf.int32, name='x')]) # 输入，类型，名字为x
def cube(z):
    return tf.pow(z, 3)


try:
    print(cube(tf.constant([1., 2., 3.]))) # 因为规定了输入是int32，这样就会报错
except ValueError as ex:
    print(ex)
    # Python inputs incompatible with input_signature: inputs ((<tf.Tensor: id=1933, shape=(3,), dtype=float32, numpy=array([1., 2., 3.], dtype=float32)>,)), input_signature ((TensorSpec(shape=(None,), dtype=tf.int32, name='x'),))
    
print(cube(tf.constant([1, 2, 3]))) # 这个是正确的

上一节中的tf.function目的是把普通python函数转换成tf的图从而实现加速。

而get_concrete_function是通过对上一步中“加了tf.function的函数”再添加函数签名，变成可以保存的图结构SavedModel。（递进关系）

代码语言：javascript复制

# @tf.function py func -> tf graph
# get_concrete_function -> add input signature -> SavedModel

cube_func_int32 = cube.get_concrete_function(
    tf.TensorSpec([None], tf.int32))

print(cube_func_int32 is cube.get_concrete_function(
    tf.TensorSpec([5], tf.int32))) 
# True，说明函数签名一样
print(cube_func_int32 is cube.get_concrete_function(
    tf.constant([1, 2, 3]))) 
# True，说明函数签名一样

在得到ConcreteFunction这对象之后，可以获取到它的图（graph），并且这个图会有一些操作

代码语言：javascript复制

cube_func_int32.graph # 变成图（FuncGraph）
# <tensorflow.python.framework.func_graph.FuncGraph at 0x7f8c5563ecd0>

cube_func_int32.graph.get_operations() # 查看它有哪些操作，返回一个list
# [<tf.Operation 'x' type=Placeholder>,
# <tf.Operation 'Pow/y' type=Const>,
# <tf.Operation 'Pow' type=Pow>,
# <tf.Operation 'Identity' type=Identity>]

pow_op = cube_func_int32.graph.get_operations()[2] #把上面list的第三个取出来，发现是一段文本
print(pow_op)
# name: "Pow"
# op: "Pow"
# input: "x"
# input: "Pow/y"
# attr {
#   key: "T"
#   value {
#     type: DT_INT32
#   }
# }

print(list(pow_op.inputs)) # 可以查看取出来这个的输入输出
print(list(pow_op.outputs))
# [<tf.Tensor 'x:0' shape=(None,) dtype=int32>, <tf.Tensor 'Pow/y:0' shape=() dtype=int32>]
# [<tf.Tensor 'Pow:0' shape=(None,) dtype=int32>]

cube_func_int32.graph.get_operation_by_name("x”) # 可以通过名字获取操作，Placeholder虽然在tf2.0舍弃，但是在图定义里面还是存在的
<tf.Operation 'x' type=Placeholder>
cube_func_int32.graph.get_tensor_by_name("x:0”) # 还可以通过名字获取tensor
<tf.Tensor 'x:0' shape=(None,) dtype=int32>

# 查看图的结构
cube_func_int32.graph.as_graph_def()

以上这些函数一般有2个用途：

1.保存模型的时候会用到。

2.读取模型的时候，并且做inference的时候会用到。

10. tf.stack和tf.concat

初学经常容易混淆的两个API，先看一下官方的介绍：

【tf.stack】：Packs the list of tensors in values into a tensor with rank one higher than each tensor in values, by packing them along the axis dimension. Given a list of length N of tensors of shape (A, B, C);
if axis == 0 then the output tensor will have the shape (N, A, B, C). if axis == 1 then the output tensor will have the shape (A, N, B, C). Etc.

输入是一个tensor的list，然后把一个list里面的tensor变成一个tensor，并且维度加1（新增的维度就是这个list的长度）。而这个新增的维度具体增加再哪个位置，就是由axis决定。（其实看官网解释很明了）

【tf.concat】：Concatenates the list of tensors values along dimension axis. If values[i].shape = [D0, D1, ... Daxis(i), ...Dn], the concatenated result has shape

代码语言：javascript复制

[D0, D1, ... Raxis, ...Dn]

where

代码语言：javascript复制

Raxis = sum(Daxis(i))

输入同样是一个tensor的list，不同的是，输出的结果不会提高维度，但是shape会变。比如2个(2, 3)tensor按照axis=0来concat，就会变成一个(4, 3)的tensor。（其实看官网解释很明了）

11. tf.einsum

https://www.tensorflow.org/api_docs/python/tf/einsum

爱因斯坦求和，用法见上面文档，反正矩阵操作直接上，真香！！比如实现点积就用一下方法就可以了。

代码语言：javascript复制

a = tf.ones([512,100])
b = tf.random.uniform([512,100])

# 实现点积
c = tf.einsum('ij,ij->i', a, b)
c = tf.reduce_sum(a * b, axis = 1, keepdims = True)  # 等价的

总结

总结一下，tf虽然比pytorch复杂一些，好在tf的官方文档真的很好用，而且2.0之后也友好很多，建议首先参考官方文档：

https://www.tensorflow.org/api_docs/python/tf/all_symbols

tensorflow2.0 tensorflow python 深度学习入门

1 人点赞