Tensorflow serving python客户端碰到的性能问题以及解决办法

问题描述

在为灰度算法写tf serving的py客户端时,预测1000条样本,竟然要1200ms,发现预测的时间其实很短,不到5ms,95%的时间耗在了tf.make_tensor_proto这个方法上。

BTW:java api似乎没有碰到这个问题。

1
2
3
4
5
6
7
8
9
10
11
12
13
def tf_serving(ids_list, vals_list):
req = predict_pb2.PredictRequest()
req.model_spec.name = 'deepfm'
req.model_spec.signature_name = 'serving_default'
start = time.time()
req.inputs['feat_ids'].CopyFrom(
tf.make_tensor_proto(ids_list, shape=[len(ids_list), 145], dtype=tf.int64))
req.inputs['feat_vals'].CopyFrom(
tf.make_tensor_proto(vals_list, shape=[len(vals_list), 145], dtype=tf.float32))
end = time.time()
print(end - start)
response = STUB.Predict(req, 10.0)
return response.outputs['prob'].float_val

解决方法

方法1:

这个方法有比较明显的效果,同样1000条样本,数据处理时间能降到400ms。

1
2
3
4
5
from tf.contrib.util import make_tensor_proto
...
req.inputs['feat_ids'].CopyFrom(
make_tensor_proto(ids_list, shape=[len(ids_list), 145], dtype=tf.int64))
...

方法2:

直接导入相关模块,这个方法有非常明显的效果,数据时间能降到10ms左右。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
def tf_serving(ids_list, vals_list):
req = predict_pb2.PredictRequest()
req.model_spec.name = 'deepfm'
req.model_spec.signature_name = 'serving_default'

start = time.time()
dims = [tensor_shape_pb2.TensorShapeProto.Dim(size=len(ids_list)),
tensor_shape_pb2.TensorShapeProto.Dim(size=145)]
tensor_shape_proto = tensor_shape_pb2.TensorShapeProto(dim=dims)
tensor_proto_ids = tensor_pb2.TensorProto(
dtype=types_pb2.DT_INT64,
tensor_shape=tensor_shape_proto)
for ids in ids_list:
for id in ids:
tensor_proto_ids.int64_val.append(id)
tensor_proto_vals = tensor_pb2.TensorProto(
dtype=types_pb2.DT_FLOAT,
tensor_shape=tensor_shape_proto)
for vals in vals_list:
for val in vals:
tensor_proto_vals.float_val.append(val)
req.inputs['feat_ids'].CopyFrom(tensor_proto_ids)
req.inputs['feat_vals'].CopyFrom(tensor_proto_vals)
end = time.time()
print(end - start)
response = STUB.Predict(req, 10.0)
return response.outputs['prob'].float_val