在 TensorFlow.org 上查看 | 在 Google Colab 中运行 | 在 GitHub 上查看源代码 | 下载笔记本 |
概述
欢迎阅读 TensorFlow Model Optimization Toolkit 中权重聚类的端到端示例。
其他页面
有关权重聚类的定义以及如何确定是否应使用权重聚类(包括支持的功能)的介绍,请参阅概述页面。
要快速找到您的用例(不局限于使用 16 个簇完全聚类模型)所需的 API,请参阅综合指南。
目录
在本教程中,您将:
- 从头开始为 MNIST 数据集训练一个
tf.keras
模型。 - 通过应用权重聚类 API 对模型进行微调,并查看准确率。
- 通过聚类创建一个大小缩减至六分之一的 TF 和 TFLite 模型。
- 通过将权重聚类与训练后量化相结合,创建一个大小缩减至八分之一的 TFLite 模型。
- 查看从 TF 到 TFLite 的准确率持久性。
设置
您可以在本地 virtualenv 或 Colab 中运行此 Jupyter 笔记本。有关设置依赖项的详细信息,请参阅安装指南。
pip install -q tensorflow-model-optimization
import tensorflow as tf
from tensorflow import keras
import numpy as np
import tempfile
import zipfile
import os
在不使用聚类的情况下为 MNIST 训练 tf.keras 模型
# Load MNIST dataset
mnist = keras.datasets.mnist
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()
# Normalize the input image so that each pixel value is between 0 to 1.
train_images = train_images / 255.0
test_images = test_images / 255.0
# Define the model architecture.
model = keras.Sequential([
keras.layers.InputLayer(input_shape=(28, 28)),
keras.layers.Reshape(target_shape=(28, 28, 1)),
keras.layers.Conv2D(filters=12, kernel_size=(3, 3), activation=tf.nn.relu),
keras.layers.MaxPooling2D(pool_size=(2, 2)),
keras.layers.Flatten(),
keras.layers.Dense(10)
])
# Train the digit classification model
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
model.fit(
train_images,
train_labels,
validation_split=0.1,
epochs=10
)
Epoch 1/10 1688/1688 [==============================] - 4s 2ms/step - loss: 0.3002 - accuracy: 0.9167 - val_loss: 0.1301 - val_accuracy: 0.9625 Epoch 2/10 1688/1688 [==============================] - 4s 2ms/step - loss: 0.1251 - accuracy: 0.9639 - val_loss: 0.0870 - val_accuracy: 0.9773 Epoch 3/10 1688/1688 [==============================] - 4s 2ms/step - loss: 0.0890 - accuracy: 0.9740 - val_loss: 0.0697 - val_accuracy: 0.9812 Epoch 4/10 1688/1688 [==============================] - 4s 2ms/step - loss: 0.0725 - accuracy: 0.9786 - val_loss: 0.0643 - val_accuracy: 0.9828 Epoch 5/10 1688/1688 [==============================] - 4s 2ms/step - loss: 0.0621 - accuracy: 0.9809 - val_loss: 0.0574 - val_accuracy: 0.9857 Epoch 6/10 1688/1688 [==============================] - 4s 2ms/step - loss: 0.0549 - accuracy: 0.9837 - val_loss: 0.0580 - val_accuracy: 0.9852 Epoch 7/10 1688/1688 [==============================] - 4s 2ms/step - loss: 0.0492 - accuracy: 0.9848 - val_loss: 0.0578 - val_accuracy: 0.9840 Epoch 8/10 1688/1688 [==============================] - 4s 2ms/step - loss: 0.0440 - accuracy: 0.9869 - val_loss: 0.0614 - val_accuracy: 0.9833 Epoch 9/10 1688/1688 [==============================] - 4s 2ms/step - loss: 0.0412 - accuracy: 0.9871 - val_loss: 0.0548 - val_accuracy: 0.9857 Epoch 10/10 1688/1688 [==============================] - 4s 2ms/step - loss: 0.0375 - accuracy: 0.9887 - val_loss: 0.0577 - val_accuracy: 0.9855 <tensorflow.python.keras.callbacks.History at 0x7f3dcbe7d588>
评估基准模型并保存以备稍后使用
_, baseline_model_accuracy = model.evaluate(
test_images, test_labels, verbose=0)
print('Baseline test accuracy:', baseline_model_accuracy)
_, keras_file = tempfile.mkstemp('.h5')
print('Saving model to: ', keras_file)
tf.keras.models.save_model(model, keras_file, include_optimizer=False)
Baseline test accuracy: 0.9807999730110168 Saving model to: /tmp/tmpkenu8pu1.h5
通过聚类微调预训练模型
将 cluster_weights()
API 应用于整个预训练模型,以演示它不仅能够在应用 zip 后有效缩减模型大小,还能保持良好的准确率。有关如何以最佳方式平衡用例的准确率和压缩率,请参阅综合指南中的每层示例。
定义模型并应用聚类 API
在将模型传递给聚类 API 之前,请确保它已经过训练并表现出可接受的准确率。
import tensorflow_model_optimization as tfmot
cluster_weights = tfmot.clustering.keras.cluster_weights
CentroidInitialization = tfmot.clustering.keras.CentroidInitialization
clustering_params = {
'number_of_clusters': 16,
'cluster_centroids_init': CentroidInitialization.LINEAR
}
# Cluster a whole model
clustered_model = cluster_weights(model, **clustering_params)
# Use smaller learning rate for fine-tuning clustered model
opt = tf.keras.optimizers.Adam(learning_rate=1e-5)
clustered_model.compile(
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
optimizer=opt,
metrics=['accuracy'])
clustered_model.summary()
Model: "sequential" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= cluster_reshape (ClusterWeig (None, 28, 28, 1) 0 _________________________________________________________________ cluster_conv2d (ClusterWeigh (None, 26, 26, 12) 136 _________________________________________________________________ cluster_max_pooling2d (Clust (None, 13, 13, 12) 0 _________________________________________________________________ cluster_flatten (ClusterWeig (None, 2028) 0 _________________________________________________________________ cluster_dense (ClusterWeight (None, 10) 20306 ================================================================= Total params: 20,442 Trainable params: 54 Non-trainable params: 20,388 _________________________________________________________________
微调模型并根据基准评估准确率
使用聚类对模型进行 1 个周期的微调。
# Fine-tune model
clustered_model.fit(
train_images,
train_labels,
batch_size=500,
epochs=1,
validation_split=0.1)
108/108 [==============================] - 0s 4ms/step - loss: 0.0547 - accuracy: 0.9807 - val_loss: 0.0804 - val_accuracy: 0.9760 <tensorflow.python.keras.callbacks.History at 0x7f3e4116ab70>
对于本示例,与基准相比,聚类后的测试准确率损失最小。
_, clustered_model_accuracy = clustered_model.evaluate(
test_images, test_labels, verbose=0)
print('Baseline test accuracy:', baseline_model_accuracy)
print('Clustered test accuracy:', clustered_model_accuracy)
Baseline test accuracy: 0.9807999730110168 Clustered test accuracy: 0.9760000109672546
通过聚类创建大小缩减至六分之一的模型
strip_clustering
和应用标准压缩算法(例如通过 gzip)对于看到聚类压缩的好处必不可少。
首先,为 TensorFlow 创建一个可压缩模型。在这里,strip_clustering
会移除聚类仅在训练期间才需要的所有变量(例如用于存储簇形心和索引的 tf.Variable
),否则这些变量会在推理期间增加模型大小。
final_model = tfmot.clustering.keras.strip_clustering(clustered_model)
_, clustered_keras_file = tempfile.mkstemp('.h5')
print('Saving clustered model to: ', clustered_keras_file)
tf.keras.models.save_model(final_model, clustered_keras_file,
include_optimizer=False)
Saving clustered model to: /tmp/tmpsc3jb7v8.h5
随后,为 TFLite 创建可压缩模型。您可以将聚类模型转换为可在目标后端上运行的格式。TensorFlow Lite 是可用于部署到移动设备的示例。
clustered_tflite_file = '/tmp/clustered_mnist.tflite'
converter = tf.lite.TFLiteConverter.from_keras_model(final_model)
tflite_clustered_model = converter.convert()
with open(clustered_tflite_file, 'wb') as f:
f.write(tflite_clustered_model)
print('Saved clustered TFLite model to:', clustered_tflite_file)
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow/python/training/tracking/tracking.py:111: Model.state_updates (from tensorflow.python.keras.engine.training) is deprecated and will be removed in a future version. Instructions for updating: This property should not be used in TensorFlow 2.0, as updates are applied automatically. WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow/python/training/tracking/tracking.py:111: Layer.updates (from tensorflow.python.keras.engine.base_layer) is deprecated and will be removed in a future version. Instructions for updating: This property should not be used in TensorFlow 2.0, as updates are applied automatically. INFO:tensorflow:Assets written to: /tmp/tmp69qei5fh/assets Saved clustered TFLite model to: /tmp/clustered_mnist.tflite
定义一个辅助函数,通过 gzip 实际压缩模型并测量压缩后的大小。
def get_gzipped_model_size(file):
# It returns the size of the gzipped model in bytes.
import os
import zipfile
_, zipped_file = tempfile.mkstemp('.zip')
with zipfile.ZipFile(zipped_file, 'w', compression=zipfile.ZIP_DEFLATED) as f:
f.write(file)
return os.path.getsize(zipped_file)
比较后可以发现,聚类使模型大小缩减至原来的六分之一
print("Size of gzipped baseline Keras model: %.2f bytes" % (get_gzipped_model_size(keras_file)))
print("Size of gzipped clustered Keras model: %.2f bytes" % (get_gzipped_model_size(clustered_keras_file)))
print("Size of gzipped clustered TFlite model: %.2f bytes" % (get_gzipped_model_size(clustered_tflite_file)))
Size of gzipped baseline Keras model: 78076.00 bytes Size of gzipped clustered Keras model: 12728.00 bytes Size of gzipped clustered TFlite model: 12126.00 bytes
通过将权重聚类与训练后量化相结合,创建一个大小缩减至八分之一的 TFLite 模型
您可以将训练后量化应用于聚类模型来获得更多好处。
converter = tf.lite.TFLiteConverter.from_keras_model(final_model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
tflite_quant_model = converter.convert()
_, quantized_and_clustered_tflite_file = tempfile.mkstemp('.tflite')
with open(quantized_and_clustered_tflite_file, 'wb') as f:
f.write(tflite_quant_model)
print('Saved quantized and clustered TFLite model to:', quantized_and_clustered_tflite_file)
print("Size of gzipped baseline Keras model: %.2f bytes" % (get_gzipped_model_size(keras_file)))
print("Size of gzipped clustered and quantized TFlite model: %.2f bytes" % (get_gzipped_model_size(quantized_and_clustered_tflite_file)))
INFO:tensorflow:Assets written to: /tmp/tmpmzv1zby7/assets INFO:tensorflow:Assets written to: /tmp/tmpmzv1zby7/assets Saved quantized and clustered TFLite model to: /tmp/tmp5yu2mobb.tflite Size of gzipped baseline Keras model: 78076.00 bytes Size of gzipped clustered and quantized TFlite model: 9237.00 bytes
查看从 TF 到 TFLite 的准确率持久性
定义一个辅助函数,基于测试数据集评估 TFLite 模型。
def eval_model(interpreter):
input_index = interpreter.get_input_details()[0]["index"]
output_index = interpreter.get_output_details()[0]["index"]
# Run predictions on every image in the "test" dataset.
prediction_digits = []
for i, test_image in enumerate(test_images):
if i % 1000 == 0:
print('Evaluated on {n} results so far.'.format(n=i))
# Pre-processing: add batch dimension and convert to float32 to match with
# the model's input data format.
test_image = np.expand_dims(test_image, axis=0).astype(np.float32)
interpreter.set_tensor(input_index, test_image)
# Run inference.
interpreter.invoke()
# Post-processing: remove batch dimension and find the digit with highest
# probability.
output = interpreter.tensor(output_index)
digit = np.argmax(output()[0])
prediction_digits.append(digit)
print('\n')
# Compare prediction results with ground truth labels to calculate accuracy.
prediction_digits = np.array(prediction_digits)
accuracy = (prediction_digits == test_labels).mean()
return accuracy
评估已被聚类和量化的模型后,您将看到从 TensorFlow 持续到 TFLite 后端的准确率。
interpreter = tf.lite.Interpreter(model_content=tflite_quant_model)
interpreter.allocate_tensors()
test_accuracy = eval_model(interpreter)
print('Clustered and quantized TFLite test_accuracy:', test_accuracy)
print('Clustered TF test accuracy:', clustered_model_accuracy)
Evaluated on 0 results so far. Evaluated on 1000 results so far. Evaluated on 2000 results so far. Evaluated on 3000 results so far. Evaluated on 4000 results so far. Evaluated on 5000 results so far. Evaluated on 6000 results so far. Evaluated on 7000 results so far. Evaluated on 8000 results so far. Evaluated on 9000 results so far. Clustered and quantized TFLite test_accuracy: 0.9759 Clustered TF test accuracy: 0.9760000109672546
结论
在本教程中,您了解了如何使用 TensorFlow Model Optimization Toolkit API 创建聚类模型。更具体地说,您已经从头至尾完成了一个端到端示例,此示例为 MNIST 创建了一个大小缩减至原来的八分之一且准确率差异最小的模型。我们鼓励您试用这项新功能,这对于在资源受限的环境中进行部署特别重要。