量化、投影和剪枝 -凯发k8网页登录

通过执行量化、投影或剪枝来压缩深度神经网络

将 deep learning toolbox™ 和 deep learning toolbox model quantization library 支持包结合使用，通过以下方式减少深度神经网络的内存占用和计算要求：

使用一阶泰勒逼近从卷积层中对滤波器进行剪枝。然后，您可以从这个剪枝过的网络中生成 c/c 或 cuda^® 代码。
对层进行投影，先使用代表训练数据的数据集对层激活执行主成分分析 (pca)，然后对层的可学习参数应用线性投影。当您使用无库的 c/c 代码生成将网络部署到嵌入式硬件时，投影的深度神经网络的前向传导通常会更快。
将层的权重、偏置和激活量化为精度降低的缩放整数数据类型。然后，您可以从这个量化的网络中生成 c/c 、cuda 或 hdl 代码。
为了实现 c/c 和 cuda 代码生成，软件通过将卷积层的权重、偏置和激活量化为 8 位定标整数数据类型，为卷积深度神经网络生成代码。量化是通过向 codegen (matlab coder) 命令提供由函数生成的标定结果文件来执行的。
代码生成不支持由函数生成的量化深度神经网络。

函数

	network that can be pruned by using first-order taylor approximation
	compute deep learning network output for training
	compute deep learning network output for inference
	remove filters from prunable layers based on importance scores
	compute and accumulate taylor-based importance scores for pruning
	deep learning network for custom training loops

	compress neural network using projection
	principal component analysis of neuron activations

	quantize a deep neural network to 8-bit scaled integer data types
	options for quantizing a trained deep neural network
	simulate and collect ranges of a deep neural network
	quantize deep neural network
	quantize and validate a deep neural network
	display quantization details for a neural network
	estimate network metrics for specific layers of a neural network
	equalize layer parameters of deep neural network

quantize a deep neural network to 8-bit scaled integer data types

use parameter pruning and quantization to reduce network size.
this example shows how to reduce the size of a deep neural network using taylor pruning.
this example shows how to reduce network size and increase inference speed by pruning convolutional filters in a you only look once (yolo) v3 object detection network.

this example shows how to compress a neural network using projection and principal component analysis.

quantization of deep neural networks
understand effects of quantization and how to visualize dynamic ranges of network convolution layers.
products required for the quantization of deep learning networks.
supported datastores for quantization workflows.

(gpu coder)
quantize and generate code for a pretrained convolutional neural network.
this example shows how to quantize the learnable parameters in the convolution layers of a deep learning neural network that has residual connections and has been trained for image classification with cifar-10 data.
this example shows how to generate cuda® code for an ssd vehicle detector and a yolo v2 vehicle detector that performs inference computations in 8-bit integers for the convolutional layers.
quantize convolutional neural network trained for semantic segmentation and generate cuda code

(deep learning hdl toolbox)
reduce the memory footprint of a deep neural network by quantizing the weights, biases, and activations of convolution layers to 8-bit scaled integer data types.
classify images on fpga using quantized neural network (deep learning hdl toolbox)
this example shows how to use deep learning hdl toolbox™ to deploy a quantized deep convolutional neural network (cnn) to an fpga.
(deep learning hdl toolbox)
this example show how to use the deep learning hdl toolbox™ to deploy a quantized googlenet network to classify an image.

(matlab coder)
quantize and generate code for a pretrained convolutional neural network.
(matlab coder)
generate code for deep learning network that performs inference computations in 8-bit integers.