main content

量化、投影和剪枝 -凯发k8网页登录

通过执行量化、投影或剪枝来压缩深度神经网络

将 deep learning toolbox™ 和 deep learning toolbox model quantization library 支持包结合使用,通过以下方式减少深度神经网络的内存占用和计算要求:

  • 使用一阶泰勒逼近从卷积层中对滤波器进行剪枝。然后,您可以从这个剪枝过的网络中生成 c/c 或 cuda® 代码。

  • 对层进行投影,先使用代表训练数据的数据集对层激活执行主成分分析 (pca),然后对层的可学习参数应用线性投影。当您使用无库的 c/c 代码生成将网络部署到嵌入式硬件时,投影的深度神经网络的前向传导通常会更快。

  • 将层的权重、偏置和激活量化为精度降低的缩放整数数据类型。然后,您可以从这个量化的网络中生成 c/c 、cuda 或 hdl 代码。

    为了实现 c/c 和 cuda 代码生成,软件通过将卷积层的权重、偏置和激活量化为 8 位定标整数数据类型,为卷积深度神经网络生成代码。量化是通过向 codegen (matlab coder) 命令提供由 函数生成的标定结果文件来执行的。

    代码生成不支持由 函数生成的量化深度神经网络。

函数

network that can be pruned by using first-order taylor approximation
compute deep learning network output for training
compute deep learning network output for inference
remove filters from prunable layers based on importance scores
compute and accumulate taylor-based importance scores for pruning
deep learning network for custom training loops
compress neural network using projection
principal component analysis of neuron activations
quantize a deep neural network to 8-bit scaled integer data types
options for quantizing a trained deep neural network
simulate and collect ranges of a deep neural network
quantize deep neural network
quantize and validate a deep neural network
display quantization details for a neural network
estimate network metrics for specific layers of a neural network
equalize layer parameters of deep neural network

app

深度网络量化器quantize a deep neural network to 8-bit scaled integer data types

主题

剪枝


  • use parameter pruning and quantization to reduce network size.

  • this example shows how to reduce the size of a deep neural network using taylor pruning.

  • this example shows how to reduce network size and increase inference speed by pruning convolutional filters in a you only look once (yolo) v3 object detection network.

投影


  • this example shows how to compress a neural network using projection and principal component analysis.

深度学习量化

  • quantization of deep neural networks
    understand effects of quantization and how to visualize dynamic ranges of network convolution layers.

  • products required for the quantization of deep learning networks.

  • supported datastores for quantization workflows.

gpu 目标的量化

  • (gpu coder)
    quantize and generate code for a pretrained convolutional neural network.

  • this example shows how to quantize the learnable parameters in the convolution layers of a deep learning neural network that has residual connections and has been trained for image classification with cifar-10 data.

  • this example shows how to generate cuda® code for an ssd vehicle detector and a yolo v2 vehicle detector that performs inference computations in 8-bit integers for the convolutional layers.

  • quantize convolutional neural network trained for semantic segmentation and generate cuda code

fpga 目标的量化

  • (deep learning hdl toolbox)
    reduce the memory footprint of a deep neural network by quantizing the weights, biases, and activations of convolution layers to 8-bit scaled integer data types.
  • classify images on fpga using quantized neural network (deep learning hdl toolbox)
    this example shows how to use deep learning hdl toolbox™ to deploy a quantized deep convolutional neural network (cnn) to an fpga.
  • (deep learning hdl toolbox)
    this example show how to use the deep learning hdl toolbox™ to deploy a quantized googlenet network to classify an image.

cpu 目标的量化

  • (matlab coder)
    quantize and generate code for a pretrained convolutional neural network.
  • (matlab coder)
    generate code for deep learning network that performs inference computations in 8-bit integers.
网站地图