Android 盲人按钮文字朗读配置

转载

mob64ca13f9e726 2024-11-06 17:56:55

文章标签 Android 盲人按钮文字朗读配置 tensorflow sed ide 文章分类 Android 移动开发

###各种使用tensorflow的app
whatsthis app
TensorFlowAndroidDemo To build this demo, you don't need to prepare build environment with Bazel, and it only requires AndroidStudio.
ID-Card_with_TensorFlow_Opencv_in_Android id卡片数字识别
PetOrNot，通过神经网络识别是不是宠物
Dango， This new app uses neural networks to choose the perfect emoji
BlindTool， from google play。帮助盲人看世界，能朗读所见物体，根据可信度震动

###移动端部署tensorflow的两种方式
一个online方式：
移动端做初步预处理，把数据传到服务器执行深度学习模型，现在很多APP都是这个思路, 优点是这个方式部署相对简单，现成的框架（caffe，theano，mxnet，Torch) 做下封装就可以直接拿来用，服务器性能大, 能够处理比较大的模型，缺点是必须联网。
另外一种是offline方式：
根据硬件的性能，部署适当的模型。优点是可以离线执行。缺点也是明显的，1）受限硬件，可能要运行个阉割版的模型，对模型精度会有一定的影响； 2) 要移植现成框架到移动平台比较麻烦, 各种依赖的剥离很痛苦，mxnet有个Android app的例子（Leliana/WhatsThis · GitHub), Torch 7也个Android版本soumith/torch-android · GitHub，可以参考下，当然如果编程能力强的话，自己写个网络前传的代码。

#####取自知乎，两种部署方式

有两个思路。
一个online方式：
移动端做初步预处理，把数据传到服务器执行深度学习模型，现在很多APP都是这个思路, 优点是这个方式部署相对简单，现成的框架（caffe，theano，mxnet，Torch) 做下封装就可以直接拿来用，服务器性能大, 能够处理比较大的模型，缺点是必须联网。
另外一种是offline方式：
根据硬件的性能，部署适当的模型。优点是可以离线执行。缺点也是明显的，1）受限硬件，可能要运行个阉割版的模型，对模型精度会有一定的影响； 2) 要移植现成框架到移动平台比较麻烦, 各种依赖的剥离很痛苦，mxnet有个Android app的例子（Leliana/WhatsThis · GitHub), Torch 7也个Android版本soumith/torch-android · GitHub，可以参考下，当然如果编程能力强的话，自己写个网络前传的代码。

###tensorflow apps
用tensorflow apps在手机上部署一个训练好的网络，在目前似乎是非常容易实现的。

###vgg内存占用
VGGNet in detail. Lets break down the VGGNet in more detail as a case study. The whole VGGNet is composed of CONV layers that perform 3x3 convolutions with stride 1 and pad 1, and of POOL layers that perform 2x2 max pooling with stride 2 (and no padding). We can write out the size of the representation at each step of the processing and keep track of both the representation size and the total number of weights:

INPUT: [224x224x3] memory: 224*224*3=150K weights: 0
CONV3-64: [224x224x64] memory: 224*224*64=3.2M weights: (3*3*3)*64 = 1,728
CONV3-64: [224x224x64] memory: 224*224*64=3.2M weights: (3*3*64)*64 = 36,864
POOL2: [112x112x64] memory: 112*112*64=800K weights: 0
CONV3-128: [112x112x128] memory: 112*112*128=1.6M weights: (3*3*64)*128 = 73,728
CONV3-128: [112x112x128] memory: 112*112*128=1.6M weights: (3*3*128)*128 = 147,456
POOL2: [56x56x128] memory: 56*56*128=400K weights: 0
CONV3-256: [56x56x256] memory: 56*56*256=800K weights: (3*3*128)*256 = 294,912
CONV3-256: [56x56x256] memory: 56*56*256=800K weights: (3*3*256)*256 = 589,824
CONV3-256: [56x56x256] memory: 56*56*256=800K weights: (3*3*256)*256 = 589,824
POOL2: [28x28x256] memory: 28*28*256=200K weights: 0
CONV3-512: [28x28x512] memory: 28*28*512=400K weights: (3*3*256)*512 = 1,179,648
CONV3-512: [28x28x512] memory: 28*28*512=400K weights: (3*3*512)*512 = 2,359,296
CONV3-512: [28x28x512] memory: 28*28*512=400K weights: (3*3*512)*512 = 2,359,296
POOL2: [14x14x512] memory: 14*14*512=100K weights: 0
CONV3-512: [14x14x512] memory: 14*14*512=100K weights: (3*3*512)*512 = 2,359,296
CONV3-512: [14x14x512] memory: 14*14*512=100K weights: (3*3*512)*512 = 2,359,296
CONV3-512: [14x14x512] memory: 14*14*512=100K weights: (3*3*512)*512 = 2,359,296
POOL2: [7x7x512] memory: 7*7*512=25K weights: 0
FC: [1x1x4096] memory: 4096 weights: 7*7*512*4096 = 102,760,448
FC: [1x1x4096] memory: 4096 weights: 4096*4096 = 16,777,216
FC: [1x1x1000] memory: 1000 weights: 4096*1000 = 4,096,000

TOTAL memory: 24M * 4 bytes ~= 93MB / image (only forward! ~*2 for bwd)
TOTAL params: 138M parameters

As is common with Convolutional Networks, notice that most of the memory (and also compute time) is used in the early CONV layers, and that most of the parameters are in the last FC layers. In this particular case, the first FC layer contains 100M weights, out of a total of 140M.

###删除vgg的所有fc层，就可以不需要修正输入图片的size
by removing all FC layers, we output the feature maps from the last convolutional layer. Each entity along the feature maps can be considered as a “local” feature, and the length of the feature equals the number of feature maps. As discussed in [44], the requirement of fixed-size images comes only from FC layers, and convolutional layers do not require images to have a fixed size. Due to not involving FC layers in scenario (II), we can freely extract convolutional features for an input image with any size.
中心大意：只有FC层对图像的fixed-size有要求，卷积层并不要求图片具有fixed-size

本文章为转载内容，我们尊重原作者对文章享有的著作权。如有内容错误或侵权问题，欢迎原作者联系我们进行内容更正或删除文章。