1. 什么是ANR?

ANR(Application Not Responding)即应用程序无响应。

2. ANR产生的原因是什么?

ANR产生的根本原因是APP阻塞了UI线程。在android系统中每个App只有一个UI线程,是在App创建时默认生成的,UI线程默认初始化了一个消息循环来处理UI消息,ANR往往就是处理UI消息超时了。那么UI消息来源有哪些呢?主要有两种来源:

2.1 来自于AMS的回调消息

在Android系统中,应用程序是有Android的四大组件组成,AMS负责对应用程序四大组件生命周期的管理,当AMS对应用程序组件的生命周期进行回调超过AMS定义的响应时间时,AMS就会报ANR。出现这种情况,一般是因为在这些组件的回调函数里面进行了耗时操作(如网络操作、SD卡文件操作、数据库操作、大量计算等),AMS对组件常见的回调函数及超时时间如下:

  • Activity: onCreate(), onResume(), onDestroy(), onKeyDown(), onClick()等,超时时间5s
  • Application: onCreate(), onTerminate()等,超时时间5s
  • Service: onCreate(), onStart(), onDestroy()等,超时时间20s
  • BroadcastReceiver:onReceiver(),前台APP广播超时时间是10s,后台App是60s

2.2 App自己的发出的消息

除了AMS对四大组件的回调消息运行在UI线程外,有些操作也是运行在UI线程的:

  • AsyncTask: onPreExecute(), onProgressUpdate(), onPostExecute(), onCancel()等,超时5s
  • Mainthread handler: handleMessage(), post*(runnable r)等,超时5s

3. 如何避免ANR?

知道了ANR产生的原因,规避方法就简单了,即不要在上述的两种情况下做耗时的操作。常见的耗时操作有:网络操作、SD卡文件操作、数据库操作、大量计算等。说起来容易,但做起来有时候会忘,所以Android从2.3开始给我们提供了StrictMode类(android.os.StrictMode),StrictMode提供了两个主要功能:一类是监控UI线程耗时的操作(磁盘读写、网络访问、自定义的运行速度慢的代码等),二是监控内存泄露(如Activity、SQLite、I/O流、大对象未释放等)。如果我们在开发过程中违反了它的监控策略,StrictMode就会在Logcat里面输出警告信息。

如何在App中开启StrictMode呢?一般在MainActiviyt的onCreate函数里面增加如下两行:

StrictMode.setThreadPolicy(new StrictMode.ThreadPolicy.Builder().detectAll().penaltyLog().penaltyDialog().build());// 启用UI线程监控
StrictMode.setVmPolicy(new StrictMode.VmPolicy.Builder().detectAll().penaltyLog().build());  // 启用虚拟机内存监控

当然,StrictMode适合测试版本,正式发布版本要屏蔽,以免影响App的性能。下面是启用StrictMode后在UI线程写数据库违反策略的一个警告输出例子:

09-04 16:15:34.592: DEBUG/StrictMode(15883): StrictMode policy violation; ~duration=319 ms: android.os.StrictMode$StrictModeDiskWriteViolation: policy=31 violation=1
09-04 16:15:34.592: DEBUG/StrictMode(15883): at android.os.StrictMode$AndroidBlockGuardPolicy.onWriteToDisk(StrictMode.java:1041)
09-04 16:15:34.592: DEBUG/StrictMode(15883): at android.database.sqlite.SQLiteStatement.acquireAndLock(SQLiteStatement.java:219)
09-04 16:15:34.592: DEBUG/StrictMode(15883): at android.database.sqlite.SQLiteStatement.executeUpdateDelete(SQLiteStatement.java:83)
09-04 16:15:34.592: DEBUG/StrictMode(15883): at android.database.sqlite.SQLiteDatabase.updateWithOnConflict(SQLiteDatabase.java:1829)
09-04 16:15:34.592: DEBUG/StrictMode(15883): at android.database.sqlite.SQLiteDatabase.update(SQLiteDatabase.java:1780)
09-04 16:15:34.592: DEBUG/StrictMode(15883): at com.test.tutorial.data.ListProvider.update(TutListProvider.java:188)
09-04 16:15:34.592: DEBUG/StrictMode(15883): at android.content.ContentProvider$Transport.update(ContentProvider.java:233)

有调用堆栈和UI线程的耗时时间。

4. 出现ANR时如何分析?

尽管我们进行了预防,但可能有时候有的同学还是没有注意引起了ANR,那么当出现ANR时,我们如何快速定位问题呢?请记住当ANR发生时,android会在两个地方给出重要提示,一个地方是logcat,另一个地方是\data\anr\traces.txt。当然bugreport的信息最全,两者都有。

4.1 Logcat信息

在发生ANR时,Logcat会输出ANR发生的时间、发生ANR前后CPU的使用情况等,如下一个ANR logcat输出信息的例子:

04-01 13:12:11.572 I/InputDispatcher( 220): Application is not responding:Window{2b263310com.android.email/com.android.email.activity.SplitScreenActivitypaused=false}.  5009.8ms since event, 5009.5ms since waitstarted
04-0113:12:11.572 I/WindowManager( 220): Input event dispatching timedout sending tocom.android.email/com.android.email.activity.SplitScreenActivity
04-01 13:12:14.123 I/Process(  220): Sending signal. PID: 21404 SIG: 3---发生ANR的时间和生成trace.txt的时间 ……
04-0113:12:15.872 E/ActivityManager(  220): ANR in com.android.email(com.android.email/.activity.SplitScreenActivity)
04-0113:12:15.872 E/ActivityManager(  220): Reason:keyDispatchingTimedOut
04-0113:12:15.872 E/ActivityManager(  220): Load: 8.68 / 8.37 / 8.53
04-0113:12:15.872 E/ActivityManager(  220): CPUusage from 4361ms to 699ms ago ----CPU在ANR发生前的使用情况
04-0113:12:15.872 E/ActivityManager(  220):   5.5%21404/com.android.email: 1.3% user + 4.1% kernel / faults: 10 minor
04-0113:12:15.872 E/ActivityManager(  220):   4.3%220/system_server: 2.7% user + 1.5% kernel / faults: 11 minor 2 major
04-0113:12:15.872 E/ActivityManager(  220):   0.9%52/spi_qsd.0: 0% user + 0.9% kernel
04-0113:12:15.872 E/ActivityManager(  220):   0.5%65/irq/170-cyttsp-: 0% user + 0.5% kernel
04-0113:12:15.872 E/ActivityManager(  220):   0.5%296/com.android.systemui: 0.5% user + 0% kernel
04-0113:12:15.872 E/ActivityManager(  220): 100%TOTAL: 4.8% user + 7.6% kernel + 87% iowait 04-0113:12:15.872 E/ActivityManager(  220): CPUusage from 3697ms to 4223ms later:-- ANR后CPU的使用量 04-0113:12:15.872 E/ActivityManager(  220):   25%21404/com.android.email: 25% user + 0% kernel / faults: 191 minor
04-0113:12:15.872 E/ActivityManager(  220):    16% 21603/__eas(par.hakan: 16% user + 0% kernel
04-0113:12:15.872 E/ActivityManager(  220):    7.2% 21406/GC: 7.2% user + 0% kernel
04-0113:12:15.872 E/ActivityManager(  220):    1.8% 21409/Compiler: 1.8% user + 0% kernel
04-0113:12:15.872 E/ActivityManager(  220):   5.5%220/system_server: 0% user + 5.5% kernel / faults: 1 minor
04-0113:12:15.872 E/ActivityManager(  220):    5.5% 263/InputDispatcher: 0% user + 5.5% kernel
04-0113:12:15.872 E/ActivityManager(  220): 32%TOTAL: 28% user + 3.7% kernel

从LOG可以看出ANR的类型及CPU的使用情况:

  • 如果CPU使用量接近100%,说明当前设备很忙,有可能是CPU饥饿导致了ANR
  • 如果CPU使用量很少,说明主线程被BLOCK了
  • 如果IOwait很高,说明可能是主线程在进行I/O操作造成的

4.2 Traces信息

当ANR发生时,android还会记录ANR信息到/data/anr/traces.txt文件,你可以通过多种方式拿到打开。这个文件只会记录最近发生的一次ANR的堆栈信息及各个线程的运行情况,一般通过堆栈信息我们可以方便定位到阻塞的位置,请看如下traces.txt文件的例子:

-----pid 21404 at 2014-04-01 13:12:14 -----                   ANR发生时间 Cmdline: com.android.email                                             发生ANR的APP包名 DALVIK THREADS:                                                         虚拟机线程信息
(mutexes: tll=0tsl=0 tscl=0 ghl=0 hwl=0 hwll=0)                 
"main" prio=5 tid=1 NATIVE                                                       主线程状态:在执行JNI调用
  | group="main" sCount=1 dsCount=0obj=0x2aad2248 self=0xcf70
  | sysTid=21404 nice=0 sched=0/0cgrp=[fopen-error:2] handle=1876218976
at android.os.MessageQueue.nativePollOnce(Native Method)                               调用堆栈
at android.os.MessageQueue.next(MessageQueue.java:119)
at android.os.Looper.loop(Looper.java:110
)
at android.app.ActivityThread.main(ActivityThread.java:3688)
at java.lang.reflect.Method.invokeNative(Native Method)
at java.lang.reflect.Method.invoke(Method.java:507)
at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:866)
at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:624)
at dalvik.system.NativeStart.main(Native Method)
…其他线程

Thread状态说明: ThreadState (defined at “dalvik/vm/thread.h “)
THREAD_UNDEFINED = -1, /* makes enum compatible with int32_t */
THREAD_ZOMBIE = 0, /* TERMINATED */
THREAD_RUNNING = 1, /* RUNNABLE or running now */
THREAD_TIMED_WAIT = 2, /* TIMED_WAITING in Object.wait() */
THREAD_MONITOR = 3, /* BLOCKED on a monitor */
THREAD_WAIT = 4, /* WAITING in Object.wait() */
THREAD_INITIALIZING= 5, /* allocated, not yet running */
THREAD_STARTING = 6, /* started, not yet on thread list */
THREAD_NATIVE = 7, /* off in a JNI native method */
THREAD_VMWAIT = 8, /* waiting on a VM resource */
THREAD_SUSPENDED = 9, /* suspended, usually by GC or debugger */

5. ANR案例

5.1 UI线程block导致ANR

MainHandler post runnable执行加载所有已安装app的信息在某些手机上ANR,原因是MainHandler post runnable运行在UI线程,而加载所有已安装app的信息会设计到大量解析计算,造成了UI回调超时。

----- pid 2848 at 2014-07-22 17:01:29 -----
Cmd line: com.tencent.qfilemanager

DALVIK THREADS:
(mutexes: tll=0 tsl=0 tscl=0 ghl=0)
"main" prio=5 tid=1 NATIVE
  | group="main" sCount=1 dsCount=0 obj=0x40cc3538 self=0xabdf30
  | sysTid=2848 nice=0 sched=0/0 cgrp=default handle=1074267496
  | schedstat=( 0 0 0 ) utm=64 stm=58 core=1
  at android.content.res.AssetManager.getStringBlockCount(Native Method)
  at android.content.res.AssetManager.makeStringBlocks(AssetManager.java:257)
  at android.content.res.AssetManager.ensureStringBlocks(AssetManager.java:249)
  at android.content.res.Resources.<init>(Resources.java:206)
  at android.app.ActivityThread.getTopLevelResources(ActivityThread.java:1492)
  at android.app.ActivityThread.getTopLevelResources(ActivityThread.java:1519)
  at android.app.ApplicationPackageManager.getResourcesForApplication(ApplicationPackageManager.java:689)
  at android.app.ApplicationPackageManager.getText(ApplicationPackageManager.java:891)
  at android.content.pm.PackageItemInfo.loadLabel(PackageItemInfo.java:115)
  at com.tencent.qfilemanager.fileanalysis.PkgInfo.reloadPkgInfo(PkgInfo.java:50)
  at com.tencent.qfilemanager.fileanalysis.PkgCache.loadAllInstalledPackages(PkgCache.java:29)
  at com.tencent.qfilemanager.activity.FileManagerActivity$3.run(FileManagerActivity.java:209)
  at android.os.Handler.handleCallback(Handler.java:605)
  at android.os.Handler.dispatchMessage(Handler.java:92)
  at android.os.Looper.loop(Looper.java:137)
  at android.app.ActivityThread.main(ActivityThread.java:4456)
  at java.lang.reflect.Method.invokeNative(Native Method)
  at java.lang.reflect.Method.invoke(Method.java:511)
  at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:787)
  at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:554)
  at dalvik.system.NativeStart.main(Native Method)

5.2 UI线程高 iowait导致ANR

在AsyncTask的onPostExecute里面批量数据库操作,导致ANR。原因,AsyncTask的onPostExecute执行在UI线程,批量数据库操作是I/O操作,速度较慢,UI线程回调超时。

Process:com.android.email
Activity:com.android.email/.activity.MessageView
Subject:keyDispatchingTimedOut
CPU usage from 2550ms to -2814ms ago:
5%187/system_server: 3.5% user + 1.4% kernel / faults: 86 minor 20major
4.4% 1134/com.android.email: 0.7% user + 3.7% kernel /faults: 38 minor 19 major
4% 372/com.android.eventstream: 0.7%user + 3.3% kernel / faults: 6 minor
1.1% 272/com.android.phone:0.9% user + 0.1% kernel / faults: 33 minor
0.9%252/com.android.systemui: 0.9% user + 0% kernel
0%409/com.android.eventstream.telephonyplugin: 0% user + 0% kernel /faults: 2 minor
0.1% 632/com.android.devicemonitor: 0.1% user + 0%kernel
100%TOTAL: 6.9% user + 8.2% kernel +84%iowait
-----pid 1134 at 2010-12-17 17:46:51 -----
Cmd line:com.android.email
DALVIK THREADS:
(mutexes: tll=0 tsl=0tscl=0 ghl=0 hwl=0 hwll=0)
"main" prio=5 tid=1 WAIT
|group="main" sCount=1 dsCount=0 obj=0x2aaca180self=0xcf20
| sysTid=1134 nice=0 sched=0/0 cgrp=[fopen-error:2]handle=1876218976
at java.lang.Object.wait(Native Method)
-waiting on <0x2aaca218> (a java.lang.VMThread)
atjava.lang.Thread.parkFor(Thread.java:1424)

atandroid.database.sqlite.SQLiteDatabase.query(SQLiteDatabase.java:1271)
atcom.android.email.provider.EmailProvider.query(EmailProvider.java:1098)
atandroid.content.ContentProvider$Transport.query(ContentProvider.java:187)
atandroid.content.ContentResolver.query(ContentResolver.java:268)
atcom.android.email.provider.EmailContent$Message.restoreMessageWithId(EmailContent.java:648)
atcom.android.email.Controller.setMessageRead(Controller.java:658)
atcom.android.email.activity.MessageView.onMarkAsRead(MessageView.java:700)
atcom.android.email.activity.MessageView.access$2500(MessageView.java:98)
atcom.android.email.activity.MessageView$LoadBodyTask.onPostExecute(MessageView.java:1290)
atcom.android.email.activity.MessageView$LoadBodyTask.onPostExecute(MessageView.java:1255)
atandroid.os.AsyncTask.finish(AsyncTask.java:417)
atandroid.os.AsyncTask.access$300(AsyncTask.java:127)
atandroid.os.AsyncTask$InternalHandler.handleMessage(AsyncTask.java:429)
atandroid.os.Handler.dispatchMessage(Handler.java:99)
atandroid.os.Looper.loop(Looper.java:123)
atandroid.app.ActivityThread.main(ActivityThread.java:3652)
atjava.lang.reflect.Method.invokeNative(Native Method)
atjava.lang.reflect.Method.invoke(Method.java:507)
atcom.android.internal.os.ZygoteIn