1.关键字:ecpool_pool_sup

日志原文:

Supervisor: {<0.18803.107>,ecpool_pool_sup}. Context: start_error. Reason: {shutdown,{failed_to_start_child,{worker,1},{{badmatch,{error,einval}},[{eredis_client,connect_with_tcp,2,[{file,"eredis_client.erl"},{line,369}]},{eredis_client,init,1,[{file,"eredis_client.erl"},{line,98}]},{gen_server,init_it,2,[{file,"gen_server.erl"},{line,417}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,385}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,226}]}]}}}. Offender: id=worker_sup,pid=undefined.

错误原因:eredis_client 里面的 badmatch,是个bug,redis驱动因为DNS解析失败等情况产生的问题

解决方案:已在4.4.3修复

2.关键字:Warning 1265

日志原文:

Warning 1265: Data truncated for column 'temp' at row 1, Warning 1265: Data truncated for column 'hum' at row 1, in insert into , temp_hum(up_timestamp, client_id, temp, hum)  VALUES  , (FROM_UNIXTIME(1647761018341/1000), 'diaocheguanli1', 'undefined', 'undefined')

错误原因:一般是 因为数据类型的不对应,或者字符串长度不够而造成的

解决方案:更改为正确的数据类型或者增加字符串长度

3.关键字: initial call: cowboy_clear

日志原文:

crasher:    initial call: cowboy_clear:connection_process/4    pid: <0.25409.710>    registered_name: []    exception error: no match of right hand side value {error,closed}      in function  cowboy_clear:connection_process/4 (cowboy_clear.erl, line 38)    ancestors: [<0.2302.0>,<0.2301.0>,ranch_sup,<0.2062.0>]    message_queue_len: 1    messages: [{handshake,'mqtt:ws:9083',ranch_tcp,#Port<0.9487546>,5000}]    links: [<0.2302.0>,#Port<0.9487546>]    dictionary: []    trap_exit: false    status: running    heap_size: 610    stack_size: 27    reductions: 263  neighbours:2022-03-13 13:27:35.816 [error] Ranch listener 'mqtt:ws:9083' had connection process started with cowboy_clear:start_link/4 at <0.25409.710> exit with reason: {{badmatch,{error,closed}},[{cowboy_clear,connection_process,4,[{file,"cowboy_clear.erl"},{line,38}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,249}]}]}

错误原因:ws的在握手的过程中被中断了

解决方案:检查下网络问题

4.关键字: Parse failed for frame_too_large

日志原文:

10.192.198.141:42568 [MQTT] , Parse failed for frame_too_large, [{emqx_frame,parse_remaining_len,5,[{file,"emqx_frame.erl"},{line,159}]},{emqx_connection,parse_incoming,3,[{file,"emqx_connection.erl"},{line,625}]},{emqx_connection,handle_msg,2,[{file,"emqx_connection.erl"},{line,618}]},{emqx_connection,process_msg,2,[{file,"emqx_connection.erl"},{line,364}]},{emqx_connection,handle_recv,3,[{file,"emqx_connection.erl"},{line,328}]},{proc_lib,wake_up,3,[{file,"proc_lib.erl"},{line,236}]}], Frame data:<<0,164,255,83,77,66,114,0,0,0,0,8,1,64,0,0,0,0,0,0,0,0,0,0,0,0,0,0,64,6,0,0,1,0,0,129,0,2,80,67,32,78,69,84,87,79,82,75,32,80,82,79,71,82,65,77,32,49,46,48,0,2,77,73,67,82,79,83,79,70,84,32,78,69,84,87,79,82,75,83,32,49,46,48,51,0,2,77,73,67,82,79,83,79,70,84,32,78,69,...>>

错误原因:数据报文太长了

解决方案:检查下发送的相关的报文

5.关键字: esockd_acceptor_sup

日志原文:


Supervisor: {<0.2393.0>,esockd_acceptor_sup}. Context: shutdown_error. Reason: noproc. Offender: id=acceptor,nb_children=2.


错误原因:进程挂掉了

解决方案:需要上下文详细定位

6.关键字: PoolWorker

日志原文:


[PoolWorker] supervisee <0.12604.2> is force killed


错误原因:ecpool 的功能,被底层库强制关闭了,这个是预期行为,如果子进程卡住是要强杀的

解决方案

7.关键字: resource

日志原文:


[Alarm Handler] Alarm resource/web_hook/resource:6777aed2/down is deactivated


错误原因:webhook资源down掉了

解决方案:检查webhook的url情况

8.关键字: More than one channel found

日志原文:


<<"zhtbhz_beidou_server_test">>@14.108.214.6:61987 [CM] More than one channel found: [<0.23782.682>,<0.23804.682>]


错误原因:同一个 ClientId 出现了多个会话

解决方案

9.关键字: Failed to discard

日志原文:


<<"android:7706416585007415566:7b6a3110779b89e3">>@112.97.83.197:37017 [CM] Failed to discard <59069.26551.1562>: {'EXIT',{{shutdown,tcp_closed},{gen_server,call,[<59069.26551.1562>,discard,infinity]}}}


错误原因:客户端网络断了,导致任务分发失败,有可能是网络导致的,强断也有可能

解决方案:检查网络问题

10.关键字: Warning 1292

日志原文:


Warning 1292: Truncated incorrect DOUBLE value: 'undefined', in INSERT INTO monitor_data_H621382_2022 (id, cod, nh3n, turb, ec, ph, dio, temp, vol, gps, monitor_dt, create_dt, state)  VALUES  ('H621382', ROUND(7.0591, 2), abs(ROUND(1.025, 2)), ROUND(10.93, 2), ROUND(0.71, 2), ROUND(8.22, 2), ROUND(4.2, 2), ROUND(13.67, 2),ROUND('undefined', 2), '0,0', FROM_UNIXTIME(1648004224), now(), '1')


错误原因:写的mysql查询语句中所提供的字段,与表中字段类型不匹配

解决方案:更改为对应的字段

11.关键字: failed_to_connect_all

日志原文:


failed_to_connect_all: [{{"127.0.0.1",9092},{{{kpro_req,#Ref<60403.1985112988.647757825.240140>,api_versions,0,false,[]},closed},[{kpro_lib,send_and_recv_raw,4,[{file,"kpro_lib.erl"},{line,70}]},{kpro_lib,send_and_recv,5,[{file,"kpro_lib.erl"},{line,81}]},{kpro_connection,query_api_versions,4,[{file,"kpro_connection.erl"},{line,251}]},{kpro_connection,init_connection,2,[{file,"kpro_connection.erl"},{line,238}]},{kpro_connection,init,4,[{file,"kpro_connection.erl"},{line,175}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,226}]}]}}]


错误原因:kafka连不上

解决方案:检查kafka连接

12.关键字: http_connectivity failed

日志原文:


check http_connectivity failed: <<"http://119.3.223.224:7402">>


错误原因:http请求的url连不上

解决方案:检查http://119.3.223.224:7402 这个链接是否可以telnet通

13.关键字: Removing (timedout) connection

日志原文:


** Node 'emqx@10.13.67.20' not responding **, ** Removing (timedout) connection **


错误原因:EMQ未正常启动

解决方案:启动EMQ

14.关键字: awaiting_rel

日志原文:


TEST00-0000-000-C050@47.103.69.133:54588 [Channel] Dropped the qos2 packet 24 due to awaiting_rel is full.


错误原因:qos2的消息队列满了,等待的满了,订阅端接收消息的能力不足

解决方案:增加订阅端的消费能力

15.关键字: login failed

日志原文:


[27784,38451,22235,38498]@58.220.83.118:57540 [Channel] Client 沕镳啕镢 (Username: 'admin') login failed for password_error


错误原因:用户名admin密码错误

解决方案:更换为正确的密码

16.关键字: message=channel_closed

日志原文:


message=channel_closed driver=tcp socket="#Port<0.741>" action=stopping


错误原因:进程正在停掉的日志

解决方案

17.关键字: mysql_conn

日志原文:


crasher: initial call: mysql_conn:init/1, pid: <0.3807.0>, registered_name: [], exit: {{1044,<<"42000">>,<<"Access denied for user 'MQTTTopci'@'%' to database 'mqttdata'">>},[{gen_server,init_it,6,[{file,"gen_server.erl"},{line,401}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,226}]}]}, ancestors: [<0.3806.0>,<0.3805.0>,<0.3803.0>,ecpool_sup,<0.2530.0>], message_queue_len: 0, messages: [], links: [<0.3806.0>], dictionary: [], trap_exit: false, status: running, heap_size: 376, stack_size: 28, reductions: 345; neighbours:


错误原因:emqx 访问被 MySQL 拒绝了

解决方案:检查mysql设置

18.关键字: Socket error

日志原文:


<<"dbce405abb00eee7">>@218.82.138.78:52242 [MQTT] Socket error: einval


错误原因: 可能域名解析失败,也可能网络出现故障

解决方案:4.4版本应该不会有这种报错,需要检查下客户端的域名和网络情况

19.关键字: license_quota

日志原文:


[Alarm Handler] Alarm license_quota is activated, License: the number of connections exceeds 80%


错误原因:连接数过高,超过了限制的80%

解决方案:减少连接数

20.关键字: HTTP request failed

日志原文:


NGROK_TEST00-0000-000-C007@47.103.69.133:33792 HTTP request failed path: <<"/iot/hub/v1alpha1/devices/mqtt-events">> error: {closed,"The connection was lost."}


错误原因:发送的http  认证请求被关掉了,HTTP 服务器端的认证连接失败

解决方案:检查http认证服务器相应配置

21.关键字: high_system_memory_usage

日志原文:


[Alarm Handler] Alarm high_system_memory_usage is activated, System memory usage is higher than 70.0%


错误原因:EMQ占os系统内存过高

解决方案:找到占内存的原因,4.4.1前是bug,已在4.4.2修复

22.关键字: emqx shutdown for join

日志原文:


[EMQ X] emqx shutdown for join


错误原因:加集群时暂时关闭emq

解决方案:加集群时暂时关闭emq

23.关键字: init_module_failure

日志原文:


cluster_call error found, ResL: [{{{init_module_failure,'emqx@10.13.67.122'}, {{emqx_module_auth_redis,on_module_create},badarg}},[{erlang,list_to_integer,["6379 "],[]},{emqx_module_auth_redis,format_server,1,[{file,"emqx_module_auth_redis.erl"},{line,303}]},{emqx_module_auth_redis,format_servers,2,[{file,"emqx_module_auth_redis.erl"},{line,295}]},{emqx_module_auth_redis,on_module_create,2,[{file,"emqx_module_auth_redis.erl"},{line,225}]},{emqx_modules,'-init_module/4-fun-0-',4,[{file,"emqx_modules.erl"},{line,253}]},{emqx_modules,init_module,4,[{file,"emqx_modules.erl"},{line,253}]},{erpc,execute_call,4,[{file,"erpc.erl"},{line,416}]}]},{{{init_module_failure,'emqx@10.13.67.20'},{{emqx_module_auth_redis,on_module_create},badarg}},[{erlang,list_to_integer,["6379 "],[]},{emqx_module_auth_redis,format_server,1,[{file,"emqx_module_auth_redis.erl"},{line,303}]},{emqx_module_auth_redis,format_servers,2,[{file,"emqx_module_auth_redis.erl"},{line,295}]},{emqx_module_auth_redis,on_module_create,2,[{file,"emqx_module_auth_redis.erl"},{line,225}]},{emqx_modules,'-init_module/4-fun-0-',4,[{file,"emqx_modules.erl"},{line,253}]},{emqx_modules,init_module,4,[{file,"emqx_modules.erl"},{line,253}]},{erpc,execute_call,4,[{file,"erpc.erl"},{line,416}]}]}]


错误原因:redis认证模块端口6379后加了空格

解决方案:删掉多出空格后重新启动该模块

24.关键字: emqx_license

日志原文:


[emqx_license] emqx_channel_conn size is undefined:{[undefined],['emqx@10.14.237.109']}

错误原因:这个是同时启动集群中的多个节点, 启动时会去别的节点拿信息,由于另的节点也没准备好,就返回了个undefined

解决方案:只要不是一直报,就没问题的

25.关键字: emqx_acl_mnesia_cli

日志原文:


[Ctl] CMD acl is overidden by {emqx_acl_mnesia_cli,cli}


错误原因:ACL配置被命令重写了

解决方案

26.关键字: JT808 Conn

日志原文:


[JT808 Conn] Parser failed for {invalid_message,                                
[{emqx_jt808_frame,extract_message,3,                                  
[{file,"emqx_jt808_frame.erl"},{line,111}]},                                 
{emqx_jt808_frame,parse_main,3,                                  
[{file,"emqx_jt808_frame.erl"},{line,67}]},                                 
{emqx_jt808_connection,received,2,                                  
[{file,"emqx_jt808_connection.erl"},                                   {line,243}]},                                 
{gen_server,try_dispatch,4,                                  [{file,"gen_server.erl"},
{line,637}]},                                 {gen_server,handle_msg,6,                                  
[{file,"gen_server.erl"},{line,711}]},                                 
{proc_lib,init_p_do_apply,3,                                  [{file,"proc_lib.erl"},
{line,249}]}]}2022-03-13 10:26:16.216 [error] [JT808 Conn] Error data: 
<<126,0,2,0,0,2,0,0,0,0,21,0,3,50,126>>


错误原因:发过来的报文不合法

解决方案:检查发送的报文

27.关键字: Cannot subscribe

日志原文:


<<"20220314192301">>@61.164.57.169:62445 [Channel] Cannot subscribe YD_MICROWAVE_DATA_PRETREAT due to Not authorized.


错误原因:没订阅的权限,所以订阅不了

解决方案:开放订阅的权限

28.关键字: TcpClosed producer

日志原文:


TcpClosed producer: <0.5964.0>


错误原因:日志太少,producer的连接断开了

解决方案

29.关键字: Redis

日志原文:


[Redis] Can't connect to Redis server: {{badmatch,{error,einval}},[{eredis_client,connect_with_tcp,2,[{file,"eredis_client.erl"},{line,369}]},{eredis_client,init,1,[{file,"eredis_client.erl"},{line,98}]},{gen_server,init_it,2,[{file,"gen_server.erl"},{line,417}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,385}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,226}]}]}


错误原因:需要升级下redis库版本

解决方案

30.关键字: lock for too long

日志原文:


kill <0.1401.4063> as it has held the lock for too long, resource: <<"dbce405abb00eee7">>


错误原因 在 ekka 里面某个资源持有的锁一直没有释放,强制杀死持有锁时间较长的进程

解决方案

31.关键字: connect_to_remote_server

日志原文:


event=connect_to_remote_server peer="emqx@10.13.67.20" result=failure reason="econnrefused"


错误原因:连接被拒绝了

解决方案:检查连接

32.关键字: maximum heap size reached

日志原文:

Process:          <0.19563.3699> on node 'emqx@10.13.67.23', Context:          maximum heap size reached, 
Max Heap Size:    10485760, 
Total Heap Size:  23943461, 
Kill:             true, 
Error Logger:     true, GC Info:          [{old_heap_block_size,318187},{heap_block_size,12909534},
{mbuf_size,10715776},{recent_size,6388},{stack_size,33},{old_heap_size,22167},{heap_size,75077},{bin_vheap_size,77750},{bin_vheap_block_size,75110},{bin_old_vheap_size,54000},{bin_old_vheap_block_size,98314}]


错误原因:进程达到了最大的堆内存,一般原因就是消息积压太多了

解决方案

33.关键字: init_action_failure

日志原文:


cluster_call error found, ResL: [{{init_action_failure,'emqx@10.14.69.101'},{{emqx_bridge_kafka_actions,on_action_create_data_to_kafka},{error,{error,kafka_topic_not_found},[{emqx_bridge_kafka_actions,check_kafka_topic,2,[{file,"emqx_bridge_kafka_actions.erl"},{line,459}]},{emqx_bridge_kafka_actions,on_action_create_data_to_kafka,2,[{file,"emqx_bridge_kafka_actions.erl"},{line,386}]},{emqx_rule_engine,'-init_action/4-fun-0-',4,[{file,"emqx_rule_engine.erl"},{line,551}]},{emqx_rule_engine,init_action,4,[{file,"emqx_rule_engine.erl"},{line,551}]},{erpc,execute_call,4,[{file,"erpc.erl"},{line,416}]}]}}},{{init_action_failure,'emqx@10.14.69.7'},{{emqx_bridge_kafka_actions,on_action_create_data_to_kafka},{error,{error,kafka_topic_not_found},[{emqx_bridge_kafka_actions,check_kafka_topic,2,[{file,"emqx_bridge_kafka_actions.erl"},{line,459}]},{emqx_bridge_kafka_actions,on_action_create_data_to_kafka,2,[{file,"emqx_bridge_kafka_actions.erl"},{line,386}]},{emqx_rule_engine,'-init_action/4-fun-0-',4,[{file,"emqx_rule_engine.erl"},{line,551}]},{emqx_rule_engine,init_action,4,[{file,"emqx_rule_engine.erl"},{line,551}]},{erpc,execute_call,4,[{file,"erpc.erl"},{line,416}]}]}}}]


错误原因 写入到kafka失败,Kafka的topic没找到

解决方案 在kafka里建topic或者写入时换kafka里有的topic

34.关键字: mc_worker

日志原文:


Generic server <0.2926.84> terminating. Reason: tcp_closed. Last message: {tcp_closed,#Port<0.2554055>}. State: {state,#Port<0.2554055>,#{7 => #Fun<mc_worker_logic.0.117303875>},<<>>,{conn_state,unsafe,master,<<"admin">>,<<"admin">>},undefined,#Fun<mc_worker.0.76890317>,gen_tcp}.


错误原因:没有设置 set name,这是一个必填字段

解决方案:设置后写入mongo

35.关键字: metadata

日志原文:


Failed to get metadata, reason: unknown_topic_or_partition


错误原因 未知的topic或者分区

解决方案 检查kafka配置

36.关键字: Alarm resource

日志原文:


[Alarm Handler] Alarm resource/web_hook/resource:6777aed2/down is deactivated


错误原因 webhook资源不可用

解决方案 检查webhook资源

37.关键字: action: discard

日志原文:


mqttdoctor_d4e9afe1@10.0.71.227:30041 action: discard, file: emqx_cm.erl, line: 317, mfa: {emqx_cm,kick_or_kill,3}, msg: session_kick_exception, pid: <0.2083.446>, reason: normal, stacktrace: [{emqx_ws_connection,call,3,[{file,"emqx_ws_connection.erl"},{line,167}]},{emqx_cm,kick_or_kill,3,[{file,"emqx_cm.erl"},{line,299}]},{lists,foreach,2,[{file,"lists.erl"},{line,1342}]},{emqx_cm,'-open_session/3-fun-0-',5,[{file,"emqx_cm.erl"},{line,219}]},{emqx_cm_locker,trans,3,[{file,"emqx_cm_locker.erl"},{line,46}]},{emqx_channel,process_connect,2,[{file,"emqx_channel.erl"},{line,492}]},{emqx_ws_connection,with_channel,3,[{file,"emqx_ws_connection.erl"},{line,575}]},{cowboy_websocket,handler_call,6,[{file,"cowboy_websocket.erl"},{line,487}]},{proc_lib,wake_up,3,[{file,"proc_lib.erl"},{line,236}]}], stale_channel: undefined


错误原因 相同的clientID互踢或者手动踢出

解决方案 检查clientid重复登录问题

38.关键字: Dropped msg

日志原文:


rcsReceiver@ba5eb70e-8c03-4563-bd6a-b4737552bb15@101.132.135.166:53926 [Session] Dropped msg due to mqueue is full: Message(Id=•Ú•Z•okªLJ•sЕo, QoS=1, Topic=/maintenance/GS142-0120-T9M-0000/status, From=<<"GS142-0120-T9M-0000">>, Flags=[], Headers=#{peerhost => {101,132,135,166}, properties => #{},proto_ver => 4,protocol => mqtt, username => <<"gaussian">>})


错误原因:消息队列已经满了,订阅端消费能力不足导致

解决方案:增加订阅端消费能力

39.关键字: SYSMON

日志原文:


[SYSMON] large_heap warning: pid = <0.15399.1611>, info: [{old_heap_block_size, 10695351}, {heap_block_size,8912793}, {mbuf_size,0}, {stack_size,49}, {old_heap_size,4158305}, {heap_size,4043698}], [{initial_call,{proc_lib,init_p,5}},{current_function,{ets,select_trap,1}},{registered_name,[]},{status,running},{message_queue_len,0},{group_leader,<0.2147.0>},{priority,normal},{trap_exit,false},{reductions,336485},{last_calls,false},{catchlevel,5},{trace,0},{suspending,[]},{sequential_trace_token,[]},{error_handler,error_handler},{memory,156866100},{total_heap_size,19608144},{heap_size,8912793},{stack_size,51},{min_heap_size,233}]


错误原因:进程达到了最大的堆内存的告警,消息积压太多了

解决方案

40.关键字: ACL http

日志原文:


evocationClient@SIM00-0000-000-F049@47.103.69.133:32814 [ACL http] Request ACL path /iot/v1alpha1/devices/authorize, error: {closed,"The connection was lost."}


错误原因:http认证服务器连不上

解决方案:检查认证服务器连接

41.关键字: PUBREL

日志原文:


TEST00-0000-000-C001@47.103.69.133:56746 [Channel] The PUBREL PacketId 151 is not found.


错误原因:一种是客户端重复发送了 PUBREL 报文,导致相应的 PacketID 在 EMQ X 这里已经处理掉了;另外一种是客户端发送的 PUBREL 报文里面的 PacketID 是错误的,在这之前客户端没有发送过相应的 PUBLISH 报文。

解决方案:检查客户端

42.关键字: PUBREC

日志原文:


69A386BD-62F9-45B6-B582-5A389380061C@10.0.71.228:54410 [Channel] The PUBREC PacketId 1 is inuse.

错误原因:警告信息,EMQX 这边收到的 PUBREC 包使用的 PacketId 重复了,可能是一条 PUBREC 信息被重复发送了,也可能是用错了 Id, 大多数情况下都是重复发送的问题

解决方案

43.关键字: gen_rpc_client_sup

日志原文:


crasher: initial call: gen_rpc_client:init/1, pid: <0.3835.0>, registered_name: 'gen_rpc.client.emqx@10.13.67.20/2614250', error: {{badmatch,{error,einval}},[{gen_rpc_driver_tcp,set_send_timeout,2,[{file,"gen_rpc_driver_tcp.erl"},{line,191}]},{gen_rpc_client,send_cast,4,[{file,"gen_rpc_client.erl"},{line,418}]},{gen_server,try_dispatch,4,[{file,"gen_server.erl"},{line,689}]},{gen_server,handle_msg,6,[{file,"gen_server.erl"},{line,765}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,226}]}]}, ancestors: [gen_rpc_client_sup,gen_rpc_sup,<0.2131.0>], message_queue_len: 23, messages: [{tcp_error,#Port<0.300>,econnreset},{tcp_closed,#Port<0.300>},{{cast,emqx_broker,dispatch,[<<"thing/product/mockdevicesn/osd">>,{delivery,<0.24144.1>,{message,<<0,5,218,42,74,192,37,28,39,137,1,0,94,80,0,124>>,0,<<"30bc90c0b567_bench_pub_677_2014658720">>,#{dup => false,retain => false},#{peerhost => {111,222,129,229},properties => #{},proto_ver => 5,protocol => mqtt,username => <<"mockuser">>},<<"thing/product/mockdevicesn/osd">>,<<"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa">>,1647250061141}}]},undefined},{{cast,emqx_broker,dispatch,[<<"thing/product/mockdevicesn/osd">>,{delivery,<0.24637.1>,{message,<<0,5,218,42,74,192,43,42,39,137,1,0,96,61,0,123>>,0,<<"30bc90c0b567_bench_pub_776_2348236373">>,#{dup => false,retain => false},#{peerhost => {111,222,129,229},properties => #{},proto_ver => 5,protocol => mqtt,username => <<"mockuser">>},<<"thing/product/mockdevicesn/osd">>


错误原因:与别一个节点的RPC连接不可用了

解决方案:要看另外个节点有什么报错, 是不是重启,还是升级了。还是其它网络原因

44.关键字: Cannot publish message

日志原文:


<<"6afc4fb6-23ee-4010-9971-05e3c2aa5d9a">>@39.144.11.81:34937 [Channel] Cannot publish message to device//events due to Not authorized.


错误原因:没有发布到这个topic的权限

解决方案:开放权限

45.关键字: JT808 Proto

日志原文:


[JT808 Proto] Unexpected frame #{<<"body">> => #{},                                 <<"header">> =>                                     #{<<"encrypt">> => 0,<<"len">> => 0,                                       <<"msg_id">> => 3,<<"msg_sn">> => 9,                                       <<"phone">> => <<"014161736453">>}}2022-03-14 01:18:05.107 [error] ** Generic server <0.646.655> terminating ** Last message in was {inet_async,#Port<0.6944860>,0,                           {ok,<<126,0,3,0,0,1,65,97,115,100,83,0,9,111,126>>}}** When Server state == {state,esockd_transport,#Port<0.6944860>,                            {{117,132,196,139},22319},                            {not_detect_heading_0x7e,8192},                            {pstate,                                #{conn_mod => emqx_jt808_connection,                                  peercert => nossl,                                  peername => {{117,132,196,139},22319},                                  sendfun =>                                      {fun emqx_jt808_connection:send/3,                                       [esockd_transport,#Port<0.6944860>]},                                  sockname => {{10,65,141,225},8090},                                  socktype => tcp},                                #{clientid => undefined,is_bridge => false,                                  is_superuser => false,                                  mountpoint => undefined,                                  peerhost => {117,132,196,139},                                  protocol => jt808,sockport => 8090,                                  username => undefined,zone => undefined},                                undefined,undefined,undefined,                                {fun emqx_jt808_connection:send/3,                                 [esockd_transport,#Port<0.6944860>]},                                undefined,0,<<"jt808/%c/dn">>,                                <<"jt808/%c/up">>,                                {auth,undefined,undefined,true},                                [0,                                 <<0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,                                   0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,                                   0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,                                   0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,                                   0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,                                   0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,                                   0,0>>],                                {inflight,128,{0,nil}},                                undefined,undefined},                            undefined,undefined,8192,true,30000}** Reason for termination ==** unexpected_frame2022-03-14 01:18:05.108 [error]   crasher:    initial call: emqx_jt808_connection:init/1    pid: <0.646.655>    registered_name: []    exception exit: unexpected_frame      in function  gen_server:handle_common_reply/8 (gen_server.erl, line 751)    ancestors: [<0.2670.0>,<0.2669.0>,esockd_sup,<0.2055.0>]    message_queue_len: 1    messages: [{inet_async,#Port<0.6944860>,1,{error,closed}}]    links: [<0.2670.0>]    dictionary: []    trap_exit: false    status: running    heap_size: 6772    stack_size: 27    reductions: 15706  neighbours


错误原因:解析失败了,设备上来的数据不对

解决方案  

46.关键字: rocketmq

日志原文:


State machine <0.5958.0> terminating. Reason: {error,econnrefused}. Stack: [{gen_statem,loop_state_callback_result,11,[{file,"gen_statem.erl"},{line,1360}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,226}]}]. Last event: {info,connecting}. State: {idle,{state,<<"bridge_rocketmq:resource:d0de7c39_al_device_up">>,<<"al_device_up">>,<<"172.24.1.152:10911">>,undefined,5,17208,[{sndbuf,1048576}],{emqx_bridge_rocket_actions,rocket_callback,[<<"data_to_rocket_1646912587576490567">>]},100,#{},<<>>}}.


错误原因:连接被rocketmq服务器拒绝

解决方案:检查rocketmq的连接

47.关键字: esockd_connection_sup

日志原文:


supervisor: 'esockd_connection_sup - <0.2390.0>', errorContext: connection_shutdown, reason: {badmatch,<<>>}, offender: [{pid,<0.10594.41>},{name,connection},{mfargs,{emqx_connection,start_link,[[{deflate_options,[]},{max_conn_rate,1000},{active_n,100},{zone,external},{proxy_address_header,<<>>},{proxy_port_header,<<>>},{supported_subprotocols,[]}]]}}]


错误原因:连接还没有创建完成,就被断开了

解决方案:检查连接

48.关键字: get_status_of_pool_workers

日志原文:


get_status_of_pool_workers failed: {throw,{timeout,<0.8230.420>}}, stacktrace: []


错误原因:在创建测试资源的进程池时,测试的资源没有在1秒内回复正常.

举个例子:如果是mongo,创建测试资源时,去查执行一个最简单的mongo查询,他没有在一秒内收到回复

解决方案

49.关键字: Take action

日志原文:


<<"android:7706416585007415566:7b6a3110779b89e3">>@112.97.83.197:36545 Take action <<"data_to_kafka_1646315366030644983">> failed, continue next action, reason: {error,timeout,[{wolff_producer,send_sync,3,[{file,"wolff_producer.erl"},{line,139}]},{emqx_bridge_kafka_actions,produce,6,[{file,"emqx_bridge_kafka_actions.erl"},{line,408}]},{emqx_rule_runtime,take_action,5,[{file,"emqx_rule_runtime.erl"},{line,236}]},{emqx_rule_runtime,'-take_actions/4-lc$^0/1-0-',4,[{file,"emqx_rule_runtime.erl"},{line,227}]},{emqx_rule_runtime,'-take_actions/4-lc$^0/1-0-',4,[{file,"emqx_rule_runtime.erl"},{line,228}]},{emqx_rule_runtime,do_apply_rule,2,[{file,"emqx_rule_runtime.erl"},{line,110}]},{emqx_rule_runtime,apply_rules,2,[{file,"emqx_rule_runtime.erl"},{line,52}]},{emqx_hooks,safe_execute,2,[{file,"emqx_hooks.erl"},{line,164}]}]}


错误原因 Kafka连接timeout

解决方案 检查相关填写参数是否正确,以及Kafka服务是否正常。

50.关键字: epgsql

日志原文:


Generic server <0.27348.0> terminating. Reason: nxdomain. Last message: {command,epgsql_cmd_connect,#{database => "dwhdb",ecpool_worker_id => 1,host => "matrixdb.c4.srv",password => #Fun<epgsql_cmd_connect.0.29916615>,port => 80,username => "mxadmin"}}. State: {state,undefined,undefined,<<>>,undefined,on_message,undefined,{[],[]},undefined,undefined,undefined,undefined,[],information_redacted,[],undefined,undefined,undefined,undefined,undefined}. Client <0.27347.0> stacktrace: [{logger_config,allow,2,[{file,"logger_config.erl"},{line,64}]},{proc_lib,crash_report,4,[{file,"proc_lib.erl"},{line,525}]},{proc_lib,exit_p,3,[{file,"proc_lib.erl"},{line,246}]}].


错误原因:matrixdb需要进行设置pg_hba.conf才可以连接数据库

解决方案:进行设置

51.关键字: Quota exceeded

日志原文:


mqtt-benchmark-227@100.121.120.249:57499 [Channel] Cannot publish messages to thing/product/mockdevicesn/osd due to Quota exceeded.


错误原因 :超出配额限制

解决方案:避免超出配额限制。

52.关键字: WebHook Action

日志原文:


d3dea20d-f45f-4243-8bc0-ddd556472708@39.144.5.19:16876 [WebHook Action] HTTP request error: timeout


错误原因:请求地址超时

解决方案:检查URL地址是否能正常通信

53.关键字: cluster_call error found

日志原文:


cluster_call error found, ResL: [{{init_action_failure,'emqx@10.14.69.101'},{{emqx_bridge_kafka_actions,on_action_create_data_to_kafka},{error,{error,kafka_topic_not_found},[{emqx_bridge_kafka_actions,check_kafka_topic,2,[{file,"emqx_bridge_kafka_actions.erl"},{line,459}]},{emqx_bridge_kafka_actions,on_action_create_data_to_kafka,2,[{file,"emqx_bridge_kafka_actions.erl"},{line,386}]},{emqx_rule_engine,'-init_action/4-fun-0-',4,[{file,"emqx_rule_engine.erl"},{line,551}]},{emqx_rule_engine,init_action,4,[{file,"emqx_rule_engine.erl"},{line,551}]},{erpc,execute_call,4,[{file,"erpc.erl"},{line,416}]}]}}},{{init_action_failure,'emqx@10.14.69.7'},{{emqx_bridge_kafka_actions,on_action_create_data_to_kafka},{error,{error,kafka_topic_not_found},[{emqx_bridge_kafka_actions,check_kafka_topic,2,[{file,"emqx_bridge_kafka_actions.erl"},{line,459}]},{emqx_bridge_kafka_actions,on_action_create_data_to_kafka,2,[{file,"emqx_bridge_kafka_actions.erl"},{line,386}]},{emqx_rule_engine,'-init_action/4-fun-0-',4,[{file,"emqx_rule_engine.erl"},{line,551}]},{emqx_rule_engine,init_action,4,[{file,"emqx_rule_engine.erl"},{line,551}]},{erpc,execute_call,4,[{file,"erpc.erl"},{line,416}]}]}}}]


错误原因:Kafka topic没找到

解决方案:请创建相应topic

54.关键字: Auth http

日志原文:


CENSYS@167.94.146.58:45530 [Auth http] Deny connection from path: /emqx/mqtt/auth, response http code: 400


错误原因:HTTP认证服务拒绝连接

解决方案:检查HTTP服务

55.关键字: PUBACK

日志原文:


76EE2AF1220D@121.69.9.98:50634 [Channel] The PUBACK PacketId 6 is not found.

错误原因:PUBACK报文PacketId 6没找到

解决方案

56.关键字: Parse failed for {badmatch

日志原文:


172.105.87.91:35516 [MQTT] , Parse failed for {badmatch,<<>>}, [{emqx_frame,parse_packet,3,[{file,"emqx_frame.erl"},{line,237}]},{emqx_frame,parse_frame,4,[{file,"emqx_frame.erl"},{line,201}]},{emqx_connection,parse_incoming,3,[{file,"emqx_connection.erl"},{line,625}]},{emqx_connection,handle_msg,2,[{file,"emqx_connection.erl"},{line,618}]},{emqx_connection,process_msg,2,[{file,"emqx_connection.erl"},{line,364}]},{emqx_connection,handle_recv,3,[{file,"emqx_connection.erl"},{line,328}]},{proc_lib,wake_up,3,[{file,"proc_lib.erl"},{line,236}]}], Frame data:<<22,3,0,0,83,1,0,0,79,3,0,63,71,215,247,186,44,238,234,178,96,126,243,0,253,130,123,185,213,150,200,119,155,230,196,219,60,61,219,111,239,16,110,0,0,40,0,22,0,19,0,10,0,102,0,5,0,4,0,101,0,100,0,99,0,98,0,97,0,96,0,21,0,18,0,9,0,20,0,17,0,8,0,6,0,3,1,0>>


错误原因:报文解析失败

解决方案:检查发布报文

57.关键字: Connection rejected

日志原文:


47.92.85.45:47136 Connection rejected due to max clients limitation


错误原因:达到最大连接数限制

解决方案:请保持连接数在限制以内。

58.关键字: ehttpc

日志原文:


ehttpc: Received 'gun_data' message from unknown stream ref: #Ref<0.1986289746.2927886337.204390>


错误原因:内部把LOG打错了

解决方案:已修复

59.关键字: JT808 Frame

日志原文:


[JT808 Frame] unknow message id 256, <<0,44,1,47,55,48,49,49,49,77,71,45,49,49,                                       0,0,0,48,48,48,48,48,48,48,1,212,193,66,                                       56,56,56,56,56>>2022-03-14 01:15:30.139 [error] [JT808 Conn] Parser failed for {invalid_message,                                [{emqx_jt808_frame,parse_message_body,2,                                  [{file,"emqx_jt808_frame.erl"},{line,219}]},                                 {emqx_jt808_frame,parse_message,1,                                  [{file,"emqx_jt808_frame.erl"},{line,121}]},                                 {emqx_jt808_frame,parse_main,3,                                  [{file,"emqx_jt808_frame.erl"},{line,69}]},                                 {emqx_jt808_connection,received,2,                                  [{file,"emqx_jt808_connection.erl"},                                   {line,243}]},                                 {gen_server,try_dispatch,4,                                  [{file,"gen_server.erl"},{line,637}]},                                 {gen_server,handle_msg,6,                                  [{file,"gen_server.erl"},{line,711}]},                                 {proc_lib,init_p_do_apply,3,                                  [{file,"proc_lib.erl"},{line,249}]}]}2022-03-14 01:15:30.139 [error] [JT808 Conn] Error data: <<126,1,0,0,33,1,65,97,115,100,83,0,0,0,44,1,47,55,48,                           49,49,49,77,71,45,49,49,0,0,0,48,48,48,48,48,48,48,                           1,212,193,66,56,56,56,56,56,8,126>>


错误原因:发的消息不符合jt808 protocol规范

解决方案

60.关键字: Unexpected sock_closed

日志原文:


<<"emq_4b74fa73b8454fe2a7bcbe5b2b4e8066">>@183.129.130.2:55474 [Channel] Unexpected sock_closed: tcp_closed


错误原因:socket被close了

解决方案:查传输层面的影响 比如网络或os