1.关键字:ecpool_pool_sup
日志原文:
Supervisor: {<0.18803.107>,ecpool_pool_sup}. Context: start_error. Reason: {shutdown,{failed_to_start_child,{worker,1},{{badmatch,{error,einval}},[{eredis_client,connect_with_tcp,2,[{file,"eredis_client.erl"},{line,369}]},{eredis_client,init,1,[{file,"eredis_client.erl"},{line,98}]},{gen_server,init_it,2,[{file,"gen_server.erl"},{line,417}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,385}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,226}]}]}}}. Offender: id=worker_sup,pid=undefined.
错误原因:eredis_client 里面的 badmatch,是个bug,redis驱动因为DNS解析失败等情况产生的问题
解决方案:已在4.4.3修复
2.关键字:Warning 1265
日志原文:
Warning 1265: Data truncated for column 'temp' at row 1, Warning 1265: Data truncated for column 'hum' at row 1, in insert into , temp_hum(up_timestamp, client_id, temp, hum) VALUES , (FROM_UNIXTIME(1647761018341/1000), 'diaocheguanli1', 'undefined', 'undefined')
错误原因:一般是 因为数据类型的不对应,或者字符串长度不够而造成的
解决方案:更改为正确的数据类型或者增加字符串长度
3.关键字: initial call: cowboy_clear
日志原文:
crasher: initial call: cowboy_clear:connection_process/4 pid: <0.25409.710> registered_name: [] exception error: no match of right hand side value {error,closed} in function cowboy_clear:connection_process/4 (cowboy_clear.erl, line 38) ancestors: [<0.2302.0>,<0.2301.0>,ranch_sup,<0.2062.0>] message_queue_len: 1 messages: [{handshake,'mqtt:ws:9083',ranch_tcp,#Port<0.9487546>,5000}] links: [<0.2302.0>,#Port<0.9487546>] dictionary: [] trap_exit: false status: running heap_size: 610 stack_size: 27 reductions: 263 neighbours:2022-03-13 13:27:35.816 [error] Ranch listener 'mqtt:ws:9083' had connection process started with cowboy_clear:start_link/4 at <0.25409.710> exit with reason: {{badmatch,{error,closed}},[{cowboy_clear,connection_process,4,[{file,"cowboy_clear.erl"},{line,38}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,249}]}]}
错误原因:ws的在握手的过程中被中断了
解决方案:检查下网络问题
4.关键字: Parse failed for frame_too_large
日志原文:
10.192.198.141:42568 [MQTT] , Parse failed for frame_too_large, [{emqx_frame,parse_remaining_len,5,[{file,"emqx_frame.erl"},{line,159}]},{emqx_connection,parse_incoming,3,[{file,"emqx_connection.erl"},{line,625}]},{emqx_connection,handle_msg,2,[{file,"emqx_connection.erl"},{line,618}]},{emqx_connection,process_msg,2,[{file,"emqx_connection.erl"},{line,364}]},{emqx_connection,handle_recv,3,[{file,"emqx_connection.erl"},{line,328}]},{proc_lib,wake_up,3,[{file,"proc_lib.erl"},{line,236}]}], Frame data:<<0,164,255,83,77,66,114,0,0,0,0,8,1,64,0,0,0,0,0,0,0,0,0,0,0,0,0,0,64,6,0,0,1,0,0,129,0,2,80,67,32,78,69,84,87,79,82,75,32,80,82,79,71,82,65,77,32,49,46,48,0,2,77,73,67,82,79,83,79,70,84,32,78,69,84,87,79,82,75,83,32,49,46,48,51,0,2,77,73,67,82,79,83,79,70,84,32,78,69,...>>
错误原因:数据报文太长了
解决方案:检查下发送的相关的报文
5.关键字: esockd_acceptor_sup
日志原文:
Supervisor: {<0.2393.0>,esockd_acceptor_sup}. Context: shutdown_error. Reason: noproc. Offender: id=acceptor,nb_children=2.
错误原因:进程挂掉了
解决方案:需要上下文详细定位
6.关键字: PoolWorker
日志原文:
[PoolWorker] supervisee <0.12604.2> is force killed
错误原因:ecpool 的功能,被底层库强制关闭了,这个是预期行为,如果子进程卡住是要强杀的
解决方案
7.关键字: resource
日志原文:
[Alarm Handler] Alarm resource/web_hook/resource:6777aed2/down is deactivated
错误原因:webhook资源down掉了
解决方案:检查webhook的url情况
8.关键字: More than one channel found
日志原文:
<<"zhtbhz_beidou_server_test">>@14.108.214.6:61987 [CM] More than one channel found: [<0.23782.682>,<0.23804.682>]
错误原因:同一个 ClientId 出现了多个会话
解决方案
9.关键字: Failed to discard
日志原文:
<<"android:7706416585007415566:7b6a3110779b89e3">>@112.97.83.197:37017 [CM] Failed to discard <59069.26551.1562>: {'EXIT',{{shutdown,tcp_closed},{gen_server,call,[<59069.26551.1562>,discard,infinity]}}}
错误原因:客户端网络断了,导致任务分发失败,有可能是网络导致的,强断也有可能
解决方案:检查网络问题
10.关键字: Warning 1292
日志原文:
Warning 1292: Truncated incorrect DOUBLE value: 'undefined', in INSERT INTO monitor_data_H621382_2022 (id, cod, nh3n, turb, ec, ph, dio, temp, vol, gps, monitor_dt, create_dt, state) VALUES ('H621382', ROUND(7.0591, 2), abs(ROUND(1.025, 2)), ROUND(10.93, 2), ROUND(0.71, 2), ROUND(8.22, 2), ROUND(4.2, 2), ROUND(13.67, 2),ROUND('undefined', 2), '0,0', FROM_UNIXTIME(1648004224), now(), '1')
错误原因:写的mysql查询语句中所提供的字段,与表中字段类型不匹配
解决方案:更改为对应的字段
11.关键字: failed_to_connect_all
日志原文:
failed_to_connect_all: [{{"127.0.0.1",9092},{{{kpro_req,#Ref<60403.1985112988.647757825.240140>,api_versions,0,false,[]},closed},[{kpro_lib,send_and_recv_raw,4,[{file,"kpro_lib.erl"},{line,70}]},{kpro_lib,send_and_recv,5,[{file,"kpro_lib.erl"},{line,81}]},{kpro_connection,query_api_versions,4,[{file,"kpro_connection.erl"},{line,251}]},{kpro_connection,init_connection,2,[{file,"kpro_connection.erl"},{line,238}]},{kpro_connection,init,4,[{file,"kpro_connection.erl"},{line,175}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,226}]}]}}]
错误原因:kafka连不上
解决方案:检查kafka连接
12.关键字: http_connectivity failed
日志原文:
check http_connectivity failed: <<"http://119.3.223.224:7402">>
错误原因:http请求的url连不上
解决方案:检查http://119.3.223.224:7402 这个链接是否可以telnet通
13.关键字: Removing (timedout) connection
日志原文:
** Node 'emqx@10.13.67.20' not responding **, ** Removing (timedout) connection **
错误原因:EMQ未正常启动
解决方案:启动EMQ
14.关键字: awaiting_rel
日志原文:
TEST00-0000-000-C050@47.103.69.133:54588 [Channel] Dropped the qos2 packet 24 due to awaiting_rel is full.
错误原因:qos2的消息队列满了,等待的满了,订阅端接收消息的能力不足
解决方案:增加订阅端的消费能力
15.关键字: login failed
日志原文:
[27784,38451,22235,38498]@58.220.83.118:57540 [Channel] Client 沕镳啕镢 (Username: 'admin') login failed for password_error
错误原因:用户名admin密码错误
解决方案:更换为正确的密码
16.关键字: message=channel_closed
日志原文:
message=channel_closed driver=tcp socket="#Port<0.741>" action=stopping
错误原因:进程正在停掉的日志
解决方案
17.关键字: mysql_conn
日志原文:
crasher: initial call: mysql_conn:init/1, pid: <0.3807.0>, registered_name: [], exit: {{1044,<<"42000">>,<<"Access denied for user 'MQTTTopci'@'%' to database 'mqttdata'">>},[{gen_server,init_it,6,[{file,"gen_server.erl"},{line,401}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,226}]}]}, ancestors: [<0.3806.0>,<0.3805.0>,<0.3803.0>,ecpool_sup,<0.2530.0>], message_queue_len: 0, messages: [], links: [<0.3806.0>], dictionary: [], trap_exit: false, status: running, heap_size: 376, stack_size: 28, reductions: 345; neighbours:
错误原因:emqx 访问被 MySQL 拒绝了
解决方案:检查mysql设置
18.关键字: Socket error
日志原文:
<<"dbce405abb00eee7">>@218.82.138.78:52242 [MQTT] Socket error: einval
错误原因: 可能域名解析失败,也可能网络出现故障
解决方案:4.4版本应该不会有这种报错,需要检查下客户端的域名和网络情况
19.关键字: license_quota
日志原文:
[Alarm Handler] Alarm license_quota is activated, License: the number of connections exceeds 80%
错误原因:连接数过高,超过了限制的80%
解决方案:减少连接数
20.关键字: HTTP request failed
日志原文:
NGROK_TEST00-0000-000-C007@47.103.69.133:33792 HTTP request failed path: <<"/iot/hub/v1alpha1/devices/mqtt-events">> error: {closed,"The connection was lost."}
错误原因:发送的http 认证请求被关掉了,HTTP 服务器端的认证连接失败
解决方案:检查http认证服务器相应配置
21.关键字: high_system_memory_usage
日志原文:
[Alarm Handler] Alarm high_system_memory_usage is activated, System memory usage is higher than 70.0%
错误原因:EMQ占os系统内存过高
解决方案:找到占内存的原因,4.4.1前是bug,已在4.4.2修复
22.关键字: emqx shutdown for join
日志原文:
[EMQ X] emqx shutdown for join
错误原因:加集群时暂时关闭emq
解决方案:加集群时暂时关闭emq
23.关键字: init_module_failure
日志原文:
cluster_call error found, ResL: [{{{init_module_failure,'emqx@10.13.67.122'}, {{emqx_module_auth_redis,on_module_create},badarg}},[{erlang,list_to_integer,["6379 "],[]},{emqx_module_auth_redis,format_server,1,[{file,"emqx_module_auth_redis.erl"},{line,303}]},{emqx_module_auth_redis,format_servers,2,[{file,"emqx_module_auth_redis.erl"},{line,295}]},{emqx_module_auth_redis,on_module_create,2,[{file,"emqx_module_auth_redis.erl"},{line,225}]},{emqx_modules,'-init_module/4-fun-0-',4,[{file,"emqx_modules.erl"},{line,253}]},{emqx_modules,init_module,4,[{file,"emqx_modules.erl"},{line,253}]},{erpc,execute_call,4,[{file,"erpc.erl"},{line,416}]}]},{{{init_module_failure,'emqx@10.13.67.20'},{{emqx_module_auth_redis,on_module_create},badarg}},[{erlang,list_to_integer,["6379 "],[]},{emqx_module_auth_redis,format_server,1,[{file,"emqx_module_auth_redis.erl"},{line,303}]},{emqx_module_auth_redis,format_servers,2,[{file,"emqx_module_auth_redis.erl"},{line,295}]},{emqx_module_auth_redis,on_module_create,2,[{file,"emqx_module_auth_redis.erl"},{line,225}]},{emqx_modules,'-init_module/4-fun-0-',4,[{file,"emqx_modules.erl"},{line,253}]},{emqx_modules,init_module,4,[{file,"emqx_modules.erl"},{line,253}]},{erpc,execute_call,4,[{file,"erpc.erl"},{line,416}]}]}]
错误原因:redis认证模块端口6379后加了空格
解决方案:删掉多出空格后重新启动该模块
24.关键字: emqx_license
日志原文:
[emqx_license] emqx_channel_conn size is undefined:{[undefined],['emqx@10.14.237.109']}
错误原因:这个是同时启动集群中的多个节点, 启动时会去别的节点拿信息,由于另的节点也没准备好,就返回了个undefined
解决方案:只要不是一直报,就没问题的
25.关键字: emqx_acl_mnesia_cli
日志原文:
[Ctl] CMD acl is overidden by {emqx_acl_mnesia_cli,cli}
错误原因:ACL配置被命令重写了
解决方案
26.关键字: JT808 Conn
日志原文:
[JT808 Conn] Parser failed for {invalid_message,
[{emqx_jt808_frame,extract_message,3,
[{file,"emqx_jt808_frame.erl"},{line,111}]},
{emqx_jt808_frame,parse_main,3,
[{file,"emqx_jt808_frame.erl"},{line,67}]},
{emqx_jt808_connection,received,2,
[{file,"emqx_jt808_connection.erl"}, {line,243}]},
{gen_server,try_dispatch,4, [{file,"gen_server.erl"},
{line,637}]}, {gen_server,handle_msg,6,
[{file,"gen_server.erl"},{line,711}]},
{proc_lib,init_p_do_apply,3, [{file,"proc_lib.erl"},
{line,249}]}]}2022-03-13 10:26:16.216 [error] [JT808 Conn] Error data:
<<126,0,2,0,0,2,0,0,0,0,21,0,3,50,126>>
错误原因:发过来的报文不合法
解决方案:检查发送的报文
27.关键字: Cannot subscribe
日志原文:
<<"20220314192301">>@61.164.57.169:62445 [Channel] Cannot subscribe YD_MICROWAVE_DATA_PRETREAT due to Not authorized.
错误原因:没订阅的权限,所以订阅不了
解决方案:开放订阅的权限
28.关键字: TcpClosed producer
日志原文:
TcpClosed producer: <0.5964.0>
错误原因:日志太少,producer的连接断开了
解决方案
29.关键字: Redis
日志原文:
[Redis] Can't connect to Redis server: {{badmatch,{error,einval}},[{eredis_client,connect_with_tcp,2,[{file,"eredis_client.erl"},{line,369}]},{eredis_client,init,1,[{file,"eredis_client.erl"},{line,98}]},{gen_server,init_it,2,[{file,"gen_server.erl"},{line,417}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,385}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,226}]}]}
错误原因:需要升级下redis库版本
解决方案
30.关键字: lock for too long
日志原文:
kill <0.1401.4063> as it has held the lock for too long, resource: <<"dbce405abb00eee7">>
错误原因 在 ekka 里面某个资源持有的锁一直没有释放,强制杀死持有锁时间较长的进程
解决方案
31.关键字: connect_to_remote_server
日志原文:
event=connect_to_remote_server peer="emqx@10.13.67.20" result=failure reason="econnrefused"
错误原因:连接被拒绝了
解决方案:检查连接
32.关键字: maximum heap size reached
日志原文:
Process: <0.19563.3699> on node 'emqx@10.13.67.23', Context: maximum heap size reached,
Max Heap Size: 10485760,
Total Heap Size: 23943461,
Kill: true,
Error Logger: true, GC Info: [{old_heap_block_size,318187},{heap_block_size,12909534},
{mbuf_size,10715776},{recent_size,6388},{stack_size,33},{old_heap_size,22167},{heap_size,75077},{bin_vheap_size,77750},{bin_vheap_block_size,75110},{bin_old_vheap_size,54000},{bin_old_vheap_block_size,98314}]
错误原因:进程达到了最大的堆内存,一般原因就是消息积压太多了
解决方案
33.关键字: init_action_failure
日志原文:
cluster_call error found, ResL: [{{init_action_failure,'emqx@10.14.69.101'},{{emqx_bridge_kafka_actions,on_action_create_data_to_kafka},{error,{error,kafka_topic_not_found},[{emqx_bridge_kafka_actions,check_kafka_topic,2,[{file,"emqx_bridge_kafka_actions.erl"},{line,459}]},{emqx_bridge_kafka_actions,on_action_create_data_to_kafka,2,[{file,"emqx_bridge_kafka_actions.erl"},{line,386}]},{emqx_rule_engine,'-init_action/4-fun-0-',4,[{file,"emqx_rule_engine.erl"},{line,551}]},{emqx_rule_engine,init_action,4,[{file,"emqx_rule_engine.erl"},{line,551}]},{erpc,execute_call,4,[{file,"erpc.erl"},{line,416}]}]}}},{{init_action_failure,'emqx@10.14.69.7'},{{emqx_bridge_kafka_actions,on_action_create_data_to_kafka},{error,{error,kafka_topic_not_found},[{emqx_bridge_kafka_actions,check_kafka_topic,2,[{file,"emqx_bridge_kafka_actions.erl"},{line,459}]},{emqx_bridge_kafka_actions,on_action_create_data_to_kafka,2,[{file,"emqx_bridge_kafka_actions.erl"},{line,386}]},{emqx_rule_engine,'-init_action/4-fun-0-',4,[{file,"emqx_rule_engine.erl"},{line,551}]},{emqx_rule_engine,init_action,4,[{file,"emqx_rule_engine.erl"},{line,551}]},{erpc,execute_call,4,[{file,"erpc.erl"},{line,416}]}]}}}]
错误原因 写入到kafka失败,Kafka的topic没找到
解决方案 在kafka里建topic或者写入时换kafka里有的topic
34.关键字: mc_worker
日志原文:
Generic server <0.2926.84> terminating. Reason: tcp_closed. Last message: {tcp_closed,#Port<0.2554055>}. State: {state,#Port<0.2554055>,#{7 => #Fun<mc_worker_logic.0.117303875>},<<>>,{conn_state,unsafe,master,<<"admin">>,<<"admin">>},undefined,#Fun<mc_worker.0.76890317>,gen_tcp}.
错误原因:没有设置 set name,这是一个必填字段
解决方案:设置后写入mongo
35.关键字: metadata
日志原文:
Failed to get metadata, reason: unknown_topic_or_partition
错误原因 未知的topic或者分区
解决方案 检查kafka配置
36.关键字: Alarm resource
日志原文:
[Alarm Handler] Alarm resource/web_hook/resource:6777aed2/down is deactivated
错误原因 webhook资源不可用
解决方案 检查webhook资源
37.关键字: action: discard
日志原文:
mqttdoctor_d4e9afe1@10.0.71.227:30041 action: discard, file: emqx_cm.erl, line: 317, mfa: {emqx_cm,kick_or_kill,3}, msg: session_kick_exception, pid: <0.2083.446>, reason: normal, stacktrace: [{emqx_ws_connection,call,3,[{file,"emqx_ws_connection.erl"},{line,167}]},{emqx_cm,kick_or_kill,3,[{file,"emqx_cm.erl"},{line,299}]},{lists,foreach,2,[{file,"lists.erl"},{line,1342}]},{emqx_cm,'-open_session/3-fun-0-',5,[{file,"emqx_cm.erl"},{line,219}]},{emqx_cm_locker,trans,3,[{file,"emqx_cm_locker.erl"},{line,46}]},{emqx_channel,process_connect,2,[{file,"emqx_channel.erl"},{line,492}]},{emqx_ws_connection,with_channel,3,[{file,"emqx_ws_connection.erl"},{line,575}]},{cowboy_websocket,handler_call,6,[{file,"cowboy_websocket.erl"},{line,487}]},{proc_lib,wake_up,3,[{file,"proc_lib.erl"},{line,236}]}], stale_channel: undefined
错误原因 相同的clientID互踢或者手动踢出
解决方案 检查clientid重复登录问题
38.关键字: Dropped msg
日志原文:
rcsReceiver@ba5eb70e-8c03-4563-bd6a-b4737552bb15@101.132.135.166:53926 [Session] Dropped msg due to mqueue is full: Message(Id=•Ú•Z•okªLJ•sЕo, QoS=1, Topic=/maintenance/GS142-0120-T9M-0000/status, From=<<"GS142-0120-T9M-0000">>, Flags=[], Headers=#{peerhost => {101,132,135,166}, properties => #{},proto_ver => 4,protocol => mqtt, username => <<"gaussian">>})
错误原因:消息队列已经满了,订阅端消费能力不足导致
解决方案:增加订阅端消费能力
39.关键字: SYSMON
日志原文:
[SYSMON] large_heap warning: pid = <0.15399.1611>, info: [{old_heap_block_size, 10695351}, {heap_block_size,8912793}, {mbuf_size,0}, {stack_size,49}, {old_heap_size,4158305}, {heap_size,4043698}], [{initial_call,{proc_lib,init_p,5}},{current_function,{ets,select_trap,1}},{registered_name,[]},{status,running},{message_queue_len,0},{group_leader,<0.2147.0>},{priority,normal},{trap_exit,false},{reductions,336485},{last_calls,false},{catchlevel,5},{trace,0},{suspending,[]},{sequential_trace_token,[]},{error_handler,error_handler},{memory,156866100},{total_heap_size,19608144},{heap_size,8912793},{stack_size,51},{min_heap_size,233}]
错误原因:进程达到了最大的堆内存的告警,消息积压太多了
解决方案
40.关键字: ACL http
日志原文:
evocationClient@SIM00-0000-000-F049@47.103.69.133:32814 [ACL http] Request ACL path /iot/v1alpha1/devices/authorize, error: {closed,"The connection was lost."}
错误原因:http认证服务器连不上
解决方案:检查认证服务器连接
41.关键字: PUBREL
日志原文:
TEST00-0000-000-C001@47.103.69.133:56746 [Channel] The PUBREL PacketId 151 is not found.
错误原因:一种是客户端重复发送了 PUBREL 报文,导致相应的 PacketID 在 EMQ X 这里已经处理掉了;另外一种是客户端发送的 PUBREL 报文里面的 PacketID 是错误的,在这之前客户端没有发送过相应的 PUBLISH 报文。
解决方案:检查客户端
42.关键字: PUBREC
日志原文:
69A386BD-62F9-45B6-B582-5A389380061C@10.0.71.228:54410 [Channel] The PUBREC PacketId 1 is inuse.
错误原因:警告信息,EMQX 这边收到的 PUBREC 包使用的 PacketId 重复了,可能是一条 PUBREC 信息被重复发送了,也可能是用错了 Id, 大多数情况下都是重复发送的问题
解决方案
43.关键字: gen_rpc_client_sup
日志原文:
crasher: initial call: gen_rpc_client:init/1, pid: <0.3835.0>, registered_name: 'gen_rpc.client.emqx@10.13.67.20/2614250', error: {{badmatch,{error,einval}},[{gen_rpc_driver_tcp,set_send_timeout,2,[{file,"gen_rpc_driver_tcp.erl"},{line,191}]},{gen_rpc_client,send_cast,4,[{file,"gen_rpc_client.erl"},{line,418}]},{gen_server,try_dispatch,4,[{file,"gen_server.erl"},{line,689}]},{gen_server,handle_msg,6,[{file,"gen_server.erl"},{line,765}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,226}]}]}, ancestors: [gen_rpc_client_sup,gen_rpc_sup,<0.2131.0>], message_queue_len: 23, messages: [{tcp_error,#Port<0.300>,econnreset},{tcp_closed,#Port<0.300>},{{cast,emqx_broker,dispatch,[<<"thing/product/mockdevicesn/osd">>,{delivery,<0.24144.1>,{message,<<0,5,218,42,74,192,37,28,39,137,1,0,94,80,0,124>>,0,<<"30bc90c0b567_bench_pub_677_2014658720">>,#{dup => false,retain => false},#{peerhost => {111,222,129,229},properties => #{},proto_ver => 5,protocol => mqtt,username => <<"mockuser">>},<<"thing/product/mockdevicesn/osd">>,<<"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa">>,1647250061141}}]},undefined},{{cast,emqx_broker,dispatch,[<<"thing/product/mockdevicesn/osd">>,{delivery,<0.24637.1>,{message,<<0,5,218,42,74,192,43,42,39,137,1,0,96,61,0,123>>,0,<<"30bc90c0b567_bench_pub_776_2348236373">>,#{dup => false,retain => false},#{peerhost => {111,222,129,229},properties => #{},proto_ver => 5,protocol => mqtt,username => <<"mockuser">>},<<"thing/product/mockdevicesn/osd">>
错误原因:与别一个节点的RPC连接不可用了
解决方案:要看另外个节点有什么报错, 是不是重启,还是升级了。还是其它网络原因
44.关键字: Cannot publish message
日志原文:
<<"6afc4fb6-23ee-4010-9971-05e3c2aa5d9a">>@39.144.11.81:34937 [Channel] Cannot publish message to device//events due to Not authorized.
错误原因:没有发布到这个topic的权限
解决方案:开放权限
45.关键字: JT808 Proto
日志原文:
[JT808 Proto] Unexpected frame #{<<"body">> => #{}, <<"header">> => #{<<"encrypt">> => 0,<<"len">> => 0, <<"msg_id">> => 3,<<"msg_sn">> => 9, <<"phone">> => <<"014161736453">>}}2022-03-14 01:18:05.107 [error] ** Generic server <0.646.655> terminating ** Last message in was {inet_async,#Port<0.6944860>,0, {ok,<<126,0,3,0,0,1,65,97,115,100,83,0,9,111,126>>}}** When Server state == {state,esockd_transport,#Port<0.6944860>, {{117,132,196,139},22319}, {not_detect_heading_0x7e,8192}, {pstate, #{conn_mod => emqx_jt808_connection, peercert => nossl, peername => {{117,132,196,139},22319}, sendfun => {fun emqx_jt808_connection:send/3, [esockd_transport,#Port<0.6944860>]}, sockname => {{10,65,141,225},8090}, socktype => tcp}, #{clientid => undefined,is_bridge => false, is_superuser => false, mountpoint => undefined, peerhost => {117,132,196,139}, protocol => jt808,sockport => 8090, username => undefined,zone => undefined}, undefined,undefined,undefined, {fun emqx_jt808_connection:send/3, [esockd_transport,#Port<0.6944860>]}, undefined,0,<<"jt808/%c/dn">>, <<"jt808/%c/up">>, {auth,undefined,undefined,true}, [0, <<0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0>>], {inflight,128,{0,nil}}, undefined,undefined}, undefined,undefined,8192,true,30000}** Reason for termination ==** unexpected_frame2022-03-14 01:18:05.108 [error] crasher: initial call: emqx_jt808_connection:init/1 pid: <0.646.655> registered_name: [] exception exit: unexpected_frame in function gen_server:handle_common_reply/8 (gen_server.erl, line 751) ancestors: [<0.2670.0>,<0.2669.0>,esockd_sup,<0.2055.0>] message_queue_len: 1 messages: [{inet_async,#Port<0.6944860>,1,{error,closed}}] links: [<0.2670.0>] dictionary: [] trap_exit: false status: running heap_size: 6772 stack_size: 27 reductions: 15706 neighbours
错误原因:解析失败了,设备上来的数据不对
解决方案
46.关键字: rocketmq
日志原文:
State machine <0.5958.0> terminating. Reason: {error,econnrefused}. Stack: [{gen_statem,loop_state_callback_result,11,[{file,"gen_statem.erl"},{line,1360}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,226}]}]. Last event: {info,connecting}. State: {idle,{state,<<"bridge_rocketmq:resource:d0de7c39_al_device_up">>,<<"al_device_up">>,<<"172.24.1.152:10911">>,undefined,5,17208,[{sndbuf,1048576}],{emqx_bridge_rocket_actions,rocket_callback,[<<"data_to_rocket_1646912587576490567">>]},100,#{},<<>>}}.
错误原因:连接被rocketmq服务器拒绝
解决方案:检查rocketmq的连接
47.关键字: esockd_connection_sup
日志原文:
supervisor: 'esockd_connection_sup - <0.2390.0>', errorContext: connection_shutdown, reason: {badmatch,<<>>}, offender: [{pid,<0.10594.41>},{name,connection},{mfargs,{emqx_connection,start_link,[[{deflate_options,[]},{max_conn_rate,1000},{active_n,100},{zone,external},{proxy_address_header,<<>>},{proxy_port_header,<<>>},{supported_subprotocols,[]}]]}}]
错误原因:连接还没有创建完成,就被断开了
解决方案:检查连接
48.关键字: get_status_of_pool_workers
日志原文:
get_status_of_pool_workers failed: {throw,{timeout,<0.8230.420>}}, stacktrace: []
错误原因:在创建测试资源的进程池时,测试的资源没有在1秒内回复正常.
举个例子:如果是mongo,创建测试资源时,去查执行一个最简单的mongo查询,他没有在一秒内收到回复
解决方案
49.关键字: Take action
日志原文:
<<"android:7706416585007415566:7b6a3110779b89e3">>@112.97.83.197:36545 Take action <<"data_to_kafka_1646315366030644983">> failed, continue next action, reason: {error,timeout,[{wolff_producer,send_sync,3,[{file,"wolff_producer.erl"},{line,139}]},{emqx_bridge_kafka_actions,produce,6,[{file,"emqx_bridge_kafka_actions.erl"},{line,408}]},{emqx_rule_runtime,take_action,5,[{file,"emqx_rule_runtime.erl"},{line,236}]},{emqx_rule_runtime,'-take_actions/4-lc$^0/1-0-',4,[{file,"emqx_rule_runtime.erl"},{line,227}]},{emqx_rule_runtime,'-take_actions/4-lc$^0/1-0-',4,[{file,"emqx_rule_runtime.erl"},{line,228}]},{emqx_rule_runtime,do_apply_rule,2,[{file,"emqx_rule_runtime.erl"},{line,110}]},{emqx_rule_runtime,apply_rules,2,[{file,"emqx_rule_runtime.erl"},{line,52}]},{emqx_hooks,safe_execute,2,[{file,"emqx_hooks.erl"},{line,164}]}]}
错误原因 Kafka连接timeout
解决方案 检查相关填写参数是否正确,以及Kafka服务是否正常。
50.关键字: epgsql
日志原文:
Generic server <0.27348.0> terminating. Reason: nxdomain. Last message: {command,epgsql_cmd_connect,#{database => "dwhdb",ecpool_worker_id => 1,host => "matrixdb.c4.srv",password => #Fun<epgsql_cmd_connect.0.29916615>,port => 80,username => "mxadmin"}}. State: {state,undefined,undefined,<<>>,undefined,on_message,undefined,{[],[]},undefined,undefined,undefined,undefined,[],information_redacted,[],undefined,undefined,undefined,undefined,undefined}. Client <0.27347.0> stacktrace: [{logger_config,allow,2,[{file,"logger_config.erl"},{line,64}]},{proc_lib,crash_report,4,[{file,"proc_lib.erl"},{line,525}]},{proc_lib,exit_p,3,[{file,"proc_lib.erl"},{line,246}]}].
错误原因:matrixdb需要进行设置pg_hba.conf才可以连接数据库
解决方案:进行设置
51.关键字: Quota exceeded
日志原文:
mqtt-benchmark-227@100.121.120.249:57499 [Channel] Cannot publish messages to thing/product/mockdevicesn/osd due to Quota exceeded.
错误原因 :超出配额限制
解决方案:避免超出配额限制。
52.关键字: WebHook Action
日志原文:
d3dea20d-f45f-4243-8bc0-ddd556472708@39.144.5.19:16876 [WebHook Action] HTTP request error: timeout
错误原因:请求地址超时
解决方案:检查URL地址是否能正常通信
53.关键字: cluster_call error found
日志原文:
cluster_call error found, ResL: [{{init_action_failure,'emqx@10.14.69.101'},{{emqx_bridge_kafka_actions,on_action_create_data_to_kafka},{error,{error,kafka_topic_not_found},[{emqx_bridge_kafka_actions,check_kafka_topic,2,[{file,"emqx_bridge_kafka_actions.erl"},{line,459}]},{emqx_bridge_kafka_actions,on_action_create_data_to_kafka,2,[{file,"emqx_bridge_kafka_actions.erl"},{line,386}]},{emqx_rule_engine,'-init_action/4-fun-0-',4,[{file,"emqx_rule_engine.erl"},{line,551}]},{emqx_rule_engine,init_action,4,[{file,"emqx_rule_engine.erl"},{line,551}]},{erpc,execute_call,4,[{file,"erpc.erl"},{line,416}]}]}}},{{init_action_failure,'emqx@10.14.69.7'},{{emqx_bridge_kafka_actions,on_action_create_data_to_kafka},{error,{error,kafka_topic_not_found},[{emqx_bridge_kafka_actions,check_kafka_topic,2,[{file,"emqx_bridge_kafka_actions.erl"},{line,459}]},{emqx_bridge_kafka_actions,on_action_create_data_to_kafka,2,[{file,"emqx_bridge_kafka_actions.erl"},{line,386}]},{emqx_rule_engine,'-init_action/4-fun-0-',4,[{file,"emqx_rule_engine.erl"},{line,551}]},{emqx_rule_engine,init_action,4,[{file,"emqx_rule_engine.erl"},{line,551}]},{erpc,execute_call,4,[{file,"erpc.erl"},{line,416}]}]}}}]
错误原因:Kafka topic没找到
解决方案:请创建相应topic
54.关键字: Auth http
日志原文:
CENSYS@167.94.146.58:45530 [Auth http] Deny connection from path: /emqx/mqtt/auth, response http code: 400
错误原因:HTTP认证服务拒绝连接
解决方案:检查HTTP服务
55.关键字: PUBACK
日志原文:
76EE2AF1220D@121.69.9.98:50634 [Channel] The PUBACK PacketId 6 is not found.
错误原因:PUBACK报文PacketId 6没找到
解决方案
56.关键字: Parse failed for {badmatch
日志原文:
172.105.87.91:35516 [MQTT] , Parse failed for {badmatch,<<>>}, [{emqx_frame,parse_packet,3,[{file,"emqx_frame.erl"},{line,237}]},{emqx_frame,parse_frame,4,[{file,"emqx_frame.erl"},{line,201}]},{emqx_connection,parse_incoming,3,[{file,"emqx_connection.erl"},{line,625}]},{emqx_connection,handle_msg,2,[{file,"emqx_connection.erl"},{line,618}]},{emqx_connection,process_msg,2,[{file,"emqx_connection.erl"},{line,364}]},{emqx_connection,handle_recv,3,[{file,"emqx_connection.erl"},{line,328}]},{proc_lib,wake_up,3,[{file,"proc_lib.erl"},{line,236}]}], Frame data:<<22,3,0,0,83,1,0,0,79,3,0,63,71,215,247,186,44,238,234,178,96,126,243,0,253,130,123,185,213,150,200,119,155,230,196,219,60,61,219,111,239,16,110,0,0,40,0,22,0,19,0,10,0,102,0,5,0,4,0,101,0,100,0,99,0,98,0,97,0,96,0,21,0,18,0,9,0,20,0,17,0,8,0,6,0,3,1,0>>
错误原因:报文解析失败
解决方案:检查发布报文
57.关键字: Connection rejected
日志原文:
47.92.85.45:47136 Connection rejected due to max clients limitation
错误原因:达到最大连接数限制
解决方案:请保持连接数在限制以内。
58.关键字: ehttpc
日志原文:
ehttpc: Received 'gun_data' message from unknown stream ref: #Ref<0.1986289746.2927886337.204390>
错误原因:内部把LOG打错了
解决方案:已修复
59.关键字: JT808 Frame
日志原文:
[JT808 Frame] unknow message id 256, <<0,44,1,47,55,48,49,49,49,77,71,45,49,49, 0,0,0,48,48,48,48,48,48,48,1,212,193,66, 56,56,56,56,56>>2022-03-14 01:15:30.139 [error] [JT808 Conn] Parser failed for {invalid_message, [{emqx_jt808_frame,parse_message_body,2, [{file,"emqx_jt808_frame.erl"},{line,219}]}, {emqx_jt808_frame,parse_message,1, [{file,"emqx_jt808_frame.erl"},{line,121}]}, {emqx_jt808_frame,parse_main,3, [{file,"emqx_jt808_frame.erl"},{line,69}]}, {emqx_jt808_connection,received,2, [{file,"emqx_jt808_connection.erl"}, {line,243}]}, {gen_server,try_dispatch,4, [{file,"gen_server.erl"},{line,637}]}, {gen_server,handle_msg,6, [{file,"gen_server.erl"},{line,711}]}, {proc_lib,init_p_do_apply,3, [{file,"proc_lib.erl"},{line,249}]}]}2022-03-14 01:15:30.139 [error] [JT808 Conn] Error data: <<126,1,0,0,33,1,65,97,115,100,83,0,0,0,44,1,47,55,48, 49,49,49,77,71,45,49,49,0,0,0,48,48,48,48,48,48,48, 1,212,193,66,56,56,56,56,56,8,126>>
错误原因:发的消息不符合jt808 protocol规范
解决方案
60.关键字: Unexpected sock_closed
日志原文:
<<"emq_4b74fa73b8454fe2a7bcbe5b2b4e8066">>@183.129.130.2:55474 [Channel] Unexpected sock_closed: tcp_closed
错误原因:socket被close了
解决方案:查传输层面的影响 比如网络或os