IM功能在社交app中扮演重要地位,但很多情况下又不是app的核心功能,因此许多项目的IM功能已接入第三方作为解决方案。

  • IM通信的核心功能是需要维持一个与服务端高可靠的长连接通道

常见实现长连接方式:

  • xmpp
  • socket长连接
  • 三方sdk(融云,环信)
  • websocket

由于websocket天然支持心跳协议(流量小),跨终端(HTML5自带支持,现成的web端解决方案,符合某些公司的大前端战略),开源支持充分等优势,建议采用websocket实现。

websocket协议实现

  • java-websocket
  • okhttp
  • 自己实现 rfc6455

鉴于okhttp的流行程度,采用okhttp实现WebSocket的Client。

private void initialization() {
        if (TextUtils.isEmpty(url)) {
            throw new AssertionError("Please initialize url first");
        }
        if (webSocket == null) {
            OkHttpClient okHttpClient = new OkHttpClient.Builder()
                    .readTimeout(WSConstant.OKHTTP_CLIENT_READ_TIMEOUT, TimeUnit.SECONDS)
                    .writeTimeout(WSConstant.OKHTTP_CLIENT_WRITE_TIMEOUT, TimeUnit.SECONDS)
                    .connectTimeout(WSConstant.OKHTTP_CLIENT_CONNECT_TIMEOUT, TimeUnit.SECONDS)
                    .pingInterval(30 * 1000, TimeUnit.MILLISECONDS) //ping帧时间间隔
                    .build();

            Request request = new Request.Builder()
                    .url(url)
                    .build();
            webSocket = okHttpClient.newWebSocket(request, webSocketListener);
        }

    }
复制代码

同时处理websocket状态变化以及断线重连,nat超时

mWebSocketListener = new WebSocketListener() {
            @Override
            public void onOpen(WebSocket webSocket, Response response) {
                super.onOpen(webSocket, response);
                //连接成功后启动检验client线程
                IncomingHandler.postDelayed(new WebSocketCheckScanner(), WebSocketCheckScanner.SCANNER_DELAY);
                //关闭重连线程
                shutDownReConnectRunner();
                //重置连接状态
                socketStatus = STATUS_OPEN;
                //连接成功后转发信息到业务层
                sendWebSocketStatusMessage(SERVICE_SOCKET_OPEN);
            }

            @Override
            public void onMessage(WebSocket webSocket, String text) {
                //能接收到服务器数据,意味着有连接
                socketStatus = STATUS_OPEN;
                L.i(TAG, "[onMessage]" + text);
                //数据转发
                dispatchResponse(text);
                super.onMessage(webSocket, text);
            }

            @Override
            public void onMessage(WebSocket webSocket, ByteString bytes) {
                super.onMessage(webSocket, bytes);
            }

            @Override
            public void onClosing(WebSocket webSocket, int code, String reason) {
                //正在关闭通道,不需要发送心跳
                L.i(TAG, "onClosing:");
                super.onClosing(webSocket, code, reason);
            }

            @Override
            public void onClosed(WebSocket webSocket, int code, String reason) {
                socketStatus = STATUS_CLOSED;
                L.i(TAG, "onClosed:");
                super.onClosed(webSocket, code, reason);
            }

            @Override
            public void onFailure(WebSocket webSocket, Throwable t, Response response) {
                //如果当前为主动关闭通道,那么不能再进行重连任务
                if (isShutDown) {
                    //发送通道已关闭回调
                    sendWebSocketStatusMessage(SERVICE_SOCKET_BROKEN);
                    closeConnection();
                    L.i(TAG, "[重连时状态为关闭连接操作]");
                    return;
                }
                //其他error  java.io.EOFException 经测试,通道也不存在  10:28:21 ~ 10:41:51 大约15分钟会出现
                //java.net.SocketException: Connection reset  这种error需要重新上线
                //原因:手机doze等模式导致ping-pong帧丢失,通道关闭 https://github.com/square/okhttp/issues/3722
                // java.net.SocketException: Software caused connection abort 断开网络
                //java.net.ConnectException
                //if (t instanceof SocketException || t instanceof UnknownHostException || t instanceof EOFException) {
                L.i(TAG, "socket 网络异常 网络已连接:" + IMUtils.isConnected(getApplication()));
                switch (socketStatus) {
                    case STATUS_OPEN:
                        //有通道的情况下出现异常,说明需要重连
                        sendWebSocketStatusMessage(SERVICE_SOCKET_BROKEN);
                        initReConnectRunner();
                        break;

                    case STATUS_CONNECTTING:
                        //无网络下初始化
                        if (scheduledExecutor == null) {
                            initReConnectRunner();
                            break;
                        } else {
                            //其他情况
                            if (retryTime >= ImConfig.RECONNECT_TIME) {
                                shutDownReConnectRunner();
                            }
                            break;
                        }
                    case STATUS_RECONNECTTING:
                        //当前处于连接中,及还处于重新连接状态
                        if (retryTime > ImConfig.RECONNECT_TIME) {
                            shutDownReConnectRunner();
                        }
                        break;

                    default:
                        break;
                }
                //} else {
                //if (t instanceof EOFException)
                //L.i(TAG, "数据发送异常 该数据将被标记为超时");
                //}
                t.printStackTrace();
                super.onFailure(webSocket, t, response);
            }
        };
复制代码

坑点:

  • nat超时,ping帧的心跳间隔不是固定的精确值
  • 每次新建通道,必须主动/被动关闭原先通道,及时减少服务端压力
  • 重新上线的时机,防止短时间内重复上线造成内部ddos
  • 消息丢失问题,与后端数据状态同步策略

下期内容:

  • 发送数据的超时检测机制
  • 断线重连后,数据的自动重发功能设计