maxIdleTime,如果一个连接在时间maxIdleTime内没有被使用的话,该连接将自动关闭与Server的连接,以此来释放该连接在服务器端和客户端的系统资源。这个最大空闲期maxIdleTime的值可以通过客户端的配置文件来设置,对应的配置项为:ipc.client.connection.maxidletime。同时为了维护该连接的有效性,该连接设置了基于TCP的Socket的网络超时时间,当该连接发生SocketTimeoutException时,会自动的向服务器端发送ping包,来测试当前客户端与服务器端的连接是否正常,同时也来告诉服务器自己现在还是在正常工作的,若果处理来了就可以把结果发送回来,这个超时时间的值为pingInterval,该值的默认大小是60000ms,不过也可以通过客户端的配置文件来配置,对应的配置项为:ipc.ping.interval。另外,当该连接向服务器发起连接请求失败的时候,可以不断的重新尝试,尝试的次数由maxRetries决定,当尝试的次数超过该值时,就将该连接视为彻底的失败,客户端的这一次RPC也就失败了。maxRetries默认的值为10,但也可以客户端的配置文件来配置,对应的配置选项为:ipc.client.connect.max.retries。底层的基于TCP的Socket网络连接还可以通过配置文件来设置是否延迟,对应的配置项为:ipc.client.tcpnodelay。其实,对于上面Client内部的四个参数,我们可以根据具体的应用场景来设置适当的值,已达到提高Hadoop集群性能的目的。当一个RPC成果返回之后,Client还需要把此次调用的返回结果解析成用户真正需要的数据类型(毕竟,网络返回的都是0/1序列),所以Cleint在其内部还需要一个解析器,该解析器的类型为valuesClass。
对于Client的三个疑问:
1. 客户端和服务端的连接是怎样建立的?
2. 客户端是怎样给服务端发送数据的?
3. 客户端是怎样获取服务端的返回数据的?
ipc.Client源码分析
private Hashtable<ConnectionId, Connection> connections = new Hashtable<ConnectionId, Connection>(); //与远程服务器连接的缓存池
private Class<? extends Writable> valueClass; //远程方法调用发回后返回值解析器
final private int maxIdleTime; //连接的最大空闲时间
final private int maxRetries; //Socket连接时,最大Retry次数
private boolean tcpNoDelay; //设置TCP连接是否延迟
private int pingInterval; //ping服务端的间隔
为了对Client类有个初步的了解,我们也先罗列几个我们感兴趣的内部类:
Call :用于封装Invocation对象,写到服务端,同时也用于存储从服务端返回的数据
Connection :用以处理远程连接对象。继承了Thread
ConnectionId :唯一确定一个连接
.Call类
int id; // 调用标示ID
Writable param; // 调用参数
Writable value; // 调用返回的值
IOException error; // 异常信息
boolean done; // 调用是否完成
客户端的一次RPC所涉及到的所有参数信息(方法名、输入参数、返回值)都被抽象到一个Call对象中,不过在这里所要说的是,Call的param属性包含了此次RPC调用的方法名和所有的输入参数,它的具体类型是org.apache.hadoop.ipc.RPC.Invocation,它主要属性有:
private String methodName; //方法名
private Class[] parameterClasses; //参数类型集合
private Object[] parameters; //参数值
3.ConnectionId类
InetSocketAddress address;//连接实例的Socket地址
GroupInformation ticket;//客户端用户信息
Class<?> protocol;//连接的协议
前面刚说过,在Client内部设计了一个RPC连接池,避免与服务器端频繁的连接于关闭,以此来提高整个系统的工作效率,所以就需要一个标识来表示一个唯一的RPC连接,在这里是通过服务器地址、用户信息、协议类型三个信息来唯一标识一个RPC连接的。
4.Connection类
private InetSocketAddress server; // 服务端ip:port
private ConnectionHeader header; // 连接头信息,该实体类封装了连接协议与用户信息UserGroupInformation
private ConnectionId remoteId; // 连接ID
private Socket socket = null; // 客户端已连接的Socket
private DataInputStream in;
private DataOutputStream out;
private Hashtable<Integer, Call> calls = new Hashtable<Integer, Call>(); //待处理的RPC队列
private AtomicLong lastActivity = new AtomicLong();// 最后I/O活跃的时间
private AtomicBoolean shouldCloseConnection = new AtomicBoolean(); //连接是否关闭
private IOException closeException; //连接关闭原因
在Client内部,把每一个RPC连接Connection设计成了一个后台线程,它的内部放置了一个任务队列来存储待处理的RPC调用,当一个RPC连接的空闲时间超过设置的最大空闲值时就会自动的关闭,从而及时地释放自己在客户端和服务器端所占用的系统资源。为了保证客户端和服务器端底层通信协议的一致性,客户端在与服务器端建立网络连接之后会马上向服务器端发送一个头部信息,以确保C/S两端所用的协议版本号是相同的,当服务器端发现自己与客户端所使用的通信协议版本号不一致时,会立马关闭与客户单的网络连接,而客户端之后会抛出EOFException异常信息。这个头部信息如下:
hrpc:hadoop的RPC实现标识;
version:协议的版本号;
length:剩余信息长度;
protocol:协议类型;
flag:是否有客户端信息;
ugi:客户端信息;
客户端的一次RPC调用的处理过程如下图:
Client为客户设计了两种调用接口,一种是单个RPC调用接口,一种是批量的RPC调用,如下:
public Writable call(Writable param, InetSocketAddress addr, Class<?> protocol, UserGroupInformation ticket) throws InterruptedException, IOException
public Writable[] call(Writable[] params, InetSocketAddress[] addresses, Class<?> protocol, UserGroupInformation ticket) throws IOException
不过这种批量的RPC调用的本质只是对单个RPC调用接口的循环调用,基本上没有做出任何的优化。个人认为可以对批量的RPC调用进行改进来优化调用的执行效率,如:在一次批量的RPC调用中,对于所有相同的Server,可以一次性全部发送到该Server,然后调用的所有执行结果根据该server当前的具体情况一次性全部返回或者分批次返回。总体说来,Hadoop的RPC过程是一个同步的过程。
问题1:客户端和服务端的连接是怎样建立的? 代码1
/** Make a call, passing <code>param</code>, to the IPC server running at
* <code>address</code> which is servicing the <code>protocol</code> protocol,
* with the <code>ticket</code> credentials, <code>rpcTimeout</code> as timeout
* and <code>conf</code> as configuration for this connection, returning the
* value. Throws exceptions if there are network problems or if the remote code
* threw an exception. */
public Writable call(Writable param, InetSocketAddress addr,
Class<?> protocol, UserGroupInformation ticket,
int rpcTimeout, Configuration conf)
throws InterruptedException, IOException {
ConnectionId remoteId = ConnectionId.getConnectionId(addr, protocol,
ticket, rpcTimeout, conf);
return call(param, remoteId);
}
/** Make a call, passing <code>param</code>, to the IPC server defined by
* <code>remoteId</code>, returning the value.
* Throws exceptions if there are network problems or if the remote code
* threw an exception. */
public Writable call(Writable param, ConnectionId remoteId)
throws InterruptedException, IOException {
Call call = new Call(param);//将传入的数据封装成call对象
Connection connection = getConnection(remoteId, call);//获得一个连接
connection.sendParam(call); // send the parameter 向服务端发送call对象
boolean interrupted = false;
synchronized (call) {
while (!call.done) {
try {
call.wait(); // wait for the result
} catch (InterruptedException ie) {
// save the fact that we were interrupted
interrupted = true;
}
}
if (interrupted) {
// set the interrupt flag now that we are done waiting
Thread.currentThread().interrupt();
}
if (call.error != null) {
if (call.error instanceof RemoteException) {
call.error.fillInStackTrace();
throw call.error;
} else { // local exception
// use the connection because it will reflect an ip change, unlike
// the remoteId 本地异常
throw wrapException(connection.getRemoteAddress(), call.error);
}
} else {
return call.value;//返回结果数据
}
}
}
那么是如何获得一个服务端的连接的呢?
Connection connection = getConnection(remoteId, call); //获得一个连接
connection.sendParam(call); // 向服务端发送call对
代码2
private Connection getConnection(ConnectionId remoteId,
Call call)
throws IOException, InterruptedException {
if (!running.get()) {
// 如果client关闭了
throw new IOException("The client is stopped");
}
Connection connection;
//如果connections连接池中有对应的连接对象,就不需重新创建了;如果没有就需重新创建一个连接对象。
//但请注意,该连接对象只是存储了remoteId的信息,其实还并没有和服务端建立连接。
do {
synchronized (connections) {
connection = connections.get(remoteId);
if (connection == null) {
connection = new Connection(remoteId);
connections.put(remoteId, connection);
}
}
} while (!connection.addCall(call)); //将call对象放入对应连接中的calls池,就不贴出源码了
//这句代码才是真正的完成了和服务端建立连接哦~we don't invoke the method below inside "synchronized (connections)"
connection.setupIOstreams(); return connection;
}
继续分析下去,那我们就一探建立连接的过程吧,下面贴出Client.Connection类中的setupIOstreams()方法:
代码3
/** Connect to the server and set up the I/O streams. It then sends
* a header to the server and starts
* the connection thread that waits for responses.
*/
private synchronized void setupIOstreams() throws InterruptedException {
if (socket != null || shouldCloseConnection.get()) {
return;
}
try {
if (LOG.isDebugEnabled()) {
LOG.debug("Connecting to "+server);
}
short numRetries = 0;
final short maxRetries = 15;
Random rand = null;
while (true) {
setupConnection(); //建立连接
InputStream inStream = NetUtils.getInputStream(socket); //获得输入流
OutputStream outStream = NetUtils.getOutputStream(socket); //获得输出流
writeRpcHeader(outStream);
if (useSasl) {
final InputStream in2 = inStream;
final OutputStream out2 = outStream;
UserGroupInformation ticket = remoteId.getTicket();
if (authMethod == AuthMethod.KERBEROS) {
if (ticket.getRealUser() != null) {
ticket = ticket.getRealUser();
}
}
boolean continueSasl = false;
try {
continueSasl =
ticket.doAs(new PrivilegedExceptionAction<Boolean>() {
@Override
public Boolean run() throws IOException {
return setupSaslConnection(in2, out2);
}
});
} catch (Exception ex) {
if (rand == null) {
rand = new Random();
}
handleSaslConnectionFailure(numRetries++, maxRetries, ex, rand,
ticket);
continue;
}
if (continueSasl) {
// Sasl connect is successful. Let's set up Sasl i/o streams.
inStream = saslRpcClient.getInputStream(inStream);
outStream = saslRpcClient.getOutputStream(outStream);
} else {
// fall back to simple auth because server told us so.
authMethod = AuthMethod.SIMPLE;
header = new ConnectionHeader(header.getProtocol(),
header.getUgi(), authMethod);
useSasl = false;
}
}
this.in = new DataInputStream(new BufferedInputStream
(new PingInputStream(inStream)));
this.out = new DataOutputStream
(new BufferedOutputStream(outStream));
writeHeader();
// update last activity time
touch();
//当连接建立时,启动接受线程等待服务端传回数据,注意:Connection继承了Tread
// start the receiver thread after the socket connection has been set up
start();
return;
}
} catch (Throwable t) {
if (t instanceof IOException) {
markClosed((IOException)t);
} else {
markClosed(new IOException("Couldn't set up IO streams", t));
}
close();
}
}
有一步我们就知道客户端的连接是怎么建立的啦,下面贴出Client.Connection类中的setupConnection()方法:
代码4
private synchronized void setupConnection() throws IOException {
short ioFailures = 0;
short timeoutFailures = 0;
while (true) {
try {
this.socket = socketFactory.createSocket();
this.socket.setTcpNoDelay(tcpNoDelay);
/*
* Bind the socket to the host specified in the principal name of the
* client, to ensure Server matching address of the client connection
* to host name in principal passed.
*/
if (UserGroupInformation.isSecurityEnabled()) {
KerberosInfo krbInfo =
remoteId.getProtocol().getAnnotation(KerberosInfo.class);
if (krbInfo != null && krbInfo.clientPrincipal() != null) {
String host =
SecurityUtil.getHostFromPrincipal(remoteId.getTicket().getUserName());
// If host name is a valid local address then bind socket to it
InetAddress localAddr = NetUtils.getLocalInetAddress(host);
if (localAddr != null) {
this.socket.bind(new InetSocketAddress(localAddr, 0));
}
}
}
// connection time out is 20s
NetUtils.connect(this.socket, server, 20000);
if (rpcTimeout > 0) {
pingInterval = rpcTimeout; // rpcTimeout overwrites pingInterval
}
this.socket.setSoTimeout(pingInterval);
return;
} catch (SocketTimeoutException toe) {
/* Check for an address change and update the local reference.
* Reset the failure counter if the address was changed
*/
if (updateAddress()) {
timeoutFailures = ioFailures = 0;
}
/* The max number of retries is 45,
* which amounts to 20s*45 = 15 minutes retries.
*/
handleConnectionFailure(timeoutFailures++, 45, toe);
} catch (IOException ie) {
if (updateAddress()) {
timeoutFailures = ioFailures = 0;
}
handleConnectionFailure(ioFailures++, maxRetries, ie);
}
}
}
终于,我们知道了客户端的连接是怎样建立的了,其实就是创建一个普通的socket进行通信。呵呵,那服务端是不是也是创建一个ServerSocket进行通信的呢?呵呵,先不要急,到这里我们只解决了客户端的第一个问题,下面还有两个问题没有解决呢,我们一个一个地来解决吧。
问题2:客户端是怎样给服务端发送数据的?
我们回顾一下代码1吧。第一句为了完成连接的建立,我们已经分析完毕;而第二句是为了发送数据,呵呵,分析下去,看能不能解决我们的问题呢。下面贴出Client.Connection类的sendParam()方法吧:
代码5
/** Initiates a call by sending the parameter to the remote server.
* Note: this is not called from the Connection thread, but by other
* threads.
*/
public void sendParam(Call call) {
if (shouldCloseConnection.get()) {
return;
}
DataOutputBuffer d=null;
try {
synchronized (this.out) {
if (LOG.isDebugEnabled())
LOG.debug(getName() + " sending #" + call.id);
//for serializing the
//data to be written
d = new DataOutputBuffer();
d.writeInt(call.id);
call.param.write(d);
byte[] data = d.getData();
int dataLength = d.getLength();
out.writeInt(dataLength); //first put the data length
out.write(data, 0, dataLength);//write the data
out.flush();
}
} catch(IOException e) {
markClosed(e);
} finally {
//the buffer is just an in-memory buffer, but it is still polite to
// close early
IOUtils.closeStream(d);
}
}
其实这就是java io的socket发送数据的一般过程哦,没有什么特别之处。到这里问题二也解决了,来看看问题三吧。
问题3:客户端是怎样获取服务端的返回数据的?
我们再回顾一下代码3吧。代码六中,当连接建立时会启动一个线程用于处理服务端返回的数据,我们看看这个处理线程是怎么实现的吧,下面贴出Client.Connection类和Client.Call类中的相关方法吧:
代码6
方法一:
public void run() {
•••
while (waitForWork()) {
receiveResponse(); //具体的处理方法
}
close();
•••
}
方法二:
private void receiveResponse() {
if (shouldCloseConnection.get()) {
return;
}
touch();
try {
int id = in.readInt(); // 阻塞读取id
if (LOG.isDebugEnabled())
LOG.debug(getName() + " got value #" + id);
Call call = calls.get(id); //在calls池中找到发送时的那个对象
int state = in.readInt(); // 阻塞读取call对象的状态
if (state == Status.SUCCESS.state) {
Writable value = ReflectionUtils.newInstance(valueClass, conf);
value.readFields(in); // 读取数据
//将读取到的值赋给call对象,同时唤醒Client等待线程,贴出setValue()代码方法三
call.setValue(value);
calls.remove(id); //删除已处理的call
} else if (state == Status.ERROR.state) {
•••
} else if (state == Status.FATAL.state) {
•••
}
} catch (IOException e) {
markClosed(e);
}
}
方法三:
public synchronized void setValue(Writable value) {
this.value = value;
callComplete(); //具体实现
}
protected synchronized void callComplete() {
this.done = true;
notify(); // 唤醒client等待线程
}
代码6完成的功能主要是:启动一个处理线程,读取从服务端传来的call对象,将call对象读取完毕后,唤醒client处理线程。就这么简单,客户端就获取了服务端返回的数据了哦~。客户端的源码分析就到这里了哦,下面我们来分析Server端的源码吧。