Linux/IO 基础
Tips:
Linux底层通过文件的方式实现IO
Java等高级语言是通过syscall对Linux系统函数进行调用来实现网络通信
知识准备
Linux中一切类型都被抽象成文件,如:普通文件、目录、字符设备、块设备、套接字等
内存被划分为内核态和用户态,数据在用户态和内核态之间拷贝,内核态可以访问用户态数据,反之不可以
只有内核可以操作硬件资源(网卡、磁盘等),内核提供syscall函数
文件描述符
文件描述符是内核创建的方便管理已打开文件的索引,指代被打开的文件。当程序打开一个现有文件或者创建一个新文件时,内核向进程返回一个文件描述符。
所有执行I/O操作的系统调用都通过文件描述符
在Linux系统中,ssh方式登录后查看/proc下信息,可以看到系统为每一个进程默认创建0,1,2 三个fd
用户态和内核态
内存被划分为内核态和用户态,数据在用户态和内核态之间拷贝,内核态可以访问用户态数据,反之不可以
用户态无法直接访问磁盘、网卡等设备,必须通过系统提供的syscall方式调用系统函数
系统调用
我们来执行如下Java代码,看看系统会发生什么:
import java.io.IOException;
import java.net.ServerSocket;
public class BIOServer {
public static void main(String[] args) throws IOException {
ServerSocket server = new ServerSocket(8080);
server.accept();
}
}
利用strace获取系统函数调用栈:
strace -ff -o out java BIOServer
查看生成的文件,找到关键信息行:
socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 5
bind(5, {sa_family=AF_INET, sin_port=htons(8080), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
listen(5, 50)
这里出现了3个系统函数,socket,bind,listen。我们分别查看Linux手册:
socket
NAME
socket - create an endpoint for communication
DESCRIPTION
socket() creates an endpoint for communication and returns a descriptor.
RETURN VALUE
On success, a file descriptor for the new socket is returned. On error, -1 is returned, and errno is set appropri-
ately.
socket() 为通信提供一个终点并且返回一个文件描述符fd,否则返回 -1
bind
NAME
bind - bind a name to a socket
SYNOPSIS
#include <sys/types.h> /* See NOTES */
#include <sys/socket.h>
int bind(int sockfd, const struct sockaddr *addr,
socklen_t addrlen);
DESCRIPTION
When a socket is created with socket(2), it exists in a name space (address family) but has no address assigned to it.
bind() assigns the address specified by addr to the socket referred to by the file descriptor sockfd. addrlen speci-
fies the size, in bytes, of the address structure pointed to by addr. Traditionally, this operation is called "assign-
ing a name to a socket".
RETURN VALUE
On success, zero is returned. On error, -1 is returned, and errno is set appropriately.
bind(),接受三个参数(socket返回的文件描述符,socket地址结构体,socket地址长度),成功返回0
listen
NAME
listen - listen for connections on a socket
SYNOPSIS
#include <sys/types.h> /* See NOTES */
#include <sys/socket.h>
int listen(int sockfd, int backlog);
DESCRIPTION
listen() marks the socket referred to by sockfd as a passive socket, that is, as a socket that will be used to accept
incoming connection requests using accept(2).
The sockfd argument is a file descriptor that refers to a socket of type SOCK_STREAM or SOCK_SEQPACKET.
The backlog argument defines the maximum length to which the queue of pending connections for sockfd may grow. If a
connection request arrives when the queue is full, the client may receive an error with an indication of ECONNREFUSED
or, if the underlying protocol supports retransmission, the request may be ignored so that a later reattempt at connec-
tion succeeds.
RETURN VALUE
On success, zero is returned. On error, -1 is returned, and errno is set appropriately.
listen(),接受两个参数(socket返回的文件描述符,接受socket队列的大小),成功返回0,失败返回-1
重新来看三个函数调用的关系,应该是:
查看Java进程下的文件描述符:
可以得出以下结论:
1、Java中通过对系统的调用来实现网络IO
2、ServerSocket server = new ServerSocket(8080); 一行Java代码的背后,经过了多个系统函数调用
3、实现网络IO,不是Java的能力,是操作系统内核提供的能力
备忘录
Linux中都是文件描述符
用户空间的程序,通过调用系统函数来访问操作系统软硬件资源
提供网络IO能力的不是Java/Python高级语言而是Linux Kernel
来源:InfoQ