The proc filesystem 5

原创

改造人 2010-06-25 09:41:11 博主文章分类：系统管理 ©著作权

文章标签 职场 The 休闲 proc filesystem 文章分类 运维

©著作权归作者所有：来自51CTO博客作者改造人的原创作品，请联系作者获取转载授权，否则将追究法律责任

2.10 IPX
--------

The IPX protocol has no tunable values in proc/sys/net.

The IPX protocol does, however, provide proc/net/ipx. This lists each IPX
socket giving the local and remote addresses in Novell format (that is
network:node:port). In accordance with the strange Novell tradition,
everything but the port is in hex. Not_Connected is displayed for sockets that
are not tied to a specific remote address. The Tx and Rx queue sizes indicate
the number of bytes pending for transmission and reception. The state
indicates the state the socket is in and the uid is the owning uid of the
socket.

The /proc/net/ipx_interface file lists all IPX interfaces. For each interface
it gives the network number, the node number, and indicates if the network is
the primary network. It also indicates which device it is bound to (or
Internal for internal networks) and the Frame Type if appropriate. Linux
supports 802.3, 802.2, 802.2 SNAP and DIX (Blue Book) ethernet framing for
IPX.

The /proc/net/ipx_route table holds a list of IPX routes. For each route it
gives the destination network, the router node (or Directly) and the network
address of the router (or Connected) for internal networks.

2.11 /proc/sys/fs/mqueue - POSIX message queues filesystem
----------------------------------------------------------

The "mqueue" filesystem provides the necessary kernel features to enable the
creation of a user space library that implements the POSIX message queues
API (as noted by the MSG tag in the POSIX 1003.1-2001 version of the System
Interfaces specification.)

The "mqueue" filesystem contains values for determining/setting the amount of
resources used by the file system.

/proc/sys/fs/mqueue/queues_max is a read/write file for setting/getting the
maximum number of message queues allowed on the system.

/proc/sys/fs/mqueue/msg_max is a read/write file for setting/getting the
maximum number of messages in a queue value. In fact it is the limiting value
for another (user) limit which is set in mq_open invocation. This attribute of
a queue must be less or equal then msg_max.

/proc/sys/fs/mqueue/msgsize_max is a read/write file for setting/getting the
maximum message size value (it is every message queue's attribute set during
its creation).

2.12 /proc/<pid>/oom_adj - Adjust the oom-killer score
------------------------------------------------------

This file can be used to adjust the score used to select which processes
should be killed in an out-of-memory situation. Giving it a high score will
increase the likelihood of this process being killed by the oom-killer. Valid
values are in the range -16 to +15, plus the special value -17, which disables
oom-killing altogether for this process.

2.13 /proc/<pid>/oom_score - Display current oom-killer score
-------------------------------------------------------------

------------------------------------------------------------------------------
This file can be used to check the current score used by the oom-killer is for
any given <pid>. Use it together with /proc/<pid>/oom_adj to tune which
process should be killed in an out-of-memory situation.

------------------------------------------------------------------------------
Summary
------------------------------------------------------------------------------
Certain aspects of kernel behavior can be modified at runtime, without the
need to recompile the kernel, or even to reboot the system. The files in the
/proc/sys tree can not only be read, but also modified. You can use the echo
command to write value into these files, thereby changing the default settings
of the kernel.
------------------------------------------------------------------------------

2.14 /proc/<pid>/io - Display the IO accounting fields
-------------------------------------------------------

This file contains IO statistics for each running process

Example
-------

test:/tmp # dd if=/dev/zero of=/tmp/test.dat &
[1] 3828

test:/tmp # cat /proc/3828/io
rchar: 323934931
wchar: 323929600
syscr: 632687
syscw: 632675
read_bytes: 0
write_bytes: 323932160
cancelled_write_bytes: 0

Description
-----------

rchar
-----

I/O counter: chars read
The number of bytes which this task has caused to be read from storage. This
is simply the sum of bytes which this process passed to read() and pread().
It includes things like tty IO and it is unaffected by whether or not actual
physical disk IO was required (the read might have been satisfied from
pagecache)

wchar
-----

I/O counter: chars written
The number of bytes which this task has caused, or shall cause to be written
to disk. Similar caveats apply here as with rchar.

syscr
-----

I/O counter: read syscalls
Attempt to count the number of read I/O operations, i.e. syscalls like read()
and pread().

syscw
-----

I/O counter: write syscalls
Attempt to count the number of write I/O operations, i.e. syscalls like
write() and pwrite().

read_bytes
----------

I/O counter: bytes read
Attempt to count the number of bytes which this process really did cause to
be fetched from the storage layer. Done at the submit_bio() level, so it is
accurate for block-backed filesystems. <please add status regarding NFS and
CIFS at a later time>

write_bytes
-----------

I/O counter: bytes written
Attempt to count the number of bytes which this process caused to be sent to
the storage layer. This is done at page-dirtying time.

cancelled_write_bytes
---------------------

The big inaccuracy here is truncate. If a process writes 1MB to a file and
then deletes the file, it will in fact perform no writeout. But it will have
been accounted as having caused 1MB of write.
In other words: The number of bytes which this process caused to not happen,
by truncating pagecache. A task can cause "negative" IO too. If this task
truncates some dirty pagecache, some IO which another task has been accounted
for (in it's write_bytes) will not be happening. We _could_ just subtract that
from the truncating task's write_bytes, but there is information loss in doing
that.

Note
----

At its current implementation state, this is a bit racy on 32-bit machines: if
process A reads process B's /proc/pid/io while process B is updating one of
those 64-bit counters, process A could see an intermediate result.

More information about this can be found within the taskstats documentation in
Documentation/accounting.

2.15 /proc/<pid>/coredump_filter - Core dump filtering settings
---------------------------------------------------------------
When a process is dumped, all anonymous memory is written to a core file as
long as the size of the core file isn't limited. But sometimes we don't want
to dump some memory segments, for example, huge shared memory. Conversely,
sometimes we want to save file-backed memory segments into a core file, not
only the individual files.

/proc/<pid>/coredump_filter allows you to customize which memory segments
will be dumped when the <pid> process is dumped. coredump_filter is a bitmask
of memory types. If a bit of the bitmask is set, memory segments of the
corresponding memory type are dumped, otherwise they are not dumped.

The following 7 memory types are supported:
- (bit 0) anonymous private memory
- (bit 1) anonymous shared memory
- (bit 2) file-backed private memory
- (bit 3) file-backed shared memory
- (bit 4) ELF header pages in file-backed private memory areas (it is
            effective only if the bit 2 is cleared)
- (bit 5) hugetlb private memory
- (bit 6) hugetlb shared memory

Note that MMIO pages such as frame buffer are never dumped and vDSO pages
are always dumped regardless of the bitmask status.

Note bit 0-4 doesn't effect any hugetlb memory. hugetlb memory are only
effected by bit 5-6.

Default value of coredump_filter is 0x23; this means all anonymous memory
segments and hugetlb private memory are dumped.

If you don't want to dump all shared memory segments attached to pid 1234,
write 0x21 to the process's proc file.

$ echo 0x21 > /proc/1234/coredump_filter

When a new process is created, the process inherits the bitmask status from its
parent. It is useful to set up coredump_filter before the program runs.
For example:

$ echo 0x7 > /proc/self/coredump_filter
$ ./some_program

2.16    /proc/<pid>/mountinfo - Information about mounts
--------------------------------------------------------

This file contains lines of the form:

36 35 98:0 /mnt1 /mnt2 rw,noatime master:1 - ext3 /dev/root rw,errors=continue
(1)(2)(3)   (4)   (5)      (6)      (7)   (8) (9)   (10)         (11)

(1) mount ID: unique identifier of the mount (may be reused after umount)
(2) parent ID: ID of parent (or of self for the top of the mount tree)
(3) major:minor: value of st_dev for files on filesystem
(4) root: root of the mount within the filesystem
(5) mount point: mount point relative to the process's root
(6) mount options: per mount options
(7) optional fields: zero or more fields of the form "tag[:value]"
(8) separator: marks the end of the optional fields
(9) filesystem type: name of filesystem of the form "type[.subtype]"
(10) mount source: filesystem specific information or "none"
(11) super options: per super block options

Parsers should ignore all unrecognised optional fields. Currently the
possible optional fields are:

shared:X mount is shared in peer group X
master:X mount is slave to peer group X
propagate_from:X mount is slave and receives propagation from peer group X (*)
unbindable mount is unbindable

(*) X is the closest dominant peer group under the process's root. If
X is the immediate master of the mount, or if there's no dominant peer
group under the same root, then only the "master:X" field is present
and not the "propagate_from:X" field.

For more information on mount propagation see:

Documentation/filesystems/sharedsubtree.txt

2.17    /proc/sys/fs/epoll - Configuration options for the epoll interface
--------------------------------------------------------

This directory contains configuration options for the epoll(7) interface.

max_user_instances
------------------

This is the maximum number of epoll file descriptors that a single user can
have open at a given time. The default value is 128, and should be enough
for normal users.

max_user_watches
----------------

Every epoll file descriptor can store a number of files to be monitored
for event readiness. Each one of these monitored files constitutes a "watch".
This configuration option sets the maximum number of "watches" that are
allowed for each user.
Each "watch" costs roughly 90 bytes on a 32bit kernel, and roughly 160 bytes
on a 64bit one.
The current default value for max_user_watches is the 1/32 of the available
low memory, divided for the "watch" cost in bytes.

------------------------------------------------------------------------------