2.10 IPX
--------

The IPX protocol has no tunable values in proc/sys/net.

The IPX  protocol  does,  however,  provide  proc/net/ipx. This lists each IPX
socket giving  the  local  and  remote  addresses  in  Novell  format (that is
network:node:port). In  accordance  with  the  strange  Novell  tradition,
everything but the port is in hex. Not_Connected is displayed for sockets that
are not  tied to a specific remote address. The Tx and Rx queue sizes indicate
the number  of  bytes  pending  for  transmission  and  reception.  The  state
indicates the  state  the  socket  is  in and the uid is the owning uid of the
socket.

The /proc/net/ipx_interface  file lists all IPX interfaces. For each interface
it gives  the network number, the node number, and indicates if the network is
the primary  network.  It  also  indicates  which  device  it  is bound to (or
Internal for  internal  networks)  and  the  Frame  Type if appropriate. Linux
supports 802.3,  802.2,  802.2  SNAP  and DIX (Blue Book) ethernet framing for
IPX.

The /proc/net/ipx_route  table  holds  a list of IPX routes. For each route it
gives the  destination  network, the router node (or Directly) and the network
address of the router (or Connected) for internal networks.

2.11 /proc/sys/fs/mqueue - POSIX message queues filesystem
----------------------------------------------------------

The "mqueue"  filesystem provides  the necessary kernel features to enable the
creation of a  user space  library that  implements  the  POSIX message queues
API (as noted by the  MSG tag in the  POSIX 1003.1-2001 version  of the System
Interfaces specification.)

The "mqueue" filesystem contains values for determining/setting  the amount of
resources used by the file system.

/proc/sys/fs/mqueue/queues_max is a read/write  file for  setting/getting  the
maximum number of message queues allowed on the system.

/proc/sys/fs/mqueue/msg_max  is  a  read/write file  for  setting/getting  the
maximum number of messages in a queue value.  In fact it is the limiting value
for another (user) limit which is set in mq_open invocation. This attribute of
a queue must be less or equal then msg_max.

/proc/sys/fs/mqueue/msgsize_max is  a read/write  file for setting/getting the
maximum  message size value (it is every  message queue's attribute set during
its creation).

2.12 /proc/<pid>/oom_adj - Adjust the oom-killer score
------------------------------------------------------

This file can be used to adjust the score used to select which processes
should be killed in an  out-of-memory  situation.  Giving it a high score will
increase the likelihood of this process being killed by the oom-killer.  Valid
values are in the range -16 to +15, plus the special value -17, which disables
oom-killing altogether for this process.

2.13 /proc/<pid>/oom_score - Display current oom-killer score
-------------------------------------------------------------

------------------------------------------------------------------------------
This file can be used to check the current score used by the oom-killer is for
any given <pid>. Use it together with /proc/<pid>/oom_adj to tune which
process should be killed in an out-of-memory situation.

------------------------------------------------------------------------------
Summary
------------------------------------------------------------------------------
Certain aspects  of  kernel  behavior  can be modified at runtime, without the
need to  recompile  the kernel, or even to reboot the system. The files in the
/proc/sys tree  can  not only be read, but also modified. You can use the echo
command to write value into these files, thereby changing the default settings
of the kernel.
------------------------------------------------------------------------------

2.14  /proc/<pid>/io - Display the IO accounting fields
-------------------------------------------------------

This file contains IO statistics for each running process

Example
-------

test:/tmp # dd if=/dev/zero of=/tmp/test.dat &
[1] 3828

test:/tmp # cat /proc/3828/io
rchar: 323934931
wchar: 323929600
syscr: 632687
syscw: 632675
read_bytes: 0
write_bytes: 323932160
cancelled_write_bytes: 0


Description
-----------

rchar
-----

I/O counter: chars read
The number of bytes which this task has caused to be read from storage. This
is simply the sum of bytes which this process passed to read() and pread().
It includes things like tty IO and it is unaffected by whether or not actual
physical disk IO was required (the read might have been satisfied from
pagecache)


wchar
-----

I/O counter: chars written
The number of bytes which this task has caused, or shall cause to be written
to disk. Similar caveats apply here as with rchar.


syscr
-----

I/O counter: read syscalls
Attempt to count the number of read I/O operations, i.e. syscalls like read()
and pread().


syscw
-----

I/O counter: write syscalls
Attempt to count the number of write I/O operations, i.e. syscalls like
write() and pwrite().


read_bytes
----------

I/O counter: bytes read
Attempt to count the number of bytes which this process really did cause to
be fetched from the storage layer. Done at the submit_bio() level, so it is
accurate for block-backed filesystems. <please add status regarding NFS and
CIFS at a later time>


write_bytes
-----------

I/O counter: bytes written
Attempt to count the number of bytes which this process caused to be sent to
the storage layer. This is done at page-dirtying time.


cancelled_write_bytes
---------------------

The big inaccuracy here is truncate. If a process writes 1MB to a file and
then deletes the file, it will in fact perform no writeout. But it will have
been accounted as having caused 1MB of write.
In other words: The number of bytes which this process caused to not happen,
by truncating pagecache. A task can cause "negative" IO too. If this task
truncates some dirty pagecache, some IO which another task has been accounted
for (in it's write_bytes) will not be happening. We _could_ just subtract that
from the truncating task's write_bytes, but there is information loss in doing
that.


Note
----

At its current implementation state, this is a bit racy on 32-bit machines: if
process A reads process B's /proc/pid/io while process B is updating one of
those 64-bit counters, process A could see an intermediate result.


More information about this can be found within the taskstats documentation in
Documentation/accounting.

2.15 /proc/<pid>/coredump_filter - Core dump filtering settings
---------------------------------------------------------------
When a process is dumped, all anonymous memory is written to a core file as
long as the size of the core file isn't limited. But sometimes we don't want
to dump some memory segments, for example, huge shared memory. Conversely,
sometimes we want to save file-backed memory segments into a core file, not
only the individual files.

/proc/<pid>/coredump_filter allows you to customize which memory segments
will be dumped when the <pid> process is dumped. coredump_filter is a bitmask
of memory types. If a bit of the bitmask is set, memory segments of the
corresponding memory type are dumped, otherwise they are not dumped.

The following 7 memory types are supported:
  - (bit 0) anonymous private memory
  - (bit 1) anonymous shared memory
  - (bit 2) file-backed private memory
  - (bit 3) file-backed shared memory
  - (bit 4) ELF header pages in file-backed private memory areas (it is
            effective only if the bit 2 is cleared)
  - (bit 5) hugetlb private memory
  - (bit 6) hugetlb shared memory

  Note that MMIO pages such as frame buffer are never dumped and vDSO pages
  are always dumped regardless of the bitmask status.

  Note bit 0-4 doesn't effect any hugetlb memory. hugetlb memory are only
  effected by bit 5-6.

Default value of coredump_filter is 0x23; this means all anonymous memory
segments and hugetlb private memory are dumped.

If you don't want to dump all shared memory segments attached to pid 1234,
write 0x21 to the process's proc file.

  $ echo 0x21 > /proc/1234/coredump_filter

When a new process is created, the process inherits the bitmask status from its
parent. It is useful to set up coredump_filter before the program runs.
For example:

  $ echo 0x7 > /proc/self/coredump_filter
  $ ./some_program

2.16    /proc/<pid>/mountinfo - Information about mounts
--------------------------------------------------------

This file contains lines of the form:

36 35 98:0 /mnt1 /mnt2 rw,noatime master:1 - ext3 /dev/root rw,errors=continue
(1)(2)(3)   (4)   (5)      (6)      (7)   (8) (9)   (10)         (11)

(1) mount ID:  unique identifier of the mount (may be reused after umount)
(2) parent ID:  ID of parent (or of self for the top of the mount tree)
(3) major:minor:  value of st_dev for files on filesystem
(4) root:  root of the mount within the filesystem
(5) mount point:  mount point relative to the process's root
(6) mount options:  per mount options
(7) optional fields:  zero or more fields of the form "tag[:value]"
(8) separator:  marks the end of the optional fields
(9) filesystem type:  name of filesystem of the form "type[.subtype]"
(10) mount source:  filesystem specific information or "none"
(11) super options:  per super block options

Parsers should ignore all unrecognised optional fields.  Currently the
possible optional fields are:

shared:X  mount is shared in peer group X
master:X  mount is slave to peer group X
propagate_from:X  mount is slave and receives propagation from peer group X (*)
unbindable  mount is unbindable

(*) X is the closest dominant peer group under the process's root.  If
X is the immediate master of the mount, or if there's no dominant peer
group under the same root, then only the "master:X" field is present
and not the "propagate_from:X" field.

For more information on mount propagation see:

  Documentation/filesystems/sharedsubtree.txt

2.17    /proc/sys/fs/epoll - Configuration options for the epoll interface
--------------------------------------------------------

This directory contains configuration options for the epoll(7) interface.

max_user_instances
------------------

This is the maximum number of epoll file descriptors that a single user can
have open at a given time. The default value is 128, and should be enough
for normal users.

max_user_watches
----------------

Every epoll file descriptor can store a number of files to be monitored
for event readiness. Each one of these monitored files constitutes a "watch".
This configuration option sets the maximum number of "watches" that are
allowed for each user.
Each "watch" costs roughly 90 bytes on a 32bit kernel, and roughly 160 bytes
on a 64bit one.
The current default value for  max_user_watches  is the 1/32 of the available
low memory, divided for the "watch" cost in bytes.


------------------------------------------------------------------------------