Difference between revisions of "File descriptors"

From Hackepedia
Jump to navigationJump to search
 
(31 intermediate revisions by 3 users not shown)
Line 1: Line 1:
A file descriptor is a handle in a program that allows data to be read and written.  It is assigned a number starting at 0 and going to the file descriptor limit.  A descriptor of -1 indicates an error. When a new process is created (by means of the fork(2) system call) it inherits all open file descriptors from the parent processNew file descriptors are made by the following system calls open(2), pipe(2), and socket(2).  A descriptor can be duplicated thus creating a new descriptor and number with the dup(2) system call.  A descriptor can be destroyed by means of the close(2) system call.  All programs executed from a shell have the following 3 descriptors open bound to the terminal:  [[Stdin]], [[Stdout]] and [[Stderr]], representing descriptor numbers 0, 1 and 2 respectively.  More about descriptors can be found in the fd manpage on the OpenBSD system; "man 4 fd" to see this.
+
A file descriptor is a handle in a program that allows data to be read and written.  It is assigned a number starting at 0 and going to the [[file descriptor limit]].  A descriptor of -1 indicates an error.  File descriptors are limited to [[file]]s in [[filesystem]]s but there is other sorts of descriptors (like sockets) that behave similarly so we bunch them together here.
  
----
 
  
There are a number of ways to see the open file descriptors of another program.  In BSD the fstat(1) command lists all running programs (processes) of the system and their open descriptors furthermore it lists what type of descriptor it is (file, socket, pipe, etc) and tries to give a hint of what the descriptor is reading or writing on such as what filesystem and what inode number on that file system.  In OpenBSD network sockets display what port number they have open which is useful for finding the program of that port.
+
== Descriptors ==
 +
 
 +
When a new [[process]] is created (by means of the [[fork]](2) system call) it inherits all open descriptors from the parent process.  New descriptors are made by the following system calls [[open]](2), [[pipe]](2), [[socket]](2), [[accept]](2) and [[socketpair]](2).  A descriptor can be duplicated thus creating a new descriptor and number with the dup(2) system call.  A descriptor can be destroyed by means of the close(2) system call.  All programs executed from a shell have the following 3 descriptors open bound to the terminal:  [[Stdin]], [[Stdout]] and [[Stderr]], representing descriptor numbers 0, 1 and 2 respectively.  More about descriptors can be found in the fd [[manual]] page on the [[OpenBSD]] system; "man 4 fd" to see this.
 +
 
 +
Careful plumbing done by a program (such as a user shell) can use the [[pipe]](2), [[fork]](2), [[exec]](2) and [[dup2]](2) syscalls to connect the [[stdout]] of one program with [[stdin]] of another program this is called piping (see [[pipe]]).  Long [[pipe]] chains (proper: pipelines) can be created this way.
 +
 
 +
When a program becomes a [[daemon]] [[stdin]], [[stdout]] and [[stderr]] are usually closed and attached to the [[null]] device since they don't have a controlling terminal anymore.  All writes to these descriptors will be discarded.
 +
 
 +
With descriptors control messaging can be used to talk to the kernel directly without having to write to it.  Interfaces for this include [[fcntl]](2), [[ioctl]](2) and [[tcsetattr]](2).  With fcntl one can manipulate the [[close-on-exec]] flag which is useful with [[setuid]] programs.
 +
 
 +
Descriptors can be polled without having to use them in order to find out that one is ready for i/o.  There are two methods to do this:  poll(2) and select(2).
 +
 
 +
Descriptors can be set non-blocking meaning that a read will not block until data arrives when there is no data available.  Instead if a block condition exists the [[syscall]] returns immediately and sets the [[errno]] to EWOULDBLOCK.
 +
 
 +
Instead of being inherited by from a parent by means of the fork(2) [[syscall]] a  descriptor can be passed from one process to another.  The means to do this is through a UNIX domain socket.  One process that uses this technique is [[sshd]].
 +
 
 +
== How do I see descriptors of a process? ==
 +
 
 +
 
 +
=== OpenBSD ===
 +
There are a number of ways to see the open file descriptors of another program.  In [[BSD]] the fstat(1) command lists all running programs (processes) of the system and their open descriptors furthermore it lists what type of descriptor it is (file, socket, pipe, etc) and tries to give a hint of what the descriptor is reading or writing on such as what filesystem and what inode number on that file system.  In [[OpenBSD]] network sockets display what port number they have open which is useful for finding the program of that port. See [[ports]]:
 +
 
 +
 
 +
Below you'll find a list of descriptors of different types as well as examples what to do with the information given by fstat (this example uses fstat from [[OpenBSD]]):
 +
 
 +
 
 +
In fstat a filedescriptor on a file looks like this:
 +
USER    CMD          PID  FD MOUNT      INUM MODE      R/W    DV|SZ
 +
bituser  python2.3  5233  26 /usr    9050558 -rw-r--r--  rw 48758784
 +
...
 +
$ find /usr/home/bituser -inum 9050558 -print 2>/dev/null
 +
/usr/home/bituser/downloads/trusted-computing/TrustedComputing_LAFKON_HIGH.mov
 +
 
 +
An Internet socket that's listening looks like this:
 +
USER    CMD          PID  FD MOUNT      INUM MODE      R/W    DV|SZ
 +
root    sshd        3641    5* internet stream tcp 0xfffffe800f551008 *:22
 +
...
 +
$ netstat -naA | head -2
 +
Active Internet connections (including servers)
 +
PCB                Proto Recv-Q Send-Q  Local Address      Foreign Address    (state)
 +
$ netstat -naA | awk '$1 == "0xfffffe800f551008" { print }'
 +
0xfffffe800f551008 tcp        0      0  *.22              *.*                LISTEN
 +
 
 +
 
 +
A connected Internet socket descriptor looks like this:
 +
USER    CMD          PID  FD MOUNT      INUM MODE      R/W    DV|SZ
 +
bituser  python2.3  5233  27* internet stream tcp 0xfffffe801be04d18 85.75.59.86:6884 <-- 80.177.208.19:12362
 +
 
 +
A UNIX domain socket looks like this:
 +
USER    CMD          PID  FD MOUNT      INUM MODE      R/W    DV|SZ
 +
pbug    ssh-agent  12530    4* unix stream 0xffff800001d80800
 +
 
 +
A UNIX domain socketpair looks like this:
 +
USER    CMD          PID  FD MOUNT      INUM MODE      R/W    DV|SZ
 +
root    ntpd      12284    3* unix stream 0xffff800001cd6480 <-> 0xffff800001cd6700
 +
 
 +
A desriptor to a pipe looks like this:
 +
USER    CMD          PID  FD MOUNT      INUM MODE      R/W    DV|SZ
 +
pbug    fstat      26338    1 pipe 0xfffffe801f525d28 state:
 +
 
 +
A descriptor on a fifo looks like this:
 +
USER    CMD          PID  FD MOUNT      INUM MODE      R/W    DV|SZ
 +
pbug    sh          3840    3 /tmp          5 prw-r--r--  r        0:0 
 +
 
 +
A revoke(2)'ed descriptor looks like this:
 +
USER    CMD          PID  FD MOUNT      INUM MODE      R/W    DV|SZ
 +
pbug    xterm      15604    5 -        -        none    -
 +
 
 +
 
 +
=== Linux ===
 +
 
 +
In [[Linux]] you'd use lsof to dig through the /proc filesystem to see open descriptors / [[ports]].  Specifically, you want to look in the /proc/<PID>/fd directory.  Each symlink will point to an open file/socket/etc.  An example from [[xterm]]:
 +
 
 +
  jbecker@aubrey /space/winders $ ls -l /proc/9383/fd
 +
  total 6
 +
  lr-x------  1 jbecker users 64 Oct 15 09:53 0 -> /dev/null
 +
  l-wx------  1 jbecker users 64 Oct 15 09:53 1 -> /home/jbecker/.xsession-errors
 +
  l-wx------  1 jbecker users 64 Oct 15 09:53 2 -> /home/jbecker/.xsession-errors
 +
  l-wx------  1 jbecker users 64 Oct 15 09:53 3 -> /home/jbecker/.fluxbox/log
 +
  lrwx------  1 jbecker users 64 Oct 15 09:53 4 -> socket:[2995011]
 +
  lrwx------  1 jbecker users 64 Oct 15 09:53 5 -> /dev/ptmx

Latest revision as of 13:06, 25 February 2008

A file descriptor is a handle in a program that allows data to be read and written. It is assigned a number starting at 0 and going to the file descriptor limit. A descriptor of -1 indicates an error. File descriptors are limited to files in filesystems but there is other sorts of descriptors (like sockets) that behave similarly so we bunch them together here.


Descriptors

When a new process is created (by means of the fork(2) system call) it inherits all open descriptors from the parent process. New descriptors are made by the following system calls open(2), pipe(2), socket(2), accept(2) and socketpair(2). A descriptor can be duplicated thus creating a new descriptor and number with the dup(2) system call. A descriptor can be destroyed by means of the close(2) system call. All programs executed from a shell have the following 3 descriptors open bound to the terminal: Stdin, Stdout and Stderr, representing descriptor numbers 0, 1 and 2 respectively. More about descriptors can be found in the fd manual page on the OpenBSD system; "man 4 fd" to see this.

Careful plumbing done by a program (such as a user shell) can use the pipe(2), fork(2), exec(2) and dup2(2) syscalls to connect the stdout of one program with stdin of another program this is called piping (see pipe). Long pipe chains (proper: pipelines) can be created this way.

When a program becomes a daemon stdin, stdout and stderr are usually closed and attached to the null device since they don't have a controlling terminal anymore. All writes to these descriptors will be discarded.

With descriptors control messaging can be used to talk to the kernel directly without having to write to it. Interfaces for this include fcntl(2), ioctl(2) and tcsetattr(2). With fcntl one can manipulate the close-on-exec flag which is useful with setuid programs.

Descriptors can be polled without having to use them in order to find out that one is ready for i/o. There are two methods to do this: poll(2) and select(2).

Descriptors can be set non-blocking meaning that a read will not block until data arrives when there is no data available. Instead if a block condition exists the syscall returns immediately and sets the errno to EWOULDBLOCK.

Instead of being inherited by from a parent by means of the fork(2) syscall a descriptor can be passed from one process to another. The means to do this is through a UNIX domain socket. One process that uses this technique is sshd.

How do I see descriptors of a process?

OpenBSD

There are a number of ways to see the open file descriptors of another program. In BSD the fstat(1) command lists all running programs (processes) of the system and their open descriptors furthermore it lists what type of descriptor it is (file, socket, pipe, etc) and tries to give a hint of what the descriptor is reading or writing on such as what filesystem and what inode number on that file system. In OpenBSD network sockets display what port number they have open which is useful for finding the program of that port. See ports:


Below you'll find a list of descriptors of different types as well as examples what to do with the information given by fstat (this example uses fstat from OpenBSD):


In fstat a filedescriptor on a file looks like this:
USER     CMD          PID   FD MOUNT      INUM MODE       R/W    DV|SZ
bituser  python2.3   5233   26 /usr     9050558 -rw-r--r--  rw 48758784
...
$ find /usr/home/bituser -inum 9050558 -print 2>/dev/null
/usr/home/bituser/downloads/trusted-computing/TrustedComputing_LAFKON_HIGH.mov
An Internet socket that's listening looks like this:
USER     CMD          PID   FD MOUNT      INUM MODE       R/W    DV|SZ
root     sshd        3641    5* internet stream tcp 0xfffffe800f551008 *:22
...
$ netstat -naA | head -2
Active Internet connections (including servers)
PCB                Proto Recv-Q Send-Q  Local Address      Foreign Address    (state)
$ netstat -naA | awk '$1 == "0xfffffe800f551008" { print }'
0xfffffe800f551008 tcp        0      0  *.22               *.*                LISTEN


A connected Internet socket descriptor looks like this:
USER     CMD          PID   FD MOUNT      INUM MODE       R/W    DV|SZ
bituser  python2.3   5233   27* internet stream tcp 0xfffffe801be04d18 85.75.59.86:6884 <-- 80.177.208.19:12362
A UNIX domain socket looks like this:
USER     CMD          PID   FD MOUNT      INUM MODE       R/W    DV|SZ
pbug     ssh-agent  12530    4* unix stream 0xffff800001d80800
A UNIX domain socketpair looks like this:
USER     CMD          PID   FD MOUNT      INUM MODE       R/W    DV|SZ
root     ntpd       12284    3* unix stream 0xffff800001cd6480 <-> 0xffff800001cd6700
A desriptor to a pipe looks like this:
USER     CMD          PID   FD MOUNT      INUM MODE       R/W    DV|SZ
pbug     fstat      26338    1 pipe 0xfffffe801f525d28 state: 
A descriptor on a fifo looks like this:
USER     CMD          PID   FD MOUNT      INUM MODE       R/W    DV|SZ
pbug     sh          3840    3 /tmp          5 prw-r--r--   r        0:0   
A revoke(2)'ed descriptor looks like this:
USER     CMD          PID   FD MOUNT      INUM MODE       R/W    DV|SZ
pbug     xterm      15604    5 -         -        none    -


Linux

In Linux you'd use lsof to dig through the /proc filesystem to see open descriptors / ports. Specifically, you want to look in the /proc/<PID>/fd directory. Each symlink will point to an open file/socket/etc. An example from xterm:

 jbecker@aubrey /space/winders $ ls -l /proc/9383/fd
 total 6
 lr-x------  1 jbecker users 64 Oct 15 09:53 0 -> /dev/null
 l-wx------  1 jbecker users 64 Oct 15 09:53 1 -> /home/jbecker/.xsession-errors
 l-wx------  1 jbecker users 64 Oct 15 09:53 2 -> /home/jbecker/.xsession-errors
 l-wx------  1 jbecker users 64 Oct 15 09:53 3 -> /home/jbecker/.fluxbox/log
 lrwx------  1 jbecker users 64 Oct 15 09:53 4 -> socket:[2995011]
 lrwx------  1 jbecker users 64 Oct 15 09:53 5 -> /dev/ptmx