Sockets are an API for IPC or network communication with a process. For IPC Unix domain sockets are used, for network communication INET sockets are preferred. INET sockets work on OSI layer 3 and above. Lower layer access is provided by bpf the Berkeley packet filter. Sockets provide descriptors to a process with which data or control data can be exchanged with the kernel.
Unix domain sockets
When a Unix domain socket is set up it is bound to the local systems filesystem. The path it can be bound to is limited to 103 characters (see /usr/include/sys/un.h) instead of the filesystem limit of 1023 characters. This means that a socket should be set up close to the root perhaps in /tmp (as sshd does). Unix domain sockets make preferred IPC in OpenBSD because of the availability of the getpeereid syscall which allows a daemon to check the credentials of who is connecting to the socket. A socket in the filesystem looks like this:
$ ls -l /tmp/ssh* total 0 srwxr-xr-x 1 pbug wheel 0 Oct 8 11:27 agent.1327
notice the 's' indicating that this file is a socket.
In order to communicate with the Internet a program can communicate with it via the Kernel which has a built-in internet stack. Common protocols that one can talk via sockets are TCP and UDP as well as ICMP which is grouped into the raw mode of sockets. When a program is a TCP server the common sequence of syscalls are socket(2), bind(2), listen(2), and accept(2). When a program is a TCP client the common sequence of syscalls are socket(2), connect(2). TCP and UDP sockets have ports to identify them. On a system one can use netstat to see this. In UNIX only root can bind to ports less than 1024 the rest is available for all users. This is shown here as an example:
$ id uid=1000(pbug) gid=1000(pbug) groups=1000(pbug), 0(wheel), 5(operator) $ nc -l 1023 nc: Permission denied $ nc -l 1024 ^C $
When a server is listening on a certain port it's difficult to regulate who connects to this port. Early implementations messed with TCP Wrappers which allowed one to set up a simple whitelist or blacklist of who can connect. This didn't cover UDP though and it was still required that the program accepted the connection before closing it. This means that someone could stealth scan a port and know that it was listening. Firewalls allowed finer control and aren't as revealing over open ports. BSD has ipfw, ipfw2, ipf and pf as firewalls.
$ fstat | grep traceroute pbug traceroute 18184 wd /usr 6310091 drwxr-xr-x r 2048 pbug traceroute 18184 0 / 84995 crw--w---- rw ttyp1 pbug traceroute 18184 1 / 84995 crw--w---- rw ttyp1 pbug traceroute 18184 2 / 84995 crw--w---- rw ttyp1 pbug traceroute 18184 3* internet raw icmp 0xfffffe800f85d678 pbug traceroute 18184 4* internet raw reserved 0xfffffe801bc73688
Raw sockets are restricted to the superuser (root) only. When writing to raw sockets you can manipulate the IP header which you cannot do with simple TCP or UDP sockets (setsockopt are an exception but require superuser (root) permissions). The traceroute program uses raw sockets in a UNIX system. It is possible to read from raw sockets, ie. if you specified a specific protocol you can read any packets that arrive for that protocol. In BSD and if the protocol is ICMP the ICMP types echo request (8), timestamp request (13) and address mask request (17) are not passed from the kernel to the socket. Similarely TCP and UDP protocols are not passed to a raw socket, these must use stream or dgram sockets or be read from the OSI datalink layer (see bpf).
Linux has divert sockets too which may have once been limited to only BSD, this is no longer the case. It is available on debian derived linux. Referenced here: http://manpages.ubuntu.com/manpages/hardy/man4/divert.4.html