1687 lines
47 KiB
HTML
1687 lines
47 KiB
HTML
|
|
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
|
|
<HTML><HEAD><TITLE>Man page of TCP</TITLE>
|
|
</HEAD><BODY>
|
|
<H1>TCP</H1>
|
|
Section: Linux Programmer's Manual (7)<BR>Updated: 2020-02-09<BR><A HREF="#index">Index</A>
|
|
<A HREF="/cgi-bin/man/man2html">Return to Main Contents</A><HR>
|
|
|
|
<A NAME="lbAB"> </A>
|
|
<H2>NAME</H2>
|
|
|
|
tcp - TCP protocol
|
|
<A NAME="lbAC"> </A>
|
|
<H2>SYNOPSIS</H2>
|
|
|
|
<B>#include <<A HREF="file:///usr/include/sys/socket.h">sys/socket.h</A>></B>
|
|
|
|
<BR>
|
|
|
|
<B>#include <<A HREF="file:///usr/include/netinet/in.h">netinet/in.h</A>></B>
|
|
|
|
<BR>
|
|
|
|
<B>#include <<A HREF="file:///usr/include/netinet/tcp.h">netinet/tcp.h</A>></B>
|
|
|
|
<P>
|
|
|
|
<B>tcp_socket = socket(AF_INET, SOCK_STREAM, 0);</B>
|
|
|
|
<A NAME="lbAD"> </A>
|
|
<H2>DESCRIPTION</H2>
|
|
|
|
This is an implementation of the TCP protocol defined in
|
|
RFC 793, RFC 1122 and RFC 2001 with the NewReno and SACK
|
|
extensions.
|
|
It provides a reliable, stream-oriented,
|
|
full-duplex connection between two sockets on top of
|
|
<B><A HREF="/cgi-bin/man/man2html?7+ip">ip</A></B>(7),
|
|
|
|
for both v4 and v6 versions.
|
|
TCP guarantees that the data arrives in order and
|
|
retransmits lost packets.
|
|
It generates and checks a per-packet checksum to catch
|
|
transmission errors.
|
|
TCP does not preserve record boundaries.
|
|
<P>
|
|
|
|
A newly created TCP socket has no remote or local address and is not
|
|
fully specified.
|
|
To create an outgoing TCP connection use
|
|
<B><A HREF="/cgi-bin/man/man2html?2+connect">connect</A></B>(2)
|
|
|
|
to establish a connection to another TCP socket.
|
|
To receive new incoming connections, first
|
|
<B><A HREF="/cgi-bin/man/man2html?2+bind">bind</A></B>(2)
|
|
|
|
the socket to a local address and port and then call
|
|
<B><A HREF="/cgi-bin/man/man2html?2+listen">listen</A></B>(2)
|
|
|
|
to put the socket into the listening state.
|
|
After that a new socket for each incoming connection can be accepted using
|
|
<B><A HREF="/cgi-bin/man/man2html?2+accept">accept</A></B>(2).
|
|
|
|
A socket which has had
|
|
<B><A HREF="/cgi-bin/man/man2html?2+accept">accept</A></B>(2)
|
|
|
|
or
|
|
<B><A HREF="/cgi-bin/man/man2html?2+connect">connect</A></B>(2)
|
|
|
|
successfully called on it is fully specified and may transmit data.
|
|
Data cannot be transmitted on listening or not yet connected sockets.
|
|
<P>
|
|
|
|
Linux supports RFC 1323 TCP high performance
|
|
extensions.
|
|
These include Protection Against Wrapped
|
|
Sequence Numbers (PAWS), Window Scaling and Timestamps.
|
|
Window scaling allows the use
|
|
of large (> 64 kB) TCP windows in order to support links with high
|
|
latency or bandwidth.
|
|
To make use of them, the send and receive buffer sizes must be increased.
|
|
They can be set globally with the
|
|
<I>/proc/sys/net/ipv4/tcp_wmem</I>
|
|
|
|
and
|
|
<I>/proc/sys/net/ipv4/tcp_rmem</I>
|
|
|
|
files, or on individual sockets by using the
|
|
<B>SO_SNDBUF</B>
|
|
|
|
and
|
|
<B>SO_RCVBUF</B>
|
|
|
|
socket options with the
|
|
<B><A HREF="/cgi-bin/man/man2html?2+setsockopt">setsockopt</A></B>(2)
|
|
|
|
call.
|
|
<P>
|
|
|
|
The maximum sizes for socket buffers declared via the
|
|
<B>SO_SNDBUF</B>
|
|
|
|
and
|
|
<B>SO_RCVBUF</B>
|
|
|
|
mechanisms are limited by the values in the
|
|
<I>/proc/sys/net/core/rmem_max</I>
|
|
|
|
and
|
|
<I>/proc/sys/net/core/wmem_max</I>
|
|
|
|
files.
|
|
Note that TCP actually allocates twice the size of
|
|
the buffer requested in the
|
|
<B><A HREF="/cgi-bin/man/man2html?2+setsockopt">setsockopt</A></B>(2)
|
|
|
|
call, and so a succeeding
|
|
<B><A HREF="/cgi-bin/man/man2html?2+getsockopt">getsockopt</A></B>(2)
|
|
|
|
call will not return the same size of buffer as requested in the
|
|
<B><A HREF="/cgi-bin/man/man2html?2+setsockopt">setsockopt</A></B>(2)
|
|
|
|
call.
|
|
TCP uses the extra space for administrative purposes and internal
|
|
kernel structures, and the
|
|
<I>/proc</I>
|
|
|
|
file values reflect the
|
|
larger sizes compared to the actual TCP windows.
|
|
On individual connections, the socket buffer size must be set prior to the
|
|
<B><A HREF="/cgi-bin/man/man2html?2+listen">listen</A></B>(2)
|
|
|
|
or
|
|
<B><A HREF="/cgi-bin/man/man2html?2+connect">connect</A></B>(2)
|
|
|
|
calls in order to have it take effect.
|
|
See
|
|
<B><A HREF="/cgi-bin/man/man2html?7+socket">socket</A></B>(7)
|
|
|
|
for more information.
|
|
<P>
|
|
|
|
TCP supports urgent data.
|
|
Urgent data is used to signal the
|
|
receiver that some important message is part of the data
|
|
stream and that it should be processed as soon as possible.
|
|
To send urgent data specify the
|
|
<B>MSG_OOB</B>
|
|
|
|
option to
|
|
<B><A HREF="/cgi-bin/man/man2html?2+send">send</A></B>(2).
|
|
|
|
When urgent data is received, the kernel sends a
|
|
<B>SIGURG</B>
|
|
|
|
signal to the process or process group that has been set as the
|
|
socket "owner" using the
|
|
<B>SIOCSPGRP</B>
|
|
|
|
or
|
|
<B>FIOSETOWN</B>
|
|
|
|
ioctls (or the POSIX.1-specified
|
|
<B><A HREF="/cgi-bin/man/man2html?2+fcntl">fcntl</A></B>(2)
|
|
|
|
<B>F_SETOWN</B>
|
|
|
|
operation).
|
|
When the
|
|
<B>SO_OOBINLINE</B>
|
|
|
|
socket option is enabled, urgent data is put into the normal
|
|
data stream (a program can test for its location using the
|
|
<B>SIOCATMARK</B>
|
|
|
|
ioctl described below),
|
|
otherwise it can be received only when the
|
|
<B>MSG_OOB</B>
|
|
|
|
flag is set for
|
|
<B><A HREF="/cgi-bin/man/man2html?2+recv">recv</A></B>(2)
|
|
|
|
or
|
|
<B><A HREF="/cgi-bin/man/man2html?2+recvmsg">recvmsg</A></B>(2).
|
|
|
|
<P>
|
|
|
|
When out-of-band data is present,
|
|
<B><A HREF="/cgi-bin/man/man2html?2+select">select</A></B>(2)
|
|
|
|
indicates the file descriptor as having an exceptional condition and
|
|
<I>poll (2)</I>
|
|
|
|
indicates a
|
|
<B>POLLPRI</B>
|
|
|
|
event.
|
|
<P>
|
|
|
|
Linux 2.4 introduced a number of changes for improved
|
|
throughput and scaling, as well as enhanced functionality.
|
|
Some of these features include support for zero-copy
|
|
<B><A HREF="/cgi-bin/man/man2html?2+sendfile">sendfile</A></B>(2),
|
|
|
|
Explicit Congestion Notification, new
|
|
management of TIME_WAIT sockets, keep-alive socket options
|
|
and support for Duplicate SACK extensions.
|
|
<A NAME="lbAE"> </A>
|
|
<H3>Address formats</H3>
|
|
|
|
TCP is built on top of IP (see
|
|
<B><A HREF="/cgi-bin/man/man2html?7+ip">ip</A></B>(7)).
|
|
|
|
The address formats defined by
|
|
<B><A HREF="/cgi-bin/man/man2html?7+ip">ip</A></B>(7)
|
|
|
|
apply to TCP.
|
|
TCP supports point-to-point communication only;
|
|
broadcasting and multicasting are not
|
|
supported.
|
|
<A NAME="lbAF"> </A>
|
|
<H3>/proc interfaces</H3>
|
|
|
|
System-wide TCP parameter settings can be accessed by files in the directory
|
|
<I>/proc/sys/net/ipv4/</I>.
|
|
|
|
In addition, most IP
|
|
<I>/proc</I>
|
|
|
|
interfaces also apply to TCP; see
|
|
<B><A HREF="/cgi-bin/man/man2html?7+ip">ip</A></B>(7).
|
|
|
|
Variables described as
|
|
<I>Boolean</I>
|
|
|
|
take an integer value, with a nonzero value ("true") meaning that
|
|
the corresponding option is enabled, and a zero value ("false")
|
|
meaning that the option is disabled.
|
|
<DL COMPACT>
|
|
<DT id="1"><I>tcp_abc</I> (Integer; default: 0; Linux 2.6.15 to Linux 3.8)
|
|
|
|
<DD>
|
|
|
|
|
|
|
|
Control the Appropriate Byte Count (ABC), defined in RFC 3465.
|
|
ABC is a way of increasing the congestion window
|
|
(<I>cwnd</I>)
|
|
|
|
more slowly in response to partial acknowledgments.
|
|
Possible values are:
|
|
<DL COMPACT><DT id="2"><DD>
|
|
<DL COMPACT>
|
|
<DT id="3">0<DD>
|
|
increase
|
|
<I>cwnd</I>
|
|
|
|
once per acknowledgment (no ABC)
|
|
<DT id="4">1<DD>
|
|
increase
|
|
<I>cwnd</I>
|
|
|
|
once per acknowledgment of full sized segment
|
|
<DT id="5">2<DD>
|
|
allow increase
|
|
<I>cwnd</I>
|
|
|
|
by two if acknowledgment is
|
|
of two segments to compensate for delayed acknowledgments.
|
|
</DL>
|
|
</DL>
|
|
|
|
<DT id="6"><I>tcp_abort_on_overflow</I> (Boolean; default: disabled; since Linux 2.4)
|
|
|
|
<DD>
|
|
|
|
Enable resetting connections if the listening service is too
|
|
slow and unable to keep up and accept them.
|
|
It means that if overflow occurred due
|
|
to a burst, the connection will recover.
|
|
Enable this option
|
|
<I>only</I>
|
|
|
|
if you are really sure that the listening daemon
|
|
cannot be tuned to accept connections faster.
|
|
Enabling this option can harm the clients of your server.
|
|
<DT id="7"><I>tcp_adv_win_scale</I> (integer; default: 2; since Linux 2.4)
|
|
|
|
<DD>
|
|
|
|
Count buffering overhead as
|
|
<I>bytes/2^tcp_adv_win_scale</I>,
|
|
|
|
if
|
|
<I>tcp_adv_win_scale</I>
|
|
|
|
is greater than 0; or
|
|
<I>bytes-bytes/2^(-tcp_adv_win_scale)</I>,
|
|
|
|
if
|
|
<I>tcp_adv_win_scale</I>
|
|
|
|
is less than or equal to zero.
|
|
<DT id="8"><DD>
|
|
The socket receive buffer space is shared between the
|
|
application and kernel.
|
|
TCP maintains part of the buffer as
|
|
the TCP window, this is the size of the receive window
|
|
advertised to the other end.
|
|
The rest of the space is used
|
|
as the "application" buffer, used to isolate the network
|
|
from scheduling and application latencies.
|
|
The
|
|
<I>tcp_adv_win_scale</I>
|
|
|
|
default value of 2 implies that the space
|
|
used for the application buffer is one fourth that of the total.
|
|
<DT id="9"><I>tcp_allowed_congestion_control</I> (String; default: see text; since Linux 2.4.20)
|
|
|
|
<DD>
|
|
|
|
Show/set the congestion control algorithm choices available to unprivileged
|
|
processes (see the description of the
|
|
<B>TCP_CONGESTION</B>
|
|
|
|
socket option).
|
|
The items in the list are separated by white space and
|
|
terminated by a newline character.
|
|
The list is a subset of those listed in
|
|
<I>tcp_available_congestion_control</I>.
|
|
|
|
The default value for this list is "reno" plus the default setting of
|
|
<I>tcp_congestion_control</I>.
|
|
|
|
<DT id="10"><I>tcp_autocorking</I> (Boolean; default: enabled; since Linux 3.14)
|
|
|
|
<DD>
|
|
|
|
|
|
If this option is enabled, the kernel tries to coalesce small writes
|
|
(from consecutive
|
|
<B><A HREF="/cgi-bin/man/man2html?2+write">write</A></B>(2)
|
|
|
|
and
|
|
<B><A HREF="/cgi-bin/man/man2html?2+sendmsg">sendmsg</A></B>(2)
|
|
|
|
calls) as much as possible,
|
|
in order to decrease the total number of sent packets.
|
|
Coalescing is done if at least one prior packet for the flow
|
|
is waiting in Qdisc queues or device transmit queue.
|
|
Applications can still use the
|
|
<B>TCP_CORK</B>
|
|
|
|
socket option to obtain optimal behavior
|
|
when they know how/when to uncork their sockets.
|
|
<DT id="11"><I>tcp_available_congestion_control</I> (String; read-only; since Linux 2.4.20)
|
|
|
|
<DD>
|
|
|
|
Show a list of the congestion-control algorithms
|
|
that are registered.
|
|
The items in the list are separated by white space and
|
|
terminated by a newline character.
|
|
This list is a limiting set for the list in
|
|
<I>tcp_allowed_congestion_control</I>.
|
|
|
|
More congestion-control algorithms may be available as modules,
|
|
but not loaded.
|
|
<DT id="12"><I>tcp_app_win</I> (integer; default: 31; since Linux 2.4)
|
|
|
|
<DD>
|
|
|
|
This variable defines how many
|
|
bytes of the TCP window are reserved for buffering overhead.
|
|
<DT id="13"><DD>
|
|
A maximum of (<I>window/2^tcp_app_win</I>, mss) bytes in the window
|
|
are reserved for the application buffer.
|
|
A value of 0 implies that no amount is reserved.
|
|
|
|
|
|
<DT id="14"><I>tcp_base_mss</I> (Integer; default: 512; since Linux 2.6.17)
|
|
|
|
<DD>
|
|
The initial value of
|
|
<I>search_low</I>
|
|
|
|
to be used by the packetization layer Path MTU discovery (MTU probing).
|
|
If MTU probing is enabled,
|
|
this is the initial MSS used by the connection.
|
|
|
|
|
|
<DT id="15"><I>tcp_bic</I> (Boolean; default: disabled; Linux 2.4.27/2.6.6 to 2.6.13)
|
|
|
|
<DD>
|
|
Enable BIC TCP congestion control algorithm.
|
|
BIC-TCP is a sender-side-only change that ensures a linear RTT
|
|
fairness under large windows while offering both scalability and
|
|
bounded TCP-friendliness.
|
|
The protocol combines two schemes
|
|
called additive increase and binary search increase.
|
|
When the congestion window is large, additive increase with a large
|
|
increment ensures linear RTT fairness as well as good scalability.
|
|
Under small congestion windows, binary search
|
|
increase provides TCP friendliness.
|
|
|
|
|
|
<DT id="16"><I>tcp_bic_low_window</I> (integer; default: 14; Linux 2.4.27/2.6.6 to 2.6.13)
|
|
|
|
<DD>
|
|
Set the threshold window (in packets) where BIC TCP starts to
|
|
adjust the congestion window.
|
|
Below this threshold BIC TCP behaves the same as the default TCP Reno.
|
|
|
|
|
|
<DT id="17"><I>tcp_bic_fast_convergence</I> (Boolean; default: enabled; Linux 2.4.27/2.6.6 to 2.6.13)
|
|
|
|
<DD>
|
|
Force BIC TCP to more quickly respond to changes in congestion window.
|
|
Allows two flows sharing the same connection to converge more rapidly.
|
|
<DT id="18"><I>tcp_congestion_control</I> (String; default: see text; since Linux 2.4.13)
|
|
|
|
<DD>
|
|
|
|
Set the default congestion-control algorithm to be used for new connections.
|
|
The algorithm "reno" is always available,
|
|
but additional choices may be available depending on kernel configuration.
|
|
The default value for this file is set as part of kernel configuration.
|
|
<DT id="19"><I>tcp_dma_copybreak</I> (integer; default: 4096; since Linux 2.6.24)
|
|
|
|
<DD>
|
|
Lower limit, in bytes, of the size of socket reads that will be
|
|
offloaded to a DMA copy engine, if one is present in the system
|
|
and the kernel was configured with the
|
|
<B>CONFIG_NET_DMA</B>
|
|
|
|
option.
|
|
<DT id="20"><I>tcp_dsack</I> (Boolean; default: enabled; since Linux 2.4)
|
|
|
|
<DD>
|
|
|
|
Enable RFC 2883 TCP Duplicate SACK support.
|
|
<DT id="21"><I>tcp_ecn</I> (Integer; default: see below; since Linux 2.4)
|
|
|
|
<DD>
|
|
|
|
Enable RFC 3168 Explicit Congestion Notification.
|
|
<DT id="22"><DD>
|
|
This file can have one of the following values:
|
|
<DL COMPACT><DT id="23"><DD>
|
|
<DL COMPACT>
|
|
<DT id="24">0<DD>
|
|
Disable ECN.
|
|
Neither initiate nor accept ECN.
|
|
This was the default up to and including Linux 2.6.30.
|
|
<DT id="25">1<DD>
|
|
Enable ECN when requested by incoming connections and also
|
|
request ECN on outgoing connection attempts.
|
|
<DT id="26">2<DD>
|
|
|
|
Enable ECN when requested by incoming connections,
|
|
but do not request ECN on outgoing connections.
|
|
This value is supported, and is the default, since Linux 2.6.31.
|
|
</DL>
|
|
</DL>
|
|
|
|
<DT id="27"><DD>
|
|
When enabled, connectivity to some destinations could be affected
|
|
due to older, misbehaving middle boxes along the path, causing
|
|
connections to be dropped.
|
|
However, to facilitate and encourage deployment with option 1, and
|
|
to work around such buggy equipment, the
|
|
<B>tcp_ecn_fallback</B>
|
|
|
|
option has been introduced.
|
|
<DT id="28"><I>tcp_ecn_fallback</I> (Boolean; default: enabled; since Linux 4.1)
|
|
|
|
<DD>
|
|
|
|
Enable RFC 3168, Section 6.1.1.1. fallback.
|
|
When enabled, outgoing ECN-setup SYNs that time out within the
|
|
normal SYN retransmission timeout will be resent with CWR and
|
|
ECE cleared.
|
|
<DT id="29"><I>tcp_fack</I> (Boolean; default: enabled; since Linux 2.2)
|
|
|
|
<DD>
|
|
|
|
Enable TCP Forward Acknowledgement support.
|
|
<DT id="30"><I>tcp_fin_timeout</I> (integer; default: 60; since Linux 2.2)
|
|
|
|
<DD>
|
|
|
|
This specifies how many seconds to wait for a final FIN packet before the
|
|
socket is forcibly closed.
|
|
This is strictly a violation of the TCP specification,
|
|
but required to prevent denial-of-service attacks.
|
|
In Linux 2.2, the default value was 180.
|
|
|
|
|
|
<DT id="31"><I>tcp_frto</I> (integer; default: see below; since Linux 2.4.21/2.6)
|
|
|
|
<DD>
|
|
|
|
Enable F-RTO, an enhanced recovery algorithm for TCP retransmission
|
|
timeouts (RTOs).
|
|
It is particularly beneficial in wireless environments
|
|
where packet loss is typically due to random radio interference
|
|
rather than intermediate router congestion.
|
|
See RFC 4138 for more details.
|
|
<DT id="32"><DD>
|
|
This file can have one of the following values:
|
|
<DL COMPACT><DT id="33"><DD>
|
|
<DL COMPACT>
|
|
<DT id="34">0<DD>
|
|
Disabled.
|
|
This was the default up to and including Linux 2.6.23.
|
|
<DT id="35">1<DD>
|
|
The basic version F-RTO algorithm is enabled.
|
|
<DT id="36">2<DD>
|
|
|
|
Enable SACK-enhanced F-RTO if flow uses SACK.
|
|
The basic version can be used also when
|
|
SACK is in use though in that case scenario(s) exists where F-RTO
|
|
interacts badly with the packet counting of the SACK-enabled TCP flow.
|
|
This value is the default since Linux 2.6.24.
|
|
</DL>
|
|
</DL>
|
|
|
|
<DT id="37"><DD>
|
|
Before Linux 2.6.22, this parameter was a Boolean value,
|
|
supporting just values 0 and 1 above.
|
|
<DT id="38"><I>tcp_frto_response</I> (integer; default: 0; since Linux 2.6.22)
|
|
|
|
<DD>
|
|
When F-RTO has detected that a TCP retransmission timeout was spurious
|
|
(i.e., the timeout would have been avoided had TCP set a
|
|
longer retransmission timeout),
|
|
TCP has several options concerning what to do next.
|
|
Possible values are:
|
|
<DL COMPACT><DT id="39"><DD>
|
|
<DL COMPACT>
|
|
<DT id="40">0<DD>
|
|
Rate halving based; a smooth and conservative response,
|
|
results in halved congestion window
|
|
(<I>cwnd</I>)
|
|
|
|
and slow-start threshold
|
|
(<I>ssthresh</I>)
|
|
|
|
after one RTT.
|
|
<DT id="41">1<DD>
|
|
Very conservative response; not recommended because even
|
|
though being valid, it interacts poorly with the rest of Linux TCP; halves
|
|
<I>cwnd</I>
|
|
|
|
and
|
|
<I>ssthresh</I>
|
|
|
|
immediately.
|
|
<DT id="42">2<DD>
|
|
Aggressive response; undoes congestion-control measures
|
|
that are now known to be unnecessary
|
|
(ignoring the possibility of a lost retransmission that would require
|
|
TCP to be more cautious);
|
|
<I>cwnd</I>
|
|
|
|
and
|
|
<I>ssthresh</I>
|
|
|
|
are restored to the values prior to timeout.
|
|
</DL>
|
|
</DL>
|
|
|
|
<DT id="43"><I>tcp_keepalive_intvl</I> (integer; default: 75; since Linux 2.4)
|
|
|
|
<DD>
|
|
|
|
The number of seconds between TCP keep-alive probes.
|
|
<DT id="44"><I>tcp_keepalive_probes</I> (integer; default: 9; since Linux 2.2)
|
|
|
|
<DD>
|
|
|
|
The maximum number of TCP keep-alive probes to send
|
|
before giving up and killing the connection if
|
|
no response is obtained from the other end.
|
|
<DT id="45"><I>tcp_keepalive_time</I> (integer; default: 7200; since Linux 2.2)
|
|
|
|
<DD>
|
|
|
|
The number of seconds a connection needs to be idle
|
|
before TCP begins sending out keep-alive probes.
|
|
Keep-alives are sent only when the
|
|
<B>SO_KEEPALIVE</B>
|
|
|
|
socket option is enabled.
|
|
The default value is 7200 seconds (2 hours).
|
|
An idle connection is terminated after
|
|
approximately an additional 11 minutes (9 probes an interval
|
|
of 75 seconds apart) when keep-alive is enabled.
|
|
<DT id="46"><DD>
|
|
Note that underlying connection tracking mechanisms and
|
|
application timeouts may be much shorter.
|
|
|
|
|
|
<DT id="47"><I>tcp_low_latency</I> (Boolean; default: disabled; since Linux 2.4.21/2.6; obsolete since Linux 4.14)
|
|
|
|
<DD>
|
|
|
|
If enabled, the TCP stack makes decisions that prefer lower
|
|
latency as opposed to higher throughput.
|
|
It this option is disabled, then higher throughput is preferred.
|
|
An example of an application where this default should be
|
|
changed would be a Beowulf compute cluster.
|
|
Since Linux 4.14,
|
|
|
|
this file still exists, but its value is ignored.
|
|
<DT id="48"><I>tcp_max_orphans</I> (integer; default: see below; since Linux 2.4)
|
|
|
|
<DD>
|
|
|
|
The maximum number of orphaned (not attached to any user file
|
|
handle) TCP sockets allowed in the system.
|
|
When this number is exceeded,
|
|
the orphaned connection is reset and a warning is printed.
|
|
This limit exists only to prevent simple denial-of-service attacks.
|
|
Lowering this limit is not recommended.
|
|
Network conditions might require you to increase the number of
|
|
orphans allowed, but note that each orphan can eat up to ~64 kB
|
|
of unswappable memory.
|
|
The default initial value is set equal to the kernel parameter NR_FILE.
|
|
This initial default is adjusted depending on the memory in the system.
|
|
<DT id="49"><I>tcp_max_syn_backlog</I> (integer; default: see below; since Linux 2.2)
|
|
|
|
<DD>
|
|
|
|
The maximum number of queued connection requests which have
|
|
still not received an acknowledgement from the connecting client.
|
|
If this number is exceeded, the kernel will begin
|
|
dropping requests.
|
|
The default value of 256 is increased to
|
|
1024 when the memory present in the system is adequate or
|
|
greater (>= 128 MB), and reduced to 128 for those systems with
|
|
very low memory (<= 32 MB).
|
|
<DT id="50"><DD>
|
|
Prior to Linux 2.6.20,
|
|
|
|
it was recommended that if this needed to be increased above 1024,
|
|
the size of the SYNACK hash table
|
|
(<B>TCP_SYNQ_HSIZE</B>)
|
|
|
|
in
|
|
<I>include/net/tcp.h</I>
|
|
|
|
should be modified to keep
|
|
<DT id="51"><DD>
|
|
<BR> TCP_SYNQ_HSIZE * 16 <= tcp_max_syn_backlog
|
|
<DT id="52"><DD>
|
|
and the kernel should be
|
|
recompiled.
|
|
In Linux 2.6.20, the fixed sized
|
|
<B>TCP_SYNQ_HSIZE</B>
|
|
|
|
was removed in favor of dynamic sizing.
|
|
<DT id="53"><I>tcp_max_tw_buckets</I> (integer; default: see below; since Linux 2.4)
|
|
|
|
<DD>
|
|
|
|
The maximum number of sockets in TIME_WAIT state allowed in
|
|
the system.
|
|
This limit exists only to prevent simple denial-of-service attacks.
|
|
The default value of NR_FILE*2 is adjusted
|
|
depending on the memory in the system.
|
|
If this number is
|
|
exceeded, the socket is closed and a warning is printed.
|
|
<DT id="54"><I>tcp_moderate_rcvbuf</I> (Boolean; default: enabled; since Linux 2.4.17/2.6.7)
|
|
|
|
<DD>
|
|
|
|
If enabled, TCP performs receive buffer auto-tuning,
|
|
attempting to automatically size the buffer (no greater than
|
|
<I>tcp_rmem[2]</I>)
|
|
|
|
to match the size required by the path for full throughput.
|
|
<DT id="55"><I>tcp_mem</I> (since Linux 2.4)
|
|
|
|
<DD>
|
|
|
|
This is a vector of 3 integers: [low, pressure, high].
|
|
These bounds, measured in units of the system page size,
|
|
are used by TCP to track its memory usage.
|
|
The defaults are calculated at boot time from the amount of
|
|
available memory.
|
|
(TCP can only use
|
|
<I>low memory</I>
|
|
|
|
for this, which is limited to around 900 megabytes on 32-bit systems.
|
|
64-bit systems do not suffer this limitation.)
|
|
<DL COMPACT><DT id="56"><DD>
|
|
<DL COMPACT>
|
|
<DT id="57"><I>low</I>
|
|
|
|
<DD>
|
|
TCP doesn't regulate its memory allocation when the number
|
|
of pages it has allocated globally is below this number.
|
|
<DT id="58"><I>pressure</I>
|
|
|
|
<DD>
|
|
When the amount of memory allocated by TCP
|
|
exceeds this number of pages, TCP moderates its memory consumption.
|
|
This memory pressure state is exited
|
|
once the number of pages allocated falls below
|
|
the
|
|
<I>low</I>
|
|
|
|
mark.
|
|
<DT id="59"><I>high</I>
|
|
|
|
<DD>
|
|
The maximum number of pages, globally, that TCP will allocate.
|
|
This value overrides any other limits imposed by the kernel.
|
|
</DL>
|
|
</DL>
|
|
|
|
<DT id="60"><I>tcp_mtu_probing</I> (integer; default: 0; since Linux 2.6.17)
|
|
|
|
<DD>
|
|
|
|
This parameter controls TCP Packetization-Layer Path MTU Discovery.
|
|
The following values may be assigned to the file:
|
|
<DL COMPACT><DT id="61"><DD>
|
|
<DL COMPACT>
|
|
<DT id="62">0<DD>
|
|
Disabled
|
|
<DT id="63">1<DD>
|
|
Disabled by default, enabled when an ICMP black hole detected
|
|
<DT id="64">2<DD>
|
|
Always enabled, use initial MSS of
|
|
<I>tcp_base_mss</I>.
|
|
|
|
</DL>
|
|
</DL>
|
|
|
|
<DT id="65"><I>tcp_no_metrics_save</I> (Boolean; default: disabled; since Linux 2.6.6)
|
|
|
|
<DD>
|
|
|
|
By default, TCP saves various connection metrics in the route cache
|
|
when the connection closes, so that connections established in the
|
|
near future can use these to set initial conditions.
|
|
Usually, this increases overall performance,
|
|
but it may sometimes cause performance degradation.
|
|
If
|
|
<I>tcp_no_metrics_save</I>
|
|
|
|
is enabled, TCP will not cache metrics on closing connections.
|
|
<DT id="66"><I>tcp_orphan_retries</I> (integer; default: 8; since Linux 2.4)
|
|
|
|
<DD>
|
|
|
|
The maximum number of attempts made to probe the other
|
|
end of a connection which has been closed by our end.
|
|
<DT id="67"><I>tcp_reordering</I> (integer; default: 3; since Linux 2.4)
|
|
|
|
<DD>
|
|
|
|
The maximum a packet can be reordered in a TCP packet stream
|
|
without TCP assuming packet loss and going into slow start.
|
|
It is not advisable to change this number.
|
|
This is a packet reordering detection metric designed to
|
|
minimize unnecessary back off and retransmits provoked by
|
|
reordering of packets on a connection.
|
|
<DT id="68"><I>tcp_retrans_collapse</I> (Boolean; default: enabled; since Linux 2.2)
|
|
|
|
<DD>
|
|
|
|
Try to send full-sized packets during retransmit.
|
|
<DT id="69"><I>tcp_retries1</I> (integer; default: 3; since Linux 2.2)
|
|
|
|
<DD>
|
|
|
|
The number of times TCP will attempt to retransmit a
|
|
packet on an established connection normally,
|
|
without the extra effort of getting the network layers involved.
|
|
Once we exceed this number of
|
|
retransmits, we first have the network layer
|
|
update the route if possible before each new retransmit.
|
|
The default is the RFC specified minimum of 3.
|
|
<DT id="70"><I>tcp_retries2</I> (integer; default: 15; since Linux 2.2)
|
|
|
|
<DD>
|
|
|
|
The maximum number of times a TCP packet is retransmitted
|
|
in established state before giving up.
|
|
The default value is 15, which corresponds to a duration of
|
|
approximately between 13 to 30 minutes, depending
|
|
on the retransmission timeout.
|
|
The RFC 1122 specified
|
|
minimum limit of 100 seconds is typically deemed too short.
|
|
<DT id="71"><I>tcp_rfc1337</I> (Boolean; default: disabled; since Linux 2.2)
|
|
|
|
<DD>
|
|
|
|
Enable TCP behavior conformant with RFC 1337.
|
|
When disabled,
|
|
if a RST is received in TIME_WAIT state, we close
|
|
the socket immediately without waiting for the end
|
|
of the TIME_WAIT period.
|
|
<DT id="72"><I>tcp_rmem</I> (since Linux 2.4)
|
|
|
|
<DD>
|
|
|
|
This is a vector of 3 integers: [min, default, max].
|
|
These parameters are used by TCP to regulate receive buffer sizes.
|
|
TCP dynamically adjusts the size of the
|
|
receive buffer from the defaults listed below, in the range
|
|
of these values, depending on memory available in the system.
|
|
<DL COMPACT><DT id="73"><DD>
|
|
<DL COMPACT>
|
|
<DT id="74"><I>min</I>
|
|
|
|
<DD>
|
|
minimum size of the receive buffer used by each TCP socket.
|
|
The default value is the system page size.
|
|
(On Linux 2.4, the default value is 4 kB, lowered to
|
|
<B>PAGE_SIZE</B>
|
|
|
|
bytes in low-memory systems.)
|
|
This value
|
|
is used to ensure that in memory pressure mode,
|
|
allocations below this size will still succeed.
|
|
This is not
|
|
used to bound the size of the receive buffer declared
|
|
using
|
|
<B>SO_RCVBUF</B>
|
|
|
|
on a socket.
|
|
<DT id="75"><I>default</I>
|
|
|
|
<DD>
|
|
the default size of the receive buffer for a TCP socket.
|
|
This value overwrites the initial default buffer size from
|
|
the generic global
|
|
<I>net.core.rmem_default</I>
|
|
|
|
defined for all protocols.
|
|
The default value is 87380 bytes.
|
|
(On Linux 2.4, this will be lowered to 43689 in low-memory systems.)
|
|
If larger receive buffer sizes are desired, this value should
|
|
be increased (to affect all sockets).
|
|
To employ large TCP windows, the
|
|
<I>net.ipv4.tcp_window_scaling</I>
|
|
|
|
must be enabled (default).
|
|
<DT id="76"><I>max</I>
|
|
|
|
<DD>
|
|
the maximum size of the receive buffer used by each TCP socket.
|
|
This value does not override the global
|
|
<I>net.core.rmem_max</I>.
|
|
|
|
This is not used to limit the size of the receive buffer declared using
|
|
<B>SO_RCVBUF</B>
|
|
|
|
on a socket.
|
|
The default value is calculated using the formula
|
|
<DT id="77"><DD>
|
|
<BR> max(87380, min(4 MB, <I>tcp_mem</I>[1]*PAGE_SIZE/128))
|
|
<DT id="78"><DD>
|
|
(On Linux 2.4, the default is 87380*2 bytes,
|
|
lowered to 87380 in low-memory systems).
|
|
</DL>
|
|
</DL>
|
|
|
|
<DT id="79"><I>tcp_sack</I> (Boolean; default: enabled; since Linux 2.2)
|
|
|
|
<DD>
|
|
|
|
Enable RFC 2018 TCP Selective Acknowledgements.
|
|
<DT id="80"><I>tcp_slow_start_after_idle</I> (Boolean; default: enabled; since Linux 2.6.18)
|
|
|
|
<DD>
|
|
|
|
If enabled, provide RFC 2861 behavior and time out the congestion
|
|
window after an idle period.
|
|
An idle period is defined as the current RTO (retransmission timeout).
|
|
If disabled, the congestion window will not
|
|
be timed out after an idle period.
|
|
<DT id="81"><I>tcp_stdurg</I> (Boolean; default: disabled; since Linux 2.2)
|
|
|
|
<DD>
|
|
|
|
If this option is enabled, then use the RFC 1122 interpretation
|
|
of the TCP urgent-pointer field.
|
|
|
|
|
|
|
|
According to this interpretation, the urgent pointer points
|
|
to the last byte of urgent data.
|
|
If this option is disabled, then use the BSD-compatible interpretation of
|
|
the urgent pointer:
|
|
the urgent pointer points to the first byte after the urgent data.
|
|
Enabling this option may lead to interoperability problems.
|
|
<DT id="82"><I>tcp_syn_retries</I> (integer; default: 5; since Linux 2.2)
|
|
|
|
<DD>
|
|
|
|
The maximum number of times initial SYNs for an active TCP
|
|
connection attempt will be retransmitted.
|
|
This value should not be higher than 255.
|
|
The default value is 5, which corresponds to approximately 180 seconds.
|
|
<DT id="83"><I>tcp_synack_retries</I> (integer; default: 5; since Linux 2.2)
|
|
|
|
<DD>
|
|
|
|
The maximum number of times a SYN/ACK segment
|
|
for a passive TCP connection will be retransmitted.
|
|
This number should not be higher than 255.
|
|
<DT id="84"><I>tcp_syncookies</I> (Boolean; since Linux 2.2)
|
|
|
|
<DD>
|
|
|
|
Enable TCP syncookies.
|
|
The kernel must be compiled with
|
|
<B>CONFIG_SYN_COOKIES</B>.
|
|
|
|
Send out syncookies when the syn backlog queue of a socket overflows.
|
|
The syncookies feature attempts to protect a
|
|
socket from a SYN flood attack.
|
|
This should be used as a last resort, if at all.
|
|
This is a violation of the TCP protocol,
|
|
and conflicts with other areas of TCP such as TCP extensions.
|
|
It can cause problems for clients and relays.
|
|
It is not recommended as a tuning mechanism for heavily
|
|
loaded servers to help with overloaded or misconfigured conditions.
|
|
For recommended alternatives see
|
|
<I>tcp_max_syn_backlog</I>,
|
|
|
|
<I>tcp_synack_retries</I>,
|
|
|
|
and
|
|
<I>tcp_abort_on_overflow</I>.
|
|
|
|
<DT id="85"><I>tcp_timestamps</I> (integer; default: 1; since Linux 2.2)
|
|
|
|
<DD>
|
|
|
|
Set to one of the following values to enable or disable RFC 1323
|
|
TCP timestamps:
|
|
<DL COMPACT><DT id="86"><DD>
|
|
<DL COMPACT>
|
|
<DT id="87">0<DD>
|
|
Disable timestamps.
|
|
<DT id="88">1<DD>
|
|
Enable timestamps as defined in RFC1323 and use random offset for
|
|
each connection rather than only using the current time.
|
|
<DT id="89">2<DD>
|
|
As for the value 1, but without random offsets.
|
|
|
|
Setting
|
|
<I>tcp_timestamps</I>
|
|
|
|
to this value is meaningful since Linux 4.10.
|
|
</DL>
|
|
</DL>
|
|
|
|
<DT id="90"><I>tcp_tso_win_divisor</I> (integer; default: 3; since Linux 2.6.9)
|
|
|
|
<DD>
|
|
This parameter controls what percentage of the congestion window
|
|
can be consumed by a single TCP Segmentation Offload (TSO) frame.
|
|
The setting of this parameter is a tradeoff between burstiness and
|
|
building larger TSO frames.
|
|
<DT id="91"><I>tcp_tw_recycle</I> (Boolean; default: disabled; Linux 2.4 to 4.11)
|
|
|
|
<DD>
|
|
|
|
|
|
Enable fast recycling of TIME_WAIT sockets.
|
|
Enabling this option is
|
|
not recommended as the remote IP may not use monotonically increasing
|
|
timestamps (devices behind NAT, devices with per-connection timestamp
|
|
offsets).
|
|
See RFC 1323 (PAWS) and RFC 6191.
|
|
|
|
|
|
<DT id="92"><I>tcp_tw_reuse</I> (Boolean; default: disabled; since Linux 2.4.19/2.6)
|
|
|
|
<DD>
|
|
|
|
Allow to reuse TIME_WAIT sockets for new connections when it is
|
|
safe from protocol viewpoint.
|
|
It should not be changed without advice/request of technical experts.
|
|
|
|
|
|
<DT id="93"><I>tcp_vegas_cong_avoid</I> (Boolean; default: disabled; Linux 2.2 to 2.6.13)
|
|
|
|
<DD>
|
|
|
|
Enable TCP Vegas congestion avoidance algorithm.
|
|
TCP Vegas is a sender-side-only change to TCP that anticipates
|
|
the onset of congestion by estimating the bandwidth.
|
|
TCP Vegas adjusts the sending rate by modifying the congestion window.
|
|
TCP Vegas should provide less packet loss, but it is
|
|
not as aggressive as TCP Reno.
|
|
|
|
|
|
<DT id="94"><I>tcp_westwood</I> (Boolean; default: disabled; Linux 2.4.26/2.6.3 to 2.6.13)
|
|
|
|
<DD>
|
|
Enable TCP Westwood+ congestion control algorithm.
|
|
TCP Westwood+ is a sender-side-only modification of the TCP Reno
|
|
protocol stack that optimizes the performance of TCP congestion control.
|
|
It is based on end-to-end bandwidth estimation to set
|
|
congestion window and slow start threshold after a congestion episode.
|
|
Using this estimation, TCP Westwood+ adaptively sets a
|
|
slow start threshold and a congestion window which takes into
|
|
account the bandwidth used at the time congestion is experienced.
|
|
TCP Westwood+ significantly increases fairness with respect to
|
|
TCP Reno in wired networks and throughput over wireless links.
|
|
<DT id="95"><I>tcp_window_scaling</I> (Boolean; default: enabled; since Linux 2.2)
|
|
|
|
<DD>
|
|
|
|
Enable RFC 1323 TCP window scaling.
|
|
This feature allows the use of a large window
|
|
(> 64 kB) on a TCP connection, should the other end support it.
|
|
Normally, the 16 bit window length field in the TCP header
|
|
limits the window size to less than 64 kB.
|
|
If larger windows are desired, applications can increase the size of
|
|
their socket buffers and the window scaling option will be employed.
|
|
If
|
|
<I>tcp_window_scaling</I>
|
|
|
|
is disabled, TCP will not negotiate the use of window
|
|
scaling with the other end during connection setup.
|
|
<DT id="96"><I>tcp_wmem</I> (since Linux 2.4)
|
|
|
|
<DD>
|
|
|
|
This is a vector of 3 integers: [min, default, max].
|
|
These parameters are used by TCP to regulate send buffer sizes.
|
|
TCP dynamically adjusts the size of the send buffer from the
|
|
default values listed below, in the range of these values,
|
|
depending on memory available.
|
|
<DL COMPACT><DT id="97"><DD>
|
|
<DL COMPACT>
|
|
<DT id="98"><I>min</I>
|
|
|
|
<DD>
|
|
Minimum size of the send buffer used by each TCP socket.
|
|
The default value is the system page size.
|
|
(On Linux 2.4, the default value is 4 kB.)
|
|
This value is used to ensure that in memory pressure mode,
|
|
allocations below this size will still succeed.
|
|
This is not used to bound the size of the send buffer declared using
|
|
<B>SO_SNDBUF</B>
|
|
|
|
on a socket.
|
|
<DT id="99"><I>default</I>
|
|
|
|
<DD>
|
|
The default size of the send buffer for a TCP socket.
|
|
This value overwrites the initial default buffer size from
|
|
the generic global
|
|
<I>/proc/sys/net/core/wmem_default</I>
|
|
|
|
defined for all protocols.
|
|
The default value is 16 kB.
|
|
|
|
If larger send buffer sizes are desired, this value
|
|
should be increased (to affect all sockets).
|
|
To employ large TCP windows, the
|
|
<I>/proc/sys/net/ipv4/tcp_window_scaling</I>
|
|
|
|
must be set to a nonzero value (default).
|
|
<DT id="100"><I>max</I>
|
|
|
|
<DD>
|
|
The maximum size of the send buffer used by each TCP socket.
|
|
This value does not override the value in
|
|
<I>/proc/sys/net/core/wmem_max</I>.
|
|
|
|
This is not used to limit the size of the send buffer declared using
|
|
<B>SO_SNDBUF</B>
|
|
|
|
on a socket.
|
|
The default value is calculated using the formula
|
|
<DT id="101"><DD>
|
|
<BR> max(65536, min(4 MB, <I>tcp_mem</I>[1]*PAGE_SIZE/128))
|
|
<DT id="102"><DD>
|
|
(On Linux 2.4, the default value is 128 kB,
|
|
lowered 64 kB depending on low-memory systems.)
|
|
</DL>
|
|
</DL>
|
|
|
|
<DT id="103"><I>tcp_workaround_signed_windows</I> (Boolean; default: disabled; since Linux 2.6.26)
|
|
|
|
<DD>
|
|
If enabled, assume that no receipt of a window-scaling option means that the
|
|
remote TCP is broken and treats the window as a signed quantity.
|
|
If disabled, assume that the remote TCP is not broken even if we do
|
|
not receive a window scaling option from it.
|
|
</DL>
|
|
<A NAME="lbAG"> </A>
|
|
<H3>Socket options</H3>
|
|
|
|
To set or get a TCP socket option, call
|
|
<B><A HREF="/cgi-bin/man/man2html?2+getsockopt">getsockopt</A></B>(2)
|
|
|
|
to read or
|
|
<B><A HREF="/cgi-bin/man/man2html?2+setsockopt">setsockopt</A></B>(2)
|
|
|
|
to write the option with the option level argument set to
|
|
<B>IPPROTO_TCP</B>.
|
|
|
|
Unless otherwise noted,
|
|
<I>optval</I>
|
|
|
|
is a pointer to an
|
|
<I>int</I>.
|
|
|
|
|
|
In addition,
|
|
most
|
|
<B>IPPROTO_IP</B>
|
|
|
|
socket options are valid on TCP sockets.
|
|
For more information see
|
|
<B><A HREF="/cgi-bin/man/man2html?7+ip">ip</A></B>(7).
|
|
|
|
<P>
|
|
|
|
Following is a list of TCP-specific socket options.
|
|
For details of some other socket options that are also applicable
|
|
for TCP sockets, see
|
|
<B><A HREF="/cgi-bin/man/man2html?7+socket">socket</A></B>(7).
|
|
|
|
<DL COMPACT>
|
|
<DT id="104"><B>TCP_CONGESTION</B> (since Linux 2.6.13)
|
|
|
|
<DD>
|
|
|
|
|
|
The argument for this option is a string.
|
|
This option allows the caller to set the TCP congestion control
|
|
algorithm to be used, on a per-socket basis.
|
|
Unprivileged processes are restricted to choosing one of the algorithms in
|
|
<I>tcp_allowed_congestion_control</I>
|
|
|
|
(described above).
|
|
Privileged processes
|
|
(<B>CAP_NET_ADMIN</B>)
|
|
|
|
can choose from any of the available congestion-control algorithms
|
|
(see the description of
|
|
<I>tcp_available_congestion_control</I>
|
|
|
|
above).
|
|
<DT id="105"><B>TCP_CORK</B> (since Linux 2.2)
|
|
|
|
<DD>
|
|
|
|
If set, don't send out partial frames.
|
|
All queued partial frames are sent when the option is cleared again.
|
|
This is useful for prepending headers before calling
|
|
<B><A HREF="/cgi-bin/man/man2html?2+sendfile">sendfile</A></B>(2),
|
|
|
|
or for throughput optimization.
|
|
As currently implemented, there is a 200 millisecond ceiling on the time
|
|
for which output is corked by
|
|
<B>TCP_CORK</B>.
|
|
|
|
If this ceiling is reached, then queued data is automatically transmitted.
|
|
This option can be combined with
|
|
<B>TCP_NODELAY</B>
|
|
|
|
only since Linux 2.5.71.
|
|
This option should not be used in code intended to be portable.
|
|
<DT id="106"><B>TCP_DEFER_ACCEPT</B> (since Linux 2.4)
|
|
|
|
<DD>
|
|
|
|
|
|
|
|
|
|
Allow a listener to be awakened only when data arrives on the socket.
|
|
Takes an integer value (seconds), this can
|
|
bound the maximum number of attempts TCP will make to
|
|
complete the connection.
|
|
This option should not be used in code intended to be portable.
|
|
<DT id="107"><B>TCP_INFO</B> (since Linux 2.4)
|
|
|
|
<DD>
|
|
Used to collect information about this socket.
|
|
The kernel returns a <I>struct tcp_info</I> as defined in the file
|
|
<I>/usr/include/linux/tcp.h</I>.
|
|
|
|
This option should not be used in code intended to be portable.
|
|
<DT id="108"><B>TCP_KEEPCNT</B> (since Linux 2.4)
|
|
|
|
<DD>
|
|
|
|
The maximum number of keepalive probes TCP should send
|
|
before dropping the connection.
|
|
This option should not be
|
|
used in code intended to be portable.
|
|
<DT id="109"><B>TCP_KEEPIDLE</B> (since Linux 2.4)
|
|
|
|
<DD>
|
|
|
|
The time (in seconds) the connection needs to remain idle
|
|
before TCP starts sending keepalive probes, if the socket
|
|
option
|
|
<B>SO_KEEPALIVE</B>
|
|
|
|
has been set on this socket.
|
|
This option should not be used in code intended to be portable.
|
|
<DT id="110"><B>TCP_KEEPINTVL</B> (since Linux 2.4)
|
|
|
|
<DD>
|
|
|
|
The time (in seconds) between individual keepalive probes.
|
|
This option should not be used in code intended to be portable.
|
|
<DT id="111"><B>TCP_LINGER2</B> (since Linux 2.4)
|
|
|
|
<DD>
|
|
|
|
The lifetime of orphaned FIN_WAIT2 state sockets.
|
|
This option can be used to override the system-wide setting in the file
|
|
<I>/proc/sys/net/ipv4/tcp_fin_timeout</I>
|
|
|
|
for this socket.
|
|
This is not to be confused with the
|
|
<B><A HREF="/cgi-bin/man/man2html?7+socket">socket</A></B>(7)
|
|
|
|
level option
|
|
<B>SO_LINGER</B>.
|
|
|
|
This option should not be used in code intended to be portable.
|
|
<DT id="112"><B>TCP_MAXSEG</B>
|
|
|
|
<DD>
|
|
|
|
The maximum segment size for outgoing TCP packets.
|
|
In Linux 2.2 and earlier, and in Linux 2.6.28 and later,
|
|
if this option is set before connection establishment, it also
|
|
changes the MSS value announced to the other end in the initial packet.
|
|
Values greater than the (eventual) interface MTU have no effect.
|
|
TCP will also impose
|
|
its minimum and maximum bounds over the value provided.
|
|
<DT id="113"><B>TCP_NODELAY</B>
|
|
|
|
<DD>
|
|
|
|
If set, disable the Nagle algorithm.
|
|
This means that segments
|
|
are always sent as soon as possible, even if there is only a
|
|
small amount of data.
|
|
When not set, data is buffered until there
|
|
is a sufficient amount to send out, thereby avoiding the
|
|
frequent sending of small packets, which results in poor
|
|
utilization of the network.
|
|
This option is overridden by
|
|
<B>TCP_CORK</B>;
|
|
|
|
however, setting this option forces an explicit flush of
|
|
pending output, even if
|
|
<B>TCP_CORK</B>
|
|
|
|
is currently set.
|
|
<DT id="114"><B>TCP_QUICKACK</B> (since Linux 2.4.4)
|
|
|
|
<DD>
|
|
Enable quickack mode if set or disable quickack
|
|
mode if cleared.
|
|
In quickack mode, acks are sent
|
|
immediately, rather than delayed if needed in accordance
|
|
to normal TCP operation.
|
|
This flag is not permanent,
|
|
it only enables a switch to or from quickack mode.
|
|
Subsequent operation of the TCP protocol will
|
|
once again enter/leave quickack mode depending on
|
|
internal protocol processing and factors such as
|
|
delayed ack timeouts occurring and data transfer.
|
|
This option should not be used in code intended to be
|
|
portable.
|
|
<DT id="115"><B>TCP_SYNCNT</B> (since Linux 2.4)
|
|
|
|
<DD>
|
|
|
|
Set the number of SYN retransmits that TCP should send before
|
|
aborting the attempt to connect.
|
|
It cannot exceed 255.
|
|
This option should not be used in code intended to be portable.
|
|
<DT id="116"><B>TCP_USER_TIMEOUT</B> (since Linux 2.6.37)
|
|
|
|
<DD>
|
|
|
|
|
|
|
|
|
|
|
|
This option takes an
|
|
<I>unsigned int</I>
|
|
|
|
as an argument.
|
|
When the value is greater than 0,
|
|
it specifies the maximum amount of time in milliseconds that transmitted
|
|
data may remain unacknowledged before TCP will forcibly close the
|
|
corresponding connection and return
|
|
<B>ETIMEDOUT</B>
|
|
|
|
to the application.
|
|
If the option value is specified as 0,
|
|
TCP will use the system default.
|
|
<DT id="117"><DD>
|
|
Increasing user timeouts allows a TCP connection to survive extended
|
|
periods without end-to-end connectivity.
|
|
Decreasing user timeouts
|
|
allows applications to "fail fast", if so desired.
|
|
Otherwise, failure may take up to 20 minutes with
|
|
the current system defaults in a normal WAN environment.
|
|
<DT id="118"><DD>
|
|
This option can be set during any state of a TCP connection,
|
|
but is effective only during the synchronized states of a connection
|
|
(ESTABLISHED, FIN-WAIT-1, FIN-WAIT-2, CLOSE-WAIT, CLOSING, and LAST-ACK).
|
|
Moreover, when used with the TCP keepalive
|
|
(<B>SO_KEEPALIVE</B>)
|
|
|
|
option,
|
|
<B>TCP_USER_TIMEOUT</B>
|
|
|
|
will override keepalive to determine when to close a
|
|
connection due to keepalive failure.
|
|
<DT id="119"><DD>
|
|
The option has no effect on when TCP retransmits a packet,
|
|
nor when a keepalive probe is sent.
|
|
<DT id="120"><DD>
|
|
This option, like many others, will be inherited by the socket returned by
|
|
<B><A HREF="/cgi-bin/man/man2html?2+accept">accept</A></B>(2),
|
|
|
|
if it was set on the listening socket.
|
|
<DT id="121"><DD>
|
|
Further details on the user timeout feature can be found in
|
|
RFC 793 and RFC 5482 ("TCP User Timeout Option").
|
|
<DT id="122"><B>TCP_WINDOW_CLAMP</B> (since Linux 2.4)
|
|
|
|
<DD>
|
|
|
|
Bound the size of the advertised window to this value.
|
|
The kernel imposes a minimum size of SOCK_MIN_RCVBUF/2.
|
|
This option should not be used in code intended to be
|
|
portable.
|
|
</DL>
|
|
<A NAME="lbAH"> </A>
|
|
<H3>Sockets API</H3>
|
|
|
|
TCP provides limited support for out-of-band data,
|
|
in the form of (a single byte of) urgent data.
|
|
In Linux this means if the other end sends newer out-of-band
|
|
data the older urgent data is inserted as normal data into
|
|
the stream (even when
|
|
<B>SO_OOBINLINE</B>
|
|
|
|
is not set).
|
|
This differs from BSD-based stacks.
|
|
<P>
|
|
|
|
Linux uses the BSD compatible interpretation of the urgent
|
|
pointer field by default.
|
|
This violates RFC 1122, but is
|
|
required for interoperability with other stacks.
|
|
It can be changed via
|
|
<I>/proc/sys/net/ipv4/tcp_stdurg</I>.
|
|
|
|
<P>
|
|
|
|
It is possible to peek at out-of-band data using the
|
|
<B><A HREF="/cgi-bin/man/man2html?2+recv">recv</A></B>(2)
|
|
|
|
<B>MSG_PEEK</B>
|
|
|
|
flag.
|
|
<P>
|
|
|
|
Since version 2.4, Linux supports the use of
|
|
<B>MSG_TRUNC</B>
|
|
|
|
in the
|
|
<I>flags</I>
|
|
|
|
argument of
|
|
<B><A HREF="/cgi-bin/man/man2html?2+recv">recv</A></B>(2)
|
|
|
|
(and
|
|
<B><A HREF="/cgi-bin/man/man2html?2+recvmsg">recvmsg</A></B>(2)).
|
|
|
|
This flag causes the received bytes of data to be discarded,
|
|
rather than passed back in a caller-supplied buffer.
|
|
Since Linux 2.4.4,
|
|
<B>MSG_TRUNC</B>
|
|
|
|
also has this effect when used in conjunction with
|
|
<B>MSG_OOB</B>
|
|
|
|
to receive out-of-band data.
|
|
<A NAME="lbAI"> </A>
|
|
<H3>Ioctls</H3>
|
|
|
|
The following
|
|
<B><A HREF="/cgi-bin/man/man2html?2+ioctl">ioctl</A></B>(2)
|
|
|
|
calls return information in
|
|
<I>value</I>.
|
|
|
|
The correct syntax is:
|
|
<P>
|
|
|
|
<DL COMPACT><DT id="123"><DD>
|
|
<PRE>
|
|
<B>int</B><I> value;</I>
|
|
<I>error</I><B> = ioctl(</B><I>tcp_socket</I><B>, </B><I>ioctl_type</I><B>, &</B><I>value</I><B>);</B>
|
|
</PRE>
|
|
|
|
</DL>
|
|
|
|
<P>
|
|
|
|
<I>ioctl_type</I>
|
|
|
|
is one of the following:
|
|
<DL COMPACT>
|
|
<DT id="124"><B>SIOCINQ</B>
|
|
|
|
<DD>
|
|
Returns the amount of queued unread data in the receive buffer.
|
|
The socket must not be in LISTEN state, otherwise an error
|
|
(<B>EINVAL</B>)
|
|
|
|
is returned.
|
|
<B>SIOCINQ</B>
|
|
|
|
is defined in
|
|
<I><<A HREF="file:///usr/include/linux/sockios.h">linux/sockios.h</A>></I>.
|
|
|
|
|
|
|
|
Alternatively,
|
|
you can use the synonymous
|
|
<B>FIONREAD</B>,
|
|
|
|
defined in
|
|
<I><<A HREF="file:///usr/include/sys/ioctl.h">sys/ioctl.h</A>></I>.
|
|
|
|
<DT id="125"><B>SIOCATMARK</B>
|
|
|
|
<DD>
|
|
Returns true (i.e.,
|
|
<I>value</I>
|
|
|
|
is nonzero) if the inbound data stream is at the urgent mark.
|
|
<DT id="126"><DD>
|
|
If the
|
|
<B>SO_OOBINLINE</B>
|
|
|
|
socket option is set, and
|
|
<B>SIOCATMARK</B>
|
|
|
|
returns true, then the
|
|
next read from the socket will return the urgent data.
|
|
If the
|
|
<B>SO_OOBINLINE</B>
|
|
|
|
socket option is not set, and
|
|
<B>SIOCATMARK</B>
|
|
|
|
returns true, then the
|
|
next read from the socket will return the bytes following
|
|
the urgent data (to actually read the urgent data requires the
|
|
<B>recv(MSG_OOB)</B>
|
|
|
|
flag).
|
|
<DT id="127"><DD>
|
|
Note that a read never reads across the urgent mark.
|
|
If an application is informed of the presence of urgent data via
|
|
<B><A HREF="/cgi-bin/man/man2html?2+select">select</A></B>(2)
|
|
|
|
(using the
|
|
<I>exceptfds</I>
|
|
|
|
argument) or through delivery of a
|
|
<B>SIGURG</B>
|
|
|
|
signal,
|
|
then it can advance up to the mark using a loop which repeatedly tests
|
|
<B>SIOCATMARK</B>
|
|
|
|
and performs a read (requesting any number of bytes) as long as
|
|
<B>SIOCATMARK</B>
|
|
|
|
returns false.
|
|
<DT id="128"><B>SIOCOUTQ</B>
|
|
|
|
<DD>
|
|
Returns the amount of unsent data in the socket send queue.
|
|
The socket must not be in LISTEN state, otherwise an error
|
|
(<B>EINVAL</B>)
|
|
|
|
is returned.
|
|
<B>SIOCOUTQ</B>
|
|
|
|
is defined in
|
|
<I><<A HREF="file:///usr/include/linux/sockios.h">linux/sockios.h</A>></I>.
|
|
|
|
|
|
|
|
Alternatively,
|
|
you can use the synonymous
|
|
<B>TIOCOUTQ</B>,
|
|
|
|
defined in
|
|
<I><<A HREF="file:///usr/include/sys/ioctl.h">sys/ioctl.h</A>></I>.
|
|
|
|
</DL>
|
|
<A NAME="lbAJ"> </A>
|
|
<H3>Error handling</H3>
|
|
|
|
When a network error occurs, TCP tries to resend the packet.
|
|
If it doesn't succeed after some time, either
|
|
<B>ETIMEDOUT</B>
|
|
|
|
or the last received error on this connection is reported.
|
|
<P>
|
|
|
|
Some applications require a quicker error notification.
|
|
This can be enabled with the
|
|
<B>IPPROTO_IP</B>
|
|
|
|
level
|
|
<B>IP_RECVERR</B>
|
|
|
|
socket option.
|
|
When this option is enabled, all incoming
|
|
errors are immediately passed to the user program.
|
|
Use this option with care --- it makes TCP less tolerant to routing
|
|
changes and other normal network conditions.
|
|
<A NAME="lbAK"> </A>
|
|
<H2>ERRORS</H2>
|
|
|
|
<DL COMPACT>
|
|
<DT id="129"><B>EAFNOTSUPPORT</B>
|
|
|
|
<DD>
|
|
Passed socket address type in
|
|
<I>sin_family</I>
|
|
|
|
was not
|
|
<B>AF_INET</B>.
|
|
|
|
<DT id="130"><B>EPIPE</B>
|
|
|
|
<DD>
|
|
The other end closed the socket unexpectedly or a read is
|
|
executed on a shut down socket.
|
|
<DT id="131"><B>ETIMEDOUT</B>
|
|
|
|
<DD>
|
|
The other end didn't acknowledge retransmitted data after some time.
|
|
</DL>
|
|
<P>
|
|
|
|
Any errors defined for
|
|
<B><A HREF="/cgi-bin/man/man2html?7+ip">ip</A></B>(7)
|
|
|
|
or the generic socket layer may also be returned for TCP.
|
|
<A NAME="lbAL"> </A>
|
|
<H2>VERSIONS</H2>
|
|
|
|
Support for Explicit Congestion Notification, zero-copy
|
|
<B><A HREF="/cgi-bin/man/man2html?2+sendfile">sendfile</A></B>(2),
|
|
|
|
reordering support and some SACK extensions
|
|
(DSACK) were introduced in 2.4.
|
|
Support for forward acknowledgement (FACK), TIME_WAIT recycling,
|
|
and per-connection keepalive socket options were introduced in 2.3.
|
|
<A NAME="lbAM"> </A>
|
|
<H2>BUGS</H2>
|
|
|
|
Not all errors are documented.
|
|
<BR>
|
|
|
|
IPv6 is not described.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
<A NAME="lbAN"> </A>
|
|
<H2>SEE ALSO</H2>
|
|
|
|
<B><A HREF="/cgi-bin/man/man2html?2+accept">accept</A></B>(2),
|
|
|
|
<B><A HREF="/cgi-bin/man/man2html?2+bind">bind</A></B>(2),
|
|
|
|
<B><A HREF="/cgi-bin/man/man2html?2+connect">connect</A></B>(2),
|
|
|
|
<B><A HREF="/cgi-bin/man/man2html?2+getsockopt">getsockopt</A></B>(2),
|
|
|
|
<B><A HREF="/cgi-bin/man/man2html?2+listen">listen</A></B>(2),
|
|
|
|
<B><A HREF="/cgi-bin/man/man2html?2+recvmsg">recvmsg</A></B>(2),
|
|
|
|
<B><A HREF="/cgi-bin/man/man2html?2+sendfile">sendfile</A></B>(2),
|
|
|
|
<B><A HREF="/cgi-bin/man/man2html?2+sendmsg">sendmsg</A></B>(2),
|
|
|
|
<B><A HREF="/cgi-bin/man/man2html?2+socket">socket</A></B>(2),
|
|
|
|
<B><A HREF="/cgi-bin/man/man2html?7+ip">ip</A></B>(7),
|
|
|
|
<B><A HREF="/cgi-bin/man/man2html?7+socket">socket</A></B>(7)
|
|
|
|
<P>
|
|
|
|
RFC 793 for the TCP specification.
|
|
<BR>
|
|
|
|
RFC 1122 for the TCP requirements and a description of the Nagle algorithm.
|
|
<BR>
|
|
|
|
RFC 1323 for TCP timestamp and window scaling options.
|
|
<BR>
|
|
|
|
RFC 1337 for a description of TIME_WAIT assassination hazards.
|
|
<BR>
|
|
|
|
RFC 3168 for a description of Explicit Congestion Notification.
|
|
<BR>
|
|
|
|
RFC 2581 for TCP congestion control algorithms.
|
|
<BR>
|
|
|
|
RFC 2018 and RFC 2883 for SACK and extensions to SACK.
|
|
<A NAME="lbAO"> </A>
|
|
<H2>COLOPHON</H2>
|
|
|
|
This page is part of release 5.05 of the Linux
|
|
<I>man-pages</I>
|
|
|
|
project.
|
|
A description of the project,
|
|
information about reporting bugs,
|
|
and the latest version of this page,
|
|
can be found at
|
|
<A HREF="https://www.kernel.org/doc/man-pages/.">https://www.kernel.org/doc/man-pages/.</A>
|
|
<P>
|
|
|
|
<HR>
|
|
<A NAME="index"> </A><H2>Index</H2>
|
|
<DL>
|
|
<DT id="132"><A HREF="#lbAB">NAME</A><DD>
|
|
<DT id="133"><A HREF="#lbAC">SYNOPSIS</A><DD>
|
|
<DT id="134"><A HREF="#lbAD">DESCRIPTION</A><DD>
|
|
<DL>
|
|
<DT id="135"><A HREF="#lbAE">Address formats</A><DD>
|
|
<DT id="136"><A HREF="#lbAF">/proc interfaces</A><DD>
|
|
<DT id="137"><A HREF="#lbAG">Socket options</A><DD>
|
|
<DT id="138"><A HREF="#lbAH">Sockets API</A><DD>
|
|
<DT id="139"><A HREF="#lbAI">Ioctls</A><DD>
|
|
<DT id="140"><A HREF="#lbAJ">Error handling</A><DD>
|
|
</DL>
|
|
<DT id="141"><A HREF="#lbAK">ERRORS</A><DD>
|
|
<DT id="142"><A HREF="#lbAL">VERSIONS</A><DD>
|
|
<DT id="143"><A HREF="#lbAM">BUGS</A><DD>
|
|
<DT id="144"><A HREF="#lbAN">SEE ALSO</A><DD>
|
|
<DT id="145"><A HREF="#lbAO">COLOPHON</A><DD>
|
|
</DL>
|
|
<HR>
|
|
This document was created by
|
|
<A HREF="/cgi-bin/man/man2html">man2html</A>,
|
|
using the manual pages.<BR>
|
|
Time: 00:06:10 GMT, March 31, 2021
|
|
</BODY>
|
|
</HTML>
|