forked from I2P_Developers/i2p.www
Moved transport-related pages into docs
This commit is contained in:
@@ -147,13 +147,13 @@ Selecting peers, requesting tunnels through those peers, and encrypting and rout
|
||||
<h3>Transport Layer</h3>
|
||||
The protocols for direct (point-to-point) router to router communication.
|
||||
<ul><li>
|
||||
<a href="transport.html">Transport layer overview</a>
|
||||
<a href="{{ site_url('docs/transport') }}">Transport layer overview</a>
|
||||
</li><li>
|
||||
<a href="ntcp.html">NTCP</a> TCP-based transport overview and specification
|
||||
<a href="{{ site_url('docs/transport/ntcp') }}">NTCP</a> TCP-based transport overview and specification
|
||||
</li><li>
|
||||
<a href="udp.html">SSU</a> UDP-based transport overview
|
||||
<a href="{{ site_url('docs/transport/ssu') }}">SSU</a> UDP-based transport overview
|
||||
</li><li>
|
||||
<a href="udp_spec.html">SSU specification</a>
|
||||
<a href="{{ site_url('docs/transport/ssu/spec') }}">SSU specification</a>
|
||||
</li><li>
|
||||
<a href="{{ site_url('docs/how/cryptography') }}#tcp">NTCP transport encryption</a>
|
||||
</li><li>
|
||||
|
127
i2p2www/pages/site/docs/transport/index.html
Normal file
127
i2p2www/pages/site/docs/transport/index.html
Normal file
@@ -0,0 +1,127 @@
|
||||
{% extends "global/layout.html" %}
|
||||
{% block title %}Transport Overview{% endblock %}
|
||||
{% block content %}
|
||||
|
||||
Updated July 2010, current as of router version 0.8
|
||||
|
||||
<h1>Transports in I2P</h1>
|
||||
|
||||
A "transport" in I2P is a method for direct, point-to-point communication
|
||||
between two routers.
|
||||
Transports must provide confidentiality and integrity
|
||||
against external adversaries while authenticating that the router contacted
|
||||
is the one who should receive a given message.
|
||||
|
||||
<p> I2P supports multiple transports simultaneously.
|
||||
There are two transports currently implemented:
|
||||
<ol>
|
||||
<li> <a href="{{ site_url('docs/transport/ntcp') }}">NTCP</a>, a Java New I/O (NIO) TCP transport
|
||||
<li> <a href="{{ site_url('docs/transport/ssu') }}">SSU</a>, or Secure Semireliable UDP
|
||||
</ol>
|
||||
|
||||
Each provides a "connection" paradigm, with authentication,
|
||||
flow control, acknowledgments and retransmission.
|
||||
|
||||
|
||||
<h2>Transport Services</h2>
|
||||
|
||||
The transport subsystem in I2P provides the following services:
|
||||
<ul>
|
||||
<li>Maintain a set of router addresses, one or more for each transport,
|
||||
that the router publishes as its global contact information (the RouterInfo)
|
||||
<li>Selection of the best transport for each outgoing message
|
||||
<li>Queueing of outbound messages by priority
|
||||
<li>Bandwidth limiting, both outbound and inbound, according to router configuration
|
||||
<li>Setup and teardown of transport connections
|
||||
<li>Encryption of point-to-point communications
|
||||
<li>Maintenance of connection limits for each transport, implementation of various thresholds for these limits,
|
||||
and communication of threshold status to the router so it may make operational changes based on the status
|
||||
<li>Firewall port opening using UPnP (Universal Plug and Play)
|
||||
<li>Cooperative NAT/Firewall traversal
|
||||
<li>Local IP detection by various methods, including UPnP, inspection of incoming connections, and enumeration of network devices
|
||||
<li>Coordination of firewall status and local IP, and changes to either, among the transports
|
||||
<li>Communication of firewall status and local IP, and changes to either, to the router and the user interface
|
||||
<li>Determination of a consensus clock, which is used to periodically update the router's clock, as a backup for NTP
|
||||
<li>Maintenance of status for each peer, including whether it is connected, whether it was recently connected,
|
||||
and whether it was reachable in the last attempt
|
||||
<li>Qualification of valid IP addresses according to a local rule set
|
||||
<li>Honoring the automated and manual lists of banned peers maintained by the router,
|
||||
and refusing outbound and inbound connections to those peers
|
||||
</ul>
|
||||
|
||||
|
||||
<h2>Transport Addresses</h2>
|
||||
|
||||
The transport subsystem maintains a set of router addresses, each of which lists a transport method, IP, and port.
|
||||
These addresses constitute the advertised contact points, and are published by the router to the network database.
|
||||
<p>
|
||||
Typical scenarios are:
|
||||
<ul>
|
||||
<li>A router has no published addresses, so it is considered "hidden" and cannot receive incoming connections
|
||||
<li>A router is firewalled, and therefore publishes an SSU address which contains a list of cooperating
|
||||
peers or "introducers" who will assist in NAT traversal (see <a href="{{ site_url('docs/transport/ssu') }}">the SSU spec</a> for details)
|
||||
<li>A router is not firewalled or its NAT ports are open; it publishes both NTCP and SSU addresses containing
|
||||
directly-accessible IP and ports.
|
||||
</ul>
|
||||
|
||||
<h2>Transport Selection</h2>
|
||||
|
||||
The transport system delivers <a href="i2np.html">I2NP messages</a>. The transport selected for any message is
|
||||
independent of the application-layer protocol (TCP or UDP).
|
||||
<p>
|
||||
|
||||
For each outgoing message, the transport system solicits "bids" from each transport.
|
||||
The transport bidding the lowest (best) value wins the bid and receives the message for delivery.
|
||||
A transport may refuse to bid.
|
||||
<p>
|
||||
Whether a transport bids, and with what value, depend on numerous factors:
|
||||
<ul>
|
||||
<li>Configuration of transport preferences
|
||||
<li>Whether the transport is already connected to the peer
|
||||
<li>The number of current connections compared to various connection limit thresholds
|
||||
<li>Whether recent connection attempts to the peer have failed
|
||||
<li>The size of the message, as different transports have different size limits
|
||||
<li>Whether the peer can accept incoming connections for that transport, as advertised in its RouterInfo
|
||||
<li>Whether the connection would be indirect (requiring introducers) or direct
|
||||
<li>The peer's transport preference, as advertised in its RouterInfo
|
||||
</ul>
|
||||
|
||||
<p>
|
||||
In general, the bid values are selected so that two routers are only connected by a single transport
|
||||
at any one time. However, this is not a requirement.
|
||||
|
||||
|
||||
|
||||
<h2>New Transports and Future Work</h2>
|
||||
|
||||
Additional transports may be developed, including:
|
||||
|
||||
<ul>
|
||||
<li>A TLS/SSH look-alike transport
|
||||
<li>An "indirect" transport for routers that are not reachable by all other routers (one form of "restricted routes")
|
||||
</ul>
|
||||
|
||||
<p>
|
||||
Also, the existing transports will be enhanced to support multiple addresses within a single transport,
|
||||
including IPV6 addresses. Currently, a transport may only advertise a single IPV4 address.
|
||||
|
||||
<p>
|
||||
Work continues on adjusting default connection limits for each transport.
|
||||
I2P is designed as a "mesh network", where it is assumed that any router can connect to any other router.
|
||||
This assumption may be broken by routers that have exceeded their connection limits, and by
|
||||
routers that are behind restrictive state firewalls (restricted routes).
|
||||
|
||||
<p>
|
||||
The current connection limits are higher for SSU than for NTCP, based on the assumption that
|
||||
the memory requirements for an NTCP connection are higher than that for SSU.
|
||||
However, as NTCP buffers are partially in the kernel and SSU buffers are on the Java heap,
|
||||
that assumption is difficult to verify.
|
||||
|
||||
</p><p>
|
||||
Analyze
|
||||
<a href="http://www.cse.chalmers.se/%7Ejohnwolf/publications/hjelmvik_breaking.pdf">Breaking and Improving Protocol Obfuscation</a>
|
||||
and see how transport-layer padding may improve things.
|
||||
</p>
|
||||
|
||||
|
||||
{% endblock %}
|
559
i2p2www/pages/site/docs/transport/ntcp/discussion.html
Normal file
559
i2p2www/pages/site/docs/transport/ntcp/discussion.html
Normal file
@@ -0,0 +1,559 @@
|
||||
{% extends "global/layout.html" %}
|
||||
{% block title %}NTCP Discussion{% endblock %}
|
||||
{% block content %}
|
||||
|
||||
Following is a discussion about NTCP that took place in March 2007.
|
||||
It has not been updated to reflect current implementation.
|
||||
For the current NTCP specification see <a href="{{ site_url('docs/transport/ntcp') }}">the main NTCP page</a>.
|
||||
|
||||
<h2>NTCP vs. SSU Discussion, March 2007</h2>
|
||||
<h3>NTCP questions</h3>
|
||||
(adapted from an IRC discussion between zzz and cervantes)
|
||||
<br />
|
||||
Why is NTCP preferred over SSU, doesn't NTCP have higher overhead and latency?
|
||||
It has better reliability.
|
||||
<br />
|
||||
Doesn't streaming lib over NTCP suffer from classic TCP-over-TCP issues?
|
||||
What if we had a really simple UDP transport for streaming-lib-originated traffic?
|
||||
I think SSU was meant to be the so-called really simple UDP transport - but it just proved too unreliable.
|
||||
|
||||
<h3>"NTCP Considered Harmful" Analysis by zzz</h3>
|
||||
Posted to new Syndie, 2007-03-25.
|
||||
This was posted to stimulate discussion, don't take it too seriously.
|
||||
<p>
|
||||
Summary: NTCP has higher latency and overhead than SSU, and is more likely to
|
||||
collapse when used with the streaming lib. However, traffic is routed with a
|
||||
preference for NTCP over SSU and this is currently hardcoded.
|
||||
</p>
|
||||
|
||||
<h4>Discussion</h4>
|
||||
<p>
|
||||
We currently have two transports, NTCP and SSU. As currently implemented, NTCP
|
||||
has lower "bids" than SSU so it is preferred, except for the case where there
|
||||
is an established SSU connection but no established NTCP connection for a peer.
|
||||
</p><p>
|
||||
|
||||
SSU is similar to NTCP in that it implements acknowledgments, timeouts, and
|
||||
retransmissions. However SSU is I2P code with tight constraints on the
|
||||
timeouts and available statistics on round trip times, retransmissions, etc.
|
||||
NTCP is based on Java NIO TCP, which is a black box and presumably implements
|
||||
RFC standards, including very long maximum timeouts.
|
||||
</p><p>
|
||||
|
||||
The majority of traffic within I2P is streaming-lib originated (HTTP, IRC,
|
||||
Bittorrent) which is our implementation of TCP. As the lower-level transport is
|
||||
generally NTCP due to the lower bids, the system is subject to the well-known
|
||||
and dreaded problem of TCP-over-TCP
|
||||
http://sites.inka.de/~W1011/devel/tcp-tcp.html , where both the higher and
|
||||
lower layers of TCP are doing retransmissions at once, leading to collapse.
|
||||
</p><p>
|
||||
|
||||
Unlike in the PPP over SSH scenario described in the link above, we have
|
||||
several hops for the lower layer, each covered by a NTCP link. So each NTCP
|
||||
latency is generally much less than the higher-layer streaming lib latency.
|
||||
This lessens the chance of collapse.
|
||||
</p><p>
|
||||
|
||||
Also, the probabilities of collapse are lessened when the lower-layer TCP is
|
||||
tightly constrained with low timeouts and number of retransmissions compared to
|
||||
the higher layer.
|
||||
</p><p>
|
||||
|
||||
The .28 release increased the maximum streaming lib timeout from 10 sec to 45
|
||||
sec which greatly improved things. The SSU max timeout is 3 sec. The NTCP max
|
||||
timeout is presumably at least 60 sec, which is the RFC recommendation. There
|
||||
is no way to change NTCP parameters or monitor performance. Collapse of the
|
||||
NTCP layer is [editor: text lost]. Perhaps an external tool like tcpdump would help.
|
||||
</p><p>
|
||||
|
||||
However, running .28, the i2psnark reported upstream does not generally stay at
|
||||
a high level. It often goes down to 3-4 KBps before climbing back up. This is a
|
||||
signal that there are still collapses.
|
||||
</p><p>
|
||||
|
||||
SSU is also more efficient. NTCP has higher overhead and probably higher round
|
||||
trip times. when using NTCP the ratio of (tunnel output) / (i2psnark data
|
||||
output) is at least 3.5 : 1. Running an experiment where the code was modified
|
||||
to prefer SSU (the config option i2np.udp.alwaysPreferred has no effect in the
|
||||
current code), the ratio reduced to about 3 : 1, indicating better efficiency.
|
||||
</p><p>
|
||||
|
||||
As reported by streaming lib stats, things were much improved - lifetime window
|
||||
size up from 6.3 to 7.5, RTT down from 11.5s to 10s, sends per ack down from
|
||||
1.11 to 1.07.
|
||||
</p><p>
|
||||
|
||||
That this was quite effective was surprising, given that we were only changing
|
||||
the transport for the first of 3 to 5 total hops the outbound messages would
|
||||
take.
|
||||
</p><p>
|
||||
|
||||
The effect on outbound i2psnark speeds wasn't clear due to normal variations.
|
||||
Also for the experiment, inbound NTCP was disabled. The effect on inbound
|
||||
speeds on i2psnark was not clear.
|
||||
</p>
|
||||
<h4>Proposals</h4>
|
||||
|
||||
<ul>
|
||||
<li>
|
||||
1A)
|
||||
This is easy -
|
||||
We should flip the bid priorities so that SSU is preferred for all traffic, if
|
||||
we can do this without causing all sorts of other trouble. This will fix the
|
||||
i2np.udp.alwaysPreferred configuration option so that it works (either as true
|
||||
or false).
|
||||
|
||||
<li>
|
||||
1B)
|
||||
Alternative to 1A), not so easy -
|
||||
If we can mark traffic without adversely affecting our anonymity goals, we
|
||||
should identify streaming-lib generated traffic and have SSU generate a low bid
|
||||
for that traffic. This tag will have to go with the message through each hop
|
||||
so that the forwarding routers also honor the SSU preference.
|
||||
|
||||
|
||||
<li>
|
||||
2)
|
||||
Bounding SSU even further (reducing maximum retransmissions from the current
|
||||
10) is probably wise to reduce the chance of collapse.
|
||||
|
||||
<li>
|
||||
3)
|
||||
We need further study on the benefits vs. harm of a semi-reliable protocol
|
||||
underneath the streaming lib. Are retransmissions over a single hop beneficial
|
||||
and a big win or are they worse than useless?
|
||||
We could do a new SUU (secure unreliable UDP) but probably not worth it. We
|
||||
could perhaps add a no-ack-required message type in SSU if we don't want any
|
||||
retransmissions at all of streaming-lib traffic. Are tightly bounded
|
||||
retransmissions desirable?
|
||||
|
||||
<li>
|
||||
4)
|
||||
The priority sending code in .28 is only for NTCP. So far my testing hasn't
|
||||
shown much use for SSU priority as the messages don't queue up long enough for
|
||||
priorities to do any good. But more testing needed.
|
||||
|
||||
<li>
|
||||
5)
|
||||
The new streaming lib max timeout of 45s is probably still too low.
|
||||
The TCP RFC says 60s. It probably shouldn't be shorter than the underlying NTCP max timeout (presumably 60s).
|
||||
</ul>
|
||||
|
||||
<h3>Response by jrandom</h3>
|
||||
Posted to new Syndie, 2007-03-27
|
||||
<p>
|
||||
On the whole, I'm open to experimenting with this, though remember why NTCP is
|
||||
there in the first place - SSU failed in a congestion collapse. NTCP "just
|
||||
works", and while 2-10% retransmission rates can be handled in normal
|
||||
single-hop networks, that gives us a 40% retransmission rate with 2 hop
|
||||
tunnels. If you loop in some of the measured SSU retransmission rates we saw
|
||||
back before NTCP was implemented (10-30+%), that gives us an 83% retransmission
|
||||
rate. Perhaps those rates were caused by the low 10 second timeout, but
|
||||
increasing that much would bite us (remember, multiply by 5 and you've got half
|
||||
the journey).
|
||||
</p><p>
|
||||
|
||||
Unlike TCP, we have no feedback from the tunnel to know whether the message
|
||||
made it - there are no tunnel level acks. We do have end to end ACKs, but only
|
||||
on a small number of messages (whenever we distribute new session tags) - out
|
||||
of the 1,553,591 client messages my router sent, we only attempted to ACK
|
||||
145,207 of them. The others may have failed silently or succeeded perfectly.
|
||||
</p><p>
|
||||
|
||||
I'm not convinced by the TCP-over-TCP argument for us, especially split across
|
||||
the various paths we transfer down. Measurements on I2P can convince me
|
||||
otherwise, of course.
|
||||
</p><p>
|
||||
|
||||
<i>
|
||||
The NTCP max timeout is presumably at least 60 sec, which is the RFC
|
||||
recommendation. There is no way to change NTCP parameters or monitor
|
||||
performance.
|
||||
</i>
|
||||
</p><p>
|
||||
|
||||
|
||||
True, but net connections only get up to that level when something really bad
|
||||
is going on - the retransmission timeout on TCP is often on the order of tens
|
||||
or hundreds of milliseconds. As foofighter points out, they've got 20+ years
|
||||
experience and bugfixing in their TCP stacks, plus a billion dollar industry
|
||||
optimizing hardware and software to perform well according to whatever it is
|
||||
they do.
|
||||
</p><p>
|
||||
|
||||
<i>
|
||||
NTCP has higher overhead and probably higher round trip times. when using NTCP
|
||||
the ratio of (tunnel output) / (i2psnark data output) is at least 3.5 : 1.
|
||||
Running an experiment where the code was modified to prefer SSU (the config
|
||||
option i2np.udp.alwaysPreferred has no effect in the current code), the ratio
|
||||
reduced to about 3 : 1, indicating better efficiency.
|
||||
</i>
|
||||
</p><p>
|
||||
|
||||
|
||||
This is very interesting data, though more as a matter of router congestion
|
||||
than bandwidth efficiency - you'd have to compare 3.5*$n*$NTCPRetransmissionPct
|
||||
./. 3.0*$n*$SSURetransmissionPct. This data point suggests there's something in
|
||||
the router that leads to excess local queuing of messages already being
|
||||
transferred.
|
||||
</p><p>
|
||||
|
||||
<i>
|
||||
lifetime window size up from 6.3 to 7.5, RTT down from 11.5s to 10s, sends per
|
||||
ACK down from 1.11 to 1.07.
|
||||
</i>
|
||||
|
||||
</p><p>
|
||||
|
||||
Remember that the sends-per-ACK is only a sample not a full count (as we don't
|
||||
try to ACK every send). Its not a random sample either, but instead samples
|
||||
more heavily periods of inactivity or the initiation of a burst of activity -
|
||||
sustained load won't require many ACKs.
|
||||
</p><p>
|
||||
|
||||
Window sizes in that range are still woefully low to get the real benefit of
|
||||
AIMD, and still too low to transmit a single 32KB BT chunk (increasing the
|
||||
floor to 10 or 12 would cover that).
|
||||
</p><p>
|
||||
|
||||
Still, the wsize stat looks promising - over how long was that maintained?
|
||||
</p><p>
|
||||
|
||||
Actually, for testing purposes, you may want to look at
|
||||
StreamSinkClient/StreamSinkServer or even TestSwarm in
|
||||
apps/ministreaming/java/src/net/i2p/client/streaming/ - StreamSinkClient is a
|
||||
CLI app that sends a selected file to a selected destination and
|
||||
StreamSinkServer creates a destination and writes out any data sent to it
|
||||
(displaying size and transfer time). TestSwarm combines the two - flooding
|
||||
random data to whomever it connects to. That should give you the tools to
|
||||
measure sustained throughput capacity over the streaming lib, as opposed to BT
|
||||
choke/send.
|
||||
</p><p>
|
||||
|
||||
<i>
|
||||
1A)
|
||||
|
||||
This is easy -
|
||||
We should flip the bid priorities so that SSU is preferred for all traffic, if
|
||||
we can do this without causing all sorts of other trouble. This will fix the
|
||||
i2np.udp.alwaysPreferred configuration option so that it works (either as true
|
||||
or false).
|
||||
</i>
|
||||
</p><p>
|
||||
|
||||
|
||||
Honoring i2np.udp.alwaysPreferred is a good idea in any case - please feel free
|
||||
to commit that change. Lets gather a bit more data though before switching the
|
||||
preferences, as NTCP was added to deal with an SSU-created congestion collapse.
|
||||
</p><p>
|
||||
|
||||
<i>
|
||||
1B)
|
||||
Alternative to 1A), not so easy -
|
||||
If we can mark traffic without adversely affecting our anonymity goals, we
|
||||
should identify streaming-lib generated traffic
|
||||
and have SSU generate a low bid for that traffic. This tag will have to go with
|
||||
the message through each hop
|
||||
so that the forwarding routers also honor the SSU preference.
|
||||
</i>
|
||||
</p><p>
|
||||
|
||||
|
||||
In practice, there are three types of traffic - tunnel building/testing, netDb
|
||||
query/response, and streaming lib traffic. The network has been designed to
|
||||
make differentiating those three very hard.
|
||||
|
||||
</p><p>
|
||||
|
||||
<i>
|
||||
2)
|
||||
Bounding SSU even further (reducing maximum retransmissions from the current
|
||||
10) is probably wise to reduce the chance of collapse.
|
||||
</i>
|
||||
</p><p>
|
||||
|
||||
|
||||
At 10 retransmissions, we're up shit creek already, I agree. One, maybe two
|
||||
retransmissions is reasonable, from a transport layer, but if the other side is
|
||||
too congested to ACK in time (even with the implemented SACK/NACK capability),
|
||||
there's not much we can do.
|
||||
</p><p>
|
||||
|
||||
In my view, to really address the core issue we need to address why the router
|
||||
gets so congested to ACK in time (which, from what I've found, is due to CPU
|
||||
contention). Maybe we can juggle some things in the router's processing to make
|
||||
the transmission of an already existing tunnel higher CPU priority than
|
||||
decrypting a new tunnel request? Though we've got to be careful to avoid
|
||||
starvation.
|
||||
</p><p>
|
||||
|
||||
<i>
|
||||
3)
|
||||
We need further study on the benefits vs. harm of a semi-reliable protocol
|
||||
underneath the streaming lib. Are retransmissions over a single hop beneficial
|
||||
and a big win or are they worse than useless?
|
||||
We could do a new SUU (secure unreliable UDP) but probably not worth it. We
|
||||
could perhaps add a no-ACK-required message type in SSU if we don't want any
|
||||
retransmissions at all of streaming-lib traffic. Are tightly bounded
|
||||
retransmissions desirable?
|
||||
</i>
|
||||
|
||||
</p><p>
|
||||
|
||||
Worth looking into - what if we just disabled SSU's retransmissions? It'd
|
||||
probably lead to much higher streaming lib resend rates, but maybe not.
|
||||
</p><p>
|
||||
|
||||
<i>
|
||||
4)
|
||||
The priority sending code in .28 is only for NTCP. So far my testing hasn't
|
||||
shown much use for SSU priority as the messages don't queue up long enough for
|
||||
priorities to do any good. But more testing needed.
|
||||
</i>
|
||||
|
||||
</p><p>
|
||||
|
||||
There's UDPTransport.PRIORITY_LIMITS and UDPTransport.PRIORITY_WEIGHT (honored
|
||||
by TimedWeightedPriorityMessageQueue), but currently the weights are almost all
|
||||
equal, so there's no effect. That could be adjusted, of course (but as you
|
||||
mention, if there's no queuing, it doesn't matter).
|
||||
</p><p>
|
||||
|
||||
<i>
|
||||
5)
|
||||
The new streaming lib max timeout of 45s is probably still too low. The TCP RFC
|
||||
says 60s. It probably shouldn't be shorter than the underlying NTCP max timeout
|
||||
(presumably 60s).
|
||||
</i>
|
||||
</p><p>
|
||||
|
||||
|
||||
That 45s is the max retransmission timeout of the streaming lib though, not the
|
||||
stream timeout. TCP in practice has retransmission timeouts orders of magnitude
|
||||
less, though yes, can get to 60s on links running through exposed wires or
|
||||
satellite transmissions ;) If we increase the streaming lib retransmission
|
||||
timeout to e.g. 75 seconds, we could go get a beer before a web page loads
|
||||
(especially assuming less than a 98% reliable transport). That's one reason we
|
||||
prefer NTCP.
|
||||
</p>
|
||||
|
||||
|
||||
<h3>Response by zzz</h3>
|
||||
Posted to new Syndie, 2007-03-31
|
||||
<p>
|
||||
|
||||
<i>
|
||||
At 10 retransmissions, we're up shit creek already, I agree. One, maybe two
|
||||
retransmissions is reasonable, from a transport layer, but if the other side is
|
||||
too congested to ACK in time (even with the implemented SACK/NACK capability),
|
||||
there's not much we can do.
|
||||
<br>
|
||||
In my view, to really address the core issue we need to address why the
|
||||
router gets so congested to ACK in time (which, from what I've found, is due to
|
||||
CPU contention). Maybe we can juggle some things in the router's processing to
|
||||
make the transmission of an already existing tunnel higher CPU priority than
|
||||
decrypting a new tunnel request? Though we've got to be careful to avoid
|
||||
starvation.
|
||||
</i>
|
||||
</p><p>
|
||||
|
||||
One of my main stats-gathering techniques is turning on
|
||||
net.i2p.client.streaming.ConnectionPacketHandler=DEBUG and watching the RTT
|
||||
times and window sizes as they go by. To overgeneralize for a moment, it's
|
||||
common to see 3 types of connections: ~4s RTT, ~10s RTT, and ~30s RTT. Trying
|
||||
to knock down the 30s RTT connections is the goal. If CPU contention is the
|
||||
cause then maybe some juggling will do it.
|
||||
</p><p>
|
||||
|
||||
Reducing the SSU max retrans from 10 is really just a stab in the dark as we
|
||||
don't have good data on whether we are collapsing, having TCP-over-TCP issues,
|
||||
or what, so more data is needed.
|
||||
</p><p>
|
||||
|
||||
<i>
|
||||
Worth looking into - what if we just disabled SSU's retransmissions? It'd
|
||||
probably lead to much higher streaming lib resend rates, but maybe not.
|
||||
</i>
|
||||
</p><p>
|
||||
|
||||
What I don't understand, if you could elaborate, are the benefits of SSU
|
||||
retransmissions for non-streaming-lib traffic. Do we need tunnel messages (for
|
||||
example) to use a semi-reliable transport or can they use an unreliable or
|
||||
kinda-sorta-reliable transport (1 or 2 retransmissions max, for example)? In
|
||||
other words, why semi-reliability?
|
||||
</p><p>
|
||||
|
||||
<i>
|
||||
(but as you mention, if there's no queuing, it doesn't matter).
|
||||
</i>
|
||||
</p><p>
|
||||
|
||||
I implemented priority sending for UDP but it kicked in about 100,000 times
|
||||
less often than the code on the NTCP side. Maybe that's a clue for further
|
||||
investigation or a hint - I don't understand why it would back up that much
|
||||
more often on NTCP, but maybe that's a hint on why NTCP performs worse.
|
||||
|
||||
</p>
|
||||
|
||||
<h3>Question answered by jrandom</h3>
|
||||
Posted to new Syndie, 2007-03-31
|
||||
<p>
|
||||
measured SSU retransmission rates we saw back before NTCP was implemented
|
||||
(10-30+%)
|
||||
</p><p>
|
||||
|
||||
Can the router itself measure this? If so, could a transport be selected based
|
||||
on measured performance? (i.e. if an SSU connection to a peer is dropping an
|
||||
unreasonable number of messages, prefer NTCP when sending to that peer)
|
||||
</p><p>
|
||||
|
||||
|
||||
|
||||
Yeah, it currently uses that stat right now as a poor-man's MTU detection (if
|
||||
the retransmission rate is high, it uses the small packet size, but if its low,
|
||||
it uses the large packet size). We tried a few things when first introducing
|
||||
NTCP (and when first moving away from the original TCP transport) that would
|
||||
prefer SSU but fail that transport for a peer easily, causing it to fall back
|
||||
on NTCP. However, there's certainly more that could be done in that regard,
|
||||
though it gets complicated quickly (how/when to adjust/reset the bids, whether
|
||||
to share these preferences across multiple peers or not, whether to share it
|
||||
across multiple sessions with the same peer (and for how long), etc).
|
||||
|
||||
|
||||
<h3>Response by foofighter</h3>
|
||||
Posted to new Syndie, 2007-03-26
|
||||
<p>
|
||||
|
||||
If I've understood things right, the primary reason in favor of TCP (in
|
||||
general, both the old and new variety) was that you needn't worry about coding
|
||||
a good TCP stack. Which ain't impossibly hard to get right... just that
|
||||
existing TCP stacks have a 20 year lead.
|
||||
</p><p>
|
||||
|
||||
AFAIK, there hasn't been much deep theory behind the preference of TCP versus
|
||||
UDP, except the following considerations:
|
||||
|
||||
<ul>
|
||||
<li>
|
||||
A TCP-only network is very dependent on reachable peers (those who can forward
|
||||
incoming connections through their NAT)
|
||||
<li>
|
||||
Still even if reachable peers are rare, having them be high capacity somewhat
|
||||
alleviates the topological scarcity issues
|
||||
<li>
|
||||
UDP allows for "NAT hole punching" which lets people be "kind of
|
||||
pseudo-reachable" (with the help of introducers) who could otherwise only
|
||||
connect out
|
||||
<li>
|
||||
The "old" TCP transport implementation required lots of threads, which was a
|
||||
performance killer, while the "new" TCP transport does well with few threads
|
||||
<li>
|
||||
Routers of set A crap out when saturated with UDP. Routers of set B crap out
|
||||
when saturated with TCP.
|
||||
<li>
|
||||
It "feels" (as in, there are some indications but no scientific data or
|
||||
quality statistics) that A is more widely deployed than B
|
||||
<li>
|
||||
Some networks carry non-DNS UDP datagrams with an outright shitty quality,
|
||||
while still somewhat bothering to carry TCP streams.
|
||||
</ul>
|
||||
</p><p>
|
||||
|
||||
|
||||
On that background, a small diversity of transports (as many as needed, but not
|
||||
more) appears sensible in either case. Which should be the main transport,
|
||||
depends on their performance-wise. I've seen nasty stuff on my line when I
|
||||
tried to use its full capacity with UDP. Packet losses on the level of 35%.
|
||||
</p><p>
|
||||
|
||||
We could definitely try playing with UDP versus TCP priorities, but I'd urge
|
||||
caution in that. I would urge that they not be changed too radically all at
|
||||
once, or it might break things.
|
||||
|
||||
</p>
|
||||
|
||||
<h3>Response by zzz</h3>
|
||||
Posted to new Syndie, 2007-03-27
|
||||
<p>
|
||||
<i>
|
||||
AFAIK, there hasn't been much deep theory behind the preference of TCP versus
|
||||
UDP, except the following considerations:
|
||||
</i>
|
||||
|
||||
</p><p>
|
||||
|
||||
These are all valid issues. However you are considering the two protocols in
|
||||
isolation, whether than thinking about what transport protocol is best for a
|
||||
particular higher-level protocol (i.e. streaming lib or not).
|
||||
</p><p>
|
||||
|
||||
What I'm saying is you have to take the streaming lib into consideration.
|
||||
|
||||
So either shift the preferences for everybody or treat streaming lib traffic
|
||||
differently.
|
||||
|
||||
That's what my proposal 1B) is talking about - have a different preference for
|
||||
streaming-lib traffic than for non streaming-lib traffic (for example tunnel
|
||||
build messages).
|
||||
</p><p>
|
||||
|
||||
<i>
|
||||
|
||||
On that background, a small diversity of transports (as many as needed, but
|
||||
not more) appears sensible in either case. Which should be the main transport,
|
||||
depends on their performance-wise. I've seen nasty stuff on my line when I
|
||||
tried to use its full capacity with UDP. Packet losses on the level of 35%.
|
||||
|
||||
</i>
|
||||
</p><p>
|
||||
|
||||
Agreed. The new .28 may have made things better for packet loss over UDP, or
|
||||
maybe not.
|
||||
|
||||
One important point - the transport code does remember failures of a transport.
|
||||
So if UDP is the preferred transport, it will try it first, but if it fails for
|
||||
a particular destination, the next attempt for that destination it will try
|
||||
NTCP rather than trying UDP again.
|
||||
</p><p>
|
||||
|
||||
<i>
|
||||
We could definitely try playing with UDP versus TCP priorities, but I'd urge
|
||||
caution in that. I would urge that they not be changed too radically all at
|
||||
once, or it might break things.
|
||||
</i>
|
||||
</p><p>
|
||||
|
||||
We have four tuning knobs - the four bid values (SSU and NTCP, for
|
||||
already-connected and not-already-connected).
|
||||
We could make SSU be preferred over NTCP only if both are connected, for
|
||||
example, but try NTCP first if neither transport is connected.
|
||||
</p><p>
|
||||
|
||||
The other way to do it gradually is only shifting the streaming lib traffic
|
||||
(the 1B proposal) however that could be hard and may have anonymity
|
||||
implications, I don't know. Or maybe shift the traffic only for the first
|
||||
outbound hop (i.e. don't propagate the flag to the next router), which gives
|
||||
you only partial benefit but might be more anonymous and easier.
|
||||
</p>
|
||||
|
||||
<h3>Results of the Discussion</h3>
|
||||
... and other related changes in the same timeframe (2007):
|
||||
<ul>
|
||||
<li>
|
||||
Significant tuning of the streaming lib parameters,
|
||||
greatly increasing outbound performance, was implemented in 0.6.1.28
|
||||
<li>
|
||||
Priority sending for NTCP was implemented in 0.6.1.28
|
||||
<li>
|
||||
Priority sending for SSU was implemented by zzz but was never checked in
|
||||
<li>
|
||||
The advanced transport bid control
|
||||
i2np.udp.preferred was implemented in 0.6.1.29.
|
||||
<li>
|
||||
Pushback for NTCP was implemented in 0.6.1.30, disabled in 0.6.1.31 due to anonymity concerns,
|
||||
and re-enabled with improvements to address those concerns in 0.6.1.32.
|
||||
<li>
|
||||
None of zzz's proposals 1-5 have been implemented.
|
||||
</ul>
|
||||
|
||||
{% endblock %}
|
434
i2p2www/pages/site/docs/transport/ntcp/index.html
Normal file
434
i2p2www/pages/site/docs/transport/ntcp/index.html
Normal file
@@ -0,0 +1,434 @@
|
||||
{% extends "global/layout.html" %}
|
||||
{% block title %}NTCP{% endblock %}
|
||||
{% block content %}
|
||||
|
||||
Updated August 2010 for release 0.8
|
||||
|
||||
<h2>NTCP (NIO-based TCP)</h2>
|
||||
|
||||
<p>
|
||||
NTCP
|
||||
is one of two <a href="{{ site_url('docs/transport') }}">transports</a> currently implemented in I2P.
|
||||
The other is <a href="{{ site_url('docs/transport/ssu') }}">SSU</a>.
|
||||
NTCP
|
||||
is a Java NIO-based transport
|
||||
introduced in I2P release 0.6.1.22.
|
||||
Java NIO (new I/O) does not suffer from the 1 thread per connection issues of the old TCP transport.
|
||||
</p><p>
|
||||
|
||||
By default,
|
||||
NTCP uses the IP/Port
|
||||
auto-detected by SSU. When enabled on config.jsp,
|
||||
SSU will notify/restart NTCP when the external address changes
|
||||
or when the firewall status changes.
|
||||
Now you can enable inbound TCP without a static IP or dyndns service.
|
||||
</p><p>
|
||||
|
||||
The NTCP code within I2P is relatively lightweight (1/4 the size of the SSU code)
|
||||
because it uses the underlying Java TCP transport for reliable delivery.
|
||||
</p>
|
||||
|
||||
|
||||
<h2>NTCP Protocol Specification</h2>
|
||||
|
||||
<h3>Standard Message Format</h3>
|
||||
<p>
|
||||
After establishment,
|
||||
the NTCP transport sends individual I2NP messages, with a simple checksum.
|
||||
The unencrypted message is encoded as follows:
|
||||
<pre>
|
||||
* +-------+-------+--//--+---//----+-------+-------+-------+-------+
|
||||
* | sizeof(data) | data | padding | Adler checksum of sz+data+pad |
|
||||
* +-------+-------+--//--+---//----+-------+-------+-------+-------+
|
||||
</pre>
|
||||
The data is then AES/256/CBC encrypted. The session key for the encryption
|
||||
is negotiated during establishment (using Diffie-Hellman 2048 bit).
|
||||
The establishment between two routers is implemented in the EstablishState class
|
||||
and detailed below.
|
||||
The IV for AES/256/CBC encryption is the last 16 bytes of the previous encrypted message.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
0-15 bytes of padding are required to bring the total message length
|
||||
(including the six size and checksum bytes) to a multiple of 16.
|
||||
The maximum message size is currently 16 KB.
|
||||
Therefore the maximum data size is currently 16 KB - 6, or 16378 bytes.
|
||||
The minimum data size is 1.
|
||||
</p>
|
||||
|
||||
<h3>Time Sync Message Format</h3>
|
||||
<p>
|
||||
One special case is a metadata message where the sizeof(data) is 0. In
|
||||
that case, the unencrypted message is encoded as:
|
||||
<pre>
|
||||
* +-------+-------+-------+-------+-------+-------+-------+-------+
|
||||
* | 0 | timestamp in seconds | uninterpreted
|
||||
* +-------+-------+-------+-------+-------+-------+-------+-------+
|
||||
* uninterpreted | Adler checksum of bytes 0-11 |
|
||||
* +-------+-------+-------+-------+-------+-------+-------+-------+
|
||||
</pre>
|
||||
Total length: 16 bytes. The time sync message is sent at approximately 15 minute intervals.
|
||||
The message is encrypted just as standard messages are.
|
||||
|
||||
|
||||
<h3>Checksums</h3>
|
||||
The standard and time sync messages use the Adler-32 checksum
|
||||
as defined in the <a href="http://tools.ietf.org/html/rfc1950">ZLIB Specification</a>.
|
||||
|
||||
|
||||
<h3>Establishment Sequence</h3>
|
||||
In the establish state, there is a 4-phase message sequence to exchange DH keys and signatures.
|
||||
In the first two messages there is a 2048-bit Diffie Hellman exchange.
|
||||
Then, DSA signatures of the critical data are exchanged to confirm the connection.
|
||||
<pre>
|
||||
* Alice contacts Bob
|
||||
* =========================================================
|
||||
* X+(H(X) xor Bob.identHash)----------------------------->
|
||||
* <----------------------------------------Y+E(H(X+Y)+tsB+padding, sk, Y[239:255])
|
||||
* E(sz+Alice.identity+tsA+padding+S(X+Y+Bob.identHash+tsA+tsB), sk, hX_xor_Bob.identHash[16:31])--->
|
||||
* <----------------------E(S(X+Y+Alice.identHash+tsA+tsB)+padding, sk, prev)
|
||||
|
||||
</pre>
|
||||
|
||||
<pre>
|
||||
Legend:
|
||||
X, Y: 256 byte DH public keys
|
||||
H(): 32 byte SHA256 Hash
|
||||
E(data, session key, IV): AES256 Encrypt
|
||||
S(): 40 byte DSA Signature
|
||||
tsA, tsB: timestamps (4 bytes, seconds since epoch)
|
||||
sk: 32 byte Session key
|
||||
sz: 2 byte size of Alice identity to follow
|
||||
</pre>
|
||||
|
||||
<h4 id="DH">DH Key Exchange</h4>
|
||||
<p>
|
||||
The initial 2048-bit DH key exchange
|
||||
uses the same shared prime (p) and generator (g) as that used for I2P's
|
||||
<a href="how_cryptography.html#elgamal">ElGamal encryption</a>.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
The DH key exchange consists of a number of steps, displayed below.
|
||||
The mapping between these steps and the messages sent between I2P routers,
|
||||
is marked in bold.
|
||||
<ol>
|
||||
<li>Alice generates a secret 226-bit integer x.
|
||||
She then calculates X = g^x mod p.
|
||||
</li>
|
||||
<li>Alice sends X to Bob <b>(Message 1)</b>.</li>
|
||||
<li>Bob generates a secret 226-bit integer y.
|
||||
He then calculates Y = g^y mod p.</li>
|
||||
<li>Bob sends Y to Alice.<b>(Message 2)</b></li>
|
||||
<li>Alice can now compute sessionKey = Y^x mod p.</li>
|
||||
<li>Bob can now compute sessionKey = X^y mod p.</li>
|
||||
<li>Both Alice and Bob now have a shared key sessionKey = g^(x*y) mod p.</li>
|
||||
</ol>
|
||||
The sessionKey is then used to exchange identities in <b>Message 3</b> and <b>Message 4</b>.
|
||||
</p>
|
||||
|
||||
<h4>Message 1 (Session Request)</h4>
|
||||
This is the DH request.
|
||||
Alice already has Bob's
|
||||
<a href="common_structures_spec.html#struct_RouterIdentity">Router Identity</a>,
|
||||
IP address, and port, as contained in his
|
||||
<a href="common_structures_spec.html#struct_RouterInfo">Router Info</a>,
|
||||
which was published to the
|
||||
<a href="{{ site_url('docs/how/networkdatabase') }}">network database</a>.
|
||||
Alice sends Bob:
|
||||
<pre>
|
||||
* X+(H(X) xor Bob.identHash)----------------------------->
|
||||
|
||||
Size: 288 bytes
|
||||
</pre>
|
||||
Contents:
|
||||
<pre>
|
||||
+----+----+----+----+----+----+----+----+
|
||||
| X, as calculated from DH |
|
||||
+ +
|
||||
| |
|
||||
~ . . . ~
|
||||
| |
|
||||
+----+----+----+----+----+----+----+----+
|
||||
| |
|
||||
+ +
|
||||
| HXxorHI |
|
||||
+ +
|
||||
| |
|
||||
+ +
|
||||
| |
|
||||
+----+----+----+----+----+----+----+----+
|
||||
|
||||
X: 256 byte X from Diffie Hellman
|
||||
|
||||
HXxorHI: SHA256 Hash(X) xored with SHA256 Hash(Bob's <a href="common_structures_spec.html#struct_RouterIdentity">Router Identity</a>)
|
||||
(32 bytes)
|
||||
|
||||
</pre>
|
||||
|
||||
<p><b>Notes:</b>
|
||||
<ul><li>
|
||||
Bob verifies HXxorHI using his own router hash. If it does not verify,
|
||||
Alice has contacted the wrong router, and Bob drops the connection.
|
||||
</li></ul>
|
||||
|
||||
|
||||
<h4>Message 2 (Session Created)</h4>
|
||||
This is the DH reply. Bob sends Alice:
|
||||
<pre>
|
||||
* <----------------------------------------Y+E(H(X+Y)+tsB+padding, sk, Y[239:255])
|
||||
|
||||
Size: 304 bytes
|
||||
</pre>
|
||||
Unencrypted Contents:
|
||||
<pre>
|
||||
+----+----+----+----+----+----+----+----+
|
||||
| Y as calculated from DH |
|
||||
+ +
|
||||
| |
|
||||
~ . . . ~
|
||||
| |
|
||||
+----+----+----+----+----+----+----+----+
|
||||
| |
|
||||
+ +
|
||||
| HXY |
|
||||
+ +
|
||||
| |
|
||||
+ +
|
||||
| |
|
||||
+----+----+----+----+----+----+----+----+
|
||||
| tsB | padding |
|
||||
+----+----+----+----+ +
|
||||
| |
|
||||
+----+----+----+----+----+----+----+----+
|
||||
|
||||
Y: 256 byte Y from Diffie Hellman
|
||||
|
||||
HXY: SHA256 Hash(X concatenated with Y)
|
||||
(32 bytes)
|
||||
|
||||
tsB: 4 byte timestamp (seconds since the epoch)
|
||||
|
||||
padding: 12 bytes random data
|
||||
|
||||
</pre>
|
||||
|
||||
|
||||
Encrypted Contents:
|
||||
<pre>
|
||||
+----+----+----+----+----+----+----+----+
|
||||
| Y as calculated from DH |
|
||||
+ +
|
||||
| |
|
||||
~ . . . ~
|
||||
| |
|
||||
+----+----+----+----+----+----+----+----+
|
||||
| |
|
||||
+ +
|
||||
| encrypted data |
|
||||
+ +
|
||||
| |
|
||||
+ +
|
||||
| |
|
||||
+ +
|
||||
| |
|
||||
+ +
|
||||
| |
|
||||
+----+----+----+----+----+----+----+----+
|
||||
|
||||
Y: 256 byte Y from Diffie Hellman
|
||||
|
||||
encrypted data: 48 bytes <a href="{{ site_url('docs/how/cryptography') }}#AES">AES encrypted</a> using the DH session key and
|
||||
the last 16 bytes of Y as the IV
|
||||
|
||||
</pre>
|
||||
|
||||
|
||||
<p><b>Notes:</b>
|
||||
<ul><li>
|
||||
Alice may drop the connection if the clock skew with Bob is too high as calculated using tsB.
|
||||
</li></ul>
|
||||
</p>
|
||||
|
||||
|
||||
<h4>Message 3 (Session Confirm A)</h4>
|
||||
This contains Alice's router identity, and a DSA signature of the critical data. Alice sends Bob:
|
||||
<pre>
|
||||
* E(sz+Alice.identity+tsA+padding+S(X+Y+Bob.identHash+tsA+tsB), sk, hX_xor_Bob.identHash[16:31])--->
|
||||
|
||||
Size: 448 bytes (typ. for 387 byte identity)
|
||||
</pre>
|
||||
Unencrypted Contents:
|
||||
<pre>
|
||||
+----+----+----+----+----+----+----+----+
|
||||
| sz | Alice's Router Identity |
|
||||
+----+----+ +
|
||||
| |
|
||||
~ . . . ~
|
||||
| |
|
||||
+ +----+----+----+
|
||||
| | tsA
|
||||
+----+----+----+----+----+----+----+----+
|
||||
| padding |
|
||||
+----+ +
|
||||
| |
|
||||
+----+----+----+----+----+----+----+----+
|
||||
| |
|
||||
+ +
|
||||
| signature |
|
||||
+ +
|
||||
| |
|
||||
+ +
|
||||
| |
|
||||
+ +
|
||||
| |
|
||||
+----+----+----+----+----+----+----+----+
|
||||
|
||||
sz: 2 byte size of Alice's router identity to follow (should always be 387)
|
||||
|
||||
ident: Alice's 387 byte <a href="common_structures_spec.html#struct_RouterIdentity">Router Identity</a>
|
||||
|
||||
tsA: 4 byte timestamp (seconds since the epoch)
|
||||
|
||||
padding: 15 bytes random data
|
||||
|
||||
signature: the 40 byte <a href="common_structures_spec.html#type_Signature">DSA signature</a> of the following concatenated data:
|
||||
X, Y, Bob's <a href="common_structures_spec.html#struct_RouterIdentity">Router Identity</a>, tsA, tsB.
|
||||
Alice signs it with the <a href="common_structures_spec.html#type_SigningPrivateKey">private signing key</a> associated with the <a href="common_structures_spec.html#type_SigningPublicKey">public signing key</a> in her <a href="common_structures_spec.html#struct_RouterIdentity">Router Identity</a>
|
||||
|
||||
</pre>
|
||||
|
||||
Encrypted Contents:
|
||||
<pre>
|
||||
+----+----+----+----+----+----+----+----+
|
||||
| |
|
||||
+ +
|
||||
| encrypted data |
|
||||
~ . . . ~
|
||||
| |
|
||||
+----+----+----+----+----+----+----+----+
|
||||
|
||||
encrypted data: 448 bytes <a href="{{ site_url('docs/how/cryptography') }}#AES">AES encrypted</a> using the DH session key and
|
||||
the last 16 bytes of HXxorHI (i.e., the last 16 bytes of message #1) as the IV
|
||||
|
||||
</pre>
|
||||
|
||||
|
||||
<p><b>Notes:</b>
|
||||
<ul><li>
|
||||
Bob verifies the signature, and on failure, drops the connection.
|
||||
</li><li>
|
||||
Bob may drop the connection if the clock skew with Alice is too high as calculated using tsA.
|
||||
</li></ul>
|
||||
</p>
|
||||
|
||||
|
||||
|
||||
<h4>Message 4 (Session Confirm B)</h4>
|
||||
This is a DSA signature of the critical data. Bob sends Alice:
|
||||
<pre>
|
||||
* <----------------------E(S(X+Y+Alice.identHash+tsA+tsB)+padding, sk, prev)
|
||||
|
||||
Size: 48 bytes
|
||||
</pre>
|
||||
Unencrypted Contents:
|
||||
<pre>
|
||||
+----+----+----+----+----+----+----+----+
|
||||
| |
|
||||
+ +
|
||||
| signature |
|
||||
+ +
|
||||
| |
|
||||
+ +
|
||||
| |
|
||||
+ +
|
||||
| |
|
||||
+----+----+----+----+----+----+----+----+
|
||||
| padding |
|
||||
+----+----+----+----+----+----+----+----+
|
||||
|
||||
|
||||
signature: the 40 byte <a href="common_structures_spec.html#type_Signature">DSA signature</a> of the following concatenated data:
|
||||
X, Y, Alice's <a href="common_structures_spec.html#struct_RouterIdentity">Router Identity</a>, tsA, tsB.
|
||||
Bob signs it with the <a href="common_structures_spec.html#type_SigningPrivateKey">private signing key</a> associated with the <a href="common_structures_spec.html#type_SigningPublicKey">public signing key</a> in his <a href="common_structures_spec.html#struct_RouterIdentity">Router Identity</a>
|
||||
|
||||
padding: 8 bytes random data
|
||||
|
||||
</pre>
|
||||
|
||||
|
||||
Encrypted Contents:
|
||||
<pre>
|
||||
+----+----+----+----+----+----+----+----+
|
||||
| |
|
||||
+ +
|
||||
| encrypted data |
|
||||
~ . . . ~
|
||||
| |
|
||||
+----+----+----+----+----+----+----+----+
|
||||
|
||||
encrypted data: 48 bytes <a href="{{ site_url('docs/how/cryptography') }}#AES">AES encrypted</a> using the DH session key and
|
||||
the last 16 bytes of the encrypted contents of message #2 as the IV
|
||||
|
||||
</pre>
|
||||
|
||||
<p><b>Notes:</b>
|
||||
<ul><li>
|
||||
Alice verifies the signature, and on failure, drops the connection.
|
||||
</li></ul>
|
||||
</p>
|
||||
|
||||
|
||||
|
||||
|
||||
<h4>After Establishment</h4>
|
||||
<p>
|
||||
The connection is established, and standard or time sync messages may be exchanged.
|
||||
All subsequent messages are AES encrypted using the negotiated DH session key.
|
||||
Alice will use the last 16 bytes of the encrypted contents of message #3 as the next IV.
|
||||
Bob will use the last 16 bytes of the encrypted contents of message #4 as the next IV.
|
||||
</p>
|
||||
|
||||
|
||||
|
||||
<h3>Check Connection Message</h3>
|
||||
Alternately, when Bob receives a connection, it could be a
|
||||
check connection (perhaps prompted by Bob asking for someone
|
||||
to verify his listener).
|
||||
Check Connection is not currently used.
|
||||
However, for the record, check connections are formatted as follows.
|
||||
A check info connection will receive 256 bytes containing:
|
||||
<ul>
|
||||
<li> 32 bytes of uninterpreted, ignored data
|
||||
<li> 1 byte size
|
||||
<li> that many bytes making up the local router's IP address (as reached by the remote side)
|
||||
<li> 2 byte port number that the local router was reached on
|
||||
<li> 4 byte i2p network time as known by the remote side (seconds since the epoch)
|
||||
<li> uninterpreted padding data, up to byte 223
|
||||
<li> xor of the local router's identity hash and the SHA256 of bytes 32 through bytes 223
|
||||
</ul>
|
||||
</pre>
|
||||
|
||||
<h2>Discussion</h2>
|
||||
Now on the <a href="{{ site_url('docs/transport/ntcp/discussion') }}">NTCP Discussion Page</a>.
|
||||
|
||||
<h2><a name="future">Future Work</a></h2>
|
||||
<ul><li>
|
||||
The maximum message size should be increased to approximately 32 KB.
|
||||
</li><li>
|
||||
A set of fixed packet sizes may be appropriate to further hide the data
|
||||
fragmentation to external adversaries, but the tunnel, garlic, and end to
|
||||
end padding should be sufficient for most needs until then.
|
||||
However, there is currently no provision for padding beyond the next 16-byte boundary,
|
||||
to create a limited number of message sizes.
|
||||
</li><li>
|
||||
Memory utilization (including that of the kernel) for NTCP should be compared to that for SSU.
|
||||
</li><li>
|
||||
Can the establishment messages be randomly padded somehow, to frustrate
|
||||
identification of I2P traffic based on initial packet sizes?
|
||||
</li><li>
|
||||
Review and possibly disable 'check connection'
|
||||
</li></ul>
|
||||
</p>
|
||||
|
||||
{% endblock %}
|
450
i2p2www/pages/site/docs/transport/ssu/index.html
Normal file
450
i2p2www/pages/site/docs/transport/ssu/index.html
Normal file
@@ -0,0 +1,450 @@
|
||||
{% extends "global/layout.html" %}
|
||||
{% block title %}SSU Transport{% endblock %}
|
||||
{% block content %}
|
||||
|
||||
Updated October 2012 for release 0.9.2
|
||||
|
||||
<h1>Secure Semireliable UDP (SSU)</h1>
|
||||
<p>
|
||||
SSU (also called "UDP" in much of the I2P documentation and user interfaces)
|
||||
is one of two <a href="{{ site_url('docs/transport') }}">transports</a> currently implemented in I2P.
|
||||
The other is <a href="{{ site_url('docs/transport/ntcp') }}">NTCP</a>.
|
||||
</p><p>
|
||||
SSU is the newer of the two transports,
|
||||
introduced in I2P release 0.6.
|
||||
In a standard I2P installation, the router uses both NTCP and SSU for outbound connections.
|
||||
|
||||
<h2>SSU Services</h2>
|
||||
|
||||
Like the NTCP transport, SSU provides reliable, encrypted, connection-oriented, point-to-point data transport.
|
||||
Unique to SSU, it also provides IP detection and NAT traversal services, including:
|
||||
|
||||
<ul>
|
||||
<li>Cooperative NAT/Firewall traversal using <a href="#introduction">introducers</a>
|
||||
<li>Local IP detection by inspection of incoming packets and <a href="#peerTesting">peer testing</a>
|
||||
<li>Communication of firewall status and local IP, and changes to either to NTCP
|
||||
<li>Communication of firewall status and local IP, and changes to either, to the router and the user interface
|
||||
</ul>
|
||||
|
||||
|
||||
<h1>Protocol Details</h1>
|
||||
|
||||
<h2><a name="congestioncontrol">Congestion control</a></h2>
|
||||
|
||||
<p>SSU's need for only semireliable delivery, TCP-friendly operation,
|
||||
and the capacity for high throughput allows a great deal of latitude in
|
||||
congestion control. The congestion control algorithm outlined below is
|
||||
meant to be both efficient in bandwidth as well as simple to implement.</p>
|
||||
|
||||
<p>Packets are scheduled according to the router's policy, taking care
|
||||
not to exceed the router's outbound capacity or to exceed the measured
|
||||
capacity of the remote peer. The measured capacity operates along the
|
||||
lines of TCP's slow start and congestion avoidance, with additive increases
|
||||
to the sending capacity and multiplicative decreases in face of congestion.
|
||||
Unlike for TCP, routers may give up on some messages after
|
||||
a given period or number of retransmissions while continuing to transmit
|
||||
other messages.</p>
|
||||
|
||||
<p>The congestion detection techniques vary from TCP as well, since each
|
||||
message has its own unique and nonsequential identifier, and each message
|
||||
has a limited size - at most, 32KB. To efficiently transmit this feedback
|
||||
to the sender, the receiver periodically includes a list of fully ACKed
|
||||
message identifiers and may also include bitfields for partially received
|
||||
messages, where each bit represents the reception of a fragment. If
|
||||
duplicate fragments arrive, the message should be ACKed again, or if the
|
||||
message has still not been fully received, the bitfield should be
|
||||
retransmitted with any new updates.</p>
|
||||
|
||||
<p>The current implementation does not pad the packets to
|
||||
any particular size, but instead just places a single message fragment into
|
||||
a packet and sends it off (careful not to exceed the MTU).
|
||||
</p>
|
||||
|
||||
<h3><a name="mtu">MTU</a></h3>
|
||||
<p>
|
||||
As of router version 0.8.12,
|
||||
two MTU values are used: 620 and 1484.
|
||||
The MTU value is adjusted based on the percentage of packets that are retransmitted.
|
||||
</p><p>
|
||||
For both MTU values, it is desirable that (MTU % 16) == 12, so that
|
||||
the payload portion after the 28-byte IP/UDP header is a multiple of
|
||||
16 bytes, for encryption purposes.
|
||||
This calculation is for IPv4 only. While the protocol as specified supports IPv6
|
||||
addresses, IPv6 is not yet implemented.
|
||||
</p><p>
|
||||
For the small MTU value, it is desirable to pack a 2646-byte
|
||||
Variable Tunnel Build Message efficiently into multiple packets;
|
||||
with a 620-byte MTU, it fits into 5 packets with nicely.
|
||||
</p><p>
|
||||
Based on measurements, 1492 fits nearly all reasonably small I2NP messages
|
||||
(larger I2NP messages may be up to 1900 to 4500 bytes, which isn't going to fit
|
||||
into a live network MTU anyway).
|
||||
</p><p>
|
||||
The MTU values were 608 and 1492 for releases 0.8.9 - 0.8.11.
|
||||
The large MTU was 1350 prior to release 0.8.9.
|
||||
</p><p>
|
||||
The maximum receive packet size
|
||||
is 1571 bytes as of release 0.8.12.
|
||||
For releases 0.8.9 - 0.8.11 it was 1535 bytes.
|
||||
Prior to release 0.8.9 it was 2048 bytes.
|
||||
</p><p>
|
||||
As of release 0.9.2, if a router's network interface MTU is less than 1484,
|
||||
it will publish that in the network database, and other routers should
|
||||
honor that when a connection is established.
|
||||
</p>
|
||||
|
||||
<h3><a name="max">Message Size Limits</a></h3>
|
||||
<p>
|
||||
While the maximum message size is nominally 32KB, the practical
|
||||
limit differs. The protocol limits the number of fragments to 7 bits, or 128.
|
||||
The current implementation, however, limits each message to a maximum of 64 fragments,
|
||||
which is sufficient for 64 * 534 = 33.3 KB when using the 608 MTU.
|
||||
Due to overhead for bundled LeaseSets and session keys, the practical limit
|
||||
at the application level is about 6KB lower, or about 26KB.
|
||||
Further work is necessary to raise the UDP transport limit above 32KB.
|
||||
For connections using the larger MTU, larger messages are possible.
|
||||
</p>
|
||||
|
||||
<h2><a name="keys">Keys</a></h2>
|
||||
|
||||
<p>All encryption used is AES256/CBC with 32 byte keys and 16 byte IVs.
|
||||
The MAC and session keys are negotiated as part of the DH exchange, used
|
||||
for the HMAC and encryption, respectively. Prior to the DH exchange,
|
||||
the publicly knowable introKey is used for the MAC and encryption.</p>
|
||||
|
||||
<p>When using the introKey, both the initial message and any subsequent
|
||||
reply use the introKey of the responder (Bob) - the responder does
|
||||
not need to know the introKey of the requester (Alice). The DSA
|
||||
signing key used by Bob should already be known to Alice when she
|
||||
contacts him, though Alice's DSA key may not already be known by
|
||||
Bob.</p>
|
||||
|
||||
<p>Upon receiving a message, the receiver checks the "from" IP address and port
|
||||
with all established sessions - if there are matches,
|
||||
that session's MAC keys are tested in the HMAC. If none
|
||||
of those verify or if there are no matching IP addresses, the
|
||||
receiver tries their introKey in the MAC. If that does not verify,
|
||||
the packet is dropped. If it does verify, it is interpreted
|
||||
according to the message type, though if the receiver is overloaded,
|
||||
it may be dropped anyway.</p>
|
||||
|
||||
<p>If Alice and Bob have an established session, but Alice loses the
|
||||
keys for some reason and she wants to contact Bob, she may at any
|
||||
time simply establish a new session through the SessionRequest and
|
||||
related messages. If Bob has lost the key but Alice does not know
|
||||
that, she will first attempt to prod him to reply, by sending a
|
||||
DataMessage with the wantReply flag set, and if Bob continually
|
||||
fails to reply, she will assume the key is lost and reestablish a
|
||||
new one.</p>
|
||||
|
||||
<p>For the DH key agreement,
|
||||
<a href="http://www.faqs.org/rfcs/rfc3526.html">RFC3526</a> 2048bit
|
||||
MODP group (#14) is used:</p>
|
||||
<pre>
|
||||
p = 2^2048 - 2^1984 - 1 + 2^64 * { [2^1918 pi] + 124476 }
|
||||
g = 2
|
||||
</pre>
|
||||
|
||||
<p>
|
||||
These are the same p and g used for I2P's
|
||||
<a href="{{ site_url('docs/how/cryptography') }}#elgamal">ElGamal encryption</a>.
|
||||
</p>
|
||||
|
||||
<h2><a name="replay">Replay prevention</a></h2>
|
||||
|
||||
<p>Replay prevention at the SSU layer occurs by rejecting packets
|
||||
with exceedingly old timestamps or those which reuse an IV. To
|
||||
detect duplicate IVs, a sequence of Bloom filters are employed to
|
||||
"decay" periodically so that only recently added IVs are detected.</p>
|
||||
|
||||
<p>The messageIds used in DataMessages are defined at layers above
|
||||
the SSU transport and are passed through transparently. These IDs
|
||||
are not in any particular order - in fact, they are likely to be
|
||||
entirely random. The SSU layer makes no attempt at messageId
|
||||
replay prevention - higher layers should take that into account.</p>
|
||||
|
||||
<h2 id="addressing">Addressing</h2>
|
||||
|
||||
<p>To contact an SSU peer, one of two sets of information is necessary:
|
||||
a direct address, for when the peer is publicly reachable, or an
|
||||
indirect address, for using a third party to introduce the peer.
|
||||
There is no restriction on the number of addresses a peer may have.</p>
|
||||
|
||||
<pre>
|
||||
Direct: host, port, introKey, options
|
||||
Indirect: tag, relayhost, port, relayIntroKey, targetIntroKey, options
|
||||
</pre>
|
||||
|
||||
<p>Each of the addresses may also expose a series of options - special
|
||||
capabilities of that particular peer. For a list of available
|
||||
capabilities, see <a href="#capabilities">below</a>.</p>
|
||||
|
||||
<p>
|
||||
The addresses, options, and capabilities are published in the <a href="{{ site_url('docs/how/networkdatabase') }}">network database</a>.
|
||||
</p>
|
||||
|
||||
|
||||
<h2><a name="direct">Direct Session Establishment</a></h2>
|
||||
<p>
|
||||
Direct session establishment is used when no third party is required for NAT traversal.
|
||||
The message sequence is as follows:
|
||||
</p>
|
||||
|
||||
<h3><a name="establishDirect">Connection establishment (direct)</a></h3>
|
||||
<p>
|
||||
Alice connects directly to Bob.
|
||||
</p>
|
||||
<pre>
|
||||
Alice Bob
|
||||
SessionRequest --------------------->
|
||||
<--------------------- SessionCreated
|
||||
SessionConfirmed ------------------->
|
||||
<--------------------- DeliveryStatusMessage
|
||||
<--------------------- DatabaseStoreMessage
|
||||
DatabaseStoreMessage --------------->
|
||||
Data <---------------------------> Data
|
||||
</pre>
|
||||
<p>
|
||||
After the SessionConfirmed message is received, Bob sends a small
|
||||
<a href="i2np_spec.html#msg_DeliveryStatus">DeliveryStatus message</a>
|
||||
as a confirmation.
|
||||
In this message, the 4-byte message ID is set to a random number, and the
|
||||
8-byte "arrival time" is set to the current network-wide ID, which is 2
|
||||
(i.e. 0x0000000000000002).
|
||||
</p><p>
|
||||
After the status message is sent, the peers exchange
|
||||
<a href="i2np_spec.html#msg_DatabaseStore">DatabaseStore messages</a>
|
||||
containing their
|
||||
<a href="common_structures_spec.html#struct_RouterInfo">RouterInfos</a>.
|
||||
</p><p>
|
||||
It does not appear that the type of the status message or its contents matters.
|
||||
It was originally added becasue the DatabaseStore message was delayed
|
||||
several seconds; since the store is now sent immediately, perhaps
|
||||
the status message can be eliminated.
|
||||
</p>
|
||||
|
||||
<h2><a name="introduction">Introduction</a></h2>
|
||||
|
||||
<p>Introduction keys are delivered through an external channel
|
||||
(the network database, where they are identical to the router Hash for now)
|
||||
and must be used when establishing a session key. For the indirect
|
||||
address, the peer must first contact the relayhost and ask them for
|
||||
an introduction to the peer known at that relayhost under the given
|
||||
tag. If possible, the relayhost sends a message to the addressed
|
||||
peer telling them to contact the requesting peer, and also gives
|
||||
the requesting peer the IP and port on which the addressed peer is
|
||||
located. In addition, the peer establishing the connection must
|
||||
already know the public keys of the peer they are connecting to (but
|
||||
not necessary to any intermediary relay peer).</p>
|
||||
|
||||
<p>Indirect session establishment by means of a third party introduction
|
||||
is necessary for efficient NAT traversal. Charlie, a router behind a
|
||||
NAT or firewall which does not allow unsolicited inbound UDP packets,
|
||||
first contacts a few peers, choosing some to serve as introducers. Each
|
||||
of these peers (Bob, Bill, Betty, etc) provide Charlie with an introduction
|
||||
tag - a 4 byte random number - which he then makes available to the public
|
||||
as methods of contacting him. Alice, a router who has Charlie's published
|
||||
contact methods, first sends a RelayRequest packet to one or more of the
|
||||
introducers, asking each to introduce her to Charlie (offering the
|
||||
introduction tag to identify Charlie). Bob then forwards a RelayIntro
|
||||
packet to Charlie including Alice's public IP and port number, then sends
|
||||
Alice back a RelayResponse packet containing Charlie's public IP and port
|
||||
number. When Charlie receives the RelayIntro packet, he sends off a small
|
||||
random packet to Alice's IP and port (poking a hole in his NAT/firewall),
|
||||
and when Alice receives Bob's RelayResponse packet, she begins a new
|
||||
full direction session establishment with the specified IP and port.</p>
|
||||
|
||||
<!--
|
||||
should Bob wait for Charlie to ack the RelayIntro packet to avoid
|
||||
situations where that packet is lost yet Alice gets Charlie's IP with
|
||||
Charlie not yet punching a hole in his NAT for her to get through?
|
||||
Perhaps Alice should send to multiple Bobs at once, hoping that at
|
||||
least one of them gets through
|
||||
-->
|
||||
|
||||
<h3><a name="establishIndirect">Connection establishment (indirect using an introducer)</a></h3>
|
||||
|
||||
Alice first connects to introducer Bob, who relays the request to Charlie.
|
||||
|
||||
<pre>
|
||||
Alice Bob Charlie
|
||||
RelayRequest ---------------------->
|
||||
<-------------- RelayResponse RelayIntro ----------->
|
||||
<-------------------------------------------- HolePunch (data ignored)
|
||||
SessionRequest -------------------------------------------->
|
||||
<-------------------------------------------- SessionCreated
|
||||
SessionConfirmed ------------------------------------------>
|
||||
<-------------------------------------------- DeliveryStatusMessage
|
||||
<-------------------------------------------- DatabaseStoreMessage
|
||||
DatabaseStoreMessage -------------------------------------->
|
||||
Data <--------------------------------------------------> Data
|
||||
</pre>
|
||||
|
||||
<p>
|
||||
After the hole punch, the session is established between Alice and Charlie as in a direct establishment.
|
||||
</p>
|
||||
|
||||
|
||||
<h2><a name="peerTesting">Peer testing</a></h2>
|
||||
|
||||
<p>The automation of collaborative reachability testing for peers is
|
||||
enabled by a sequence of PeerTest messages. With its proper
|
||||
execution, a peer will be able to determine their own reachability
|
||||
and may update its behavior accordingly. The testing process is
|
||||
quite simple:</p>
|
||||
|
||||
<pre>
|
||||
Alice Bob Charlie
|
||||
PeerTest ------------------->
|
||||
PeerTest-------------------->
|
||||
<-------------------PeerTest
|
||||
<-------------------PeerTest
|
||||
<------------------------------------------PeerTest
|
||||
PeerTest------------------------------------------>
|
||||
<------------------------------------------PeerTest
|
||||
</pre>
|
||||
|
||||
<p>Each of the PeerTest messages carry a nonce identifying the
|
||||
test series itself, as initialized by Alice. If Alice doesn't
|
||||
get a particular message that she expects, she will retransmit
|
||||
accordingly, and based upon the data received or the messages
|
||||
missing, she will know her reachability. The various end states
|
||||
that may be reached are as follows:</p>
|
||||
|
||||
<ul>
|
||||
<li>If she doesn't receive a response from Bob, she will retransmit
|
||||
up to a certain number of times, but if no response ever arrives,
|
||||
she will know that her firewall or NAT is somehow misconfigured,
|
||||
rejecting all inbound UDP packets even in direct response to an
|
||||
outbound packet. Alternately, Bob may be down or unable to get
|
||||
Charlie to reply.</li>
|
||||
|
||||
<li>If Alice doesn't receive a PeerTest message with the
|
||||
expected nonce from a third party (Charlie), she will retransmit
|
||||
her initial request to Bob up to a certain number of times, even
|
||||
if she has received Bob's reply already. If Charlie's first message
|
||||
still doesn't get through but Bob's does, she knows that she is
|
||||
behind a NAT or firewall that is rejecting unsolicited connection
|
||||
attempts and that port forwarding is not operating properly (the
|
||||
IP and port that Bob offered up should be forwarded).</li>
|
||||
|
||||
<li>If Alice receives Bob's PeerTest message and both of Charlie's
|
||||
PeerTest messages but the enclosed IP and port numbers in Bob's
|
||||
and Charlie's second messages don't match, she knows that she is
|
||||
behind a symmetric NAT, rewriting all of her outbound packets with
|
||||
different 'from' ports for each peer contacted. She will need to
|
||||
explicitly forward a port and always have that port exposed for
|
||||
remote connectivity, ignoring further port discovery.</li>
|
||||
|
||||
<li>If Alice receives Charlie's first message but not his second,
|
||||
she will retransmit her PeerTest message to Charlie up to a
|
||||
certain number of times, but if no response is received she knows
|
||||
that Charlie is either confused or no longer online.</li>
|
||||
</ul>
|
||||
|
||||
<p>Alice should choose Bob arbitrarily from known peers who seem
|
||||
to be capable of participating in peer tests. Bob in turn should
|
||||
choose Charlie arbitrarily from peers that he knows who seem to be
|
||||
capable of participating in peer tests and who are on a different
|
||||
IP from both Bob and Alice. If the first error condition occurs
|
||||
(Alice doesn't get PeerTest messages from Bob), Alice may decide
|
||||
to designate a new peer as Bob and try again with a different nonce.</p>
|
||||
|
||||
<p>Alice's introduction key is included in all of the PeerTest
|
||||
messages so that she doesn't need to already have an established
|
||||
session with Bob and so that Charlie can contact her without knowing
|
||||
any additional information. Alice may go on to establish a session
|
||||
with either Bob or Charlie, but it is not required.</p>
|
||||
|
||||
|
||||
<h2><a name="acks">Transmission window, ACKs and Retransmissions</a></h2>
|
||||
<p>
|
||||
The DATA message may contain ACKs of full messages and
|
||||
partial ACKs of individual fragments of a message. See
|
||||
the data message section of
|
||||
<a href="{{ site_url('docs/transport/ssu/spec') }}">the protocol specification page</a>
|
||||
for details.
|
||||
</p><p>
|
||||
The details of windowing, ACK, and retransmission strategies are not specified
|
||||
here. See the Java code for the current implementation.
|
||||
During the establishment phase, and for peer testing, routers
|
||||
should implement exponential backoff for retransmission.
|
||||
For an established connection, routers should implement
|
||||
an adjustable transmission window, RTT estimate and timeout, similar to TCP
|
||||
or <a href="streaming.html">streaming</a>.
|
||||
See the code for initial, min and max parameters.
|
||||
</p>
|
||||
|
||||
|
||||
<h2><a name="security">Security</a></h2>
|
||||
<p>
|
||||
UDP source addresses may, of course, be spoofed.
|
||||
Additionally, the IPs and ports contained inside specific
|
||||
SSU messages (RelayRequest, RelayResponse, RelayIntro, PeerTest)
|
||||
may not be legitimate.
|
||||
Also, certain actions and responses may need to be rate-limited.
|
||||
</p><p>
|
||||
The details of validation are not specified
|
||||
here. Implementers should add defenses where appropriate.
|
||||
</p>
|
||||
|
||||
|
||||
<h2><a name="capabilities">Peer capabilities</a></h2>
|
||||
|
||||
<dl>
|
||||
<dt>B</dt>
|
||||
<dd>If the peer address contains the 'B' capability, that means
|
||||
they are willing and able to participate in peer tests as
|
||||
a 'Bob' or 'Charlie'.</dd>
|
||||
<dt>C</dt>
|
||||
<dd>If the peer address contains the 'C' capability, that means
|
||||
they are willing and able to serve as an introducer - serving
|
||||
as a Bob for an otherwise unreachable Alice.</dd>
|
||||
</dl>
|
||||
|
||||
<h1><a name="future">Future Work</a></h1>
|
||||
<ul><li>
|
||||
Analysis of current SSU performance, including assessment of window size adjustment
|
||||
and other parameters, and adjustment of the protocol implementation to improve
|
||||
performance, is a topic for future work.
|
||||
</li><li>
|
||||
The current implementation repeatedly sends acknowledgments for the same packets,
|
||||
which unnecessarily increases overhead.
|
||||
</li><li>
|
||||
The default small MTU value of 620 should be analyzed and possibly increased.
|
||||
The current MTU adjustment strategy should be evaluated.
|
||||
Does a streaming lib 1730-byte packet fit in 3 small SSU packets? Probably not.
|
||||
</li><li>
|
||||
The protocol should be extended to exchange MTUs during the setup.
|
||||
</li><li>
|
||||
Rekeying is currently unimplemented and may never be.
|
||||
</li><li>
|
||||
The potential use of the 'challenge' fields in RelayIntro and RelayResponse,
|
||||
and use of the padding field in SessionRequest and SessionCreated, is undocumented.
|
||||
</li><li>
|
||||
Instead of a single fragment per packet, a more efficient
|
||||
strategy may be to bundle multiple message fragments into the same packet,
|
||||
so long as it doesn't exceed the MTU.
|
||||
</li><li>
|
||||
A set of fixed packet sizes may be appropriate to further hide the data
|
||||
fragmentation to external adversaries, but the tunnel, garlic, and end to
|
||||
end padding should be sufficient for most needs until then.
|
||||
</li><li>
|
||||
Why are introduction keys the same as the router hash, should it be changed, would there be any benefit?
|
||||
</li><li>
|
||||
Capacities appear to be unused.
|
||||
</li><li>
|
||||
Signed-on times in SessionCreated and SessionConfirmed appear to be unused or unverified.
|
||||
</li></ul>
|
||||
|
||||
<h1>Implementation Diagram</h1>
|
||||
This diagram
|
||||
should accurately reflect the current implementation, however there may be small differences.
|
||||
<p>
|
||||
<img src="{{ url_for('static', filename='images/udp.png') }}">
|
||||
|
||||
<h1><a name="spec">Specification</a></h1>
|
||||
<a href="{{ site_url('docs/transport/ssu/spec') }}">Now on the SSU specification page</a>.
|
||||
|
||||
|
||||
{% endblock %}
|
843
i2p2www/pages/site/docs/transport/ssu/spec.html
Normal file
843
i2p2www/pages/site/docs/transport/ssu/spec.html
Normal file
@@ -0,0 +1,843 @@
|
||||
{% extends "global/layout.html" %}
|
||||
{% block title %}SSU Protocol Specification{% endblock %}
|
||||
{% block content %}
|
||||
|
||||
Updated October 2012 for release 0.9.2
|
||||
|
||||
<p>
|
||||
<a href="{{ site_url('docs/transport/ssu') }}">See the SSU page for an overview of the SSU transport</a>.
|
||||
|
||||
<h1>Specification</h1>
|
||||
|
||||
|
||||
<h2 id="DH">DH Key Exchange</h2>
|
||||
<p>
|
||||
The initial 2048-bit DH key exchange is described on the
|
||||
<a href="{{ site_url('docs/transport/ssu') }}#keys">SSU page</a>.
|
||||
This exchange uses the same shared prime as that used for I2P's
|
||||
<a href="{{ site_url('docs/how/cryptography') }}#elgamal">ElGamal encryption</a>.
|
||||
</p>
|
||||
|
||||
|
||||
<h2 id="header">Message Header</h2>
|
||||
|
||||
<p>
|
||||
All UDP datagrams begin with a 16 byte MAC (Message Authentication Code)
|
||||
and a 16 byte IV (Initialization Vector)
|
||||
followed by a variable-size
|
||||
payload encrypted with the appropriate key. The MAC used is
|
||||
HMAC-MD5, truncated to 16 bytes, while the key is a full 32 byte AES256
|
||||
key. The specific construct of the MAC is the first 16 bytes from:</p>
|
||||
<pre>
|
||||
HMAC-MD5(payload || IV || (payloadLength ^ protocolVersion), macKey)
|
||||
</pre>
|
||||
where '||' means append.
|
||||
The payload is the message starting with the flag byte.
|
||||
The macKey is either the introduction key or the
|
||||
session key, as specified for each message below.
|
||||
<b>WARNING</b> - the HMAC-MD5-128 used here is non-standard,
|
||||
see <a href="{{ site_url('docs/how/cryptography') }}#udp">the cryptography page</a> for details.
|
||||
|
||||
|
||||
<p>The payload itself (that is, the message starting with the flag byte)
|
||||
is AES256/CBC encrypted with the IV and the
|
||||
sessionKey, with replay prevention addressed within its body,
|
||||
explained below. The payloadLength in the MAC is a 2 byte unsigned
|
||||
integer.</p>
|
||||
|
||||
<p>The protocolVersion is a 2 byte unsigned integer
|
||||
and is currently set to 0. Peers using a different protocol version will
|
||||
not be able to communicate with this peer, though earlier versions not
|
||||
using this flag are.</p>
|
||||
|
||||
<p>Within the AES encrypted payload, there is a minimal common structure
|
||||
to the various messages - a one byte flag and a four byte sending
|
||||
timestamp (seconds since the unix epoch). The flag byte contains
|
||||
the following bitfields:</p>
|
||||
<pre>
|
||||
Bit order: 76543210 (bit 7 is MSB)
|
||||
|
||||
bits 7-4: payload type
|
||||
bit 3: rekey?
|
||||
bit 2: extended options included
|
||||
bits 1-0: reserved
|
||||
</pre>
|
||||
<pre>
|
||||
Header: 37+ bytes
|
||||
Encryption starts with the flag byte.
|
||||
+----+----+----+----+----+----+----+----+
|
||||
| MAC |
|
||||
+ +
|
||||
| |
|
||||
+----+----+----+----+----+----+----+----+
|
||||
| IV |
|
||||
+ +
|
||||
| |
|
||||
+----+----+----+----+----+----+----+----+
|
||||
|flag| time | (optionally |
|
||||
+----+----+----+----+----+ |
|
||||
| this may have 64 byte keying material |
|
||||
| and/or a one+N byte extended options) |
|
||||
+---------------------------------------|
|
||||
</pre>
|
||||
|
||||
|
||||
<h3 id="rekey">Rekeying</h3>
|
||||
<p>If the rekey flag is set, 64 bytes of keying material follow the
|
||||
timestamp.
|
||||
|
||||
<p>When rekeying, the first 32 bytes of the keying material is fed
|
||||
into a SHA256 to produce the new MAC key, and the next 32 bytes are
|
||||
fed into a SHA256 to produce the new session key, though the keys are
|
||||
not immediately used. The other side should also reply with the
|
||||
rekey flag set and that same keying material. Once both sides have
|
||||
sent and received those values, the new keys should be used and the
|
||||
previous keys discarded. It may be useful to keep the old keys
|
||||
around briefly, to address packet loss and reordering.</p>
|
||||
|
||||
<p>NOTE: Rekeying is currently unimplemented.</p>
|
||||
|
||||
<h3 id="extend">Extended Options</h3>
|
||||
<p>
|
||||
If the extended options flag is set, a one byte option
|
||||
size value is appended, followed by that many extended option
|
||||
bytes.</p>
|
||||
|
||||
<p>NOTE: Extended options is currently unimplemented.</p>
|
||||
|
||||
<h2 id="padding">Padding</h2>
|
||||
<p>
|
||||
All messages contain 0 or more bytes of padding.
|
||||
Each message must be padded to a 16 byte boundary, as required by the <a href="{{ site_url('docs/how/cryptography') }}#AES">AES256 encryption layer</a>.
|
||||
Currently, messages are not padded beyond the next 16 byte boundary.
|
||||
The fixed-size tunnel messages of 1024 bytes (at a higher layer)
|
||||
provide a significant amount of protection.
|
||||
In the future, additional padding in the transport layer up to
|
||||
a set of fixed packet sizes may be appropriate to further hide the data
|
||||
fragmentation to external adversaries.
|
||||
</p>
|
||||
|
||||
|
||||
<h2 id="keys">Keys</h2>
|
||||
<p>
|
||||
DSA signatures in the SessionCreated and SessionConfirmed messages are generated using
|
||||
the
|
||||
<a href="common_structures_spec.html#type_SigningPublicKey">signing public key</a>
|
||||
from the
|
||||
<a href="common_structures_spec.html#struct_RouterIdentity">router identity</a>
|
||||
which is distributed out-of-band by publishing in the network database, and the associated
|
||||
<a href="common_structures_spec.html#type_SigningPrivateKey">signing private key</a>.
|
||||
</p><p>
|
||||
Both introduction keys and session keys are 32 bytes,
|
||||
and are defined by the
|
||||
<a href="common_structures_spec.html#type_SessionKey">Common structures specification</a>.
|
||||
The key used for the MAC and encryption is specified for each message below.
|
||||
</p>
|
||||
<p>Introduction keys are delivered through an external channel
|
||||
(the network database, where they are identical to the router Hash for now).
|
||||
</p>
|
||||
|
||||
|
||||
<h2 id="notes">Notes</h2>
|
||||
|
||||
<h3 id="ipv6">IPv6 Notes</h3>
|
||||
While the protocol specification supports 16-byte IPv6 addresses,
|
||||
IPv6 addressing is not currently supported within I2P.
|
||||
All IP addresses are currently 4 bytes.
|
||||
|
||||
<h3 id="time">Timestamps</h3>
|
||||
While most of I2P uses 8-byte <a href="common_structures_spec.html#type_Date">Date</a> timestamps with
|
||||
millisecond resolution, SSU uses a 4-byte timestamp with one-second resolution.
|
||||
|
||||
|
||||
|
||||
|
||||
<h2 id="messages">Messages</h2>
|
||||
<p>
|
||||
There are 10 messages (payload types) defined:
|
||||
</p><p>
|
||||
<table border="1">
|
||||
<tr><th>Type<th>Message<th>Notes
|
||||
<tr><td align="center">0<td>SessionRequest<td>
|
||||
<tr><td align="center">1<td>SessionCreated<td>
|
||||
<tr><td align="center">2<td>SessionConfirmed<td>
|
||||
<tr><td align="center">3<td>RelayRequest<td>
|
||||
<tr><td align="center">4<td>RelayResponse<td>
|
||||
<tr><td align="center">5<td>RelayIntro<td>
|
||||
<tr><td align="center">6<td>Data<td>
|
||||
<tr><td align="center">7<td>PeerTest<td>
|
||||
<tr><td align="center">8<td>SessionDestroyed<td>Implemented as of 0.8.9
|
||||
<tr><td align="center">n/a<td>HolePunch<td>
|
||||
</table>
|
||||
</p>
|
||||
|
||||
<h3 id="sessionRequest">SessionRequest (type 0)</h3>
|
||||
<p>
|
||||
This is the first message sent to establish a session.
|
||||
</p>
|
||||
|
||||
<table border="1">
|
||||
<tr><td align="right" valign="top"><b>Peer:</b></td>
|
||||
<td>Alice to Bob</td></tr>
|
||||
<tr><td align="right" valign="top"><b>Data:</b></td>
|
||||
<td><ul>
|
||||
<li>256 byte X, to begin the DH agreement</li>
|
||||
<li>1 byte IP address size</li>
|
||||
<li>that many byte representation of Bob's IP address</li>
|
||||
<li>N bytes, currently uninterpreted</li>
|
||||
</ul></td></tr>
|
||||
<tr><td align="right" valign="top"><b>Key used:</b></td>
|
||||
<td>introKey</td></tr>
|
||||
</table>
|
||||
|
||||
<pre>
|
||||
+----+----+----+----+----+----+----+----+
|
||||
| X, as calculated from DH |
|
||||
| |
|
||||
. . .
|
||||
| |
|
||||
+----+----+----+----+----+----+----+----+
|
||||
|size| that many byte IP address (4-16) |
|
||||
+----+----+----+----+----+----+----+----+
|
||||
| arbitrary amount |
|
||||
| of uninterpreted data |
|
||||
. . .
|
||||
| |
|
||||
+----+----+----+----+----+----+----+----+
|
||||
</pre>
|
||||
|
||||
<p>
|
||||
Typical size including header, in current implementation: 304 bytes
|
||||
</p>
|
||||
|
||||
<h4>Notes</h4>
|
||||
<ul><li>
|
||||
IP address is always 4 bytes in the current implementation.
|
||||
</li><li>
|
||||
The uninterpreted data could possibly be used in the future for challenges.
|
||||
</li></ul>
|
||||
|
||||
|
||||
IP address is always 4 bytes in the current implementation.
|
||||
|
||||
|
||||
|
||||
<h3 id="sessionCreated">SessionCreated (type 1)</h3>
|
||||
<p>
|
||||
This is the response to a Session Request.
|
||||
</p>
|
||||
|
||||
<table border="1">
|
||||
<tr><td align="right" valign="top"><b>Peer:</b></td>
|
||||
<td>Bob to Alice</td></tr>
|
||||
<tr><td align="right" valign="top"><b>Data:</b></td>
|
||||
<td><ul>
|
||||
<li>256 byte Y, to complete the DH agreement</li>
|
||||
<li>1 byte IP address size</li>
|
||||
<li>that many byte representation of Alice's IP address</li>
|
||||
<li>2 byte Alice's port number</li>
|
||||
<li>4 byte relay (introduction) tag which Alice can publish (else 0x00000000)</li>
|
||||
<li>4 byte timestamp (seconds from the epoch) for use in the DSA
|
||||
signature</li>
|
||||
<li>40 byte <a href="common_structures_spec.html#type_Signature">DSA signature</a> of the critical exchanged data
|
||||
(X + Y + Alice's IP + Alice's port + Bob's IP + Bob's port + Alice's
|
||||
new relay tag + Bob's signed on time), encrypted with another
|
||||
layer of encryption using the negotiated sessionKey. The IV
|
||||
is reused here.</li>
|
||||
<li>8 bytes padding, encrypted with an additional layer of encryption
|
||||
using the negotiated session key as part of the DSA block</li>
|
||||
<li>N bytes, currently uninterpreted</li>
|
||||
</ul></td></tr>
|
||||
<tr><td align="right" valign="top"><b>Key used:</b></td>
|
||||
<td>introKey, with an additional layer of encryption over the 40 byte
|
||||
signature and the following 8 bytes padding.</td></tr>
|
||||
</table>
|
||||
|
||||
<pre>
|
||||
+----+----+----+----+----+----+----+----+
|
||||
| Y, as calculated from DH |
|
||||
| |
|
||||
. . .
|
||||
| |
|
||||
+----+----+----+----+----+----+----+----+
|
||||
|size| that many byte IP address (4-16) |
|
||||
+----+----+----+----+----+----+----+----+
|
||||
| Port (A)| public relay tag | signed
|
||||
+----+----+----+----+----+----+----+----+
|
||||
on time | |
|
||||
+----+----+ |
|
||||
| DSA signature |
|
||||
+ +
|
||||
| |
|
||||
+ +
|
||||
| |
|
||||
+ +
|
||||
| |
|
||||
+ +----+----+----+----+----+----+
|
||||
| | (8 bytes of padding)
|
||||
+----+----+----+----+----+----+----+----+
|
||||
| |
|
||||
+----+----+ |
|
||||
| arbitrary amount |
|
||||
| of uninterpreted data |
|
||||
. . .
|
||||
| |
|
||||
+----+----+----+----+----+----+----+----+
|
||||
</pre>
|
||||
|
||||
<p>
|
||||
Typical size including header, in current implementation: 368 bytes
|
||||
</p>
|
||||
|
||||
<h4>Notes</h4>
|
||||
<ul><li>
|
||||
IP address is always 4 bytes in the current implementation.
|
||||
</li><li>
|
||||
If the relay tag is nonzero, Bob is offering to act as an introducer for Alice.
|
||||
Alice may subsequently publish Bob's address and the relay tag in the network database.
|
||||
</li><li>
|
||||
For the signature, Bob must use his external port, as that what Alice will use to verify.
|
||||
If Bob's NAT/firewall has mapped his internal port to a different external port,
|
||||
and Bob is unaware of it, the verification by Alice will fail.
|
||||
</li><li>
|
||||
See <a href="#keys">the Keys section above</a> for details on DSA signatures.
|
||||
Alice already has Bob's public signing key, from the network database.
|
||||
</li><li>
|
||||
Signed-on time appears to be unused or unverified in the current implementation.
|
||||
</li><li>
|
||||
The uninterpreted data could possibly be used in the future for challenges.
|
||||
</li></ul>
|
||||
|
||||
|
||||
|
||||
<h3 id="sessionConfirmed">SessionConfirmed (type 2)</h3>
|
||||
<p>
|
||||
This is the response to a Session Created message and the last step in establishing a session.
|
||||
There may be multiple Session Confirmed messages required if the Router Identity must be fragmented.
|
||||
</p>
|
||||
|
||||
<table border="1">
|
||||
<tr><td align="right" valign="top"><b>Peer:</b></td>
|
||||
<td>Alice to Bob</td></tr>
|
||||
<tr><td align="right" valign="top"><b>Data:</b></td>
|
||||
<td><ul>
|
||||
<li>1 byte identity fragment info:<pre>
|
||||
Bit order: 76543210 (bit 7 is MSB)
|
||||
bits 7-4: current identity fragment # 0-14
|
||||
bits 3-0: total identity fragments (F) 1-15</pre></li>
|
||||
<li>2 byte size of the current identity fragment</li>
|
||||
<li>that many byte fragment of Alice's
|
||||
<a href="common_structures_spec#struct_RouterIdentity">Router Identity</a>
|
||||
</li>
|
||||
<li>After the last identity fragment only:
|
||||
<ul><li>4 byte signed-on time
|
||||
</li></ul>
|
||||
<li>N bytes padding, currently uninterpreted
|
||||
<li>After the last identity fragment only:
|
||||
<ul><li>The last 40
|
||||
bytes contain the <a href="common_structures_spec.html#type_Signature">DSA signature</a> of the critical exchanged
|
||||
data (X + Y + Alice's IP + Alice's port + Bob's IP + Bob's port
|
||||
+ Alice's new relay key + Alice's signed on time)</li>
|
||||
</li></ul>
|
||||
</ul></td></tr>
|
||||
<tr><td align="right" valign="top"><b>Key used:</b></td>
|
||||
<td>sessionKey</td></tr>
|
||||
</table>
|
||||
|
||||
<pre>
|
||||
<b>Fragment 0 through F-2</b>
|
||||
+----+----+----+----+----+----+----+----+
|
||||
|info| cursize | |
|
||||
+----+----+----+ |
|
||||
| fragment of Alice's full |
|
||||
| Router Identity |
|
||||
. . .
|
||||
| |
|
||||
+----+----+----+----+----+----+----+----+
|
||||
| arbitrary amount of uninterpreted |
|
||||
| data |
|
||||
+----+----+----+----+----+----+----+----+
|
||||
|
||||
<b>Fragment F-1:</b>
|
||||
+----+----+----+----+----+----+----+----+
|
||||
|info| cursize | |
|
||||
+----+----+----+ |
|
||||
| last fragment of Alice's full |
|
||||
| Router Identity |
|
||||
. . .
|
||||
| |
|
||||
+----+----+----+----+----+----+----+----+
|
||||
| signed on time | |
|
||||
+----+----+----+----+ |
|
||||
| arbitrary amount of uninterpreted |
|
||||
| data, to 40 bytes prior to |
|
||||
| end of the current packet |
|
||||
+----+----+----+----+----+----+----+----+
|
||||
| DSA signature |
|
||||
+ +
|
||||
| |
|
||||
+ +
|
||||
| |
|
||||
+ +
|
||||
| |
|
||||
+ +
|
||||
| |
|
||||
+----+----+----+----+----+----+----+----+
|
||||
</pre>
|
||||
|
||||
<p>
|
||||
Typical size including header, in current implementation: 480 bytes
|
||||
</p>
|
||||
|
||||
<h4>Notes</h4>
|
||||
<ul><li>
|
||||
In the current implementation, the maximum fragment size is 512 bytes.
|
||||
</li><li>
|
||||
The typical <a href="common_structures_spec.html#struct_RouterIdentity">Router Identity</a>
|
||||
is 387 bytes, so no fragmentation is usually necessary.
|
||||
</li><li>
|
||||
There is no mechanism for requesting or redelivering missing fragments.
|
||||
</li><li>
|
||||
The total fragments field F must be set identically in all fragments.
|
||||
</li><li>
|
||||
See <a href="#keys">the Keys section above</a> for details on DSA signatures.
|
||||
</li><li>
|
||||
Signed-on time appears to be unused or unverified in the current implementation.
|
||||
</li></ul>
|
||||
|
||||
|
||||
|
||||
|
||||
<h3 id="sessionDestroyed">SessionDestroyed (type 8)</h3>
|
||||
<p>
|
||||
The Session Destroyed message was implemented (reception only) in release 0.8.1,
|
||||
and is never sent. Transmission implemented as of release 0.8.9.
|
||||
</p>
|
||||
|
||||
<table border="1">
|
||||
<tr><td align="right" valign="top"><b>Peer:</b></td>
|
||||
<td>Alice to Bob or Bob to Alice</td></tr>
|
||||
<tr><td align="right" valign="top"><b>Data:</b></td>
|
||||
<td>none
|
||||
</td></tr>
|
||||
<tr><td align="right" valign="top"><b>Key used:</b></td>
|
||||
<td>sessionKey or introKey</td></tr>
|
||||
</table>
|
||||
|
||||
<pre>
|
||||
+----+----+----+----+----+----+----+----+
|
||||
| no data |
|
||||
+----+----+----+----+----+----+----+----+
|
||||
</pre>
|
||||
|
||||
<p>
|
||||
Typical size including header, in current implementation: 48 bytes
|
||||
</p>
|
||||
|
||||
<h3 id="relayRequest">RelayRequest (type 3)</h3>
|
||||
<p>
|
||||
This is the first message sent from Alice to Bob to request an introduction to Charlie.
|
||||
</p>
|
||||
|
||||
<table border="1">
|
||||
<tr><td align="right" valign="top"><b>Peer:</b></td>
|
||||
<td>Alice to Bob</td></tr>
|
||||
<tr><td align="right" valign="top"><b>Data:</b></td>
|
||||
<td><ul>
|
||||
<li>4 byte relay (introduction) tag, nonzero</li>
|
||||
<li>1 byte IP address size</li>
|
||||
<li>that many byte representation of Alice's IP address</li>
|
||||
<li>2 byte port number (of Alice)</li>
|
||||
<li>1 byte challenge size</li>
|
||||
<li>that many bytes to be relayed to Charlie in the intro</li>
|
||||
<li>Alice's 32-byte introduction key (so Bob can reply with Charlie's info)</li>
|
||||
<li>4 byte nonce of Alice's relay request</li>
|
||||
<li>N bytes, currently uninterpreted</li>
|
||||
</ul></td></tr>
|
||||
<tr><td align="right" valign="top"><b>Key used:</b></td>
|
||||
<td>introKey (or sessionKey, if Alice/Bob is established)</td></tr>
|
||||
</table>
|
||||
|
||||
<pre>
|
||||
+----+----+----+----+----+----+----+----+
|
||||
| relay tag |size| Alice IP addr
|
||||
+----+----+----+----+----+--- +----+----|
|
||||
| Port (A)|size| challenge bytes |
|
||||
+----+----+----+----+ +
|
||||
| to be delivered to Charlie |
|
||||
+----+----+----+----+----+----+----+----+
|
||||
| Alice's intro key |
|
||||
+ +
|
||||
| |
|
||||
+ +
|
||||
| |
|
||||
+ +
|
||||
| |
|
||||
+----+----+----+----+----+----+----+----+
|
||||
| nonce | |
|
||||
+----+----+----+----+ |
|
||||
| arbitrary amount of uninterpreted data|
|
||||
+----+----+----+----+----+----+----+----+
|
||||
</pre>
|
||||
|
||||
<p>
|
||||
Typical size including header, in current implementation: 96 bytes
|
||||
</p>
|
||||
|
||||
<h4>Notes</h4>
|
||||
<ul><li>
|
||||
The IP address is only included if it is be different than the
|
||||
packet's source address and port. In the current implementation, the
|
||||
IP length is always 0 and the port is always 0, and the receiver should
|
||||
use the packet's source address and port.
|
||||
</li><li>
|
||||
Challenge is unimplemented, challenge size is always zero
|
||||
</li></ul>
|
||||
|
||||
|
||||
<h3 id="relayResponse">RelayResponse (type 4)</h3>
|
||||
<p>
|
||||
This is the response to a Relay Request and is sent from Bob to Alice.
|
||||
</p>
|
||||
|
||||
<table border="1">
|
||||
<tr><td align="right" valign="top"><b>Peer:</b></td>
|
||||
<td>Bob to Alice</td></tr>
|
||||
<tr><td align="right" valign="top"><b>Data:</b></td>
|
||||
<td><ul>
|
||||
<li>1 byte IP address size</li>
|
||||
<li>that many byte representation of Charlie's IP address</li>
|
||||
<li>2 byte Charlie's port number</li>
|
||||
<li>1 byte IP address size</li>
|
||||
<li>that many byte representation of Alice's IP address</li>
|
||||
<li>2 byte Alice's port number</li>
|
||||
<li>4 byte nonce sent by Alice</li>
|
||||
<li>N bytes, currently uninterpreted</li>
|
||||
</ul></td></tr>
|
||||
<tr><td align="right" valign="top"><b>Key used:</b></td>
|
||||
<td>introKey (or sessionKey, if Alice/Bob is established)</td></tr>
|
||||
</table>
|
||||
|
||||
<pre>
|
||||
+----+----+----+----+----+----+----+----+
|
||||
|size| Charlie IP | Port (C)|size|
|
||||
+----+----+----+----+----+----+----+----+
|
||||
| Alice IP | Port (A)| nonce
|
||||
+----+----+----+----+----+----+----+----+
|
||||
| arbitrary amount of |
|
||||
+----+----+ |
|
||||
| uninterpreted data |
|
||||
+----+----+----+----+----+----+----+----+
|
||||
</pre>
|
||||
|
||||
<p>
|
||||
Typical size including header, in current implementation: 64 bytes
|
||||
</p>
|
||||
|
||||
<h4>Notes</h4>
|
||||
IP address is always 4 bytes in the current implementation.
|
||||
|
||||
|
||||
|
||||
<h3 id="relayIntro">RelayIntro (type 5)</h3>
|
||||
<p>
|
||||
This is the introduction for Alice, which is sent from Bob to Charlie.
|
||||
</p>
|
||||
|
||||
<table border="1">
|
||||
<tr><td align="right" valign="top"><b>Peer:</b></td>
|
||||
<td>Bob to Charlie</td></tr>
|
||||
<tr><td align="right" valign="top"><b>Data:</b></td>
|
||||
<td><ul>
|
||||
<li>1 byte IP address size</li>
|
||||
<li>that many byte representation of Alice's IP address</li>
|
||||
<li>2 byte port number (of Alice)</li>
|
||||
<li>1 byte challenge size</li>
|
||||
<li>that many bytes relayed from Alice</li>
|
||||
<li>N bytes, currently uninterpreted</li>
|
||||
</ul></td></tr>
|
||||
<tr><td align="right" valign="top"><b>Key used:</b></td>
|
||||
<td>sessionKey</td></tr>
|
||||
</table>
|
||||
|
||||
<pre>
|
||||
+----+----+----+----+----+----+----+----+
|
||||
|size| Alice IP | Port (A)|size|
|
||||
+----+----+----+----+----+----+----+----+
|
||||
| that many bytes of challenge |
|
||||
+ |
|
||||
| data relayed from Alice |
|
||||
+----+----+----+----+----+----+----+----+
|
||||
| arbitrary amount of uninterpreted data|
|
||||
+----+----+----+----+----+----+----+----+
|
||||
</pre>
|
||||
|
||||
<p>
|
||||
Typical size including header, in current implementation: 48 bytes
|
||||
</p>
|
||||
|
||||
<h4>Notes</h4>
|
||||
<ul><li>
|
||||
IP address is always 4 bytes in the current implementation.
|
||||
</li><li>
|
||||
Challenge is unimplemented, challenge size is always zero
|
||||
</li></ul>
|
||||
|
||||
|
||||
|
||||
|
||||
<h3 id="data">Data (type 6)</h3>
|
||||
<p>
|
||||
This message is used for data transport and acknowledgment.
|
||||
</p>
|
||||
|
||||
<table border="1">
|
||||
<tr><td align="right" valign="top"><b>Peer:</b></td>
|
||||
<td>Any</td></tr>
|
||||
<tr><td align="right" valign="top"><b>Data:</b></td>
|
||||
<td><ul>
|
||||
<li>1 byte flags:<pre>
|
||||
Bit order: 76543210 (bit 7 is MSB)
|
||||
bit 7: explicit ACKs included
|
||||
bit 6: ACK bitfields included
|
||||
bit 5: reserved
|
||||
bit 4: explicit congestion notification (ECN)
|
||||
bit 3: request previous ACKs
|
||||
bit 2: want reply
|
||||
bit 1: extended data included (unused, never set)
|
||||
bit 0: reserved</pre></li>
|
||||
<li>if explicit ACKs are included:<ul>
|
||||
<li>a 1 byte number of ACKs</li>
|
||||
<li>that many 4 byte MessageIds being fully ACKed</li>
|
||||
</ul></li>
|
||||
<li>if ACK bitfields are included:<ul>
|
||||
<li>a 1 byte number of ACK bitfields</li>
|
||||
<li>that many 4 byte MessageIds + a 1 or more byte ACK bitfield.
|
||||
The bitfield uses the 7 low bits of each byte, with the high
|
||||
bit specifying whether an additional bitfield byte follows it
|
||||
(1 = true, 0 = the current bitfield byte is the last). These
|
||||
sequence of 7 bit arrays represent whether a fragment has been
|
||||
received - if a bit is 1, the fragment has been received. To
|
||||
clarify, assuming fragments 0, 2, 5, and 9 have been received,
|
||||
the bitfield bytes would be as follows:<pre>
|
||||
byte 0
|
||||
Bit order: 76543210 (bit 7 is MSB)
|
||||
bit 7: 1 (further bitfield bytes follow)
|
||||
bit 6: 1 (fragment 0 received)
|
||||
bit 5: 0 (fragment 1 not received)
|
||||
bit 4: 1 (fragment 2 received)
|
||||
bit 3: 0 (fragment 3 not received)
|
||||
bit 2: 0 (fragment 4 not received)
|
||||
bit 1: 1 (fragment 5 received)
|
||||
bit 0: 0 (fragment 6 not received)
|
||||
byte 1
|
||||
Bit order: 76543210 (bit 7 is MSB)
|
||||
bit 7: 0 (no further bitfield bytes)
|
||||
bit 6: 0 (fragment 7 not received)
|
||||
bit 5: 0 (fragment 8 not received)
|
||||
bit 4: 1 (fragment 9 received)
|
||||
bit 3: 0 (fragment 10 not received)
|
||||
bit 2: 0 (fragment 11 not received)
|
||||
bit 1: 0 (fragment 12 not received)
|
||||
bit 0: 0 (fragment 13 not received)</pre></li>
|
||||
</ul></li>
|
||||
<li>If extended data included:<ul>
|
||||
<li>1 byte data size</li>
|
||||
<li>that many bytes of extended data (currently uninterpreted)</li></ul></li>
|
||||
<li>1 byte number of fragments (can be zero)</li>
|
||||
<li>If nonzero, that many message fragments. Each fragment contains:<ul>
|
||||
<li>4 byte messageId</li>
|
||||
<li>3 byte fragment info:<pre>
|
||||
Bit order: 76543210 (bit 7 is MSB)
|
||||
bits 23-17: fragment # 0 - 127
|
||||
bit 16: isLast (1 = true)
|
||||
bits 15-14: unused
|
||||
bits 13-0: fragment size 0 - 16383</pre></li>
|
||||
<li>that many bytes</li></ul>
|
||||
<li>N bytes padding, uninterpreted</li>
|
||||
</ul></td></tr>
|
||||
<tr><td align="right" valign="top"><b>Key used:</b></td>
|
||||
<td>sessionKey</td></tr>
|
||||
</table>
|
||||
|
||||
<pre>
|
||||
+----+----+----+----+----+----+----+----+
|
||||
|flag| (additional headers, determined |
|
||||
+----+ |
|
||||
| by the flags, such as ACKs or |
|
||||
| bitfields |
|
||||
+----+----+----+----+----+----+----+----+
|
||||
|#frg| messageId | frag info |
|
||||
+----+----+----+----+----+----+----+----+
|
||||
| that many bytes of fragment data |
|
||||
. . .
|
||||
| |
|
||||
+----+----+----+----+----+----+----+----+
|
||||
| messageId | frag info | |
|
||||
+----+----+----+----+----+----+----+ |
|
||||
| that many bytes of fragment data |
|
||||
. . .
|
||||
| |
|
||||
+----+----+----+----+----+----+----+----+
|
||||
| messageId | frag info | |
|
||||
+----+----+----+----+----+----+----+ |
|
||||
| that many bytes of fragment data |
|
||||
. . .
|
||||
| |
|
||||
+----+----+----+----+----+----+----+----+
|
||||
| arbitrary amount of uninterpreted data|
|
||||
+----+----+----+----+----+----+----+----+
|
||||
</pre>
|
||||
|
||||
<h4>Notes</h4>
|
||||
<ul><li>
|
||||
The current implementation adds a limited number of duplicate acks for
|
||||
messages previously acked, if space is available.
|
||||
</li><li>
|
||||
If the number of fragments is zero, this is an ack-only or keepalive message.
|
||||
</li><li>
|
||||
The ECN feature is unimplemented, and the bit is never set.
|
||||
</li><li>
|
||||
The want reply bit is always set in the current implementation.
|
||||
</li><li>
|
||||
Extended data is unimplemented and never present.
|
||||
</li><li>
|
||||
The current implementation does not pack multiple fragments into a single packet;
|
||||
the number of fragments is always 0 or 1.
|
||||
</li><li>
|
||||
As currently implemented, maximum fragments is 64
|
||||
(maximum fragment number = 63).
|
||||
</li><li>
|
||||
As currently implemented, maximum fragment size is of course
|
||||
less than the MTU.
|
||||
</li><li>
|
||||
Take care not to exceed the maximum MTU even if there is a large number of
|
||||
ACKs to send.
|
||||
</li><li>
|
||||
The protocol allows zero-length fragments but there's no reason to send them.
|
||||
</li><li>
|
||||
In SSU, the data uses a short 5-byte I2NP header followed by the payload
|
||||
of the I2NP message instead of the standard 16-byte I2NP header.
|
||||
The short I2NP header consists only of
|
||||
the one-byte I2NP type and 4-byte expiration in seconds.
|
||||
The I2NP message ID is used as the message ID for the fragment.
|
||||
The I2NP size is assembled from the fragment sizes.
|
||||
The I2NP checksum is not required as UDP message integrity is ensured by decryption.
|
||||
</li><li>
|
||||
Message IDs are not sequence numbers and are not consecutive.
|
||||
SSU does not guarantee in-order delivery.
|
||||
While we use the I2NP message ID as the SSU message ID, from the SSU
|
||||
protocol view, they are random numbers.
|
||||
In fact, since the router uses a single Bloom filter for all peers,
|
||||
the message ID must be an actual random number.
|
||||
</li></ul>
|
||||
|
||||
|
||||
|
||||
<h3 id="peerTest">PeerTest (type 7)</h3>
|
||||
<p>
|
||||
See <a href="{{ site_url('docs/transport/ssu') }}#peerTesting">the UDP overview page</a> for details.
|
||||
</p>
|
||||
|
||||
<table border="1">
|
||||
<tr><td align="right" valign="top"><b>Peer:</b></td>
|
||||
<td>Any</td></tr>
|
||||
<tr><td align="right" valign="top"><b>Data:</b></td>
|
||||
<td><ul>
|
||||
<li>4 byte nonce</li>
|
||||
<li>1 byte IP address size</li>
|
||||
<li>that many byte representation of Alice's IP address</li>
|
||||
<li>2 byte Alice's port number</li>
|
||||
<li>Alice's 32-byte introduction key</li>
|
||||
<li>N bytes, currently uninterpreted</li>
|
||||
</ul></td></tr>
|
||||
<tr><td align="right" valign="top"><b>Key used:</b></td>
|
||||
<td>introKey (or sessionKey if the connection has already been established)</td></tr>
|
||||
</table>
|
||||
|
||||
<pre>
|
||||
+----+----+----+----+----+----+----+----+
|
||||
| test nonce |size| Alice IP addr
|
||||
+----+----+----+----+----+----+----+----+
|
||||
| Port (A)| |
|
||||
+----+----+----+ +
|
||||
| Alice or Charlie's |
|
||||
+ introduction key (Alice's is sent to +
|
||||
| Bob and Charlie, while Charlie's is |
|
||||
+ sent to Alice) +
|
||||
| |
|
||||
| +----+----+----+----+----+
|
||||
| | arbitrary amount of |
|
||||
|----+----+----+ |
|
||||
| uninterpreted data |
|
||||
+----+----+----+----+----+----+----+----+
|
||||
</pre>
|
||||
|
||||
<p>
|
||||
Typical size including header, in current implementation: 80 bytes
|
||||
</p>
|
||||
|
||||
<h4>Notes</h4>
|
||||
<ul><li>
|
||||
When sent by Alice, IP address size is 0, IP address is not present, and port is 0,
|
||||
as Bob and Charlie do not need this information.
|
||||
</li><li>
|
||||
When sent by Bob or Charlie, IP and port are present, and
|
||||
IP address is always 4 bytes in the current implementation.
|
||||
</li></ul>
|
||||
|
||||
<h3 id="holePunch">HolePunch</h3>
|
||||
<p>
|
||||
A HolePunch is simply a UDP packet with no data.
|
||||
It is unauthenticated and unencrypted.
|
||||
It does not contain a SSU header, so it does not have a message type number.
|
||||
It is sent from Charlie to Alice as a part of the Introduction sequence.
|
||||
</p>
|
||||
|
||||
|
||||
<h2><a name="sampleDatagrams">Sample datagrams</a></h2>
|
||||
|
||||
<b>Minimal data message (no fragments, no ACKs, no NACKs, etc)</b><br />
|
||||
<i>(Size: 39 bytes)</i>
|
||||
|
||||
<pre>
|
||||
+----+----+----+----+----+----+----+----+
|
||||
| MAC |
|
||||
+ +
|
||||
| |
|
||||
+----+----+----+----+----+----+----+----+
|
||||
| IV |
|
||||
+ +
|
||||
| |
|
||||
+----+----+----+----+----+----+----+----+
|
||||
|flag| time |flag|#frg| |
|
||||
+----+----+----+----+----+----+----+ |
|
||||
| padding to fit a full AES256 block |
|
||||
+----+----+----+----+----+----+----+----+
|
||||
</pre>
|
||||
|
||||
<b>Minimal data message with payload</b><br />
|
||||
<i>(Size: 46+fragmentSize bytes)</i>
|
||||
|
||||
<pre>
|
||||
+----+----+----+----+----+----+----+----+
|
||||
| MAC |
|
||||
+ +
|
||||
| |
|
||||
+----+----+----+----+----+----+----+----+
|
||||
| IV |
|
||||
+ +
|
||||
| |
|
||||
+----+----+----+----+----+----+----+----+
|
||||
|flag| time |flag|#frg|
|
||||
+----+----+----+----+----+----+----+----+
|
||||
messageId | frag info | |
|
||||
+----+----+----+----+----+----+ |
|
||||
| that many bytes of fragment data |
|
||||
. . .
|
||||
| |
|
||||
+----+----+----+----+----+----+----+----+
|
||||
</pre>
|
||||
|
||||
|
||||
{% endblock %}
|
Reference in New Issue
Block a user