Moved transport-related pages into docs

This commit is contained in:
str4d
2012-11-09 10:11:48 +00:00
parent 6c131a8117
commit a49baa5e32
6 changed files with 33 additions and 33 deletions

View File

@@ -147,13 +147,13 @@ Selecting peers, requesting tunnels through those peers, and encrypting and rout
<h3>Transport Layer</h3>
The protocols for direct (point-to-point) router to router communication.
<ul><li>
<a href="transport.html">Transport layer overview</a>
<a href="{{ site_url('docs/transport') }}">Transport layer overview</a>
</li><li>
<a href="ntcp.html">NTCP</a> TCP-based transport overview and specification
<a href="{{ site_url('docs/transport/ntcp') }}">NTCP</a> TCP-based transport overview and specification
</li><li>
<a href="udp.html">SSU</a> UDP-based transport overview
<a href="{{ site_url('docs/transport/ssu') }}">SSU</a> UDP-based transport overview
</li><li>
<a href="udp_spec.html">SSU specification</a>
<a href="{{ site_url('docs/transport/ssu/spec') }}">SSU specification</a>
</li><li>
<a href="{{ site_url('docs/how/cryptography') }}#tcp">NTCP transport encryption</a>
</li><li>

View File

@@ -0,0 +1,127 @@
{% extends "global/layout.html" %}
{% block title %}Transport Overview{% endblock %}
{% block content %}
Updated July 2010, current as of router version 0.8
<h1>Transports in I2P</h1>
A "transport" in I2P is a method for direct, point-to-point communication
between two routers.
Transports must provide confidentiality and integrity
against external adversaries while authenticating that the router contacted
is the one who should receive a given message.
<p> I2P supports multiple transports simultaneously.
There are two transports currently implemented:
<ol>
<li> <a href="{{ site_url('docs/transport/ntcp') }}">NTCP</a>, a Java New I/O (NIO) TCP transport
<li> <a href="{{ site_url('docs/transport/ssu') }}">SSU</a>, or Secure Semireliable UDP
</ol>
Each provides a "connection" paradigm, with authentication,
flow control, acknowledgments and retransmission.
<h2>Transport Services</h2>
The transport subsystem in I2P provides the following services:
<ul>
<li>Maintain a set of router addresses, one or more for each transport,
that the router publishes as its global contact information (the RouterInfo)
<li>Selection of the best transport for each outgoing message
<li>Queueing of outbound messages by priority
<li>Bandwidth limiting, both outbound and inbound, according to router configuration
<li>Setup and teardown of transport connections
<li>Encryption of point-to-point communications
<li>Maintenance of connection limits for each transport, implementation of various thresholds for these limits,
and communication of threshold status to the router so it may make operational changes based on the status
<li>Firewall port opening using UPnP (Universal Plug and Play)
<li>Cooperative NAT/Firewall traversal
<li>Local IP detection by various methods, including UPnP, inspection of incoming connections, and enumeration of network devices
<li>Coordination of firewall status and local IP, and changes to either, among the transports
<li>Communication of firewall status and local IP, and changes to either, to the router and the user interface
<li>Determination of a consensus clock, which is used to periodically update the router's clock, as a backup for NTP
<li>Maintenance of status for each peer, including whether it is connected, whether it was recently connected,
and whether it was reachable in the last attempt
<li>Qualification of valid IP addresses according to a local rule set
<li>Honoring the automated and manual lists of banned peers maintained by the router,
and refusing outbound and inbound connections to those peers
</ul>
<h2>Transport Addresses</h2>
The transport subsystem maintains a set of router addresses, each of which lists a transport method, IP, and port.
These addresses constitute the advertised contact points, and are published by the router to the network database.
<p>
Typical scenarios are:
<ul>
<li>A router has no published addresses, so it is considered "hidden" and cannot receive incoming connections
<li>A router is firewalled, and therefore publishes an SSU address which contains a list of cooperating
peers or "introducers" who will assist in NAT traversal (see <a href="{{ site_url('docs/transport/ssu') }}">the SSU spec</a> for details)
<li>A router is not firewalled or its NAT ports are open; it publishes both NTCP and SSU addresses containing
directly-accessible IP and ports.
</ul>
<h2>Transport Selection</h2>
The transport system delivers <a href="i2np.html">I2NP messages</a>. The transport selected for any message is
independent of the application-layer protocol (TCP or UDP).
<p>
For each outgoing message, the transport system solicits "bids" from each transport.
The transport bidding the lowest (best) value wins the bid and receives the message for delivery.
A transport may refuse to bid.
<p>
Whether a transport bids, and with what value, depend on numerous factors:
<ul>
<li>Configuration of transport preferences
<li>Whether the transport is already connected to the peer
<li>The number of current connections compared to various connection limit thresholds
<li>Whether recent connection attempts to the peer have failed
<li>The size of the message, as different transports have different size limits
<li>Whether the peer can accept incoming connections for that transport, as advertised in its RouterInfo
<li>Whether the connection would be indirect (requiring introducers) or direct
<li>The peer's transport preference, as advertised in its RouterInfo
</ul>
<p>
In general, the bid values are selected so that two routers are only connected by a single transport
at any one time. However, this is not a requirement.
<h2>New Transports and Future Work</h2>
Additional transports may be developed, including:
<ul>
<li>A TLS/SSH look-alike transport
<li>An "indirect" transport for routers that are not reachable by all other routers (one form of "restricted routes")
</ul>
<p>
Also, the existing transports will be enhanced to support multiple addresses within a single transport,
including IPV6 addresses. Currently, a transport may only advertise a single IPV4 address.
<p>
Work continues on adjusting default connection limits for each transport.
I2P is designed as a "mesh network", where it is assumed that any router can connect to any other router.
This assumption may be broken by routers that have exceeded their connection limits, and by
routers that are behind restrictive state firewalls (restricted routes).
<p>
The current connection limits are higher for SSU than for NTCP, based on the assumption that
the memory requirements for an NTCP connection are higher than that for SSU.
However, as NTCP buffers are partially in the kernel and SSU buffers are on the Java heap,
that assumption is difficult to verify.
</p><p>
Analyze
<a href="http://www.cse.chalmers.se/%7Ejohnwolf/publications/hjelmvik_breaking.pdf">Breaking and Improving Protocol Obfuscation</a>
and see how transport-layer padding may improve things.
</p>
{% endblock %}

View File

@@ -0,0 +1,559 @@
{% extends "global/layout.html" %}
{% block title %}NTCP Discussion{% endblock %}
{% block content %}
Following is a discussion about NTCP that took place in March 2007.
It has not been updated to reflect current implementation.
For the current NTCP specification see <a href="{{ site_url('docs/transport/ntcp') }}">the main NTCP page</a>.
<h2>NTCP vs. SSU Discussion, March 2007</h2>
<h3>NTCP questions</h3>
(adapted from an IRC discussion between zzz and cervantes)
<br />
Why is NTCP preferred over SSU, doesn't NTCP have higher overhead and latency?
It has better reliability.
<br />
Doesn't streaming lib over NTCP suffer from classic TCP-over-TCP issues?
What if we had a really simple UDP transport for streaming-lib-originated traffic?
I think SSU was meant to be the so-called really simple UDP transport - but it just proved too unreliable.
<h3>"NTCP Considered Harmful" Analysis by zzz</h3>
Posted to new Syndie, 2007-03-25.
This was posted to stimulate discussion, don't take it too seriously.
<p>
Summary: NTCP has higher latency and overhead than SSU, and is more likely to
collapse when used with the streaming lib. However, traffic is routed with a
preference for NTCP over SSU and this is currently hardcoded.
</p>
<h4>Discussion</h4>
<p>
We currently have two transports, NTCP and SSU. As currently implemented, NTCP
has lower "bids" than SSU so it is preferred, except for the case where there
is an established SSU connection but no established NTCP connection for a peer.
</p><p>
SSU is similar to NTCP in that it implements acknowledgments, timeouts, and
retransmissions. However SSU is I2P code with tight constraints on the
timeouts and available statistics on round trip times, retransmissions, etc.
NTCP is based on Java NIO TCP, which is a black box and presumably implements
RFC standards, including very long maximum timeouts.
</p><p>
The majority of traffic within I2P is streaming-lib originated (HTTP, IRC,
Bittorrent) which is our implementation of TCP. As the lower-level transport is
generally NTCP due to the lower bids, the system is subject to the well-known
and dreaded problem of TCP-over-TCP
http://sites.inka.de/~W1011/devel/tcp-tcp.html , where both the higher and
lower layers of TCP are doing retransmissions at once, leading to collapse.
</p><p>
Unlike in the PPP over SSH scenario described in the link above, we have
several hops for the lower layer, each covered by a NTCP link. So each NTCP
latency is generally much less than the higher-layer streaming lib latency.
This lessens the chance of collapse.
</p><p>
Also, the probabilities of collapse are lessened when the lower-layer TCP is
tightly constrained with low timeouts and number of retransmissions compared to
the higher layer.
</p><p>
The .28 release increased the maximum streaming lib timeout from 10 sec to 45
sec which greatly improved things. The SSU max timeout is 3 sec. The NTCP max
timeout is presumably at least 60 sec, which is the RFC recommendation. There
is no way to change NTCP parameters or monitor performance. Collapse of the
NTCP layer is [editor: text lost]. Perhaps an external tool like tcpdump would help.
</p><p>
However, running .28, the i2psnark reported upstream does not generally stay at
a high level. It often goes down to 3-4 KBps before climbing back up. This is a
signal that there are still collapses.
</p><p>
SSU is also more efficient. NTCP has higher overhead and probably higher round
trip times. when using NTCP the ratio of (tunnel output) / (i2psnark data
output) is at least 3.5 : 1. Running an experiment where the code was modified
to prefer SSU (the config option i2np.udp.alwaysPreferred has no effect in the
current code), the ratio reduced to about 3 : 1, indicating better efficiency.
</p><p>
As reported by streaming lib stats, things were much improved - lifetime window
size up from 6.3 to 7.5, RTT down from 11.5s to 10s, sends per ack down from
1.11 to 1.07.
</p><p>
That this was quite effective was surprising, given that we were only changing
the transport for the first of 3 to 5 total hops the outbound messages would
take.
</p><p>
The effect on outbound i2psnark speeds wasn't clear due to normal variations.
Also for the experiment, inbound NTCP was disabled. The effect on inbound
speeds on i2psnark was not clear.
</p>
<h4>Proposals</h4>
<ul>
<li>
1A)
This is easy -
We should flip the bid priorities so that SSU is preferred for all traffic, if
we can do this without causing all sorts of other trouble. This will fix the
i2np.udp.alwaysPreferred configuration option so that it works (either as true
or false).
<li>
1B)
Alternative to 1A), not so easy -
If we can mark traffic without adversely affecting our anonymity goals, we
should identify streaming-lib generated traffic and have SSU generate a low bid
for that traffic. This tag will have to go with the message through each hop
so that the forwarding routers also honor the SSU preference.
<li>
2)
Bounding SSU even further (reducing maximum retransmissions from the current
10) is probably wise to reduce the chance of collapse.
<li>
3)
We need further study on the benefits vs. harm of a semi-reliable protocol
underneath the streaming lib. Are retransmissions over a single hop beneficial
and a big win or are they worse than useless?
We could do a new SUU (secure unreliable UDP) but probably not worth it. We
could perhaps add a no-ack-required message type in SSU if we don't want any
retransmissions at all of streaming-lib traffic. Are tightly bounded
retransmissions desirable?
<li>
4)
The priority sending code in .28 is only for NTCP. So far my testing hasn't
shown much use for SSU priority as the messages don't queue up long enough for
priorities to do any good. But more testing needed.
<li>
5)
The new streaming lib max timeout of 45s is probably still too low.
The TCP RFC says 60s. It probably shouldn't be shorter than the underlying NTCP max timeout (presumably 60s).
</ul>
<h3>Response by jrandom</h3>
Posted to new Syndie, 2007-03-27
<p>
On the whole, I'm open to experimenting with this, though remember why NTCP is
there in the first place - SSU failed in a congestion collapse. NTCP "just
works", and while 2-10% retransmission rates can be handled in normal
single-hop networks, that gives us a 40% retransmission rate with 2 hop
tunnels. If you loop in some of the measured SSU retransmission rates we saw
back before NTCP was implemented (10-30+%), that gives us an 83% retransmission
rate. Perhaps those rates were caused by the low 10 second timeout, but
increasing that much would bite us (remember, multiply by 5 and you've got half
the journey).
</p><p>
Unlike TCP, we have no feedback from the tunnel to know whether the message
made it - there are no tunnel level acks. We do have end to end ACKs, but only
on a small number of messages (whenever we distribute new session tags) - out
of the 1,553,591 client messages my router sent, we only attempted to ACK
145,207 of them. The others may have failed silently or succeeded perfectly.
</p><p>
I'm not convinced by the TCP-over-TCP argument for us, especially split across
the various paths we transfer down. Measurements on I2P can convince me
otherwise, of course.
</p><p>
<i>
The NTCP max timeout is presumably at least 60 sec, which is the RFC
recommendation. There is no way to change NTCP parameters or monitor
performance.
</i>
</p><p>
True, but net connections only get up to that level when something really bad
is going on - the retransmission timeout on TCP is often on the order of tens
or hundreds of milliseconds. As foofighter points out, they've got 20+ years
experience and bugfixing in their TCP stacks, plus a billion dollar industry
optimizing hardware and software to perform well according to whatever it is
they do.
</p><p>
<i>
NTCP has higher overhead and probably higher round trip times. when using NTCP
the ratio of (tunnel output) / (i2psnark data output) is at least 3.5 : 1.
Running an experiment where the code was modified to prefer SSU (the config
option i2np.udp.alwaysPreferred has no effect in the current code), the ratio
reduced to about 3 : 1, indicating better efficiency.
</i>
</p><p>
This is very interesting data, though more as a matter of router congestion
than bandwidth efficiency - you'd have to compare 3.5*$n*$NTCPRetransmissionPct
./. 3.0*$n*$SSURetransmissionPct. This data point suggests there's something in
the router that leads to excess local queuing of messages already being
transferred.
</p><p>
<i>
lifetime window size up from 6.3 to 7.5, RTT down from 11.5s to 10s, sends per
ACK down from 1.11 to 1.07.
</i>
</p><p>
Remember that the sends-per-ACK is only a sample not a full count (as we don't
try to ACK every send). Its not a random sample either, but instead samples
more heavily periods of inactivity or the initiation of a burst of activity -
sustained load won't require many ACKs.
</p><p>
Window sizes in that range are still woefully low to get the real benefit of
AIMD, and still too low to transmit a single 32KB BT chunk (increasing the
floor to 10 or 12 would cover that).
</p><p>
Still, the wsize stat looks promising - over how long was that maintained?
</p><p>
Actually, for testing purposes, you may want to look at
StreamSinkClient/StreamSinkServer or even TestSwarm in
apps/ministreaming/java/src/net/i2p/client/streaming/ - StreamSinkClient is a
CLI app that sends a selected file to a selected destination and
StreamSinkServer creates a destination and writes out any data sent to it
(displaying size and transfer time). TestSwarm combines the two - flooding
random data to whomever it connects to. That should give you the tools to
measure sustained throughput capacity over the streaming lib, as opposed to BT
choke/send.
</p><p>
<i>
1A)
This is easy -
We should flip the bid priorities so that SSU is preferred for all traffic, if
we can do this without causing all sorts of other trouble. This will fix the
i2np.udp.alwaysPreferred configuration option so that it works (either as true
or false).
</i>
</p><p>
Honoring i2np.udp.alwaysPreferred is a good idea in any case - please feel free
to commit that change. Lets gather a bit more data though before switching the
preferences, as NTCP was added to deal with an SSU-created congestion collapse.
</p><p>
<i>
1B)
Alternative to 1A), not so easy -
If we can mark traffic without adversely affecting our anonymity goals, we
should identify streaming-lib generated traffic
and have SSU generate a low bid for that traffic. This tag will have to go with
the message through each hop
so that the forwarding routers also honor the SSU preference.
</i>
</p><p>
In practice, there are three types of traffic - tunnel building/testing, netDb
query/response, and streaming lib traffic. The network has been designed to
make differentiating those three very hard.
</p><p>
<i>
2)
Bounding SSU even further (reducing maximum retransmissions from the current
10) is probably wise to reduce the chance of collapse.
</i>
</p><p>
At 10 retransmissions, we're up shit creek already, I agree. One, maybe two
retransmissions is reasonable, from a transport layer, but if the other side is
too congested to ACK in time (even with the implemented SACK/NACK capability),
there's not much we can do.
</p><p>
In my view, to really address the core issue we need to address why the router
gets so congested to ACK in time (which, from what I've found, is due to CPU
contention). Maybe we can juggle some things in the router's processing to make
the transmission of an already existing tunnel higher CPU priority than
decrypting a new tunnel request? Though we've got to be careful to avoid
starvation.
</p><p>
<i>
3)
We need further study on the benefits vs. harm of a semi-reliable protocol
underneath the streaming lib. Are retransmissions over a single hop beneficial
and a big win or are they worse than useless?
We could do a new SUU (secure unreliable UDP) but probably not worth it. We
could perhaps add a no-ACK-required message type in SSU if we don't want any
retransmissions at all of streaming-lib traffic. Are tightly bounded
retransmissions desirable?
</i>
</p><p>
Worth looking into - what if we just disabled SSU's retransmissions? It'd
probably lead to much higher streaming lib resend rates, but maybe not.
</p><p>
<i>
4)
The priority sending code in .28 is only for NTCP. So far my testing hasn't
shown much use for SSU priority as the messages don't queue up long enough for
priorities to do any good. But more testing needed.
</i>
</p><p>
There's UDPTransport.PRIORITY_LIMITS and UDPTransport.PRIORITY_WEIGHT (honored
by TimedWeightedPriorityMessageQueue), but currently the weights are almost all
equal, so there's no effect. That could be adjusted, of course (but as you
mention, if there's no queuing, it doesn't matter).
</p><p>
<i>
5)
The new streaming lib max timeout of 45s is probably still too low. The TCP RFC
says 60s. It probably shouldn't be shorter than the underlying NTCP max timeout
(presumably 60s).
</i>
</p><p>
That 45s is the max retransmission timeout of the streaming lib though, not the
stream timeout. TCP in practice has retransmission timeouts orders of magnitude
less, though yes, can get to 60s on links running through exposed wires or
satellite transmissions ;) If we increase the streaming lib retransmission
timeout to e.g. 75 seconds, we could go get a beer before a web page loads
(especially assuming less than a 98% reliable transport). That's one reason we
prefer NTCP.
</p>
<h3>Response by zzz</h3>
Posted to new Syndie, 2007-03-31
<p>
<i>
At 10 retransmissions, we're up shit creek already, I agree. One, maybe two
retransmissions is reasonable, from a transport layer, but if the other side is
too congested to ACK in time (even with the implemented SACK/NACK capability),
there's not much we can do.
<br>
In my view, to really address the core issue we need to address why the
router gets so congested to ACK in time (which, from what I've found, is due to
CPU contention). Maybe we can juggle some things in the router's processing to
make the transmission of an already existing tunnel higher CPU priority than
decrypting a new tunnel request? Though we've got to be careful to avoid
starvation.
</i>
</p><p>
One of my main stats-gathering techniques is turning on
net.i2p.client.streaming.ConnectionPacketHandler=DEBUG and watching the RTT
times and window sizes as they go by. To overgeneralize for a moment, it's
common to see 3 types of connections: ~4s RTT, ~10s RTT, and ~30s RTT. Trying
to knock down the 30s RTT connections is the goal. If CPU contention is the
cause then maybe some juggling will do it.
</p><p>
Reducing the SSU max retrans from 10 is really just a stab in the dark as we
don't have good data on whether we are collapsing, having TCP-over-TCP issues,
or what, so more data is needed.
</p><p>
<i>
Worth looking into - what if we just disabled SSU's retransmissions? It'd
probably lead to much higher streaming lib resend rates, but maybe not.
</i>
</p><p>
What I don't understand, if you could elaborate, are the benefits of SSU
retransmissions for non-streaming-lib traffic. Do we need tunnel messages (for
example) to use a semi-reliable transport or can they use an unreliable or
kinda-sorta-reliable transport (1 or 2 retransmissions max, for example)? In
other words, why semi-reliability?
</p><p>
<i>
(but as you mention, if there's no queuing, it doesn't matter).
</i>
</p><p>
I implemented priority sending for UDP but it kicked in about 100,000 times
less often than the code on the NTCP side. Maybe that's a clue for further
investigation or a hint - I don't understand why it would back up that much
more often on NTCP, but maybe that's a hint on why NTCP performs worse.
</p>
<h3>Question answered by jrandom</h3>
Posted to new Syndie, 2007-03-31
<p>
measured SSU retransmission rates we saw back before NTCP was implemented
(10-30+%)
</p><p>
Can the router itself measure this? If so, could a transport be selected based
on measured performance? (i.e. if an SSU connection to a peer is dropping an
unreasonable number of messages, prefer NTCP when sending to that peer)
</p><p>
Yeah, it currently uses that stat right now as a poor-man's MTU detection (if
the retransmission rate is high, it uses the small packet size, but if its low,
it uses the large packet size). We tried a few things when first introducing
NTCP (and when first moving away from the original TCP transport) that would
prefer SSU but fail that transport for a peer easily, causing it to fall back
on NTCP. However, there's certainly more that could be done in that regard,
though it gets complicated quickly (how/when to adjust/reset the bids, whether
to share these preferences across multiple peers or not, whether to share it
across multiple sessions with the same peer (and for how long), etc).
<h3>Response by foofighter</h3>
Posted to new Syndie, 2007-03-26
<p>
If I've understood things right, the primary reason in favor of TCP (in
general, both the old and new variety) was that you needn't worry about coding
a good TCP stack. Which ain't impossibly hard to get right... just that
existing TCP stacks have a 20 year lead.
</p><p>
AFAIK, there hasn't been much deep theory behind the preference of TCP versus
UDP, except the following considerations:
<ul>
<li>
A TCP-only network is very dependent on reachable peers (those who can forward
incoming connections through their NAT)
<li>
Still even if reachable peers are rare, having them be high capacity somewhat
alleviates the topological scarcity issues
<li>
UDP allows for "NAT hole punching" which lets people be "kind of
pseudo-reachable" (with the help of introducers) who could otherwise only
connect out
<li>
The "old" TCP transport implementation required lots of threads, which was a
performance killer, while the "new" TCP transport does well with few threads
<li>
Routers of set A crap out when saturated with UDP. Routers of set B crap out
when saturated with TCP.
<li>
It "feels" (as in, there are some indications but no scientific data or
quality statistics) that A is more widely deployed than B
<li>
Some networks carry non-DNS UDP datagrams with an outright shitty quality,
while still somewhat bothering to carry TCP streams.
</ul>
</p><p>
On that background, a small diversity of transports (as many as needed, but not
more) appears sensible in either case. Which should be the main transport,
depends on their performance-wise. I've seen nasty stuff on my line when I
tried to use its full capacity with UDP. Packet losses on the level of 35%.
</p><p>
We could definitely try playing with UDP versus TCP priorities, but I'd urge
caution in that. I would urge that they not be changed too radically all at
once, or it might break things.
</p>
<h3>Response by zzz</h3>
Posted to new Syndie, 2007-03-27
<p>
<i>
AFAIK, there hasn't been much deep theory behind the preference of TCP versus
UDP, except the following considerations:
</i>
</p><p>
These are all valid issues. However you are considering the two protocols in
isolation, whether than thinking about what transport protocol is best for a
particular higher-level protocol (i.e. streaming lib or not).
</p><p>
What I'm saying is you have to take the streaming lib into consideration.
So either shift the preferences for everybody or treat streaming lib traffic
differently.
That's what my proposal 1B) is talking about - have a different preference for
streaming-lib traffic than for non streaming-lib traffic (for example tunnel
build messages).
</p><p>
<i>
On that background, a small diversity of transports (as many as needed, but
not more) appears sensible in either case. Which should be the main transport,
depends on their performance-wise. I've seen nasty stuff on my line when I
tried to use its full capacity with UDP. Packet losses on the level of 35%.
</i>
</p><p>
Agreed. The new .28 may have made things better for packet loss over UDP, or
maybe not.
One important point - the transport code does remember failures of a transport.
So if UDP is the preferred transport, it will try it first, but if it fails for
a particular destination, the next attempt for that destination it will try
NTCP rather than trying UDP again.
</p><p>
<i>
We could definitely try playing with UDP versus TCP priorities, but I'd urge
caution in that. I would urge that they not be changed too radically all at
once, or it might break things.
</i>
</p><p>
We have four tuning knobs - the four bid values (SSU and NTCP, for
already-connected and not-already-connected).
We could make SSU be preferred over NTCP only if both are connected, for
example, but try NTCP first if neither transport is connected.
</p><p>
The other way to do it gradually is only shifting the streaming lib traffic
(the 1B proposal) however that could be hard and may have anonymity
implications, I don't know. Or maybe shift the traffic only for the first
outbound hop (i.e. don't propagate the flag to the next router), which gives
you only partial benefit but might be more anonymous and easier.
</p>
<h3>Results of the Discussion</h3>
... and other related changes in the same timeframe (2007):
<ul>
<li>
Significant tuning of the streaming lib parameters,
greatly increasing outbound performance, was implemented in 0.6.1.28
<li>
Priority sending for NTCP was implemented in 0.6.1.28
<li>
Priority sending for SSU was implemented by zzz but was never checked in
<li>
The advanced transport bid control
i2np.udp.preferred was implemented in 0.6.1.29.
<li>
Pushback for NTCP was implemented in 0.6.1.30, disabled in 0.6.1.31 due to anonymity concerns,
and re-enabled with improvements to address those concerns in 0.6.1.32.
<li>
None of zzz's proposals 1-5 have been implemented.
</ul>
{% endblock %}

View File

@@ -0,0 +1,434 @@
{% extends "global/layout.html" %}
{% block title %}NTCP{% endblock %}
{% block content %}
Updated August 2010 for release 0.8
<h2>NTCP (NIO-based TCP)</h2>
<p>
NTCP
is one of two <a href="{{ site_url('docs/transport') }}">transports</a> currently implemented in I2P.
The other is <a href="{{ site_url('docs/transport/ssu') }}">SSU</a>.
NTCP
is a Java NIO-based transport
introduced in I2P release 0.6.1.22.
Java NIO (new I/O) does not suffer from the 1 thread per connection issues of the old TCP transport.
</p><p>
By default,
NTCP uses the IP/Port
auto-detected by SSU. When enabled on config.jsp,
SSU will notify/restart NTCP when the external address changes
or when the firewall status changes.
Now you can enable inbound TCP without a static IP or dyndns service.
</p><p>
The NTCP code within I2P is relatively lightweight (1/4 the size of the SSU code)
because it uses the underlying Java TCP transport for reliable delivery.
</p>
<h2>NTCP Protocol Specification</h2>
<h3>Standard Message Format</h3>
<p>
After establishment,
the NTCP transport sends individual I2NP messages, with a simple checksum.
The unencrypted message is encoded as follows:
<pre>
* +-------+-------+--//--+---//----+-------+-------+-------+-------+
* | sizeof(data) | data | padding | Adler checksum of sz+data+pad |
* +-------+-------+--//--+---//----+-------+-------+-------+-------+
</pre>
The data is then AES/256/CBC encrypted. The session key for the encryption
is negotiated during establishment (using Diffie-Hellman 2048 bit).
The establishment between two routers is implemented in the EstablishState class
and detailed below.
The IV for AES/256/CBC encryption is the last 16 bytes of the previous encrypted message.
</p>
<p>
0-15 bytes of padding are required to bring the total message length
(including the six size and checksum bytes) to a multiple of 16.
The maximum message size is currently 16 KB.
Therefore the maximum data size is currently 16 KB - 6, or 16378 bytes.
The minimum data size is 1.
</p>
<h3>Time Sync Message Format</h3>
<p>
One special case is a metadata message where the sizeof(data) is 0. In
that case, the unencrypted message is encoded as:
<pre>
* +-------+-------+-------+-------+-------+-------+-------+-------+
* | 0 | timestamp in seconds | uninterpreted
* +-------+-------+-------+-------+-------+-------+-------+-------+
* uninterpreted | Adler checksum of bytes 0-11 |
* +-------+-------+-------+-------+-------+-------+-------+-------+
</pre>
Total length: 16 bytes. The time sync message is sent at approximately 15 minute intervals.
The message is encrypted just as standard messages are.
<h3>Checksums</h3>
The standard and time sync messages use the Adler-32 checksum
as defined in the <a href="http://tools.ietf.org/html/rfc1950">ZLIB Specification</a>.
<h3>Establishment Sequence</h3>
In the establish state, there is a 4-phase message sequence to exchange DH keys and signatures.
In the first two messages there is a 2048-bit Diffie Hellman exchange.
Then, DSA signatures of the critical data are exchanged to confirm the connection.
<pre>
* Alice contacts Bob
* =========================================================
* X+(H(X) xor Bob.identHash)-----------------------------&gt;
* &lt;----------------------------------------Y+E(H(X+Y)+tsB+padding, sk, Y[239:255])
* E(sz+Alice.identity+tsA+padding+S(X+Y+Bob.identHash+tsA+tsB), sk, hX_xor_Bob.identHash[16:31])---&gt;
* &lt;----------------------E(S(X+Y+Alice.identHash+tsA+tsB)+padding, sk, prev)
</pre>
<pre>
Legend:
X, Y: 256 byte DH public keys
H(): 32 byte SHA256 Hash
E(data, session key, IV): AES256 Encrypt
S(): 40 byte DSA Signature
tsA, tsB: timestamps (4 bytes, seconds since epoch)
sk: 32 byte Session key
sz: 2 byte size of Alice identity to follow
</pre>
<h4 id="DH">DH Key Exchange</h4>
<p>
The initial 2048-bit DH key exchange
uses the same shared prime (p) and generator (g) as that used for I2P's
<a href="how_cryptography.html#elgamal">ElGamal encryption</a>.
</p>
<p>
The DH key exchange consists of a number of steps, displayed below.
The mapping between these steps and the messages sent between I2P routers,
is marked in bold.
<ol>
<li>Alice generates a secret 226-bit integer x.
She then calculates X = g^x mod p.
</li>
<li>Alice sends X to Bob <b>(Message 1)</b>.</li>
<li>Bob generates a secret 226-bit integer y.
He then calculates Y = g^y mod p.</li>
<li>Bob sends Y to Alice.<b>(Message 2)</b></li>
<li>Alice can now compute sessionKey = Y^x mod p.</li>
<li>Bob can now compute sessionKey = X^y mod p.</li>
<li>Both Alice and Bob now have a shared key sessionKey = g^(x*y) mod p.</li>
</ol>
The sessionKey is then used to exchange identities in <b>Message 3</b> and <b>Message 4</b>.
</p>
<h4>Message 1 (Session Request)</h4>
This is the DH request.
Alice already has Bob's
<a href="common_structures_spec.html#struct_RouterIdentity">Router Identity</a>,
IP address, and port, as contained in his
<a href="common_structures_spec.html#struct_RouterInfo">Router Info</a>,
which was published to the
<a href="{{ site_url('docs/how/networkdatabase') }}">network database</a>.
Alice sends Bob:
<pre>
* X+(H(X) xor Bob.identHash)-----------------------------&gt;
Size: 288 bytes
</pre>
Contents:
<pre>
+----+----+----+----+----+----+----+----+
| X, as calculated from DH |
+ +
| |
~ . . . ~
| |
+----+----+----+----+----+----+----+----+
| |
+ +
| HXxorHI |
+ +
| |
+ +
| |
+----+----+----+----+----+----+----+----+
X: 256 byte X from Diffie Hellman
HXxorHI: SHA256 Hash(X) xored with SHA256 Hash(Bob's <a href="common_structures_spec.html#struct_RouterIdentity">Router Identity</a>)
(32 bytes)
</pre>
<p><b>Notes:</b>
<ul><li>
Bob verifies HXxorHI using his own router hash. If it does not verify,
Alice has contacted the wrong router, and Bob drops the connection.
</li></ul>
<h4>Message 2 (Session Created)</h4>
This is the DH reply. Bob sends Alice:
<pre>
* &lt;----------------------------------------Y+E(H(X+Y)+tsB+padding, sk, Y[239:255])
Size: 304 bytes
</pre>
Unencrypted Contents:
<pre>
+----+----+----+----+----+----+----+----+
| Y as calculated from DH |
+ +
| |
~ . . . ~
| |
+----+----+----+----+----+----+----+----+
| |
+ +
| HXY |
+ +
| |
+ +
| |
+----+----+----+----+----+----+----+----+
| tsB | padding |
+----+----+----+----+ +
| |
+----+----+----+----+----+----+----+----+
Y: 256 byte Y from Diffie Hellman
HXY: SHA256 Hash(X concatenated with Y)
(32 bytes)
tsB: 4 byte timestamp (seconds since the epoch)
padding: 12 bytes random data
</pre>
Encrypted Contents:
<pre>
+----+----+----+----+----+----+----+----+
| Y as calculated from DH |
+ +
| |
~ . . . ~
| |
+----+----+----+----+----+----+----+----+
| |
+ +
| encrypted data |
+ +
| |
+ +
| |
+ +
| |
+ +
| |
+----+----+----+----+----+----+----+----+
Y: 256 byte Y from Diffie Hellman
encrypted data: 48 bytes <a href="{{ site_url('docs/how/cryptography') }}#AES">AES encrypted</a> using the DH session key and
the last 16 bytes of Y as the IV
</pre>
<p><b>Notes:</b>
<ul><li>
Alice may drop the connection if the clock skew with Bob is too high as calculated using tsB.
</li></ul>
</p>
<h4>Message 3 (Session Confirm A)</h4>
This contains Alice's router identity, and a DSA signature of the critical data. Alice sends Bob:
<pre>
* E(sz+Alice.identity+tsA+padding+S(X+Y+Bob.identHash+tsA+tsB), sk, hX_xor_Bob.identHash[16:31])---&gt;
Size: 448 bytes (typ. for 387 byte identity)
</pre>
Unencrypted Contents:
<pre>
+----+----+----+----+----+----+----+----+
| sz | Alice's Router Identity |
+----+----+ +
| |
~ . . . ~
| |
+ +----+----+----+
| | tsA
+----+----+----+----+----+----+----+----+
| padding |
+----+ +
| |
+----+----+----+----+----+----+----+----+
| |
+ +
| signature |
+ +
| |
+ +
| |
+ +
| |
+----+----+----+----+----+----+----+----+
sz: 2 byte size of Alice's router identity to follow (should always be 387)
ident: Alice's 387 byte <a href="common_structures_spec.html#struct_RouterIdentity">Router Identity</a>
tsA: 4 byte timestamp (seconds since the epoch)
padding: 15 bytes random data
signature: the 40 byte <a href="common_structures_spec.html#type_Signature">DSA signature</a> of the following concatenated data:
X, Y, Bob's <a href="common_structures_spec.html#struct_RouterIdentity">Router Identity</a>, tsA, tsB.
Alice signs it with the <a href="common_structures_spec.html#type_SigningPrivateKey">private signing key</a> associated with the <a href="common_structures_spec.html#type_SigningPublicKey">public signing key</a> in her <a href="common_structures_spec.html#struct_RouterIdentity">Router Identity</a>
</pre>
Encrypted Contents:
<pre>
+----+----+----+----+----+----+----+----+
| |
+ +
| encrypted data |
~ . . . ~
| |
+----+----+----+----+----+----+----+----+
encrypted data: 448 bytes <a href="{{ site_url('docs/how/cryptography') }}#AES">AES encrypted</a> using the DH session key and
the last 16 bytes of HXxorHI (i.e., the last 16 bytes of message #1) as the IV
</pre>
<p><b>Notes:</b>
<ul><li>
Bob verifies the signature, and on failure, drops the connection.
</li><li>
Bob may drop the connection if the clock skew with Alice is too high as calculated using tsA.
</li></ul>
</p>
<h4>Message 4 (Session Confirm B)</h4>
This is a DSA signature of the critical data. Bob sends Alice:
<pre>
* &lt;----------------------E(S(X+Y+Alice.identHash+tsA+tsB)+padding, sk, prev)
Size: 48 bytes
</pre>
Unencrypted Contents:
<pre>
+----+----+----+----+----+----+----+----+
| |
+ +
| signature |
+ +
| |
+ +
| |
+ +
| |
+----+----+----+----+----+----+----+----+
| padding |
+----+----+----+----+----+----+----+----+
signature: the 40 byte <a href="common_structures_spec.html#type_Signature">DSA signature</a> of the following concatenated data:
X, Y, Alice's <a href="common_structures_spec.html#struct_RouterIdentity">Router Identity</a>, tsA, tsB.
Bob signs it with the <a href="common_structures_spec.html#type_SigningPrivateKey">private signing key</a> associated with the <a href="common_structures_spec.html#type_SigningPublicKey">public signing key</a> in his <a href="common_structures_spec.html#struct_RouterIdentity">Router Identity</a>
padding: 8 bytes random data
</pre>
Encrypted Contents:
<pre>
+----+----+----+----+----+----+----+----+
| |
+ +
| encrypted data |
~ . . . ~
| |
+----+----+----+----+----+----+----+----+
encrypted data: 48 bytes <a href="{{ site_url('docs/how/cryptography') }}#AES">AES encrypted</a> using the DH session key and
the last 16 bytes of the encrypted contents of message #2 as the IV
</pre>
<p><b>Notes:</b>
<ul><li>
Alice verifies the signature, and on failure, drops the connection.
</li></ul>
</p>
<h4>After Establishment</h4>
<p>
The connection is established, and standard or time sync messages may be exchanged.
All subsequent messages are AES encrypted using the negotiated DH session key.
Alice will use the last 16 bytes of the encrypted contents of message #3 as the next IV.
Bob will use the last 16 bytes of the encrypted contents of message #4 as the next IV.
</p>
<h3>Check Connection Message</h3>
Alternately, when Bob receives a connection, it could be a
check connection (perhaps prompted by Bob asking for someone
to verify his listener).
Check Connection is not currently used.
However, for the record, check connections are formatted as follows.
A check info connection will receive 256 bytes containing:
<ul>
<li> 32 bytes of uninterpreted, ignored data
<li> 1 byte size
<li> that many bytes making up the local router's IP address (as reached by the remote side)
<li> 2 byte port number that the local router was reached on
<li> 4 byte i2p network time as known by the remote side (seconds since the epoch)
<li> uninterpreted padding data, up to byte 223
<li> xor of the local router's identity hash and the SHA256 of bytes 32 through bytes 223
</ul>
</pre>
<h2>Discussion</h2>
Now on the <a href="{{ site_url('docs/transport/ntcp/discussion') }}">NTCP Discussion Page</a>.
<h2><a name="future">Future Work</a></h2>
<ul><li>
The maximum message size should be increased to approximately 32 KB.
</li><li>
A set of fixed packet sizes may be appropriate to further hide the data
fragmentation to external adversaries, but the tunnel, garlic, and end to
end padding should be sufficient for most needs until then.
However, there is currently no provision for padding beyond the next 16-byte boundary,
to create a limited number of message sizes.
</li><li>
Memory utilization (including that of the kernel) for NTCP should be compared to that for SSU.
</li><li>
Can the establishment messages be randomly padded somehow, to frustrate
identification of I2P traffic based on initial packet sizes?
</li><li>
Review and possibly disable 'check connection'
</li></ul>
</p>
{% endblock %}

View File

@@ -0,0 +1,450 @@
{% extends "global/layout.html" %}
{% block title %}SSU Transport{% endblock %}
{% block content %}
Updated October 2012 for release 0.9.2
<h1>Secure Semireliable UDP (SSU)</h1>
<p>
SSU (also called "UDP" in much of the I2P documentation and user interfaces)
is one of two <a href="{{ site_url('docs/transport') }}">transports</a> currently implemented in I2P.
The other is <a href="{{ site_url('docs/transport/ntcp') }}">NTCP</a>.
</p><p>
SSU is the newer of the two transports,
introduced in I2P release 0.6.
In a standard I2P installation, the router uses both NTCP and SSU for outbound connections.
<h2>SSU Services</h2>
Like the NTCP transport, SSU provides reliable, encrypted, connection-oriented, point-to-point data transport.
Unique to SSU, it also provides IP detection and NAT traversal services, including:
<ul>
<li>Cooperative NAT/Firewall traversal using <a href="#introduction">introducers</a>
<li>Local IP detection by inspection of incoming packets and <a href="#peerTesting">peer testing</a>
<li>Communication of firewall status and local IP, and changes to either to NTCP
<li>Communication of firewall status and local IP, and changes to either, to the router and the user interface
</ul>
<h1>Protocol Details</h1>
<h2><a name="congestioncontrol">Congestion control</a></h2>
<p>SSU's need for only semireliable delivery, TCP-friendly operation,
and the capacity for high throughput allows a great deal of latitude in
congestion control. The congestion control algorithm outlined below is
meant to be both efficient in bandwidth as well as simple to implement.</p>
<p>Packets are scheduled according to the router's policy, taking care
not to exceed the router's outbound capacity or to exceed the measured
capacity of the remote peer. The measured capacity operates along the
lines of TCP's slow start and congestion avoidance, with additive increases
to the sending capacity and multiplicative decreases in face of congestion.
Unlike for TCP, routers may give up on some messages after
a given period or number of retransmissions while continuing to transmit
other messages.</p>
<p>The congestion detection techniques vary from TCP as well, since each
message has its own unique and nonsequential identifier, and each message
has a limited size - at most, 32KB. To efficiently transmit this feedback
to the sender, the receiver periodically includes a list of fully ACKed
message identifiers and may also include bitfields for partially received
messages, where each bit represents the reception of a fragment. If
duplicate fragments arrive, the message should be ACKed again, or if the
message has still not been fully received, the bitfield should be
retransmitted with any new updates.</p>
<p>The current implementation does not pad the packets to
any particular size, but instead just places a single message fragment into
a packet and sends it off (careful not to exceed the MTU).
</p>
<h3><a name="mtu">MTU</a></h3>
<p>
As of router version 0.8.12,
two MTU values are used: 620 and 1484.
The MTU value is adjusted based on the percentage of packets that are retransmitted.
</p><p>
For both MTU values, it is desirable that (MTU % 16) == 12, so that
the payload portion after the 28-byte IP/UDP header is a multiple of
16 bytes, for encryption purposes.
This calculation is for IPv4 only. While the protocol as specified supports IPv6
addresses, IPv6 is not yet implemented.
</p><p>
For the small MTU value, it is desirable to pack a 2646-byte
Variable Tunnel Build Message efficiently into multiple packets;
with a 620-byte MTU, it fits into 5 packets with nicely.
</p><p>
Based on measurements, 1492 fits nearly all reasonably small I2NP messages
(larger I2NP messages may be up to 1900 to 4500 bytes, which isn't going to fit
into a live network MTU anyway).
</p><p>
The MTU values were 608 and 1492 for releases 0.8.9 - 0.8.11.
The large MTU was 1350 prior to release 0.8.9.
</p><p>
The maximum receive packet size
is 1571 bytes as of release 0.8.12.
For releases 0.8.9 - 0.8.11 it was 1535 bytes.
Prior to release 0.8.9 it was 2048 bytes.
</p><p>
As of release 0.9.2, if a router's network interface MTU is less than 1484,
it will publish that in the network database, and other routers should
honor that when a connection is established.
</p>
<h3><a name="max">Message Size Limits</a></h3>
<p>
While the maximum message size is nominally 32KB, the practical
limit differs. The protocol limits the number of fragments to 7 bits, or 128.
The current implementation, however, limits each message to a maximum of 64 fragments,
which is sufficient for 64 * 534 = 33.3 KB when using the 608 MTU.
Due to overhead for bundled LeaseSets and session keys, the practical limit
at the application level is about 6KB lower, or about 26KB.
Further work is necessary to raise the UDP transport limit above 32KB.
For connections using the larger MTU, larger messages are possible.
</p>
<h2><a name="keys">Keys</a></h2>
<p>All encryption used is AES256/CBC with 32 byte keys and 16 byte IVs.
The MAC and session keys are negotiated as part of the DH exchange, used
for the HMAC and encryption, respectively. Prior to the DH exchange,
the publicly knowable introKey is used for the MAC and encryption.</p>
<p>When using the introKey, both the initial message and any subsequent
reply use the introKey of the responder (Bob) - the responder does
not need to know the introKey of the requester (Alice). The DSA
signing key used by Bob should already be known to Alice when she
contacts him, though Alice's DSA key may not already be known by
Bob.</p>
<p>Upon receiving a message, the receiver checks the "from" IP address and port
with all established sessions - if there are matches,
that session's MAC keys are tested in the HMAC. If none
of those verify or if there are no matching IP addresses, the
receiver tries their introKey in the MAC. If that does not verify,
the packet is dropped. If it does verify, it is interpreted
according to the message type, though if the receiver is overloaded,
it may be dropped anyway.</p>
<p>If Alice and Bob have an established session, but Alice loses the
keys for some reason and she wants to contact Bob, she may at any
time simply establish a new session through the SessionRequest and
related messages. If Bob has lost the key but Alice does not know
that, she will first attempt to prod him to reply, by sending a
DataMessage with the wantReply flag set, and if Bob continually
fails to reply, she will assume the key is lost and reestablish a
new one.</p>
<p>For the DH key agreement,
<a href="http://www.faqs.org/rfcs/rfc3526.html">RFC3526</a> 2048bit
MODP group (#14) is used:</p>
<pre>
p = 2^2048 - 2^1984 - 1 + 2^64 * { [2^1918 pi] + 124476 }
g = 2
</pre>
<p>
These are the same p and g used for I2P's
<a href="{{ site_url('docs/how/cryptography') }}#elgamal">ElGamal encryption</a>.
</p>
<h2><a name="replay">Replay prevention</a></h2>
<p>Replay prevention at the SSU layer occurs by rejecting packets
with exceedingly old timestamps or those which reuse an IV. To
detect duplicate IVs, a sequence of Bloom filters are employed to
"decay" periodically so that only recently added IVs are detected.</p>
<p>The messageIds used in DataMessages are defined at layers above
the SSU transport and are passed through transparently. These IDs
are not in any particular order - in fact, they are likely to be
entirely random. The SSU layer makes no attempt at messageId
replay prevention - higher layers should take that into account.</p>
<h2 id="addressing">Addressing</h2>
<p>To contact an SSU peer, one of two sets of information is necessary:
a direct address, for when the peer is publicly reachable, or an
indirect address, for using a third party to introduce the peer.
There is no restriction on the number of addresses a peer may have.</p>
<pre>
Direct: host, port, introKey, options
Indirect: tag, relayhost, port, relayIntroKey, targetIntroKey, options
</pre>
<p>Each of the addresses may also expose a series of options - special
capabilities of that particular peer. For a list of available
capabilities, see <a href="#capabilities">below</a>.</p>
<p>
The addresses, options, and capabilities are published in the <a href="{{ site_url('docs/how/networkdatabase') }}">network database</a>.
</p>
<h2><a name="direct">Direct Session Establishment</a></h2>
<p>
Direct session establishment is used when no third party is required for NAT traversal.
The message sequence is as follows:
</p>
<h3><a name="establishDirect">Connection establishment (direct)</a></h3>
<p>
Alice connects directly to Bob.
</p>
<pre>
Alice Bob
SessionRequest ---------------------&gt;
&lt;--------------------- SessionCreated
SessionConfirmed -------------------&gt;
&lt;--------------------- DeliveryStatusMessage
&lt;--------------------- DatabaseStoreMessage
DatabaseStoreMessage ---------------&gt;
Data &lt;---------------------------&gt; Data
</pre>
<p>
After the SessionConfirmed message is received, Bob sends a small
<a href="i2np_spec.html#msg_DeliveryStatus">DeliveryStatus message</a>
as a confirmation.
In this message, the 4-byte message ID is set to a random number, and the
8-byte "arrival time" is set to the current network-wide ID, which is 2
(i.e. 0x0000000000000002).
</p><p>
After the status message is sent, the peers exchange
<a href="i2np_spec.html#msg_DatabaseStore">DatabaseStore messages</a>
containing their
<a href="common_structures_spec.html#struct_RouterInfo">RouterInfos</a>.
</p><p>
It does not appear that the type of the status message or its contents matters.
It was originally added becasue the DatabaseStore message was delayed
several seconds; since the store is now sent immediately, perhaps
the status message can be eliminated.
</p>
<h2><a name="introduction">Introduction</a></h2>
<p>Introduction keys are delivered through an external channel
(the network database, where they are identical to the router Hash for now)
and must be used when establishing a session key. For the indirect
address, the peer must first contact the relayhost and ask them for
an introduction to the peer known at that relayhost under the given
tag. If possible, the relayhost sends a message to the addressed
peer telling them to contact the requesting peer, and also gives
the requesting peer the IP and port on which the addressed peer is
located. In addition, the peer establishing the connection must
already know the public keys of the peer they are connecting to (but
not necessary to any intermediary relay peer).</p>
<p>Indirect session establishment by means of a third party introduction
is necessary for efficient NAT traversal. Charlie, a router behind a
NAT or firewall which does not allow unsolicited inbound UDP packets,
first contacts a few peers, choosing some to serve as introducers. Each
of these peers (Bob, Bill, Betty, etc) provide Charlie with an introduction
tag - a 4 byte random number - which he then makes available to the public
as methods of contacting him. Alice, a router who has Charlie's published
contact methods, first sends a RelayRequest packet to one or more of the
introducers, asking each to introduce her to Charlie (offering the
introduction tag to identify Charlie). Bob then forwards a RelayIntro
packet to Charlie including Alice's public IP and port number, then sends
Alice back a RelayResponse packet containing Charlie's public IP and port
number. When Charlie receives the RelayIntro packet, he sends off a small
random packet to Alice's IP and port (poking a hole in his NAT/firewall),
and when Alice receives Bob's RelayResponse packet, she begins a new
full direction session establishment with the specified IP and port.</p>
<!--
should Bob wait for Charlie to ack the RelayIntro packet to avoid
situations where that packet is lost yet Alice gets Charlie's IP with
Charlie not yet punching a hole in his NAT for her to get through?
Perhaps Alice should send to multiple Bobs at once, hoping that at
least one of them gets through
-->
<h3><a name="establishIndirect">Connection establishment (indirect using an introducer)</a></h3>
Alice first connects to introducer Bob, who relays the request to Charlie.
<pre>
Alice Bob Charlie
RelayRequest ----------------------&gt;
&lt;-------------- RelayResponse RelayIntro -----------&gt;
&lt;-------------------------------------------- HolePunch (data ignored)
SessionRequest --------------------------------------------&gt;
&lt;-------------------------------------------- SessionCreated
SessionConfirmed ------------------------------------------&gt;
&lt;-------------------------------------------- DeliveryStatusMessage
&lt;-------------------------------------------- DatabaseStoreMessage
DatabaseStoreMessage --------------------------------------&gt;
Data &lt;--------------------------------------------------&gt; Data
</pre>
<p>
After the hole punch, the session is established between Alice and Charlie as in a direct establishment.
</p>
<h2><a name="peerTesting">Peer testing</a></h2>
<p>The automation of collaborative reachability testing for peers is
enabled by a sequence of PeerTest messages. With its proper
execution, a peer will be able to determine their own reachability
and may update its behavior accordingly. The testing process is
quite simple:</p>
<pre>
Alice Bob Charlie
PeerTest -------------------&gt;
PeerTest--------------------&gt;
&lt;-------------------PeerTest
&lt;-------------------PeerTest
&lt;------------------------------------------PeerTest
PeerTest------------------------------------------&gt;
&lt;------------------------------------------PeerTest
</pre>
<p>Each of the PeerTest messages carry a nonce identifying the
test series itself, as initialized by Alice. If Alice doesn't
get a particular message that she expects, she will retransmit
accordingly, and based upon the data received or the messages
missing, she will know her reachability. The various end states
that may be reached are as follows:</p>
<ul>
<li>If she doesn't receive a response from Bob, she will retransmit
up to a certain number of times, but if no response ever arrives,
she will know that her firewall or NAT is somehow misconfigured,
rejecting all inbound UDP packets even in direct response to an
outbound packet. Alternately, Bob may be down or unable to get
Charlie to reply.</li>
<li>If Alice doesn't receive a PeerTest message with the
expected nonce from a third party (Charlie), she will retransmit
her initial request to Bob up to a certain number of times, even
if she has received Bob's reply already. If Charlie's first message
still doesn't get through but Bob's does, she knows that she is
behind a NAT or firewall that is rejecting unsolicited connection
attempts and that port forwarding is not operating properly (the
IP and port that Bob offered up should be forwarded).</li>
<li>If Alice receives Bob's PeerTest message and both of Charlie's
PeerTest messages but the enclosed IP and port numbers in Bob's
and Charlie's second messages don't match, she knows that she is
behind a symmetric NAT, rewriting all of her outbound packets with
different 'from' ports for each peer contacted. She will need to
explicitly forward a port and always have that port exposed for
remote connectivity, ignoring further port discovery.</li>
<li>If Alice receives Charlie's first message but not his second,
she will retransmit her PeerTest message to Charlie up to a
certain number of times, but if no response is received she knows
that Charlie is either confused or no longer online.</li>
</ul>
<p>Alice should choose Bob arbitrarily from known peers who seem
to be capable of participating in peer tests. Bob in turn should
choose Charlie arbitrarily from peers that he knows who seem to be
capable of participating in peer tests and who are on a different
IP from both Bob and Alice. If the first error condition occurs
(Alice doesn't get PeerTest messages from Bob), Alice may decide
to designate a new peer as Bob and try again with a different nonce.</p>
<p>Alice's introduction key is included in all of the PeerTest
messages so that she doesn't need to already have an established
session with Bob and so that Charlie can contact her without knowing
any additional information. Alice may go on to establish a session
with either Bob or Charlie, but it is not required.</p>
<h2><a name="acks">Transmission window, ACKs and Retransmissions</a></h2>
<p>
The DATA message may contain ACKs of full messages and
partial ACKs of individual fragments of a message. See
the data message section of
<a href="{{ site_url('docs/transport/ssu/spec') }}">the protocol specification page</a>
for details.
</p><p>
The details of windowing, ACK, and retransmission strategies are not specified
here. See the Java code for the current implementation.
During the establishment phase, and for peer testing, routers
should implement exponential backoff for retransmission.
For an established connection, routers should implement
an adjustable transmission window, RTT estimate and timeout, similar to TCP
or <a href="streaming.html">streaming</a>.
See the code for initial, min and max parameters.
</p>
<h2><a name="security">Security</a></h2>
<p>
UDP source addresses may, of course, be spoofed.
Additionally, the IPs and ports contained inside specific
SSU messages (RelayRequest, RelayResponse, RelayIntro, PeerTest)
may not be legitimate.
Also, certain actions and responses may need to be rate-limited.
</p><p>
The details of validation are not specified
here. Implementers should add defenses where appropriate.
</p>
<h2><a name="capabilities">Peer capabilities</a></h2>
<dl>
<dt>B</dt>
<dd>If the peer address contains the 'B' capability, that means
they are willing and able to participate in peer tests as
a 'Bob' or 'Charlie'.</dd>
<dt>C</dt>
<dd>If the peer address contains the 'C' capability, that means
they are willing and able to serve as an introducer - serving
as a Bob for an otherwise unreachable Alice.</dd>
</dl>
<h1><a name="future">Future Work</a></h1>
<ul><li>
Analysis of current SSU performance, including assessment of window size adjustment
and other parameters, and adjustment of the protocol implementation to improve
performance, is a topic for future work.
</li><li>
The current implementation repeatedly sends acknowledgments for the same packets,
which unnecessarily increases overhead.
</li><li>
The default small MTU value of 620 should be analyzed and possibly increased.
The current MTU adjustment strategy should be evaluated.
Does a streaming lib 1730-byte packet fit in 3 small SSU packets? Probably not.
</li><li>
The protocol should be extended to exchange MTUs during the setup.
</li><li>
Rekeying is currently unimplemented and may never be.
</li><li>
The potential use of the 'challenge' fields in RelayIntro and RelayResponse,
and use of the padding field in SessionRequest and SessionCreated, is undocumented.
</li><li>
Instead of a single fragment per packet, a more efficient
strategy may be to bundle multiple message fragments into the same packet,
so long as it doesn't exceed the MTU.
</li><li>
A set of fixed packet sizes may be appropriate to further hide the data
fragmentation to external adversaries, but the tunnel, garlic, and end to
end padding should be sufficient for most needs until then.
</li><li>
Why are introduction keys the same as the router hash, should it be changed, would there be any benefit?
</li><li>
Capacities appear to be unused.
</li><li>
Signed-on times in SessionCreated and SessionConfirmed appear to be unused or unverified.
</li></ul>
<h1>Implementation Diagram</h1>
This diagram
should accurately reflect the current implementation, however there may be small differences.
<p>
<img src="{{ url_for('static', filename='images/udp.png') }}">
<h1><a name="spec">Specification</a></h1>
<a href="{{ site_url('docs/transport/ssu/spec') }}">Now on the SSU specification page</a>.
{% endblock %}

View File

@@ -0,0 +1,843 @@
{% extends "global/layout.html" %}
{% block title %}SSU Protocol Specification{% endblock %}
{% block content %}
Updated October 2012 for release 0.9.2
<p>
<a href="{{ site_url('docs/transport/ssu') }}">See the SSU page for an overview of the SSU transport</a>.
<h1>Specification</h1>
<h2 id="DH">DH Key Exchange</h2>
<p>
The initial 2048-bit DH key exchange is described on the
<a href="{{ site_url('docs/transport/ssu') }}#keys">SSU page</a>.
This exchange uses the same shared prime as that used for I2P's
<a href="{{ site_url('docs/how/cryptography') }}#elgamal">ElGamal encryption</a>.
</p>
<h2 id="header">Message Header</h2>
<p>
All UDP datagrams begin with a 16 byte MAC (Message Authentication Code)
and a 16 byte IV (Initialization Vector)
followed by a variable-size
payload encrypted with the appropriate key. The MAC used is
HMAC-MD5, truncated to 16 bytes, while the key is a full 32 byte AES256
key. The specific construct of the MAC is the first 16 bytes from:</p>
<pre>
HMAC-MD5(payload || IV || (payloadLength ^ protocolVersion), macKey)
</pre>
where '||' means append.
The payload is the message starting with the flag byte.
The macKey is either the introduction key or the
session key, as specified for each message below.
<b>WARNING</b> - the HMAC-MD5-128 used here is non-standard,
see <a href="{{ site_url('docs/how/cryptography') }}#udp">the cryptography page</a> for details.
<p>The payload itself (that is, the message starting with the flag byte)
is AES256/CBC encrypted with the IV and the
sessionKey, with replay prevention addressed within its body,
explained below. The payloadLength in the MAC is a 2 byte unsigned
integer.</p>
<p>The protocolVersion is a 2 byte unsigned integer
and is currently set to 0. Peers using a different protocol version will
not be able to communicate with this peer, though earlier versions not
using this flag are.</p>
<p>Within the AES encrypted payload, there is a minimal common structure
to the various messages - a one byte flag and a four byte sending
timestamp (seconds since the unix epoch). The flag byte contains
the following bitfields:</p>
<pre>
Bit order: 76543210 (bit 7 is MSB)
bits 7-4: payload type
bit 3: rekey?
bit 2: extended options included
bits 1-0: reserved
</pre>
<pre>
Header: 37+ bytes
Encryption starts with the flag byte.
+----+----+----+----+----+----+----+----+
| MAC |
+ +
| |
+----+----+----+----+----+----+----+----+
| IV |
+ +
| |
+----+----+----+----+----+----+----+----+
|flag| time | (optionally |
+----+----+----+----+----+ |
| this may have 64 byte keying material |
| and/or a one+N byte extended options) |
+---------------------------------------|
</pre>
<h3 id="rekey">Rekeying</h3>
<p>If the rekey flag is set, 64 bytes of keying material follow the
timestamp.
<p>When rekeying, the first 32 bytes of the keying material is fed
into a SHA256 to produce the new MAC key, and the next 32 bytes are
fed into a SHA256 to produce the new session key, though the keys are
not immediately used. The other side should also reply with the
rekey flag set and that same keying material. Once both sides have
sent and received those values, the new keys should be used and the
previous keys discarded. It may be useful to keep the old keys
around briefly, to address packet loss and reordering.</p>
<p>NOTE: Rekeying is currently unimplemented.</p>
<h3 id="extend">Extended Options</h3>
<p>
If the extended options flag is set, a one byte option
size value is appended, followed by that many extended option
bytes.</p>
<p>NOTE: Extended options is currently unimplemented.</p>
<h2 id="padding">Padding</h2>
<p>
All messages contain 0 or more bytes of padding.
Each message must be padded to a 16 byte boundary, as required by the <a href="{{ site_url('docs/how/cryptography') }}#AES">AES256 encryption layer</a>.
Currently, messages are not padded beyond the next 16 byte boundary.
The fixed-size tunnel messages of 1024 bytes (at a higher layer)
provide a significant amount of protection.
In the future, additional padding in the transport layer up to
a set of fixed packet sizes may be appropriate to further hide the data
fragmentation to external adversaries.
</p>
<h2 id="keys">Keys</h2>
<p>
DSA signatures in the SessionCreated and SessionConfirmed messages are generated using
the
<a href="common_structures_spec.html#type_SigningPublicKey">signing public key</a>
from the
<a href="common_structures_spec.html#struct_RouterIdentity">router identity</a>
which is distributed out-of-band by publishing in the network database, and the associated
<a href="common_structures_spec.html#type_SigningPrivateKey">signing private key</a>.
</p><p>
Both introduction keys and session keys are 32 bytes,
and are defined by the
<a href="common_structures_spec.html#type_SessionKey">Common structures specification</a>.
The key used for the MAC and encryption is specified for each message below.
</p>
<p>Introduction keys are delivered through an external channel
(the network database, where they are identical to the router Hash for now).
</p>
<h2 id="notes">Notes</h2>
<h3 id="ipv6">IPv6 Notes</h3>
While the protocol specification supports 16-byte IPv6 addresses,
IPv6 addressing is not currently supported within I2P.
All IP addresses are currently 4 bytes.
<h3 id="time">Timestamps</h3>
While most of I2P uses 8-byte <a href="common_structures_spec.html#type_Date">Date</a> timestamps with
millisecond resolution, SSU uses a 4-byte timestamp with one-second resolution.
<h2 id="messages">Messages</h2>
<p>
There are 10 messages (payload types) defined:
</p><p>
<table border="1">
<tr><th>Type<th>Message<th>Notes
<tr><td align="center">0<td>SessionRequest<td>
<tr><td align="center">1<td>SessionCreated<td>
<tr><td align="center">2<td>SessionConfirmed<td>
<tr><td align="center">3<td>RelayRequest<td>
<tr><td align="center">4<td>RelayResponse<td>
<tr><td align="center">5<td>RelayIntro<td>
<tr><td align="center">6<td>Data<td>
<tr><td align="center">7<td>PeerTest<td>
<tr><td align="center">8<td>SessionDestroyed<td>Implemented as of 0.8.9
<tr><td align="center">n/a<td>HolePunch<td>
</table>
</p>
<h3 id="sessionRequest">SessionRequest (type 0)</h3>
<p>
This is the first message sent to establish a session.
</p>
<table border="1">
<tr><td align="right" valign="top"><b>Peer:</b></td>
<td>Alice to Bob</td></tr>
<tr><td align="right" valign="top"><b>Data:</b></td>
<td><ul>
<li>256 byte X, to begin the DH agreement</li>
<li>1 byte IP address size</li>
<li>that many byte representation of Bob's IP address</li>
<li>N bytes, currently uninterpreted</li>
</ul></td></tr>
<tr><td align="right" valign="top"><b>Key used:</b></td>
<td>introKey</td></tr>
</table>
<pre>
+----+----+----+----+----+----+----+----+
| X, as calculated from DH |
| |
. . .
| |
+----+----+----+----+----+----+----+----+
|size| that many byte IP address (4-16) |
+----+----+----+----+----+----+----+----+
| arbitrary amount |
| of uninterpreted data |
. . .
| |
+----+----+----+----+----+----+----+----+
</pre>
<p>
Typical size including header, in current implementation: 304 bytes
</p>
<h4>Notes</h4>
<ul><li>
IP address is always 4 bytes in the current implementation.
</li><li>
The uninterpreted data could possibly be used in the future for challenges.
</li></ul>
IP address is always 4 bytes in the current implementation.
<h3 id="sessionCreated">SessionCreated (type 1)</h3>
<p>
This is the response to a Session Request.
</p>
<table border="1">
<tr><td align="right" valign="top"><b>Peer:</b></td>
<td>Bob to Alice</td></tr>
<tr><td align="right" valign="top"><b>Data:</b></td>
<td><ul>
<li>256 byte Y, to complete the DH agreement</li>
<li>1 byte IP address size</li>
<li>that many byte representation of Alice's IP address</li>
<li>2 byte Alice's port number</li>
<li>4 byte relay (introduction) tag which Alice can publish (else 0x00000000)</li>
<li>4 byte timestamp (seconds from the epoch) for use in the DSA
signature</li>
<li>40 byte <a href="common_structures_spec.html#type_Signature">DSA signature</a> of the critical exchanged data
(X + Y + Alice's IP + Alice's port + Bob's IP + Bob's port + Alice's
new relay tag + Bob's signed on time), encrypted with another
layer of encryption using the negotiated sessionKey. The IV
is reused here.</li>
<li>8 bytes padding, encrypted with an additional layer of encryption
using the negotiated session key as part of the DSA block</li>
<li>N bytes, currently uninterpreted</li>
</ul></td></tr>
<tr><td align="right" valign="top"><b>Key used:</b></td>
<td>introKey, with an additional layer of encryption over the 40 byte
signature and the following 8 bytes padding.</td></tr>
</table>
<pre>
+----+----+----+----+----+----+----+----+
| Y, as calculated from DH |
| |
. . .
| |
+----+----+----+----+----+----+----+----+
|size| that many byte IP address (4-16) |
+----+----+----+----+----+----+----+----+
| Port (A)| public relay tag | signed
+----+----+----+----+----+----+----+----+
on time | |
+----+----+ |
| DSA signature |
+ +
| |
+ +
| |
+ +
| |
+ +----+----+----+----+----+----+
| | (8 bytes of padding)
+----+----+----+----+----+----+----+----+
| |
+----+----+ |
| arbitrary amount |
| of uninterpreted data |
. . .
| |
+----+----+----+----+----+----+----+----+
</pre>
<p>
Typical size including header, in current implementation: 368 bytes
</p>
<h4>Notes</h4>
<ul><li>
IP address is always 4 bytes in the current implementation.
</li><li>
If the relay tag is nonzero, Bob is offering to act as an introducer for Alice.
Alice may subsequently publish Bob's address and the relay tag in the network database.
</li><li>
For the signature, Bob must use his external port, as that what Alice will use to verify.
If Bob's NAT/firewall has mapped his internal port to a different external port,
and Bob is unaware of it, the verification by Alice will fail.
</li><li>
See <a href="#keys">the Keys section above</a> for details on DSA signatures.
Alice already has Bob's public signing key, from the network database.
</li><li>
Signed-on time appears to be unused or unverified in the current implementation.
</li><li>
The uninterpreted data could possibly be used in the future for challenges.
</li></ul>
<h3 id="sessionConfirmed">SessionConfirmed (type 2)</h3>
<p>
This is the response to a Session Created message and the last step in establishing a session.
There may be multiple Session Confirmed messages required if the Router Identity must be fragmented.
</p>
<table border="1">
<tr><td align="right" valign="top"><b>Peer:</b></td>
<td>Alice to Bob</td></tr>
<tr><td align="right" valign="top"><b>Data:</b></td>
<td><ul>
<li>1 byte identity fragment info:<pre>
Bit order: 76543210 (bit 7 is MSB)
bits 7-4: current identity fragment # 0-14
bits 3-0: total identity fragments (F) 1-15</pre></li>
<li>2 byte size of the current identity fragment</li>
<li>that many byte fragment of Alice's
<a href="common_structures_spec#struct_RouterIdentity">Router Identity</a>
</li>
<li>After the last identity fragment only:
<ul><li>4 byte signed-on time
</li></ul>
<li>N bytes padding, currently uninterpreted
<li>After the last identity fragment only:
<ul><li>The last 40
bytes contain the <a href="common_structures_spec.html#type_Signature">DSA signature</a> of the critical exchanged
data (X + Y + Alice's IP + Alice's port + Bob's IP + Bob's port
+ Alice's new relay key + Alice's signed on time)</li>
</li></ul>
</ul></td></tr>
<tr><td align="right" valign="top"><b>Key used:</b></td>
<td>sessionKey</td></tr>
</table>
<pre>
<b>Fragment 0 through F-2</b>
+----+----+----+----+----+----+----+----+
|info| cursize | |
+----+----+----+ |
| fragment of Alice's full |
| Router Identity |
. . .
| |
+----+----+----+----+----+----+----+----+
| arbitrary amount of uninterpreted |
| data |
+----+----+----+----+----+----+----+----+
<b>Fragment F-1:</b>
+----+----+----+----+----+----+----+----+
|info| cursize | |
+----+----+----+ |
| last fragment of Alice's full |
| Router Identity |
. . .
| |
+----+----+----+----+----+----+----+----+
| signed on time | |
+----+----+----+----+ |
| arbitrary amount of uninterpreted |
| data, to 40 bytes prior to |
| end of the current packet |
+----+----+----+----+----+----+----+----+
| DSA signature |
+ +
| |
+ +
| |
+ +
| |
+ +
| |
+----+----+----+----+----+----+----+----+
</pre>
<p>
Typical size including header, in current implementation: 480 bytes
</p>
<h4>Notes</h4>
<ul><li>
In the current implementation, the maximum fragment size is 512 bytes.
</li><li>
The typical <a href="common_structures_spec.html#struct_RouterIdentity">Router Identity</a>
is 387 bytes, so no fragmentation is usually necessary.
</li><li>
There is no mechanism for requesting or redelivering missing fragments.
</li><li>
The total fragments field F must be set identically in all fragments.
</li><li>
See <a href="#keys">the Keys section above</a> for details on DSA signatures.
</li><li>
Signed-on time appears to be unused or unverified in the current implementation.
</li></ul>
<h3 id="sessionDestroyed">SessionDestroyed (type 8)</h3>
<p>
The Session Destroyed message was implemented (reception only) in release 0.8.1,
and is never sent. Transmission implemented as of release 0.8.9.
</p>
<table border="1">
<tr><td align="right" valign="top"><b>Peer:</b></td>
<td>Alice to Bob or Bob to Alice</td></tr>
<tr><td align="right" valign="top"><b>Data:</b></td>
<td>none
</td></tr>
<tr><td align="right" valign="top"><b>Key used:</b></td>
<td>sessionKey or introKey</td></tr>
</table>
<pre>
+----+----+----+----+----+----+----+----+
| no data |
+----+----+----+----+----+----+----+----+
</pre>
<p>
Typical size including header, in current implementation: 48 bytes
</p>
<h3 id="relayRequest">RelayRequest (type 3)</h3>
<p>
This is the first message sent from Alice to Bob to request an introduction to Charlie.
</p>
<table border="1">
<tr><td align="right" valign="top"><b>Peer:</b></td>
<td>Alice to Bob</td></tr>
<tr><td align="right" valign="top"><b>Data:</b></td>
<td><ul>
<li>4 byte relay (introduction) tag, nonzero</li>
<li>1 byte IP address size</li>
<li>that many byte representation of Alice's IP address</li>
<li>2 byte port number (of Alice)</li>
<li>1 byte challenge size</li>
<li>that many bytes to be relayed to Charlie in the intro</li>
<li>Alice's 32-byte introduction key (so Bob can reply with Charlie's info)</li>
<li>4 byte nonce of Alice's relay request</li>
<li>N bytes, currently uninterpreted</li>
</ul></td></tr>
<tr><td align="right" valign="top"><b>Key used:</b></td>
<td>introKey (or sessionKey, if Alice/Bob is established)</td></tr>
</table>
<pre>
+----+----+----+----+----+----+----+----+
| relay tag |size| Alice IP addr
+----+----+----+----+----+--- +----+----|
| Port (A)|size| challenge bytes |
+----+----+----+----+ +
| to be delivered to Charlie |
+----+----+----+----+----+----+----+----+
| Alice's intro key |
+ +
| |
+ +
| |
+ +
| |
+----+----+----+----+----+----+----+----+
| nonce | |
+----+----+----+----+ |
| arbitrary amount of uninterpreted data|
+----+----+----+----+----+----+----+----+
</pre>
<p>
Typical size including header, in current implementation: 96 bytes
</p>
<h4>Notes</h4>
<ul><li>
The IP address is only included if it is be different than the
packet's source address and port. In the current implementation, the
IP length is always 0 and the port is always 0, and the receiver should
use the packet's source address and port.
</li><li>
Challenge is unimplemented, challenge size is always zero
</li></ul>
<h3 id="relayResponse">RelayResponse (type 4)</h3>
<p>
This is the response to a Relay Request and is sent from Bob to Alice.
</p>
<table border="1">
<tr><td align="right" valign="top"><b>Peer:</b></td>
<td>Bob to Alice</td></tr>
<tr><td align="right" valign="top"><b>Data:</b></td>
<td><ul>
<li>1 byte IP address size</li>
<li>that many byte representation of Charlie's IP address</li>
<li>2 byte Charlie's port number</li>
<li>1 byte IP address size</li>
<li>that many byte representation of Alice's IP address</li>
<li>2 byte Alice's port number</li>
<li>4 byte nonce sent by Alice</li>
<li>N bytes, currently uninterpreted</li>
</ul></td></tr>
<tr><td align="right" valign="top"><b>Key used:</b></td>
<td>introKey (or sessionKey, if Alice/Bob is established)</td></tr>
</table>
<pre>
+----+----+----+----+----+----+----+----+
|size| Charlie IP | Port (C)|size|
+----+----+----+----+----+----+----+----+
| Alice IP | Port (A)| nonce
+----+----+----+----+----+----+----+----+
| arbitrary amount of |
+----+----+ |
| uninterpreted data |
+----+----+----+----+----+----+----+----+
</pre>
<p>
Typical size including header, in current implementation: 64 bytes
</p>
<h4>Notes</h4>
IP address is always 4 bytes in the current implementation.
<h3 id="relayIntro">RelayIntro (type 5)</h3>
<p>
This is the introduction for Alice, which is sent from Bob to Charlie.
</p>
<table border="1">
<tr><td align="right" valign="top"><b>Peer:</b></td>
<td>Bob to Charlie</td></tr>
<tr><td align="right" valign="top"><b>Data:</b></td>
<td><ul>
<li>1 byte IP address size</li>
<li>that many byte representation of Alice's IP address</li>
<li>2 byte port number (of Alice)</li>
<li>1 byte challenge size</li>
<li>that many bytes relayed from Alice</li>
<li>N bytes, currently uninterpreted</li>
</ul></td></tr>
<tr><td align="right" valign="top"><b>Key used:</b></td>
<td>sessionKey</td></tr>
</table>
<pre>
+----+----+----+----+----+----+----+----+
|size| Alice IP | Port (A)|size|
+----+----+----+----+----+----+----+----+
| that many bytes of challenge |
+ |
| data relayed from Alice |
+----+----+----+----+----+----+----+----+
| arbitrary amount of uninterpreted data|
+----+----+----+----+----+----+----+----+
</pre>
<p>
Typical size including header, in current implementation: 48 bytes
</p>
<h4>Notes</h4>
<ul><li>
IP address is always 4 bytes in the current implementation.
</li><li>
Challenge is unimplemented, challenge size is always zero
</li></ul>
<h3 id="data">Data (type 6)</h3>
<p>
This message is used for data transport and acknowledgment.
</p>
<table border="1">
<tr><td align="right" valign="top"><b>Peer:</b></td>
<td>Any</td></tr>
<tr><td align="right" valign="top"><b>Data:</b></td>
<td><ul>
<li>1 byte flags:<pre>
Bit order: 76543210 (bit 7 is MSB)
bit 7: explicit ACKs included
bit 6: ACK bitfields included
bit 5: reserved
bit 4: explicit congestion notification (ECN)
bit 3: request previous ACKs
bit 2: want reply
bit 1: extended data included (unused, never set)
bit 0: reserved</pre></li>
<li>if explicit ACKs are included:<ul>
<li>a 1 byte number of ACKs</li>
<li>that many 4 byte MessageIds being fully ACKed</li>
</ul></li>
<li>if ACK bitfields are included:<ul>
<li>a 1 byte number of ACK bitfields</li>
<li>that many 4 byte MessageIds + a 1 or more byte ACK bitfield.
The bitfield uses the 7 low bits of each byte, with the high
bit specifying whether an additional bitfield byte follows it
(1 = true, 0 = the current bitfield byte is the last). These
sequence of 7 bit arrays represent whether a fragment has been
received - if a bit is 1, the fragment has been received. To
clarify, assuming fragments 0, 2, 5, and 9 have been received,
the bitfield bytes would be as follows:<pre>
byte 0
Bit order: 76543210 (bit 7 is MSB)
bit 7: 1 (further bitfield bytes follow)
bit 6: 1 (fragment 0 received)
bit 5: 0 (fragment 1 not received)
bit 4: 1 (fragment 2 received)
bit 3: 0 (fragment 3 not received)
bit 2: 0 (fragment 4 not received)
bit 1: 1 (fragment 5 received)
bit 0: 0 (fragment 6 not received)
byte 1
Bit order: 76543210 (bit 7 is MSB)
bit 7: 0 (no further bitfield bytes)
bit 6: 0 (fragment 7 not received)
bit 5: 0 (fragment 8 not received)
bit 4: 1 (fragment 9 received)
bit 3: 0 (fragment 10 not received)
bit 2: 0 (fragment 11 not received)
bit 1: 0 (fragment 12 not received)
bit 0: 0 (fragment 13 not received)</pre></li>
</ul></li>
<li>If extended data included:<ul>
<li>1 byte data size</li>
<li>that many bytes of extended data (currently uninterpreted)</li></ul></li>
<li>1 byte number of fragments (can be zero)</li>
<li>If nonzero, that many message fragments. Each fragment contains:<ul>
<li>4 byte messageId</li>
<li>3 byte fragment info:<pre>
Bit order: 76543210 (bit 7 is MSB)
bits 23-17: fragment # 0 - 127
bit 16: isLast (1 = true)
bits 15-14: unused
bits 13-0: fragment size 0 - 16383</pre></li>
<li>that many bytes</li></ul>
<li>N bytes padding, uninterpreted</li>
</ul></td></tr>
<tr><td align="right" valign="top"><b>Key used:</b></td>
<td>sessionKey</td></tr>
</table>
<pre>
+----+----+----+----+----+----+----+----+
|flag| (additional headers, determined |
+----+ |
| by the flags, such as ACKs or |
| bitfields |
+----+----+----+----+----+----+----+----+
|#frg| messageId | frag info |
+----+----+----+----+----+----+----+----+
| that many bytes of fragment data |
. . .
| |
+----+----+----+----+----+----+----+----+
| messageId | frag info | |
+----+----+----+----+----+----+----+ |
| that many bytes of fragment data |
. . .
| |
+----+----+----+----+----+----+----+----+
| messageId | frag info | |
+----+----+----+----+----+----+----+ |
| that many bytes of fragment data |
. . .
| |
+----+----+----+----+----+----+----+----+
| arbitrary amount of uninterpreted data|
+----+----+----+----+----+----+----+----+
</pre>
<h4>Notes</h4>
<ul><li>
The current implementation adds a limited number of duplicate acks for
messages previously acked, if space is available.
</li><li>
If the number of fragments is zero, this is an ack-only or keepalive message.
</li><li>
The ECN feature is unimplemented, and the bit is never set.
</li><li>
The want reply bit is always set in the current implementation.
</li><li>
Extended data is unimplemented and never present.
</li><li>
The current implementation does not pack multiple fragments into a single packet;
the number of fragments is always 0 or 1.
</li><li>
As currently implemented, maximum fragments is 64
(maximum fragment number = 63).
</li><li>
As currently implemented, maximum fragment size is of course
less than the MTU.
</li><li>
Take care not to exceed the maximum MTU even if there is a large number of
ACKs to send.
</li><li>
The protocol allows zero-length fragments but there's no reason to send them.
</li><li>
In SSU, the data uses a short 5-byte I2NP header followed by the payload
of the I2NP message instead of the standard 16-byte I2NP header.
The short I2NP header consists only of
the one-byte I2NP type and 4-byte expiration in seconds.
The I2NP message ID is used as the message ID for the fragment.
The I2NP size is assembled from the fragment sizes.
The I2NP checksum is not required as UDP message integrity is ensured by decryption.
</li><li>
Message IDs are not sequence numbers and are not consecutive.
SSU does not guarantee in-order delivery.
While we use the I2NP message ID as the SSU message ID, from the SSU
protocol view, they are random numbers.
In fact, since the router uses a single Bloom filter for all peers,
the message ID must be an actual random number.
</li></ul>
<h3 id="peerTest">PeerTest (type 7)</h3>
<p>
See <a href="{{ site_url('docs/transport/ssu') }}#peerTesting">the UDP overview page</a> for details.
</p>
<table border="1">
<tr><td align="right" valign="top"><b>Peer:</b></td>
<td>Any</td></tr>
<tr><td align="right" valign="top"><b>Data:</b></td>
<td><ul>
<li>4 byte nonce</li>
<li>1 byte IP address size</li>
<li>that many byte representation of Alice's IP address</li>
<li>2 byte Alice's port number</li>
<li>Alice's 32-byte introduction key</li>
<li>N bytes, currently uninterpreted</li>
</ul></td></tr>
<tr><td align="right" valign="top"><b>Key used:</b></td>
<td>introKey (or sessionKey if the connection has already been established)</td></tr>
</table>
<pre>
+----+----+----+----+----+----+----+----+
| test nonce |size| Alice IP addr
+----+----+----+----+----+----+----+----+
| Port (A)| |
+----+----+----+ +
| Alice or Charlie's |
+ introduction key (Alice's is sent to +
| Bob and Charlie, while Charlie's is |
+ sent to Alice) +
| |
| +----+----+----+----+----+
| | arbitrary amount of |
|----+----+----+ |
| uninterpreted data |
+----+----+----+----+----+----+----+----+
</pre>
<p>
Typical size including header, in current implementation: 80 bytes
</p>
<h4>Notes</h4>
<ul><li>
When sent by Alice, IP address size is 0, IP address is not present, and port is 0,
as Bob and Charlie do not need this information.
</li><li>
When sent by Bob or Charlie, IP and port are present, and
IP address is always 4 bytes in the current implementation.
</li></ul>
<h3 id="holePunch">HolePunch</h3>
<p>
A HolePunch is simply a UDP packet with no data.
It is unauthenticated and unencrypted.
It does not contain a SSU header, so it does not have a message type number.
It is sent from Charlie to Alice as a part of the Introduction sequence.
</p>
<h2><a name="sampleDatagrams">Sample datagrams</a></h2>
<b>Minimal data message (no fragments, no ACKs, no NACKs, etc)</b><br />
<i>(Size: 39 bytes)</i>
<pre>
+----+----+----+----+----+----+----+----+
| MAC |
+ +
| |
+----+----+----+----+----+----+----+----+
| IV |
+ +
| |
+----+----+----+----+----+----+----+----+
|flag| time |flag|#frg| |
+----+----+----+----+----+----+----+ |
| padding to fit a full AES256 block |
+----+----+----+----+----+----+----+----+
</pre>
<b>Minimal data message with payload</b><br />
<i>(Size: 46+fragmentSize bytes)</i>
<pre>
+----+----+----+----+----+----+----+----+
| MAC |
+ +
| |
+----+----+----+----+----+----+----+----+
| IV |
+ +
| |
+----+----+----+----+----+----+----+----+
|flag| time |flag|#frg|
+----+----+----+----+----+----+----+----+
messageId | frag info | |
+----+----+----+----+----+----+ |
| that many bytes of fragment data |
. . .
| |
+----+----+----+----+----+----+----+----+
</pre>
{% endblock %}