From 625a889bbe3d82daa399297ab07c3ccc77a0ab28 Mon Sep 17 00:00:00 2001
From: jrandom <jrandom>
Date: Mon, 19 Jul 2004 20:08:11 +0000
Subject: [PATCH] rewrite, includes new "capacity" concept (replacing the old
 "reliability" concept)

---
 pages/how_peerselection.html | 99 ++++++++++++++++++++++++++++++++++++
 1 file changed, 99 insertions(+)
 create mode 100644 pages/how_peerselection.html
diff --git a/pages/how_peerselection.html b/pages/how_peerselection.html
new file mode 100644
index 00000000..80596eca
--- /dev/null
+++ b/pages/how_peerselection.html
@@ -0,0 +1,99 @@
+<p>Peer selection within I2P is simply the process of choosing which routers
+on the network we want our messages to go through (which peers will we 
+ask to join our tunnels).  To accomplish this, we keep track of how each
+peer performs (the peer's "profile") and use that data to estimate how 
+fast they are, how often they will be able to accept our requests, and 
+whether they seem to be overloaded or otherwise unable to perform what
+they agree to reliably.</p>
+
+<h2>Peer profiles</h2>
+
+<p>Each peer has a set of data points collected about them, including statistics 
+about how long it takes for them to reply to a network database query, how 
+often their tunnels fail, and how many new peers they are able to introduce 
+us to, as well as simple data points such as when we last heard from them or
+when the last communication error occurred.  The specific data points gathered
+can be found in the <a href="http://dev.i2p.net/cgi-bin/cvsweb.cgi/i2p/router/java/src/net/i2p/router/peermanager/PeerProfile.java?rev=HEAD&content-type=text/x-cvsweb-markup">code</a>
+</p>
+
+<p>Currently, there is no 'ejection' strategy to get rid of the profiles for 
+peers that are no longer active (or when the network consists of thousands
+of peers, to get rid of peers that are performing poorly).  However, the size 
+of each profile is fairly small, and is unrelated to how much data is 
+collected about the peer, so that a router can keep a few thousand active
+peer profiles without any significant overhead.  Once it becomes necessary,
+we can simply compact the poorly performing profiles (keeping only the most
+basic data) and maintain hundreds of thousands of profiles in memory.  Beyond
+that size, we can simply eject the peers (e.g. keeping the best 100,000).</p>
+
+<h2>Peer summaries</h2>
+
+<p>While the profiles themselves can be considered a summary of a peer's 
+performance, to allow for effective peer selection we break each summary down 
+into four simple values, representing the peer's speed, its capacity, how well 
+integrated into the network it is, and whether it is failing.</p>
+
+<h3>Speed</h3>
+
+<p>The speed calculation (as implemented 
+<a href="http://dev.i2p.net/cgi-bin/cvsweb.cgi/i2p/router/java/src/net/i2p/router/peermanager/SpeedCalculator.java?rev=HEAD&content-type=text/x-cvsweb-markup">here</a>)
+simply goes through the profile and estimates how many round trip messages we can
+send through the peer in a minute.  For this estimate it just looks at past 
+performance, weighing more recent data more heavily, and extrapolates it for the future.</p>
+
+<h3>Capacity</h3>
+
+<p>The capacity calculation (as implemented 
+<a href="http://dev.i2p.net/cgi-bin/cvsweb.cgi/i2p/router/java/src/net/i2p/router/peermanager/CapacityCalculator.java?rev=HEAD&content-type=text/x-cvsweb-markup">here</a>)
+simply goes through the profile and estimates how many tunnels we think the peer
+would agree to participate in over the next hour.  For this estimate it looks at 
+how many the peer has agreed to lately, how many the peer rejected, and how many
+of the agreed to tunnels failed, and extrapolates the data.  In addition, it
+includes a 'growth factor' (when the peer isn't failing or rejecting requests) so
+that we will keep trying to push their limits.</p>
+
+<h3>Integration</h3>
+
+<p>The integration calculation (as implemented 
+<a href="http://dev.i2p.net/cgi-bin/cvsweb.cgi/i2p/router/java/src/net/i2p/router/peermanager/IntegrationCalculator.java?rev=HEAD&content-type=text/x-cvsweb-markup">here</a>)
+is important only for the network database (and in turn, only when trying to focus
+the 'exploration' of the network to detect and heal splits).  At the moment it is 
+not used for anything though, as the detection code is not necessary for generally
+well connected networks.  In any case, the calculation itself simply tracks how many
+times the peer is able to tell us about a peer we didn't know (or updated data for
+a peer we did know).</p>
+
+<h3>Failing</h3>
+
+<p>The failing calculation (as implemented 
+<a href="http://dev.i2p.net/cgi-bin/cvsweb.cgi/i2p/router/java/src/net/i2p/router/peermanager/IsFailingCalculator.java?rev=HEAD&content-type=text/x-cvsweb-markup">here</a>)
+keeps track of a few data points and determines whether the peer is overloaded 
+or is unable to honor its agreements.  When a peer is marked as failing, it will
+be avoided whenever possible.</p>
+
+<h2>Peer organization</h2>
+
+<p>As mentioned above, we drill through each peer's profile to come up with a 
+few key calculations, and based upon those, we organize each peer into five
+groups - fast, capable, well integrated, not failing, and failing.  When the
+router wants to build a tunnel, it looks for fast peers, while when it wants
+to test peers it simply choses capable ones.  Within each of these groups 
+however, the peer selected is random so that we balance the load (and risk)
+across peers as well as to prevent some simple attacks.</p>
+
+<p>The groupings are not mutually exclusive, nor are they unrelated:  <ul> 
+<li>A peer is active if we have sent or received a message from the peer in the 
+    last few minutes</li>
+<li>A peer is considered "high capacity" if its capacity calculation meets or 
+     exceeds the median of all active peers.  </li>
+<li>A peer is considered "fast" if they are already "high capacity" and their 
+    speed calculation meets or exceeds the median of all "high capacity" peers.</li>
+<li>A peer is considered "well integrated" if its integration calculation meets 
+    or exceeds the mean value of active peers.</li>
+<li>A peer is considered "failing" if the failing calculation returns true.</li>
+<li>A peer is considered "not failing" if it is not "high capacity" or "failing"</li>
+</ul>
+
+These groupings are implemented in the <a 
+href="http://dev.i2p.net/cgi-bin/cvsweb.cgi/i2p/router/java/src/net/i2p/router/peermanager/ProfileOrganizer.java?rev=HEAD&content-type=text/x-cvsweb-markup">ProfileOrganizer</a>'s
+reorganize() method (using the calculateThresholds() and locked_placeProfile() methods in turn).</p>
\ No newline at end of file