ANR when getting RouterInfo #47

Open
opened 2025-04-21 14:45:17 -04:00 by idk · 6 comments
Owner

Opened 3 years ago

Last modified 2 years ago

#2096assigneddefect

ANR when getting RouterInfo

Reported by:str4dOwned by:Meeh
Priority:
minor
Milestone:
0.9.42
Component:
apps/android
Version:
0.9.31
Keywords:
I2P Android, hang
Cc:

Parent Tickets:

Sensitive:
no

Description

Seen on various Android devices:

"main" prio=5 tid=1 Blocked
| group="main" sCount=1 dsCount=0 obj=0x7626f000 self=0xb4827800
| sysTid=14023 nice=-11 cgrp=default sched=0/0 handle=0xb6f56bec
| state=S schedstat=( 0 0 0 ) utm=811 stm=143 core=1 HZ=100
| stack=0xbe784000-0xbe786000 stackSize=8MB
| held mutexes=
at net.i2p.router.Router.getRouterInfo (Router.java:519)
- waiting to lock <0x20b029bd> (a java.lang.Object) held by thread 85
at net.i2p.android.router.util.Util.getNetStatus (Util.java:436)
at net.i2p.android.router.MainFragment.updateStatus (MainFragment.java:359)
at net.i2p.android.router.MainFragment.access$600 (MainFragment.java:49)
at net.i2p.android.router.MainFragment$OneShotUpdate.run (MainFragment.java:270)

The thread holding the lock isn't shown in any of the reports, but I'm guessing it is a call to Router.rebuildRouterInfo(blockingRebuild=true).

Would it make sense to move the publishing of the RouterInfo? outside the lock? Publishing happens after a call to Router.setRouterInfo(ri), at which point other callers to Router.getRouterInfo() should be able to proceed fine (and the publisher is one such caller).

Subtickets

Opened [3 years ago](/timeline?from=2017-11-25T20%3A15%3A33Z&precision=second "See timeline at Nov 25, 2017 8:15:33 PM") Last modified [2 years ago](/timeline?from=2019-06-05T18%3A43%3A08Z&precision=second "See timeline at Jun 5, 2019 6:43:08 PM") ## [\#2096](/ticket/2096)[assigned](/query?status=assigned)[defect](/query?status=!closed&type=defect) # ANR when getting RouterInfo Reported by:[str4d](/query?status=!closed&reporter=str4d)Owned by:[Meeh](/query?status=!closed&owner=meeh) Priority: [minor](/query?status=!closed&priority=minor) Milestone: [0.9.42](/milestone/0.9.42 "Completed 20 months ago (Aug 27, 2019 12:00:00 PM)") Component: [apps/android](/query?status=!closed&component=apps%2Fandroid) Version: [0.9.31](/query?status=!closed&version=0.9.31) Keywords: [I2P](/query?status=!closed&keywords=~I2P) [Android](/query?status=!closed&keywords=~Android), [hang](/query?status=!closed&keywords=~hang) Cc: Parent Tickets: Sensitive: [no](/query?status=!closed&sensitive=0) ### Description Seen on various Android devices: ``` "main" prio=5 tid=1 Blocked | group="main" sCount=1 dsCount=0 obj=0x7626f000 self=0xb4827800 | sysTid=14023 nice=-11 cgrp=default sched=0/0 handle=0xb6f56bec | state=S schedstat=( 0 0 0 ) utm=811 stm=143 core=1 HZ=100 | stack=0xbe784000-0xbe786000 stackSize=8MB | held mutexes= at net.i2p.router.Router.getRouterInfo (Router.java:519) - waiting to lock <0x20b029bd> (a java.lang.Object) held by thread 85 at net.i2p.android.router.util.Util.getNetStatus (Util.java:436) at net.i2p.android.router.MainFragment.updateStatus (MainFragment.java:359) at net.i2p.android.router.MainFragment.access$600 (MainFragment.java:49) at net.i2p.android.router.MainFragment$OneShotUpdate.run (MainFragment.java:270) ``` The thread holding the lock isn't shown in any of the reports, but I'm guessing it is a call to `Router.rebuildRouterInfo(blockingRebuild=true)`. Would it make sense to move the publishing of the RouterInfo? outside the lock? Publishing happens after a call to `Router.setRouterInfo(ri)`, at which point other callers to `Router.getRouterInfo()` should be able to proceed fine (and the publisher is one such caller). ### Subtickets
idk added this to the 0.9.42 milestone 2025-04-21 14:45:17 -04:00
idk added the
#2096
0.9.42
apps.android
labels 2025-04-21 14:45:17 -04:00
Author
Owner
Familiarizing myself with this issue, looks like the relevant chunk of code from up zzz's post is: https://i2pgit.org/i2p-hackers/i2p.android.base/-/blob/master/app/src/main/java/net/i2p/android/router/util/Util.java#L418 https://i2pgit.org/i2p-hackers/i2p.i2p/-/blob/master/apps/routerconsole/java/src/net/i2p/router/web/helpers/SummaryHelper.java#L222
Author
Owner

comment:5 Changed 2 years ago by zzz

Milestone:0.9.33 →
0.9.42Owner:
changed from str4d to Meeh

[comment:5](https://trac.i2p2.de/\#comment:5) Changed [2 years ago](https://trac.i2p2.de//timeline?from=2019-06-05T18%3A43%3A08Z&precision=second "See timeline at Jun 5, 2019 6:43:08 PM") by zzz Milestone:0.9.33 → 0.9.42Owner: changed from _str4d_ to _Meeh_
Author
Owner

comment:4 Changed 3 years ago by zzz

@str4d please fix for .34 if wasn't fixed in .33, please update this ticket

[comment:4](https://trac.i2p2.de/\#comment:4) Changed [3 years ago](https://trac.i2p2.de//timeline?from=2018-02-19T15%3A37%3A00Z&precision=second "See timeline at Feb 19, 2018 3:37:00 PM") by zzz @str4d please fix for .34 if wasn't fixed in .33, please update this ticket
Author
Owner

comment:3 Changed 3 years ago by zzz

Component:router/general →
apps/androidOwner:
set to _str4d_Status:open →
assigned

Changed to a r/w lock in 821022ec07265c32461b5106f2a72b0ef00c2c61 to be 0.9.32-15.

Reassigning to OP to look at Android Util.getNetStatus().

[comment:3](https://trac.i2p2.de/\#comment:3) Changed [3 years ago](https://trac.i2p2.de//timeline?from=2017-12-10T13%3A08%3A48Z&precision=second "See timeline at Dec 10, 2017 1:08:48 PM") by zzz Component:router/general → apps/androidOwner: set to _str4d_Status:open → assigned Changed to a r/w lock in 821022ec07265c32461b5106f2a72b0ef00c2c61 to be 0.9.32-15. Reassigning to OP to look at Android Util.getNetStatus().
Author
Owner

comment:2 Changed 3 years ago by zzz

Milestone:undecided →
0.9.33

I searched and could only find two places in the code where we call rebuildRouterInfo(true).

  • FloodfillMonitorJob?, where we switch from non-ff to ff, or vice versa. Will never happen on Android.

  • FloodfillNetworkDatabaseFacade?,shutdown(), when we are floodfill. Will never happen on Android.

However, even when blockingRebuild == false, it may take quite a while, as it calls commSystem.createAddresses() which has locks of its own and may hae a lot of work to do.

In Router, I'll change the locking to a read/write lock, but that won't help if it's stuck in createAddresses().

I think the fix for you is to just remove the call to get the router info, and the detailed checks for what's in the NTCP and SSU addresses in Util.getNetStatus(). This looks like code copied from console SummaryHelper?.reachability(). A lot of the special cases we put into specific error messages in the console aren't really necessary in Android. It's a simpler UI with much less configuration options, a lot less ability for the user to misconfigure something, and less facilities to fix something that's broken.

Now that we have a lot of states that can get returned from commSystem.getStatus(), that should be granular enough without needing to inspect the RouterInfo? contents. If not, we could consider adding yet more states to getStatus().

[comment:2](https://trac.i2p2.de/\#comment:2) Changed [3 years ago](https://trac.i2p2.de//timeline?from=2017-12-10T12%3A47%3A33Z&precision=second "See timeline at Dec 10, 2017 12:47:33 PM") by zzz Milestone:undecided → 0.9.33 I searched and could only find two places in the code where we call rebuildRouterInfo(true). - FloodfillMonitorJob?, where we switch from non-ff to ff, or vice versa. Will never happen on Android. - FloodfillNetworkDatabaseFacade?,shutdown(), when we are floodfill. Will never happen on Android. However, even when blockingRebuild == false, it may take quite a while, as it calls commSystem.createAddresses() which has locks of its own and may hae a lot of work to do. In Router, I'll change the locking to a read/write lock, but that won't help if it's stuck in createAddresses(). I think the fix for you is to just remove the call to get the router info, and the detailed checks for what's in the NTCP and SSU addresses in Util.getNetStatus(). This looks like code copied from console SummaryHelper?.reachability(). A lot of the special cases we put into specific error messages in the console aren't really necessary in Android. It's a simpler UI with much less configuration options, a lot less ability for the user to misconfigure something, and less facilities to fix something that's broken. Now that we have a lot of states that can get returned from commSystem.getStatus(), that should be granular enough without needing to inspect the RouterInfo? contents. If not, we could consider adding yet more states to getStatus().
Author
Owner

comment:1 Changed 3 years ago by zzz

Status:new →
open

Worth researching why it's the way it is. netdb.publish(), when inline, can take a while, especially with 1800+ floodfills to choose from.

[comment:1](https://trac.i2p2.de/\#comment:1) Changed [3 years ago](https://trac.i2p2.de//timeline?from=2017-11-25T21%3A35%3A07Z&precision=second "See timeline at Nov 25, 2017 9:35:07 PM") by zzz Status:new → open Worth researching why it's the way it is. netdb.publish(), when inline, can take a while, especially with 1800+ floodfills to choose from.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: I2P_Developers/i2p.android.base#47
No description provided.