Skip to content

Conversation

@h3lix1
Copy link
Contributor

@h3lix1 h3lix1 commented Sep 15, 2025

With a move to faster (but shorter distance) presets, the limit of 7 hops may be a challenge for some networks to grow within metro regions while keeping within the 7 hop limit, limiting adoption of presets like SHORT_FAST or SHORT_TURBO for environments that can support it otherwise.

Allowing users to configure past 7 hops could cause problems with meshes not prepared for this. A compromise is to allow routers with other routers in the "friend" list to not decrease max_hops.

Using some inspiration from CLIENT_BASE, it uses "friended" routers as a determinator if the hop_limit is decreased. In the case of this, client_base is also considered a router.

This implementation looks at the p->relay_node as the last byte of the node ID to determine if the node is a friend based on the last byte of node ID, and then secondly by if it is a router in the node DB. There could be collisions here since there are only 256 possible values between 0x00 to 0xff. There is a much smaller possibility that a client will match both.

The equation for collision probability is 1 - ((256 - F) / 256)^(N - F)

Practical examples:

10 nodes, 1 favorite - will collide with 0.04 nodes (3.5% probability of a single collision)
50 nodes, 3 favorites - will collide with 0.55 nodes (43% probability of a single collision)
100 nodes, 5 favorites - will collide with 1.86 nodes (84.5% probability of a single collision)
200 nodes, 10 favorites - will collide with 7.42 nodes (99.9% probability of a single collision)

Even with a collision, it would require both nodes to be in close proximity to talk to each other, which makes this an edge case at best, and there is no harm done if there is a collision.

Nodes maintain a PacketHistory with sender/id to avoid sending the same packet twice. This will avoid common routing loops.

Features

  1. Preserves hop_limit when relaying between infrastructure nodes (no decrement)
  2. Only activates for FAVORITE infrastructure nodes (conservative approach)
  3. Handles ambiguity of last-byte node identification
  4. Works for both flooding and next-hop routing
  5. Supports ROUTER, ROUTER_LATE, and CLIENT_BASE roles

Supported Infrastructure Roles:

  • ROUTER - Traditional infrastructure router
  • ROUTER_LATE - Lower priority router for coverage extension
  • CLIENT_BASE - Powerful base station for favorited nodes (attic/roof nodes)

Behavior:

  • When local device is a ROUTER/ROUTER_LATE/CLIENT_BASE role
  • AND it is not the first hop
  • AND previous relay is identified as a FAVORITE infrastructure node
  • AND previous relay is also ROUTER/ROUTER_LATE/CLIENT_BASE
  • THEN hop_limit is preserved (not decremented)
  • OTHERWISE normal hop decrement applies

Usage:

To benefit from this feature:

  1. Configure infrastructure nodes as ROUTER, ROUTER_LATE, or CLIENT_BASE role
  2. Mark trusted infrastructure nodes as favorites
  3. Messages between favorite infrastructure nodes won't consume hop counts

Safety:

  • Deduplication prevents routing loops (based on sender ID + packet ID)
  • Conservative favorite-only approach prevents unintended behavior
  • CLIENT_BASE nodes are treated as infrastructure nodes like routers
  • Backward compatible with existing mesh networks

🤝 Attestations

  • I have tested that my proposed changes behave as described.
  • I have tested that my proposed changes do not cause any obvious regressions on the following devices:
    • Heltec (Lora32) V3
    • LilyGo T-Deck
    • LilyGo T-Beam
    • RAK WisBlock 4631
    • Seeed Studio T-1000E tracker card
    • Other (please specify below)
      Station G2

…ation

- Preserve hop_limit when both local device and previous relay are routers/CLIENT_BASE
- Only preserve hops for favorite routers to prevent abuse
- Apply to both FloodingRouter and NextHopRouter
- Update hop counting logic in MeshService for router-to-router communication

This allows routers to communicate over longer distances without
consuming hop limits, improving mesh network efficiency for
infrastructure nodes.
@CLAassistant
Copy link

CLAassistant commented Sep 15, 2025

CLA assistant check
All committers have signed the CLA.

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

This reverts the protobufs submodule back to a84657c22 to remove
unintended changes from this branch.
@thebentern thebentern requested a review from GUVWAF September 15, 2025 10:47
@thebentern
Copy link
Contributor

This is a really neat idea for extending the range of the higher bandwidth presets. I do think it should probably be gated for presets below the LONG_FAST link budget.

Copy link
Member

@GUVWAF GUVWAF left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m not totally against the idea, but there is a problem with this, namely that in several places the firmware relies on the hop limit decrementing at each node for routing decisions. For example:

bool isRepeated = p->hop_start > 0 && p->hop_start == p->hop_limit;

And:
(p->hop_start != 0 && p->hop_start == p->hop_limit &&

When not decrementing the hop limit, it thinks the packet was received directly.
Furthermore, it messes with the “hopsAway” counter and also traceroutes that include “unknown” nodes.

That said, I believe meshes in metro regions would benefit from #5534 and I would give higher priority in getting that one merged than this more pervasive PR.

@GUVWAF
Copy link
Member

GUVWAF commented Sep 15, 2025

Evidence shows in the Bay Area Mesh that the lowest TTL is usually the one that reaches the far nodes (50km+ distance) first anyway with MediumSlow, as logged via MQTT and shown in the logs.

But how do you know that at the point where it is ready to rebroadcast (after scanning the channel and random delays due to CSMA/CA), it hasn’t received any duplicates with a higher hop limit?

We're also reaching limits to what MediumSlow can provide us, and ShortFast/ShortTurbo would be ideal.

While indeed the impact of a higher hop limit on channel utilization is limited when using a faster preset, I’m not sure the overall reliability will increase. At each transmission, there is a chance it fails, e.g. due to collisions, external interference/noise, or just a marginal link budget. Thus, the chance that more than 8 transmissions (7 hops) in a row succeed gets lower and lower.

that is possible only if every node in the path is a router, which seems odd on many levels.

I’m also talking about any intermediate node, so only two is enough. For example, node A sends a packet that gets heard by two nodes B and C, where B is a favorite router that does not decrement the hop limit. Once C hears the rebroadcast of B with the original hop limit, it thinks it’s a repeated packet from A and hence tries to rebroadcast again, while it shouldn’t. This can only be fixed by relying on the relay_node, but that’s only present from firmware >=2.6.

As for the NextHopRouter - it also depends on the original sender being a router (not client).. and again, not sure why a router would retransmit to trip this code.

Yes, for example when you request position/NodeInfo/traceroute/telemetry of a router. A node receiving a rebroadcast from another router without decremented hop limit thinks it’s the original message.

@h3lix1
Copy link
Contributor Author

h3lix1 commented Sep 16, 2025

But how do you know that at the point where it is ready to rebroadcast (after scanning the channel and random delays due to CSMA/CA), it hasn’t received any duplicates with a higher hop limit?

We don't. In the bay mesh logger, we log the first packet that comes into the node, and generally it is correct. We can see the hops fan out from each route appropriately via traceroute or MQTT data. The routers are all well connected, and we don't see evidence a lower hop packet arrives after a higher one. I'd be willing to run test code on my router though to validate how often packets like this arrive out of order.

While indeed the impact of a higher hop limit on channel utilization is limited when using a faster preset, I’m not sure the overall reliability will increase. At each transmission, there is a chance it fails, e.g. due to collisions, external interference/noise, or just a marginal link budget. Thus, the chance that more than 8 transmissions (7 hops) in a row succeed gets lower and lower.

I don't know how other meshes operate when using routers, but in the bay mesh if a router doesn't receive a message (due to noise, collision, etc) chances are it will receive it from another router. Same with clients. The core routers of the environment are highly meshed with each router having 3-4 others it hears. Reliability is solved through redundancy.

I’m also talking about any intermediate node, so only two is enough. For example, node A sends a packet that gets heard by two nodes B and C, where B is a favorite router that does not decrement the hop limit. Once C hears the rebroadcast of B with the original hop limit, it thinks it’s a repeated packet from A and hence tries to rebroadcast again, while it shouldn’t. This can only be fixed by relying on the relay_node, but that’s only present from firmware >=2.6.

This is not completely correct... If node A sends a packet that gets heard by nodes B and C, both B and C will decrement the hop limit since node A is not a favorite router, and will not fit the criteria. The only way this gets weird is if A, B, and C are all routers, A is liked by B and C, and A is the sender. Your point is still valid and correct. I'll fix it. As long as hop_limit is less than hop_start the rest should be good.

Thank you for this input. I appreciate it.

@GUVWAF
Copy link
Member

GUVWAF commented Sep 16, 2025

 I'd be willing to run test code on my router though to validate how often packets like this arrive out of order.

That would be nice. It would be a shame if this would get nullified just by packets arriving via other routes first, which is common in dense and active meshes (where routers often have multiple packets in the queue).

The core routers of the environment are highly meshed with each router having 3-4 others it hears. Reliability is solved through redundancy.

I agree that indeed redundancy helps a lot here, but having 4-5 routers in each other’s range is a rather unique set-up. (And to be honest also not really recommended if they are real routers (not ROUTER_LATE) as they all try to rebroadcast in the same short window).
Like I said, I’m not totally against this idea, and we’ll just have to see how it performs in real-life.

The only way this gets weird is if A, B, and C are all routers,

A and B yes, but C, no. My point was that any CLIENT hearing a rebroadcast without decremented hop limit will think it’s a repeated original packet and does not refrain from rebroadcasting. But your fix solves this, so that’s good.

What remains are the more cosmetic anomalies like incorrect “hops away” counts, traceroutes and NeighborInfo.

@h3lix1
Copy link
Contributor Author

h3lix1 commented Sep 16, 2025

What remains are the more cosmetic anomalies like incorrect “hops away” counts, traceroutes and NeighborInfo.

From my testing (admittedly, limited) traceroutes work as intended and ignore the hop_start and hop_limit, although it can reach ROUTE_SIZE which is limited to 8. It looks like it's a graceful fail (with error) but without completely re-doing the protobufs, I can't find a good solution here. This one we're definitely limited by the payload size as increasing it past 8 will break a lot of things. I don't have a good solve other than it works until 8 hops.

On the other side, it will only show "unknown" hops as long as the number if hopsTaken is less than the hop_start - hop_limit. I guess this is a blessing and curse, as the hops work as intended, but if there is a missing unknown hop somewhere, it will be masked. I personally don't see this as a dealbreaker given our current constraints.

// Only insert unknown hops if hop_start is valid
if (p.hop_start != 0 && p.hop_limit <= p.hop_start) {
    uint8_t hopsTaken = p.hop_start - p.hop_limit;

For NeighborInfo the previous change of decreasing hop_limit will fix all the issues there, as it only cares about local hops.

 else if (mp.hop_start != 0 && mp.hop_start == mp.hop_limit) {
         // If the hopLimit is the same as hopStart, then it is a neighbor

I'm starting to appreciate my first-hop fix a little more now.

With hops away counts, which seems to be a simple calculation of hop_start - hop_limit, which will treat the "favorite routers" as a single hop in calculating this. I would say this is preferred, even if it is a lie, while working in the confines of the current protobuf implementation.

@GUVWAF
Copy link
Member

GUVWAF commented Sep 16, 2025

I don't have a good solve other than it works until 8 hops.

but if there is a missing unknown hop somewhere, it will be masked.

which will treat the "favorite routers" as a single hop in calculating this.

Yes, I think we'll need to live with the above limitations.

For NeighborInfo the previous change of decreasing hop_limit will fix all the issues there, as it only cares about local hops.

Indeed, nice.

@h3lix1 h3lix1 force-pushed the feat/router-hop-preservation branch from f9394ff to 5e91335 Compare September 16, 2025 23:18
@h3lix1 h3lix1 requested a review from GUVWAF September 17, 2025 18:03
@h3lix1
Copy link
Contributor Author

h3lix1 commented Sep 17, 2025

I am currently unable to add labels to fix the remaining failing check..

@GUVWAF GUVWAF added the enhancement New feature or request label Sep 17, 2025
Copy link
Member

@GUVWAF GUVWAF left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thebentern do we still want to gate this for faster presets only?

In principle I'm OK with this, but still I would prefer to have it merged after #5534, or at least at the same time to get the most out of this.

h3lix1 added a commit to erayd/meshtastic-firmware that referenced this pull request Sep 18, 2025
- Replace multiple individual role checks with cleaner IS_ONE_OF macro
- Add CLIENT_BASE support as suggested in PR meshtastic#7992
- Include MeshTypes.h for IS_ONE_OF macro
- Makes code more maintainable and consistent with other parts of codebase
@h3lix1
Copy link
Contributor Author

h3lix1 commented Oct 9, 2025

@shalberd

I thought about this for about 2 minutes before deciding to include CLIENT_BASE, but the thought process was something like this...

  • People with routers (router, router_late, etc) will likely not favorite other ROUTER to keep them in the phone/nodedb as a normal course of business
  • If they do, that's still fine unless the node favorited ROUTER is also a neighbor (less likely)
  • And even then.. the effects are minimal?

The assumption is that if someone is using a CLIENT_BASE node, they're not using the node as a CLIENT since the whole purpose of CLIENT_BASE is to forward messages from inside a structure outwards (please let me know if this is incorrect). The hope is users would just use CLIENT if the node is on the roof and they're connecting to it directly.

The reason for including CLIENT_BASE is to allow CLIENT_BASE nodes to repeat a packet from a favorite ROUTER (one hopefully nearby) before the packet runs out of hops.

I'll need to look at the CLIENT_BASE code to see if it discriminates favorite CLIENTs from ROUTERs when making this determination of sending a packet as priority content and sends it before other ROUTERs have a chance to send it.

Technically, it's adding a 0-cost hop to the path to other routers, and other routers will forward the message anyway however it is received.

But I do see the edge case. The most logical fix in my eyes is to make it so only favorite non-repeating roles (client, client_mute, etc) can make the priority window when using CLIENT_BASE

@shalberd
Copy link

shalberd commented Oct 9, 2025

since the whole purpose of CLIENT_BASE is to forward messages from inside a structure outwards

yes, that is correct. either from inside a structure or from a mobile node, say a mobile client_mute role type node and wanting to prevent client nodes in the trenches of city buildings passing a package on when the rootop Client_Base node would be better-suited.

People with routers (router, router_late, etc) will likely not favorite other ROUTER to keep them in the phone/nodedb as a normal course of business

They might now, since the feature of trusted / favorite routers-from (client_base, router, router_late) for non-reduction of 0 hop count is favorite-based and client_base favoriting of nodes is also a part of the client_base PR #7873
We have so many nodes around here, about 400, and a node limit between 80 and 250, depending on hardware used, that that favoriting just to keep some nodes in nodeDB is actually a valid thing people are doing.

Basically, your changes and the ones from that PR change have some intersection in terms of the idea of favorites.

I'll need to look at the CLIENT_BASE code to see if it discriminates favorite CLIENTs from ROUTERs when making this determination of sending a packet as priority content and sends it before other ROUTERs have a chance to send it.

Yeah, this is a bit too deep at this moment in time in the evening for me, but I think I get the idea of what you are checking in the code. In our experience, with too many Routers around, problems quickly arise, even if it's just two routers. We decided to put our mountain nodes on ROUTER_LATE here in Switzerland.

The hope is users would just use CLIENT if the node is on the roof and they're connecting to it directly.

Yes, people in our Swiss network that have that possibility are doing so, depending on BLE / bluetooth range conditions, BLE antenna used and so on. Though we have a lot of houses condo-style, multiple-story, where being at floor 0 or 1 means once won't be able to connect to a rooftop node anymore via bluetooth. So people are excited, especially those with access to a roof, for that new role CLIENT_BASE in general.

But I do see the edge case. The most logical fix in my eyes is to make it so only favorite non-repeating roles (client, client_mute, etc) can make the priority window when using CLIENT_BASE

@h3lix1 yes, that would be great, thank you for having taken my thoughts into account. That is THE added bonus / edge case prevention feature.

@h3lix1
Copy link
Contributor Author

h3lix1 commented Oct 10, 2025

@shalberd
Working on the MR for this, I'm actually struggling to find the use case that we're protecting against.
Favorited: (f)
CLIENT_BASE (f) -> CLIENT_BASE (f) - The first hop always reduces hop_limit (no 0 hop)
CLIENT (f) -> ROUTER (f) -> CLIENT_BASE - Works as intended if the first client is favorite. 0 hop rule is in effect
CLIENT (f) -> ROUTER -> CLIENT_BASE - Works as intended, no 0 hop rule is in effect
CLIENT -> ROUTER (f) -> CLIENT_BASE - Works as intended - no priority given to packet (CLIENT not favorited). 0 hop rule is in effect

CLIENT_BASE strictly uses p->from and p->to for its decision making. The 0 hop routing focuses on p->relay_node to make its decisions about the packet. Combined with the first-hop rule (the first hop always decrements) there isn't a case where two client_base favorited neighboring nodes will cause an issue.

I think this is working perfectly as-is. I'm unable to determine an edge case where the two will conflict. The only edge cases I can think of are the ones CLIENT_BASE introduces itself. (i.e. client (f)->router->client_base(f')->router) where CLIENT_BASE adds an extra hop between routers ..

Ping me on discord (h3lix) and let's see if we can find a way this can cause an issue.

@shalberd
Copy link

shalberd commented Oct 11, 2025

@h3lix1 thinking about this more, I think we will be ok. Definitely make sure to write a blog article and update documentation on the topic of non-hop-decrement / 0 cost hops for favorited ROUTER/ROUTER_LATE/CLIENT_BASE because otherwise, I can well image ROUTER_LATE or CLIENT_BASE operators by accident favoriting among themselves when that is not even intended.
This 0 cost hop feature is very useful, but, as always, when people are not aware of the relation to favorited nodes on those device roles, they might accidentally cause 0 cost hops and network problems.
As mentioned ... in a large mesh with >250 nodes, favorites are also used to keep nodes on the phone app, to keep them from disappearing. Especially since we work a lot with nrf52 based devices that only have room for 80 nodes on memory.
The Android or Iphone app serves as a kind of extended node detail memory in that case, if I understand correctly.

@shalberd
Copy link

shalberd commented Oct 11, 2025

there isn't a case where two client_base favorited neighboring nodes will cause an issue.

We have very well-visible mountain nodes, ROUTER_LATE, that can see each other well. Those people I will definitely have to tell to NOT favorite their mountain nodes. Same for more than / > 2 CLIENT_BASE that can see each other well in different cities each on a higher mountain ... that could have unintended consequences if they are on the favorite list of each other.
@hb3lix1 @GUVWAF @thebentern I'd really prefer some other field than "favorite" for this feature, but I understand the protobuf memory constraints you have at Meshtastic.

@h3lix1
Copy link
Contributor Author

h3lix1 commented Oct 11, 2025

@shalberd

If you don't mind, can you draw a diagram that explains what drawbacks you see with 0-cost hops?

I want to make sure I'm solving for the right scenario.

@shalberd
Copy link

shalberd commented Oct 12, 2025

@h3lix1 @GUVWAF @thebentern

here is my fear diagram regarding node favorites and their role in non-decrementing / 0 cost hops.
I fear that in effect, in an area where geography is very good, inadvertently, setting other Router, Router_Late and Client_Base as favorites could result in endless hops and circulating packages:

0 hop decrement endless ripple feat

We had scenarios with 60 - 80 km hops and round trips of Switzerland before and were able to prevent excessive early rebroadcasting by not using Router anymore, only Router_Late.

The new combination of Client_Base re-transmitting in the early contention window, plus the 0 cost hop feature, all based on favorites, worries me. Maybe those worries are unfounded, but it leaves a kind of questionmark with me.

https://github.com/orgs/meshtastic/discussions/409#discussioncomment-14523768

@erayd
Copy link
Contributor

erayd commented Oct 12, 2025

...could result in endless hops and circulating packages.

Meshtastic has duplicate packet detection. If the same packet is received again by a node, that node will simply ignore it. This prevents the 'endless loop' scenario you are concerned about.

@shalberd
Copy link

shalberd commented Oct 13, 2025

Meshtastic has duplicate packet detection. If the same packet is received again by a node, that node will simply ignore it.

That is not what we observed with role Router and our very well-visible, in terms of topography, Router nodes. Speaking of firmware 2.6.11, possible this has changed for the better very recently, e.g. with #8216 and #8148

@erayd
Copy link
Contributor

erayd commented Oct 13, 2025

Meshtastic has duplicate packet detection. If the same packet is received again by a node, that node will simply ignore it.

That is not what we observed with role Router and our very well-visible, in terms of topography, Router nodes. Speaking of firmware 2.6.11, possible this has changed for the better very recently, e.g. with #8216 and #8148

It's not new; the ignore-dupes functionality has existed for ages and was definitely present in v2.6.11. That is one of the things that the shouldFilterReceived() functions do. The dupe normally just gets dropped, and is not processed or rebroadcast a second time. There are a few edge cases where rebroadcasting again is appropriate (e.g. when the original sender is retrying), but these aren't cases which could cause the kind of infinite rebroadcasting you seem to be concerned about.

Can you please clarify exactly what it was that you were observing? I would be extremely surprised if you have actually observed a node repeatedly rebroadcasting the same packet outside of one of the intended exceptions (although if you have, then there's a bug somewhere that needs squashing ASAP).

@shalberd
Copy link

shalberd commented Oct 13, 2025

Can you please clarify exactly what it was that you were observing

early rebroadcast / contention window, multiple routers, role ROUTER, all broadcasting at the same time.

https://github.com/orgs/meshtastic/discussions/409#discussioncomment-14523768

One user mentions there:

This highlights the directional message sending problem that even Client nodes can have with each other. It is just more obvious and likely with nodes that have higher gain antennas and placed in very good locations.

Now me again:

That, together with Client_Base now using early rebroadcast window / Router-type logic for favorite-based routing, leads me to worry along with this 0-cost-fowarding feature here in our alpine scenarios.

We have switched our alpine nodes to role ROUTER_LATE and don't have that problem anymore, but as mentioned, Client_Base utilizing Router, not Router_Late, type behavior for favorited nodes, along with 0 cost hop forwarding for favorited nodes, has the potential, as an attack vector, to cause massive problems.

Somewhere else, someone from either Canada or NZ mentioned this: technical vs. social engineering and coordination question.
#4060 (comment)

We are all learning, and we appreciate the effort put into Meshtastic. Just please don't put in obvious config traps that could be avoided, or try finding a balance between complete dependency on user good-will (especially problematic in larger networks) and functionality.

@erayd
Copy link
Contributor

erayd commented Oct 13, 2025

early rebroadcast / contention window, multiple routers, role ROUTER, all broadcasting at the same time.

https://github.com/orgs/meshtastic/discussions/409#discussioncomment-14523768

That is completely unrelated, and has nothing to do with your infinite-rebroadcast concern. Those are mandatory-rebroadcast roles, they are supposed to rebroadcast packets. But as I pointed out above, as a general rule no node will rebroadcast the same packet a second time. Duplicate detection prevents them from doing so. Once they have received a packet, they remember doing so, and simply ignore it on any subsequent occasion they happen to hear that packet again.

Are you perhaps confusing the concept of duplicate detection (i.e. that a node wil only act on the same packet once, regardless of how many times it is received), with the concept of cancelling pending rebroadcasts of packets where another node was heard to rebroadcast it?

As an example of the former, let's say you have 50 ROUTER nodes, and they can all see each other. A packet arrives. Each of those 50 routers will rebroadcast that packet precisely once, and will then proceed to ignore any further instances of that packet which they might subsequently hear. It ends up on the air air 50 times as a result (because there are 50 routers), but there is no routing loop to worry about. The hop limit has no impact on a node's duplicate-detection behaviour.

Client_Base utilizing Router, not Router_Late, type behavior for favorited nodes, along with 0 cost hop forwarding for favorited nodes, has the potential, as an attack vector, to cause massive problems.

This is still offtopic, but yes - you're absolutely correct on that. Meshtastic is inherently a very easy system to initiate denial-of-service attacks against. There's not really much you can do to avoid that unfortunately. Tweaking things to make it harder to accidentally cause problems is quite helpful (e.g. removal of the repeater role), but ultimately an attacker who is actively malicious is always going to find it pretty easy to cause major problems.

I do feel that the CLIENT_BASE role should be using the same timing behaviour as ROUTER_LATE though, not ROUTER, and I agree with you that having it in the early window has the potential to cause a number of frustrating problems. However others seem to feel that the problems arising from running it in the early window can be safely ignored, and it was ultimately implemented in the early window anyway. We'll have to wait and see what happens once it starts seeing significant real-world use.

Just please don't put in obvious config traps that could be avoided...

What did you have in mind re config traps? That isn't something that anybody here wants, so if you see one, please by all means point it out 🙂

@shalberd
Copy link

shalberd commented Oct 14, 2025

Are you perhaps confusing the concept of duplicate detection (i.e. that a node wil only act on the same packet once, regardless of how many times it is received), with the concept of cancelling pending rebroadcasts of packets where another node was heard to rebroadcast it

yes, I think so, thanks for clearing that up

It ends up on the air air 50 times as a result (because there are 50 routers), but there is no routing loop to worry about. The hop limit has no impact on a node's duplicate-detection behaviour.

Mhh, when we did traceroutes between our Router mountain nodes, there might not have been a loop, but because of it being transmitted via 4 routers in our case, the hop limit was reached prematurely. I guess it wasn't simultaneously in that case, I see your point.

Since we switched to Router_Late for out mountaintop nodes, the situation has much improved because we do indeed have good paths between Client role nodes as well.

tweaking things to make it harder to accidentally cause problems is quite helpful (e.g. removal of the repeater role),
What did you have in mind re config traps? That isn't something that anybody here wants

Config trap 1) the use of favorites in the two features we are talking about

@compumike A user mentioned in ticket "CLIENT_BASE role unwanted behavior"
#8338
the issue of a Client_Base operator marking that other user's node as favorite, not the apartment node only for the rooftop client_base node. That is a classic case of misconfiguration, and I don't think using "favorites" can be something we can fault users for.
Scenario a) Let's say the Client_Base operator wanted to maliciously direct away traffic from the client he favorited but that is not his: then it is just bad luck.
Scenario b) But if the Client_Base operator just saw favorites as a way of marking favorite nodes and keeping them from being rolled of the node DB memory ... which it what is was before ... then you have here a classic case of a config trap. Either way, the agressive Router type behavior of client_base for favorites makes the situation worse.

agree with you that having it in the early window has the potential to cause a number of frustrating problems. However others seem to feel that the problems arising from running it in the early window can be safely ignored, and it was ultimately implemented in the early window anyway

We are already seeing problems in the wild with this, issue 8338, but also when I discuss with early adopters.

  • Timing should definitely be changed to Router_Late for CLIENT_BASE favorited nodes
  • another field than "favorite" should be found for both 0 cost hop and client_base preferred nodes. Favorite is just that, a favorite.

Just to summarize and be clear: I think both features are really cool and useful: 0 cost hops and CLIENT_BASE preferential routing. It is just I think that they will cause more problems, in the sense of favorite notion being a config trap, than they solve.

cross-linking #8338 (comment)

Config trap 2) when setting user HAM mode is_licensed, override duty cycle is set to true https://github.com/meshtastic/firmware/blob/master/src/modules/AdminModule.cpp#L1339
Both an EU_433 and EU_868, overriding the duty cycle is not allowed. On EU_868 because it is not an amateur radio band, on EU_433 because even in HAM mode on the 70cm HAM band, duty cycle restriction applies.

Plus, at least on the Android app, users would not even see that setting being enabled / true (override duty cycle) meshtastic/Meshtastic-Android#3324 (comment)

@korbinianbauer
Copy link
Contributor

korbinianbauer commented Oct 17, 2025

Dumb question maybe, but how do you even favorite another Router on a Router-Node? I don't see how that's available via remote administration.

In AdminModule.cpp I've found that a node that sent an authorized admin message is automatically added to the favorites, but that's not something one Router does to another (or is it?).

@shalberd
Copy link

Hi @korbinianbauer

Dumb question maybe, but how do you even favorite another Router on a Router-Node

That only seems possible when connected via USB or Bluetooth.

@shalberd
Copy link

shalberd commented Oct 17, 2025

@h3lix1 @fifieldt
With is_ignored nodes, we set those as is_favorite somewhere in the background, it seems (when I add a node to the ignore list, after putting that to true, it also adds it to the favorites, at least in the android app, is_favorite is shown as true).
https://github.com/search?q=repo%3Ameshtastic%2Ffirmware+is_ignored+is_favorite&type=code
Android App 2.7.3, firmware 2.7.12

SmartSelect_20251017_174852_Photos

Intuitively, a firmware search does not support my hunch that is_ignored also leads to is_favorite ... hmm
https://github.com/meshtastic/firmware/blob/master/src/modules/AdminModule.cpp#L350

then again, a search here shows that an is_favorite: true node could also be is_ignored: true or vice versa ...

https://github.com/meshtastic/firmware/blob/master/src/mesh/NodeDB.cpp#L1917

However, the nodes are then not seen in the Android app under favorites, only under filter "show ignored".
Are you making sure to only use is_favorite here for your 0 cost hop when at the same time, it is !is_ignored as well?

I want to make sure that nodes I set as is_ignored do not by accident end up in consideration for originating-from 0 hop links ...

https://github.com/meshtastic/firmware/pull/7992/files#diff-fb9b4a4da3229ad0f8213d3e55e97cb8068e71ba0225002aa050cf06788f6b8aR99

As for a different field, I was suggesting a similar approach over at Client_Base preferred nodes (new field):
#8367 (reply in thread)

@h3lix1
Copy link
Contributor Author

h3lix1 commented Oct 17, 2025

In AdminModule.cpp I've found that a node that sent an authorized admin message is automatically added to the favorites, but that's not something one Router does to another (or is it?).

Not as a normal course of business.. Unless you message the other router, which might be what @shalberd is finding above. (May have tried to message a node that was ignored to test it)

The favorites flag is a little overloaded. I might not have realized about how much until after the fact. I think it's still manageable for now since routers favoriting routers is a little odd, but as you found it will take wifi/bluetooth/serial access to set favorites normally. I'll see if there is a way to do this remotely one way or another.

@shalberd
Copy link

shalberd commented Oct 17, 2025

(May have tried to message a node that was ignored to test it)

@thebentern @fifieldt no, I did not message that node at all after I set it to is_ignored
Which is why it is so unnerving to me.
If indeed is_ignored also leads to is_favorite, we have a major problem.
cross-linking the same, is_ignored and is_favorite from the related discussion at client_base:
#8367 (reply in thread)

I think it's still manageable for now since routers favoriting routers is a little odd

No, it is not, isn't the whole point of this feature 0 cost hops here that is_favorite determines whether to make a from-hop a 0 cost hop? I mean, your feature "Feat/0-cost hops for favorite routers" requires operators to favorite other ROUTER, ROUTER_LATE, CLIENT_BASE nodes.

you currently only check for an originating node to be marked as is_favorite:

https://github.com/meshtastic/firmware/pull/7992/files#diff-fb9b4a4da3229ad0f8213d3e55e97cb8068e71ba0225002aa050cf06788f6b8aR99

However, as I have shown above, now, even is_ignored nodes, nodes clearly to be ignored, are given 0 hop cost because somewhere in the background, they are also marked as favorite.

@shalberd
Copy link

shalberd commented Oct 17, 2025

@thebentern
so I checked this again, tried on a separate node also running the firmware 2.7.12 with Android App 2.7.3, importing the to-be-ignored node via QR code.
At least regarding the distinction to is_favorite, we are safe, but I would like to get your opinion.
I imported a node, set it as to be ignored, and it now, on further testing, does not end up also being is_favorite:true
However, the testing now is done on a client_mute device, not a router or router_late device.

Update: same on Router_Late device role.
Importing a node via QR code first makes it seen, no favorite, no ignore, good.
Then, when I get ignore to true, even after a while, favorite is not set to true.
I don't know what happened earlier today up on the mountain, thin air maybe?

I just absolutely want to make sure that ignored nodes are never ever considered for 0 cost hops, nor for preferential routing early window in the case of CLIENT_BASE.

Nonetheless, for both features, using is_favorite is not good, in my opinion. Separation of concerns.

proposing a new NodeInfo field name, e.g. 0_cost_from_node true/false

@erayd
Copy link
Contributor

erayd commented Oct 18, 2025

Dumb question maybe, but how do you even favorite another Router on a Router-Node? I don't see how that's available via remote administration.

You can set it via remote admin, but must use the CLI tool to do it. The phone apps don't have that ability currently.

@Hamberthm
Copy link

Hamberthm commented Nov 19, 2025

Hi @h3lix1 I have a question about this great new feature.

My use case is that I have poor signal inside my home and at best I see only one nearby node and only sometimes.

To solve this, I installed a node on the roof. I understand that by seeing more nodes the hop cost should not matter very much but anyway I see no harm in taking a free hop from inside my house to the roof.

Why does this implementation need to have both devices as CLIENT_BASE? Why the first hop is always subtracted?

If I understand correctly, this new feature doesn't meet the needs of my use case. Am I correct? Do I have an alternative?

Thanks!

@h3lix1
Copy link
Contributor Author

h3lix1 commented Nov 23, 2025

@Hamberthm The issue was towards the structures within the Meshtastic code treat the first hop as special in terms of other functions like neighborinfo. It expects the first hop to be the actual sender, which can be very confusing for these functions if it is a few actual hops away. (Other things include gratuitous nodeinfo for neighbors)

It was easier to expect someone to just bump up their local max hops to account for their roof node than to handle a lot of these edge cases.

@Hamberthm
Copy link

@Hamberthm The issue was towards the structures within the Meshtastic code treat the first hop as special in terms of other functions like neighborinfo. It expects the first hop to be the actual sender, which can be very confusing for these functions if it is a few actual hops away. (Other things include gratuitous nodeinfo for neighbors)

It was easier to expect someone to just bump up their local max hops to account for their roof node than to handle a lot of these edge cases.

All right. I suppose setting one more hop on the outgoing messages would solve that part.

What about receiving? I see the previous relay also has to be set to ROUTER/ROUTER_LATE/CLIENT_BASE and favorited, so there's no way of preventing a "no hops left" message from dying on my roof before hoping one more time to my inside node.

@h3lix1
Copy link
Contributor Author

h3lix1 commented Nov 24, 2025

@Hamberthm That is correct - It is assuming that the hop before the roof node is a ROUTER (or ROUTER_LATE) node that is favorited. The expectation is that if there is one (or zero) hops left when it reaches your roof node it will still forward it along. It's not a guarantee that the node will avoid no-hops-left issue, although I guess we can try to make exceptions for all ROUTER_LATE packets.

@h3lix1 h3lix1 deleted the feat/router-hop-preservation branch November 24, 2025 02:00
@shalberd
Copy link

shalberd commented Nov 24, 2025

@h3lix1 @compumike
aside from your discussion and all the warning new feature when favoriting ... it'd be nice if the number of favorites on nodes of type Router, Router_Late, Client_Base could be limited to say 2-3 ...
Does't matter whether for this feature or the client_base Router-like behavior.
Or maybe alternatively feature-gate 0 cost hops feature for the shorter presets only.
Maybe a combination of both.
Look, our goal here is, on that single EU_868 frequency for all presets, to limit channel utilization to max 35%.
Some safeguards we'd certainly appreciate for a coming stable/beta release in this matter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

8 participants