weixin_39595537
weixin_39595537
2020-11-25 09:21

decouple sockets from network interface up/down

Currently, as is with most socket implementations, bringing down the network interface that a socket is bound to causes the socket to become invalidated.

A network interface can be bounced by the system, such as when the access point goes out of range.

This makes programming sockets reliably harder for the application programmer, since they would have to continually check if the socket is still valid before use.

It would be far simpler for the application developer if the system itself kept track of the source/target binding for each socket and reinstated that when the network interface it's bound to is bounced. This would allow the application developer to code as if the network was always available, since the system takes care of reinstating sockets.

To prevent unwanted changes in behavior, this could be an opt-in scheme, via a new property on the socket. E.g


TCPClient client;
...
client.autoUp();

There are some caveats for TCP - in particular when the socket is recreated it will be seen as a new connection to the server, and so is semantically different from a connection that was never interrupted.

With UDP we should be able to bind to the same local port and IP, and so the remote peer doesn't know the connection was bounced.

TCPServer could also be re-instated automatically, although all existing clients would be disconnected. Even so, it's one less thing for the application programmer to have to worry about.

Completeness: - [x] Minimum test case added - [x] Device, system and user firmware versions stated (as of 0.6.0)

该提问来源于开源项目:particle-iot/device-os

  • 点赞
  • 写回答
  • 关注问题
  • 收藏
  • 复制链接分享
  • 邀请回答

5条回答

  • weixin_39707612 weixin_39707612 5月前

    For most linux implementations, the socket activity is not directly coupled to interface state.

    UDP is a non-issue, and a UDP socket should persist through almost anything apart from a reboot.

    I agree that dropping in/out of AP range should not inherently terminate a socket, but I assert that we should just let the TCP stack decide if contact was lost for long enough to close that socket. If it comes back in range in time, then all is well and the socket/connection persists.

    However, connecting to a different AP, behind a different firewall will result in a closed TCP connection, because of NAT (in absolutely almost every case we care about), IMHO spoofing that is wrong, and breaks the normal socket semantics.

    So I agree that closing sockets summarily is bad and should be stopped (just let the stack do what the stack should do), I do not believe that re-establishing connections autonomously is wise at the socket API level.

    If we want to provide a persistent abstraction that autonomously sets up new TCP connections, then I assert that should be a new API, not the socket API.

    点赞 评论 复制链接分享
  • weixin_39595537 weixin_39595537 5月前

    Andy, thanks for your input. Just to be clear, we are discussing the behaviour on an embedded platform - and not Linux, and consequently the network stack behaves differently from what you describe above. The WICED docs inform us that all sockets bound to a network interface should be disposed of before taking down that interface, so we follow that advice. (fwiw, I also see similar behavior on OSX and Windows - shutting down a wlan interface will cause any pending socket operations to immediately fail.)

    If we want to provide a persistent abstraction that autonomously sets up new TCP connections, then I assert that should be a new API, not the socket API.

    This makes sense from a systems programming perspective. However I'm not proposing that this be a change at the system level but in the application API (TCPClient/TCPServer/UDP) via a new property to enable automatic revival of disconnected sockets so long as they haven't been explicitly closed. The goal here it to make sockets application programming simple and uncluttered from unnecessary system details. The existing methods to read/write data would work as is, so I feel we can reduce confusion by using today's API as is without need for a separate, parallel API. In other words, application devs only need to add:

    client.autoUp();

    to their sockets and the rest of their code doesn't need changing, since it's using the same API. A single line change and their sockets become more robust.

    I hope that clarifies things and aligns our perspectives!

    点赞 评论 复制链接分享
  • weixin_39707612 weixin_39707612 5月前

    I think so, I thought you were talking about doctoring the socket API, glad to see the clarification that is not the case.

    If you're indeed talking about modifying the classes like TCPClient etc, and not the fundamental socket API, as long as the default behaviour is off, and the consequences are clearly stated - then I see that as a reasonable response, although it's not going to free anyone up from checking return values, if only because you have limited buffering resources (doesn't everyone.)

    For people new to network programming, opening a connection in setup() and expecting to use it in loop() for the rest of eternity doesn't seem like a daft proposition (which it is), this will enable that behaviour and bring with it all the good and bad things that follow.

    I'm not lobbying for us to fully emulate Linux here, merely pointing out that the socket API stretches back a long way (obviously way past Linux to BSD) - and that plenty of embedded platforms adhere to this model.

    That the wiced stack has changed the behaviour to close sockets when the (presumably only) interface flaps is a simplification that I can totally see the wiced developers embracing to get a product suitable for thermostats etc out of the door. If we were creating a product, we'd just adopt that behaviour and code the application accordingly.

    However, we are creating a platform, which is a subtly more difficult task, hence threads like this.

    点赞 评论 复制链接分享
  • weixin_39982933 weixin_39982933 5月前

    I think what would be best would be a system event that user code could register to handle that lets user code know that interface bounced and they should handle any re-connects they want to.

    点赞 评论 复制链接分享
  • weixin_39707612 weixin_39707612 5月前

    I think mat's idea is that while a fundamental capability like you describe might be present, incorporating handling into some common access classes might be possible/desirable. I'm personally still not 100% sure about the desirable, but I do acknowledge the idea of de-skilling the network programming required.

    点赞 评论 复制链接分享

相关推荐