Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ethernet/IPv4/TCP: net_receive & net_reply in server mode #2016

Closed
zephyrbot opened this issue Jun 21, 2016 · 16 comments
Closed

Ethernet/IPv4/TCP: net_receive & net_reply in server mode #2016

zephyrbot opened this issue Jun 21, 2016 · 16 comments
Labels
area: Networking bug The issue is a bug, or the PR is fixing a bug priority: high High impact/importance bug
Milestone

Comments

@zephyrbot
Copy link
Collaborator

zephyrbot commented Jun 21, 2016

Reported by Flavio Santes:

net_reply does not work when net_receive's timeout is >= 200.

Steps to reproduce (based on echo_server):

  1. Download the attached tar.gz.
  2. Modify the IP address according to your configuration.
  3. Compile and do everything necessary to run the binary on a Galileo Dev Board.
  4. Use telnet to send some text to the application. Watch what happens when the timeout is above 150.

NOTE:

  • Some applications may require a higher timeout.
  • Timeout can be set to something below 150. However, sometimes even with a timeout below 150, net_reply does not work.

(Imported from Jira ZEP-469)

@zephyrbot
Copy link
Collaborator Author

zephyrbot commented Jun 21, 2016

by Flavio Santes:

@zephyrbot
Copy link
Collaborator Author

zephyrbot commented Jul 4, 2016

by Jaakko Hannikainen:

Adding fiber_yield() between receiving and sending a packet seems to fix this. Adding a yield to the net_reply() code would probably also fix GH-2041 (though, it would then yield always before sending...). I'll test it around and try to send the patch today.

@zephyrbot
Copy link
Collaborator Author

zephyrbot commented Jul 12, 2016

by Flavio Santes:

Read last comment in GH-2041.

@zephyrbot
Copy link
Collaborator Author

by Jaakko Hannikainen:

This is fixed with 2880, which is now merged in the main tree.

@zephyrbot
Copy link
Collaborator Author

by Jaakko Hannikainen:

Flavio, the fixes have been merged, can this be closed?

@zephyrbot
Copy link
Collaborator Author

zephyrbot commented Jul 19, 2016

by Flavio Santes:

Jaakko Hannikainen As you pointed out, GH-2041 is fixed. However, in server-mode this issue is still present.

@zephyrbot
Copy link
Collaborator Author

by Jaakko Hannikainen:

Could you clarify the 'watch what happens' part? I'm right now seeing different kinds of unexpected behavior (corrupted packets & a segfault) when sending multiple packets at once. If I send packets one-by-one and wait for responses, the connection seems to work fine.

@zephyrbot
Copy link
Collaborator Author

by Jaakko Hannikainen:

Looks like the crash is actually an application error due to misuse of ip_buf_unref. See receive_and_reply() in the echo server sample to see how the tcp packet handling should work. Still, looks like there is something going on in the ip core if it gets hit by multiple packets at once, occasionally the application replies with corrupted packets - even to ACKs telnet sends for them.

@zephyrbot
Copy link
Collaborator Author

by Flavio Santes:

Jaakko Hannikainen thank you. I will review again my code and I will update the Jira's description if I found something more specific.

@zephyrbot
Copy link
Collaborator Author

by Andrei Laperie:

Per last comment, assigning to Flavio

@zephyrbot
Copy link
Collaborator Author

by Mark Linkmeyer:

Flavio Santes , what's the latest status? Please provide update in comments. Thx.

@zephyrbot
Copy link
Collaborator Author

by Flavio Santes:

According to the last comment from Jaakko: "Still, looks like there is something going on in the ip core if it gets hit by multiple packets at once, occasionally the application replies with corrupted packets - even to ACKs telnet sends for them."

We are still observing the behavior.described by Jaakko.

Workaround: we are increasing the number of buffers. See comments at https://gerrit.zephyrproject.org/r/#/c/3936

@zephyrbot
Copy link
Collaborator Author

by Flavio Santes:

Mark Linkmeyer : IMHO, this issue must be assigned to the IP stack maintainers.

@zephyrbot
Copy link
Collaborator Author

by Mark Linkmeyer:

Andrei Laperie , it seems this is circling back to you. I haven't dug into the details of what this is about. I'm just trying to ensure it gets driven to the Closed state by getting it properly assigned. Please see the comment history and let me know who owns driving it to closure.
Anas Nashif

@zephyrbot
Copy link
Collaborator Author

zephyrbot commented Aug 17, 2016

by Flavio Santes:

Workaround applied to solve this issue must be documented. There is a Jira describing IP stack documentation issues: GH-1754

@zephyrbot
Copy link
Collaborator Author

zephyrbot commented Aug 17, 2016

Blocks GH-2015

@zephyrbot zephyrbot added priority: high High impact/importance bug area: Networking bug The issue is a bug, or the PR is fixing a bug labels Sep 23, 2017
@zephyrbot zephyrbot added this to the v1.5.0 milestone Sep 23, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: Networking bug The issue is a bug, or the PR is fixing a bug priority: high High impact/importance bug
Projects
None yet
Development

No branches or pull requests

1 participant