Computing stuff tied to the physical world

EtherCard improvements

In Software on Apr 11, 2012 at 00:01

This has been an often-requested feature, so I’ve added a way to get an Ethernet reply back after you call tcpSend() in the EtherCard library:

Screen Shot 2012 04 01 at 15 08 44

The one thing to watch out for, is that – over time – packets going out and coming back are going to interleave in unforeseen ways, so it is important to keep track of which incoming reply is associated to which outgoing request. Fortunately, the EtherCard library already has some crude support for this:

  • Each new tcpSend() call increases an internal session ID, which consist of a small integer in the range 0..7 (it wraps after 8 calls).
  • You have to store the last ID to be able to look for its associated reply later, hence the “session” variable, which should be global (or at least static).
  • There’s a new tclReply() call which takes that session ID as argument, and returns a pointer to the received data if there is any, or a null pointer otherwise. Each new reply is only returned once.

A simple version of this had been hacked in there in a Nanode-derived version of EtherCard, so I thought I might as well bring this into the EtherCard library in a more official way.

This code – the whole EtherCard library in fact – is fairly crude and not robust enough to handle all the edge cases. One reason for this is that everything is going through a single packet buffer, since RAM space is so tight. So that buffer gets constantly re-used, for both outgoing and incoming data.

Every time I go through the EtherCard code, my fingers start itching to re-factor it. I already did quite a few sweeps of the code a while back as a matter of fact, but some of the cruft still remains (such as callback functions setting up nested callbacks). It has to be said though, that the code does work pretty well, with all its warts and limitations, and it’s non-trivial, so I’d rather stick to hopping from one working state to the next, instead of starting from scratch, working out all the cases, and tracking out all the new bugs that would introduce.

The biggest recent change was the addition of a “Stash” mechanism, which is a way to temporarily use the RAM inside the ENC28J60 Ethernet controller as scratchpad for all sorts of data. Its already useful in its current state because it lets you “print” data to it in the Arduino way to construct a request or a reply for an Ethernet session.

There are a few more steps planned, with as goal to avoid the need to have a full packet buffer in the ATmega’s RAM. Once that goal is reached, it should also become possible to track more than one session at the same time, so that more frequent requests (in and out) should be possible. There is no reason IMO, why an ENC28J60-based Ethernet board should be much less capable than a Wiznet-based one (apart from needing a bit more flash memory for the library code, and not supporting multi-packet TCP sessions).

The remaining steps to get away from the current high demands on RAM space are:

  • generate the final outgoing packet directly from one or more stashes, without going through our RAM-based buffer
  • collect the incoming request into a stash as well, again to avoid the RAM buffer, and to quickly release the receiver buffer again
  • reduce the RAM buffer to only store all the headers and the first few bytes of data, this should not affect all too much of the current code
  • add logic to easily “read” incoming data from a stash as an Arduino stream (just as “writing” to a stash is already implemented)

Not there yet, but thinking this through in detail is really the first step…

  1. ENC28J60 could actually be MORE capable than wiznet — think IPv6 :) or proprietary protocols over Ethernet (who says you HAVE to use IP?). Maybe it could be useful to have, on the ENC28J60 board, a RAM chip and a microSD slot, but that’s another story.

  2. Hi JC,

    Is your code still compatible with Arduino 0022 or only 1.0 ? I’d like to know because I’m considering using the original Wiring platform instead, possibly migrating to 32-bit platforms someday…

  3. JC, I can’t get tcpReply() to work. Do you think you could add it to your pachube example sketch. https://github.com/jcw/ethercard/tree/master/examples/pachube

    I’m using a Nanode to upload to pachube and it hangs after a while. I know other people have this problem with Nanodes. tcpReply() should be a big help in detecting problems.

    • Sorry – good idea, I’ll look into it.

      I’m curious about the hangs, have a Nanode running here – I have seen a very occasional upload failure in the past week or so (my guess would be due to DHCP loss of lease).

    • Hi JC,

      I think you have now opened can of worms – DHCP loss of lease :-) I having similar issue with my feed – loss of data after some period of time, suspected DHCP.

      Is there a way to handle this specific DHCP condition with the library?

      Thanks!

    • You want a quick pragmatic fix? Make the ATmega reset itself after 86000 seconds (just under a day). Most DHCP leases last 24h.

      Here’s one way to force a reset (should work with OptiBoot, which disables WDT, else you’ll run into an interminable reset loop): http://www.arduino.cc/cgi-bin/yabb2/YaBB.pl?num=1246541553/5#5.

  4. I’ve been forcing the WDT to reset every hour using an infinite loop: While(true){}

    It’s been running almost a week without hanging. Usually it would last a day at most.

    The DHCP lease is interesting as a potential problem. I could try setting a static IP to see if that made a difference.

  5. I started using ethercard on my nanode in January and it’s great!! I have recently been having similar issues to people above with net connectivity stopping after a while. Literarally an hour ago I have tracked that my problem seems to be that stash space gradually gets used and not freed… Not every time, just sometimes… Stash::freecount dropping to zero being the indicator of doom! No idea why yet, probably something stoopid I am doing; meanwhile stash::initmap when freecount gets low seems to save me doing a full reset. Thought I would share in case this is the same thing others are finding.

  6. I’m looking at DHCP and DNS issues as recently got a BT HomeHub 3 and it seems to have more frequent problems than my old router. Just updating my copy of the library. Also have a couple of changes to the http post functions to make them more flexible.

  7. Just updated my copy of the library with latest from github, added the tcpReply code to my sketch but doesn’t seem to get response every time. Is it missing packets, or are they just not finding their way up to the higher layers of the code (sketch)? Wondering if this is something that could be where I and others are seeing missed packets on dhcp and dns requests.

    Basically I’ve got a simple sketch that sends a ping and measures the response time, it then updates the value to one of my pachube feeds. I also have an LED turn on when tcpSend is called and off when tcpReply gets a response to the request. Most of the time the LED just flashes, but when it stays on I can see that it hasn’t detected the response. The request gets to its destination, just the response doesn’t get seen by the library.

    Time to delve into the code a bit more.

    Andy

    • Hm… Now I can get longer runs going, I am finding occasionally that the call to make an http post (to my server) isn’t getting transmitted (evidenced by an incrementing sequence count skipping occasionally)… So far i’ve only confirmed this from a wireshark trace (hooray for having an old 10mb/s hub…), I need to add some additional diags to my code to better understand what’s happening ‘at the source’ but its possible we’re looking at similar things….

  8. I think I’ve found a DNS issue with BT HomeHub 3. Checking dns.cpp, the function dnsRequest is setting the high byte of the flags to 1, this is Recursion Desired bit which is what it should be doing. The response should then have the top bit of the next byte set to indicate that recursion is available on the server. However it looks like my BT HomeHub 3 is not setting this bit every time so as a result some dns responses get rejected as not being valid.

    I’ve yet to determine exactly why or when it sets it and when it doesnt but it could be to do with results being cached in the router. On looking at the returned response packets, they are almost identical in the DNS part except one has the recursion bit set, the other doesn’t.

    A work around is to temporarily remove the check for this bit in the function checkForDnsAnswer in dns.cpp. The current check is:

    if (plen < 70 || gPB[UDP_SRC_PORT_L_P] != 53 || // OK gPB[UDP_DST_PORT_H_P] != DNSCLIENT_SRC_PORT_H || // OK gPB[UDP_DST_PORT_L_P] != dnstid_l || p[1] != dnstid_l || (p[3] & 0x8F) != 0x80 ) {

    I changed mine to:

    if (plen < 70 || gPB[UDP_SRC_PORT_L_P] != 53 || // OK gPB[UDP_DST_PORT_H_P] != DNSCLIENT_SRC_PORT_H || // OK gPB[UDP_DST_PORT_L_P] != dnstid_l || p[1] != dnstid_l ) {

    This gets round the problem temporarily for my setup.

    Hope this helps.

    Andy

  9. Great work JC. The HTTP/1.1 status code for a successful Pachube post is ‘200 OK’. Can the response be configured to look for the 200 OK, and order a reboot if the response is a 404, 500 etc? I’m no expert in programming, but the sketch looks as though it currently accepts any status code.

  10. In my nanodeRF, the enc28j60 is hanging the RX after transmitting any data. If the packetSend function of the driver is called, nothing comes in anymore. It works only with simple pings, if i do not transmit anything. any sugestions?

  11. Tried today the Pachube example on my new DINo board (from KM tronic, Arduino+Relays and optos, on DIN rail). Compiles with no errors, but apparently dies during setup … never comes back from “ether.begin(sizeof Ethernet::buffer, mymac)”. Any tip ?

Comments are closed.