Computing stuff tied to the physical world

Web server software

by Thorsten von Eicken

Adding a web server to the TCP-serial bridge is the easiest way to make everything configurable. But it is not a simple matter. The most popular alternative for configuring a TCP-serial bridge is to use modem AT style commands.

Fortunately Jeroen Domburg wrote a very nice and compact web server for the esp8266 called esphttpd. This web server has the following features:

  • supports GET requests with query string arguments
  • supports POST requests with query string arguments and POST body
  • has a built-in read-only file system to serve-up files
  • has templating functionality to substitute placeholder strings in templates stored in the filesystem with computed strings
  • allows new handlers to be written and plugged in easily
  • supports streaming of POST data and of responses to minimize buffer use

In order to fully configure the esp-bridge the following features are desirable:

  • configure the wifi network
  • configure the serial port
  • view serial port input as a form of microcontroller debug console
  • upload new firmware to the esp8266 itself to make it easy to develop the firmware
  • view the esp8266 firmware debug log

With all this functionality the web server part of esp-link is much more involved than the TCP-serial bridge! In order to keep the size of this episode within reason it focuses on just a few high-level issues and design considerations.

Web framework, or lack thereof

Because quite a few web pages with a good number of interactive elements are needed to implement all the desired functions the templating system, which tends to generate static HTML pages, isn’t ideal. In esp-link this is completely replaced by javascript in the browser to create interaction and AJAX requests to the esp8266 to get JSON data and submit changes. This leads to a much more modern “single-page app” style user interface but also to a much streamlined esp-link firmware. The way the esphttpd templating system worked is that html template files were stored in the flash filesystem in compressed form using a simple compression scheme called heatshrink. The simple compression scheme allowed a handler to decompress the file and then search for %variables% to be substituted. For each variable it then called the user’s handler asking it for the substitution string. If a template had 12 substitutions the user’s handler would be called 12 times. Overall quite simple and effective.

The new scheme functions differently. The filesystem contains html, javascript, and css files. The javascript (.js) and css files are all compressed using gzip and served in compressed form to the browser. This way the esp8266 does not need to decompress them. The html files are compressed using heatshrink so the esp8266 can decompress them and it prefixes them with a fixed header file, this is because all the pages need the same html boilerplate at the start. (It may be that repeating the header and then using gzip compression would result in a smaller flash filesystem, this has not been checked.)

The upshot of all this is that it makes the web UI development for esp-link very similar to that of modern web sites. All the user-interface interaction is coded in javascript and the data to populate each page is fetched with AJAX requests that return JSON data that is then placed into the appropriate HTML elements using javascript code. When the user clicks a button or submits a form the data is sent to the server using another AJAX request with the data encoded in the query string.

6caf7326 167f 11e5 8085 bc8b20159b2b

Where the javascript is different from typical web development is in the use of frameworks, or rather the absence thereof. The very widely used JQuery framework in itself would fill most of the esp8266’s flash so it is out of the question. Instead, esp-link uses a tiny CSS library called Pure-CSS and it uses native javascript functions and methods directly without the luxury of a framework. Fortunately the web has advanced past stone-age browsers and even Internet Explorer has fallen in line with standards such that frameworks with magic cross-browser tricks are no longer must-have’s. However, don’t try to use IE7 with esp-link…

Simultaneous connections

One of the downsides of using a javscript/AJAX style of web pages is a little unexpected, which is that each page requests a good number of assets simultaneously: the initial html page, the pure-css css file, the esp-link css file, a js file with helper functions, and then any AJAX request that are immediately made. This may be a bit slow on the one hand but the bigger issue is that web browsers tend to make many concurrent requests. What this means is that the HTTP listening socket cannot be set to only process one or two concurrent connections. Given that each concurrent connection requires its own set of buffers this puts quite some pressure on the available memory. Empirically, allowing fewer than 4 concurrent connections leads to errors.

Unfortunately the problem is somewhat compounded by the Espressif SDK’s handling of connection attempts past the configured maximum. When it receives a TCP SYN packet (which opens a connection) and all the connection slots are already busy it sends a TCP Reset back. This results in an error at the client end and in the case of a browser in an unhappy user. Instead of sending a reset the SDK could queue the (small) SYN packet for a short period of time waiting for one of the existing connection slots to open up. This approach is actually used in normal operating systems where it is called the accept queue.

One of the options provided by Espressif is to use a different configuration of the LwIP library which contains all the TCP networking code, namely one configured with a maximum packet size (“mss”) of 536 bytes (instead of 1460). With this configuration the esp8266 announces to clients that it will only accept packets of 536 bytes or less with the result that much less buffer space is required per connection and thus more concurrent connections can be supported with a given amount of memory. Esp-link will most likely switch to this configuration in the future after some testing.

Wifi initialization

The most tricky piece of the web interface is the Wifi initialization. This comes as no surprise because bootstrapping is always complicated! The first step in tackling complicated flows is to establish the desired end goal. For the Wifi config it is:

  • esp-link is connected to the user’s local WiFi Access Point (AP) as a plain station (STA)
  • esp-link has an IP address and perhaps a hostname
  • the user can connect to that IP address or hostname and use the UI

Obviously there is a chicken-and-egg problem: how can esp-link connect to the WiFi so the user can tell esp-link how to connect to the WiFi? The solution is to use AP mode in which the esp-link becomes its own access point. This lets the user connect to esp-link and use the UI to make it also connect to the user’s regular AP.

With this addition the full set of steps becomes:

  1. esp-link starts an AP
  2. the user configures a laptop, tablet, or smartphone to connect to the esp-link’s AP
  3. a DHCP server built into the Espressif SDK assigns the laptop an IP address
  4. the user navigates to the web UI and configures the WiFi to connect to the local AP
  5. esp-link switches the WiFi to be both AP and station (STA) simultaneously so it can attempt to connect to the local WiFi network and simultaneously show progress information to the user
  6. the WiFi connection succeeds and esp-link displays the IP address to the user
  7. esp-link shuts down the AP, i.e., changes to STA-only mode
  8. the user reconfigures the laptop/tablet/phone back to the local AP and navigates to the UI using the esp-link’s hostname or IP address

Fortunately these steps only need to be done once because esp-link stores the WiFi parameters in flash. What these 8 steps do not cover is what happens when everything doesn’t work as planned… For example, what happens if the user enters an incorrect password. Or if esp-link is powered-up in a new location where the configured network does not exist. Or if the configured network goes down for a few minutes. Basically some fail-safe is needed.

The fail-safe that is built into esp-link is a 15 second timer at boot-up. If esp-link cannot connect to the configured network within 15 seconds it switches from the standard STA-only mode to STA+AP mode so the user can connect to its AP and redo the WiFi config. If at any point esp-link manages to connect to the configured network it goes back to STA-only mode after 15 seconds — enough time to send a web page to the user to show the obtained IP address.

Now, why not always run in STA+AP mode or switch to that mode whenever the configured network becomes unavailable? The reason not to stay in STA+AP mode is that the esp8266 cannot enter any low-power state in this mode. This is because an access point has to always be ready to receive a station’s packet. In contrast, stations can tell the AP that they are going to sleep for a bit and the AP then automatically queues packets and includes a “you have a packet waiting” indicator in the beacons it sends every 100ms. This allows stations to wake up just in time to receive the beacon, check whether they have a packet waiting, and then go back to sleep if they have none. When running in STA+AP mode the esp8266 also has to send these beacons itself every 100ms, which consumes 4-5x more power than reception and 100x more power than sleep mode.

Another reason not to operate in STA+AP mode is that AP mode is an unencrypted network and that would allow others to take over the esp-link too easily. (If the AP was encrypted there would be another chicken-and-egg problem to set the encryption password.) This is also why esp-link doesn’t turn on the AP if the configured network goes away: it would be too easy to catch or provoke one of these moments in order to cause the AP to be turned on. Instead, a reset (typically power cycle) is necessary, which pretty much requires physical access, at which point WiFi security is meaningless anyway.

The web server software has many more nifty features (as well as shortcomings!), the next episode will look into the built-in self-update in more detail.

[Back to article index]