Monday, April 20, 2015

Adapting an ESP-01 module for breadboard use


While esp8266 ESP-01 modules can easily be programmed after hooking up some dupont jumpers to a USB-TTL module, using them on a breadboard without an adapter or modification is not possible.  The obvious method of making an adapter with a 2x4-pin female header, some stripboard, and 2 1x4 male headers.  I thought of an even simpler way of adapting the ESP-01 for a breadboard without using any extra parts.

Of the 8 pins on the ESP-01, the CH_PD pin should be permanently tied to Vcc, so only 3 of the four middle pins are needed.  If you use my zero-wire reset solution, only the GPIO0 and GPIO2 pins are needed.  To modify the ESP-01 for breadboard use, heat up the CH_PD pin with a soldering iron, then pull it out with a pair of needle-nose pliers after the solder is liquid.  Then solder a short wire from Vcc to the CH_PD pad.  Next heat up the remaining 3 middle pins, and push them until they stick up out of the PCB.  To do this I put my needle-nose pliars under the pins, then pushed down on the module.  If you go too fast and get lumps of solder on the pin, add some flux and heat up the solder to level out the solder so jumper wires can smoothly plug into the pins.  If you're not using my reset solution, I still recommend a capacitor on RST as it will reduce or eliminate spurious resets.  The RST line on the esp8266 is very sensitive, at least compared to RST on 8-bit AVR MCUs.  When you are done you your module should look like one below, and can easily plug into a breadboard.


Wednesday, April 15, 2015

Continuous Integration: the wonderful world of free build servers


While contributing to the esp8266/Arduino project, Ivan posted a link to a test build using Appveyor.  After a bit of research, I learned that there is a whole slew of companies offering cloud-based build servers in this space called continuous integration.  More impressive is that it is free for open-source projects with most companies in this space.

When I first started writing software it was in basic and assembler on a Commodore 64.  When writing small programs on fixed-configuration systems like that, the development cycle was reasonably quick, with even "large" assembler programs taking seconds to build.  Deployment testing was simple as well; if it worked on my C64 it would work on everyone else's.  As computers got bigger and more complex, so did the development cycle.  While working on large projects at places like Nortel, full system builds could take several hours or even days.  Being able get quick feedback on small code changes is very important to software development productivity.  The availability of low-cost, on-demand services like Amazon Web Services has enabled companies in the CI space to offer build services with minimal infrastructure investment.

Who's Who

The esp8266/Arduino project uses Appveyor for Windows builds and Travis for Linux builds.  Other CI companies offering Linux-based build services include Drone.io, Snap CI, and my favorite, Codeship.  All of these companies offer some level of service for free to open-source developers, so I decided to try all four of the Linux-based CI services.

For my work with embedded systems, I have been writing build scripts for avr-gcc, which I intend to extend to building a gcc cross-compiler for the xtensa lx106 CPU on the esp8266.  Full builds of binutils, avr-gcc, and avr-libc take a few hours on an Intel Core i5, so getting a working build was a slow process.  Having a large build like this also turned out to be helpful differentiating between the different CI services.

One thing all the CI services have in common is they make it easy to set up an account and try their service if you already have a github account.  With Google Code shutting down, everyone in open-source development should have a github account already.  While the CI services support a number of different languages, I was only concerned with C++ using the gcc compiler.  All the servers had gcc >= 4.4 and typical tools like autoconf, flex, and bison.

Travis

Travis was the first CI service I tried, and it turned out to be one of the more complicated.  In order to get Travis to set up for your build (set environment variables, download dependencies), you need to make a .travis.yml file in the root of your repository.  The format is similar to a shell script, so it wasn't too hard to figure out.  After a bit of experimenting I was able to get a build started.

From some of the posts I read online, I was concerned whether the build would complete in the allowed time of 50 minutes.  The problem I ended up having was not build time but build output.  After 4MB of log output, Travis terminates the build.  If my build failed I wanted to see where in the build process the problem occurred.  So I turned off minor log output from things like tar extracting dependency libraries, but I still hit the 4MB limit.

Another problem you might have with large amounts of log output relates to how your browser handles it.  Firefox started freezing on me when I tried to view a 4MB log file, but Chrome was OK.

Drone.io

Drone's service was easier to setup, allowing a shell script to be written in their web interface, which would be run to start your build.  Drone has a limit of only 15 minutes on free builds, which turned out to be the showstopper with their service.

Codeship

I almost missed the boat on Codeship since they don't even list C/C++ in their supported languages.  I guess a gcc installation is taken for granted for Linux-based CI.  Codeship, like Drone, allows you to write a build script in their web interface.  Unlike Travis and Drone, no sudo access is availabe on the build servers, so installing updated packages is not possible.  Since the servers have a reasonably recent version of gcc (4.8.2-19ubuntu1) and gnu tools, this was not a problem for me.  Their build servers (running on AWS) are nice and fast, with a full build, using make -j4, taking about 13 minutes.

Codeship doesn't seem to have any limit on build time, though they do have a 10MB log output limit.  Fortunately that is just a limit of what is displayed in the web inteface, and the build does not stop.   The most impressive thing about Codeship's service is that they give you ssh access to a build server instance for debugging!  Clicking on Open SSH Debug Session gives you the IP and port to ssh into, assuming you've already updated your account with your ssh public key.

On the debug server, your code is already copied to the "clone" directory.  The servers are running Ubuntu 14.04.2 LTS, The servers seem to have un-throttled GigE ports, as a download of gcc 4.9.2 clocks in at 24MB/s or 200mbps.  With the debug server I was able to manually run my build, review config.log files, and copy files using scp to my computer for later review.

Snap CI

One way Snap CI is different than the other services, is that their servers run CentOS instead of Ubuntu.  I started using RedHat before Debian and Ubuntu existed, and never had a good reason to leave rpm-based distributions, so I like the CentOS support.  The version of gcc on their servers (4.4.7) is rather old, but they enable sudo so you can upgrade that with a newer RPM.  They also have a limited shell interface called snap-shell.  It's not full ssh access like Codeship, but it does make it easy to check the environment by running things like "gcc --version".

Snap CI also uses AWS servers, and build times were very similar to Codeship.  If your builds require downloading a lot of prerequisite files, Snap may take a bit longer than Codeship though, as gcc 4.9.2 took about twice as long to download on Snap.

Conclusion

CI services eliminate the time and cost of setting up and maintaining build servers.  They simplify software testing by having a clean server instantiated for each build.  No more broken or incompatible builds because someone installed a custom library version on the build server that normal users don't have on their machine.  I closed my Drone.io account, and will probably close my Travis account too.  I'll keep using both Codeship and Snap, to be sure the software I'm working on can build on both Ubunto and Centos.  If I continue to support programs like picoboot-avrdude that builds under Linux and MinGW, I'll also try out Appveyor.

Thursday, April 9, 2015

Building avr-gcc from source

Although 8-bit AVR MCUs are widely used, it is hard to find recent builds of avr-gcc.  The latest releases of gcc are 4.9.2 and 4.8.4, yet the latest release of the Atmel AVR Toolchain only includes gcc 4.8.1.  For CentOS 6, the most recent RPM I could find is 4.7.1.  To take advantage of the latest improvements to compiler features like link-time optimization, it is often necessary to build gcc from source.  When I first attempted to build gcc for AVR targets, I quickly discovered it's not as simple as  downloading the source then running "configure; make install".  Picking away at it over the course of several months, I've figured out how to do it.  This method should work for avr targets, and with a small change to the build options, for other targets like Arm and Mips.

Although both the avr-libc and gcc sites have some information on building gcc, I found both fell short of being concise and unambiguous.  The biggest source of problems I encountered was other libraries that gcc requires for building.  GCC's prerequisites indicates, "Several support libraries are necessary to build GCC, some are required, others optional."  I (mis)interpreted the list of libraries starting with GNU Multiple Precision Library as being required only if those features were enabled in the compiler.  In the end I was only able to build avr-gcc 4.9.2 when I included GMP, MPFR, and MPC.  The ISL library was not required.


Required source files

The first thing to download before GCC is gnu binutils, which includes utilities like objdump for disassembling files.  If you already have an earlier version of binutils, it is not necessary to build a new version.  For example, on my server I have Atmel AVR Toolchain 3.4.4, which includes avr-gcc 4.8.1 and binutils 2.24.  In order to build avr-gcc 4.9.2 I don't need to make a new build of binutils.  I did try building binutils 2.25 (the latest), but instead of debugging a compile error I decided to stick with 2.24.  For building binutils, the following configure options were sufficient (though perhaps not necessary):
-v --target=avr --quiet --enable-install-libbfd --with-dwarf2 --disable-werror CFLAGS="-Wno-format-security"

The next thing to download is GCC, followed by GMP, MPFR, and MPC.  I used GMP 5.1.3, MPFR 3.1.2, and MPC 1.0.3.  After extracting all the packages, links in the GCC source directory need to be made, named gmp, mpfr, and mpc respectively, linking to their source trees.  Then gcc can be configured with the following options before running make:
-v --target=avr --disable-nls --enable-languages="c,c++" --disable-libssp --with-dwarf2


Build script

Rather than downloading, extracting, and building gcc manually, I started with a build script made by Rod Moffitt and a couple other contributors.  To use it, first run getfiles.gcc.sh which will download the files, then buildavr-gcc.sh.  After a long build process, the binaries will be in /usr/local/avr/bin/, which you should then add to your shell PATH variable.


Binaries

You can download my build for Linux x86_64.  It's dynamically linked, and built on CentOS 6, so you may have to add some symlinks in /lib64 for other Linux distributions.

Monday, April 6, 2015

Zero-wire auto-reset for esp8266/Arduino


A little over a year ago I developed a zero-wire auto-reset solution for Arduino.  After I started using Arduino for the esp8266, I realized I could do the same thing with the ESP-01.

Flashing the esp8266

In order to download code to the esp8266 after reset, GPIO0 and GPIO15 must be low, and GPIO2 must be high.  The ESP-01 has GPIO15 grounded, and GPIO2 is set high after reset.  GPIO0 is pulled up to Vcc after reset, so in order to download code to the flash, this must be pulled low.  Although esptool-ck supports using RTS and DTR for flashing the esp8266, many cheap USB-TTL modules don't break out those lines.  With USB-TTL modules that break out DTR, the DTR line should be connected to GPIO0 in order to pull the line low after reset.  Otherwise DTR needs to be grounded with a jumper or by connecting a push-button switch to ground.

The circuit

The auto-reset circuit I used on the esp8266 is a simplified version of the circuit I used with the pro mini.  It consists of just a 7.5K resistor between Rx and RST, and a 4.7uF capacitor between RST and Vcc.  The values are not critical, as long as the RC constant is between 10ms and 100ms, so if what you have on hand is a 15K resistor and 1uF capacitor, that should work fine.  A serial break signal is 250ms long, which is why I suggest a an RC constant of less than 100ms to allow the capacitor to discharge and trigger a reset before the break signal ends.  If the RC constant is less than 10ms, a sequence of zero bytes transmitted to the esp8266 could unintentionally trigger a reset.  At 9600bps, each bit is 104.2us long, so 8 zero bits plus the start bit would last 938us.  Several zero-bytes in a row, even with the high voltage of the stop bit in between, could trigger a reset.  The esptool default upload speed is 115.2kbps, so unwanted resets are quite unlikely.

The  auto-reset circuit has an added benefit of improving the stability of the esp8266 module.  The RST pin on the esp8266 is extremely sensitive.  Before I added the auto-reset circuit, simply touching a probe from my multimeter to the RST pin would usually reset the module, even when I tried adding a 15K pullup resistor to Vcc.  I would also get intermittent "espcomm_sync failed" messages when trying to upload code.  Since adding the auto-reset circuit, I can probe the RST pin without triggering a reset, and my uploads have been error-free.

Getting the updated esptool-ck

By the time you are reading this, Ivan may have already integrated my patch for esptool-ck.  If the issue is still open, then you can download the updated esptool-ck.  Extract the esptool.exe into hardware\tools\esp8266.  This version also includes support for 921.6kbps uploads, which can be enabled by modifying putting esp01.upload.speed=921600 in hardware\esp8266com\esp8266\boards.txt.

Wednesday, April 1, 2015

A 4mbps shiftOut for esp8266/Arduino


Since I finished writing the fastest possible bit-banged SPI for AVR, I wanted to see how fast the ESP8266 is at bit-banging SPI.  The NodeMCU eLua interpreter I initially tested out on my ESP-01 has little hope of high-performance since it is at best byte-code compiled.  For a simple way to develop C programs for the ESP8266, I decided to use ESP8266/Arduino, using Jeroen's installer for my existing Arduino 1.6.1 installation.  Starting with a basic shiftOut function that worked at around 640kbps, I was able to write an optimized version that is six times faster at almost 4mbps.

I modified the spi_byte AVR C code to use digitalWrite(), and call it twice in loop():
void shiftOut(byte dataPin, byte clkPin, byte data)
{
  byte i = 8;
    do{
      digitalWrite(clkPin, LOW);
      digitalWrite(dataPin, LOW);
      if(data & 0x80) digitalWrite(dataPin, HIGH);
      digitalWrite(clkPin, HIGH);
      data <<= 1;
    }while(--i);
    return;
}

void loop() {
  shiftOut(DATA, CLOCK, 'h'); 
  shiftOut(DATA, CLOCK, 'i'); 
}

Since I don't have a datasheet for the esp8266 that provides instruction timing, and am just starting to learn the lx106 assembler code, I used my oscilloscope to measure the timing of the data line:

The time to shift out 8 bits of data is around 12.5us, for a speed of 640kbps.  Looking at the signal in more detail I could see that the time between digitalWrite(dataPin, LOW) and digitalWrite(dataPin, HIGH) was 425ns.  Rather than setting the data pin low, then setting it high if the bit to shift out was a 1, I changed the code to do a single digitalWrite based on the bit being a 0 or a 1:
void shiftOut(byte dataPin, byte clkPin, byte data)
{
  byte i = 8;
    do{
      digitalWrite(clkPin, LOW);
      if(data & 0x80) digitalWrite(dataPin, HIGH);
      else digitalWrite(dataPin, LOW);
      digitalWrite(clkPin, HIGH);
      data <<= 1;
    }while(--i);
    return;
}

This change increased the speed slightly to 770kbps.  Suspecting the overhead of calling digitalWrite as being a large part of the performance limitations, I looked at the source for the digitalWrite function.  If I could get the compiler to inline the digitalWrite function, I figured it would provide a significant speedup.  From my previous investigation of the performance of digitalWrite, I knew gcc's link-time optimization could do this kind of global inlining.  I enabled lto by adding -flto to the compiler options in platform.txt.  Unfortunately, the xtensa-lx106-elf build of gcc 4.8.2 does not yet support lto.

After looking at the source for the digitalWrite function, I could see that I could replace the digitalWrite with a call to a esp8266 library function GPIO_REG_WRITE:
void shiftOutFast(byte data)
{
  byte i = 8;
    do{
      GPIO_REG_WRITE(GPIO_OUT_W1TC_ADDRESS, 1 << CLOCK);
      if(data & 0x80)
        GPIO_REG_WRITE(GPIO_OUT_W1TS_ADDRESS, 1 << DATA);
      else
        GPIO_REG_WRITE(GPIO_OUT_W1TC_ADDRESS, 1 << DATA);
      GPIO_REG_WRITE(GPIO_OUT_W1TS_ADDRESS, 1 << CLOCK);
      data <<= 1;
    }while(--i);
    return;
}

This modified version was much faster - the oscilloscope screen shot at the beginning of this article shows the performance of shiftOutFast.  One bit time is 262.5ns, for a speed of 3.81mbps.  This would be quite adequate for driving a Nokia 5110 black and white LCD which has a maximum speed of 4mbps.

Conclusion

While 4mbps is fast enough for a low-resolution LCD display or some LEDs controlled by a shift register like the 74595, it's quite slow compared to the 80Mhz clock speed of the esp8266.  Each bit, at 262.5ns is taking 21 clock cycles.  I doubt the esp8266 supports modifying an I/O register in a single cyle like the AVR does, but it should be able to do it in two or three cycles.  While I don't have a proper datasheet for the esp8266, the Xtensa LX data book is a good start.  Combined with disassembling the compiled C, I should be able to further optimize the code, and maybe even figure out how to write the code in lx106 assembler.