Convolution, anyone?
Jun 22, 2016

Right now, the JeeLabs Energy Monitor only tracks and reports three mains pulse counters here at JeeLabs. The smart meter’s P1 serial data source will be added soon, but there have been issues with reading it out with only a 3.3V power source - some more experimentation is needed to sort this out.

The next task will be to read out 3 current transformers using the STM32F103’s 12-bit ADC, as well as the AC mains signal, to be able to perform lots of multiplication w/ sums and estimate the real and reactive power consumption for each channel.

As described recently, AC mains can be read via the AC power supply transformer, using only the negative-going cycles.

This means that we must fill in the missing half of the input waveform, and this in turn requires detecting where the half wave gets cut off during rectification.

One way to calculate this is would be with convolution, a fascinating Digital Signal Processing technique related to (auto-) correlation and the Fourier Transform.

I’m still in learning mode and trying to wrap my head around all this stuff, but one way of looking at convolution, is that it takes a (long/continuous) signal and compares it to a “shorter” signal while shifting that signal along on the time line. The resulting stream of values will then correspond to the how “similar” the two are.

If we take the 7-point sequence (-1 -1 -1 -1 +1 +1 +1) and apply it to our rectified wave, we get a nice peak where the waveform is flattening out. Here as zoomed-in detail:

No peak on the negative side of the wave:

And here’s an artist’s impression of what we’re after:

The solid blue line is the input wave, the solid red line was manually added to turn it all into a full waveform: it’s a flipped-and-shifted version of the blue line. It all relies on identifying the flat segments so that we can switch over cleanly and remove them.

That’s where convolution could be used: the dotted lines are the convolution with that same 7-point signal, one for detecting the start of the flat segment, another (reversed) sequence for detecting the segment’s end. As you can see, both dotted lines will peak right next to these transitions, which can then be identified algorithmically.

This is only a very first (but important) step in the whole processing task. We’ll also need to measure the (quasi-fixed) phase shifts present in each of the current transformers as well as the AC power transformer and correct for each of them (by shifting the acquired data around a bit).

Note that this process will also let us determine all the 50 Hz voltage waveform zero crossings.

PS. Here is a small 20 mV peak-to-peak 50 Hz sinewave, as acquired by the ADC:

As you can see there’s a fair bit of noise, but keep in mind that this is a very small signal. For larger currents, signal levels up to 3 Vpp can be measured, so this is less than 1% of full-scale.

There’s probably no need to clean this up: the noise should cancel out in the final multiply-accumulate step, i.e. when V (volt) times I (amp) produces W (true power).

Please keep in mind that a lot of what I’ve described above is uncharted territory for me - I’m still reading up and learning…

For comments, visit the forum.

Forth on Nandland Go Board
Jun 15, 2016

After a recent excursion into FPGAs, and in particular Z80 + CP/M emulation, I’ve been tracking developments and keeping tabs on what’s being going on in both FPGA- and Retrocomputing-land.

FPGAs are amazing chips. The interest in these universal chips is increasing, not in the least due to IceStorm - an open-source toolchain to perform the complex task of turning a high-level logical description of the chip into an actual “bitstream” which can be loaded into an FPGA.

One of the most interesting new low-end boards to bring FPGAs into the tinkerer’s reach is the Go Board by Russell Merrick:

It includes an ICE40HX1K FPGA chip by Lattice (that fat square chip in the picture above), as well as some buttons, LEDs, 7-segment displays, and VGA out connector (VGA is quite easy to generate in an FPGA).

The HX1K series FPGA chips are relatively small, with only 1280 logic elements (the EP2C5 chip in the Z80 emulator has 4608 LEs). It’s more than enough to learn the basics and drive all sorts of interfaces at high speed, but it’s on the small side for creating a complete CPU “soft core”.

Except for the J1a by James Bowman, which implements a really nice and clean small Forth processor. It’s all open source on GitHub, making it a perfect candidate for IceStorm. The J1a processor is defined in 100 lines of amazing Verilog code. Best of all: it fits into the ICE40HX1K.

Matthias Koch then extended this code to make the Forth implementation more Mecrisp-like and came up with “Mecrisp-ICE”. The upcoming 0.8 release also supports the Go Board, which means you can now explore a complete system, from chip design to programming language!

Here is some sample code for Mecrisp-ICE 0.8 on the Go Board, which displays key-presses coming in over USB serial as hexadecimal on the two 7-segment LEDs:

create font
binary
  0111111 ,  \ 0
  0000110 ,  \ 1
  1011011 ,  \ 2
  1001111 ,  \ 3
  1100110 ,  \ 4
  1101101 ,  \ 5
  1111101 ,  \ 6
  0000111 ,  \ 7
  1111111 ,  \ 8
  1101111 ,  \ 9
  1110111 ,  \ A
  1111100 ,  \ B
  0111001 ,  \ C
  1011110 ,  \ D
  1111001 ,  \ E
  1110001 ,  \ F
decimal

: >seg  ( u -- x ) 2* font + @ ;

: seg.x ( c -- )
  dup
  $F and >seg 7 lshift
  swap
  4 rshift $F and >seg
  or $80 io! ;

: ascii ( -- )  \ display keys on 7-seg until ESC is pressed
  begin key dup seg.x 27 = until ;

If you look very closely at the image above, you can see it running and displaying “0d” for the Return key I just pressed.

The Go Board draws 135 mA, but most of it is by the FT2232HL chip, which handles high-speed USB <-> serial conversion and FPGA programming. By themselves, the ICE40 chips are in fact very low-power.

For more info about FPGA’s and the Go Board, see Russell’s tutorials on YouTube.

There’s also a very nice 146-page PDF by Xess, who offer the XuLA2 boards (a fairly substantial Xilinx-based FPGA, though you need their free-but-proprietary toolchain).

Yet another extensive tutorial series by Embedded Micro, who make the Mojo board and accessories (again with Xilinx chip, and again no open-source toolchain).

Lastly, if the HX1K is too limited for your tastes, you can check out the iCEHX8K evaluation board, which has a larger FPGA and is also supported by Mecrisp-ICE.

It’s now only a matter of time: more open source projects based on FPGAs!

For comments, visit the forum.

Thoughts about app structure
Jun 8, 2016

These are some ideas about how to structure the flash and RAM memory for applications built on top of Mecrisp Forth.

First off: 64 KB of flash memory turns out to be plenty for very substantial applications. All of the current JeeLabs Energy Monitor code easily fits within 64K (even though the Olimexino has 128K). And that includes the RF69, OLED with graphics, ADC with DMA, pulse counters, DCF decoding, as well as the core hardware abstraction layer, clock and time management, SPI and I2C drivers, timers, PWM, RTC, interrupt-based UART with ring buffers, and the multi-tasker.

Code is added on-the-fly to the Forth “dictionary”, either in flash memory, or in RAM. All RAM-based definitions are lost on reset or a power-cycle, so this is really more for development than for permanent use.

The dictionary acts like a “stack of code” (and constant data). Mecrisp includes primitive words to erase and re-use flash memory, but this needs to be done with care since words can cross flash page boundaries. The solution to this is the “cornerstone” word, which can be called to set a marker on a flash page boundary. The use of cornerstones is really simple - this defines one:

cornerstone blah

And this erases all definitions added to the dictionary after that definition:

blah

Note that blah itself remains in the dictionary, it’s not self-destructive like the traditional “forget” mechanism used in other Forth systems.

With cornerstones, we can define an app structure which simplifies development and makes it easy to replace just the top layer with new code. This works really nicely in combination with the “include” mechanism in Folie, by setting up source files as follows:

Layer “A” (always.fs)

\ this file must be loaded on top of a pristine Mecrisp image
compiletoflash
... lots of definitions ...
: init ... ;  \ essential startup configuration (clock, console, ...)
: cornerstone ... ;
cornerstone eraseflash

This uses the trick from a previous article of having some code stay around essentially forever, by redefining the built-in eraseflash word. With the above code loaded into a fresh Mecrisp setup, we effectively install some code and init word which sets the stage for permanent use. Entering “eraseflash” at the Mecrisp command prompt will always revert the system and flash state to this configuration.

Layer “B” (board.fs)

The next layer is for “board” definitions, i.e. words which are for a specific µC and board, and words which allow abstracting away some of the basic differences. This is where a µC-specific implementation of the SPI driver resides - so that everything on top can see the same API:

eraseflash
compiletoflash
... lots of definitions: pins/buttons/LEDs, main h/w interfaces ...
cornerstone <<<board>>>

Board definitions rarely need reloading, their purpose is to simplify the code in the next layers. This would probably be called the “runtime library” in other languages.

Note how the source file starts with an eraseflash to restore the dictionary to the “always” state of layer A, before it adds the board definitions.

A side effect of calling eraseflash or a cornerstone, is that Mecrisp will reset itself afterwards, and run the init word again. So all board definitions take place in the context set up by init.

Layer “C” (core.fs)

The “core” layer is intended for well-tested code, such as hardware drivers and libraries, which are needed for this specific application. It is likely to consist mostly of include lines for folie, to bring in certain features. Here is the outline of a sample “core.fs” source file:

<<<board>>>
compiletoflash
... lots of definitions: includes and tested application code ...
cornerstone <<<core>>>

As the code for a new app solidifies, it can be added to this source file (inline or via includes).

In this case, the source code starts with <<<board>>> to wipe out any other definitions added afterwards (i.e. the previous version of itself). And it ends by adding a new conerstone of its own.

Layer “D” (dev.fs)

This layer is for active development. It is not stored in flash but in RAM. That means that the code and definitions will all disappear on reset. The source code reflects this difference:

reset
... more definitions: work in progress, debug words ...

No more cornerstones now (they can only be used for flash memory). The goal here is to make reloading as fast as possible. A lot of development will take place in this layer, with code moved to the “C” layer once it’s working and deemed stable.

Layer “E” (exp.fs)

Sometimes, development needs more work and more exploration before we can settle on a design and work out all the details. Layer “E” is for experimentation and exploration, i.e. code which is not necessarily going to end up in the real application - tests, little helpers, custom debug words, and actual calls to these words - the goal here is even faster turnaround:

include dev.fs
... yet more definitions, but also actual calls to start up things...

The intention is to allow several different versions of this source code, so that there could be “e<sometag>.fs” files for all sorts of trials. And we can easily keep all of them around forever, just in case we ever need to chase that tricky issue again one day.

In this case, we don’t start off with a call to reset but we include the dev.fs source code (which includes the reset) to make sure all those words are loaded. With a bit of luck, we can simply hit up-arrow + return to include this file over and over again, as part of a fast-paced edit-run cycle.

Layer “F (final.fs)

The last layer is the one which turns the board into a complete application, starting up the minute is is powered up:

<<<core>>>
compiletoflash
... lots of definitions: includes and application code ...
: init ... init ;  \ this will auto-start the application

Note that this layer does not load the D and E layers. It’s flash-only.

This one is risky, in that it “seals” the application. If the app is not working properly and if we haven’t built in an escape hatch of some sort, we will not regain control of the command prompt. In that case, all that’s left is to reflash the board by other means: a ROM-based boot or SWD.

One way out is to build a delay into init, allowing some key press or other external event to bypass a full auto-run. That way, a reset follow by some specific action will get us back to that oh so powerful Forth command prompt.

An even better alternative though, is to build in support for an interactive command prompt, while running all the application code from a second background task, using the multi-tasker. This way, we can peek and poke while the application is actually running, and even develop enhanceents and new features on the fly (in RAM, for easy recovery).

So there you have it: a layered application structure, with built-in support for permanent console redirection, board-specific extensions, driver/library use, fast-turnaround development cycles, and an application freeze for turnkey use. It’s just a thought experiment for now, but worth a try!

There’s no IDE in sight: just your preferred source code editor, Folie, and Mecrisp.

As you may have noticed, the above layers are called A, B, C, D, E, and F - easy to remember!

For comments, visit the forum.

Weblog © Jean-Claude Wippler. Generated by Hugo.