Now that the Milkymist hardware is sufficiently advanced, it’s time to run some real software on it.
I have started designing the subsystem that evaluates the per-frame and per-point equations ; a central component of the rendering process. Doing this requires parsing the preset code ; and this kind of task is usually done using a so-called compiler-compiler (or parser generator). Lex and Yacc (and often their GNU equivalents Flex and Bison) are perhaps the most popular tools, and were what I tried in the first place.
But it turned out that the code generated by them is laden with ugly global variables and, more importantly, not-so-portable glibc calls that would cause problems in the minimalistic Milkymist software environment, and which would require rather dirty hacks to solve. The cleanest option would have been to modify Flex and Bison, but, as often with GNU software, the code readability standard is pretty low and I would then have to maintain and distribute my modified tools ; turning the little technical problem into a development and management nightmare.
Fortunately, after some web crawling I found these two tools :
Both of them do not use global variables and no glibc calls, unless you enable assertions or debug output. I would say their only major problem is the scarcity of documentation ; and I basically ran into these two issues that could be pointed out better in the documentation :
Lemon associates numbers with each token type, and generates a include file listing them. You must use that include file. If you don’t, and try to supply your own numbers instead (coming straight from the lexer for instance), this will fail because the numbers are hardcoded in the parser (instead of using the identifiers from the generated include file).
Lemon uses a stack where it pushes tokens which did not cause a rule reduction yet. If you want to read the token string in the parser (and you often do), you will probably pass a pointer to the string to the Parse() function, and that pointer will be pushed on the parser stack. Then you have to be careful that the data pointed to is not modified until the parser is destroyed. A way to solve this problem is to make copies of the data and use the %token_destructor directive to make the parser automatically free those copies.
During the last days, I’ve been working on the warp engine for Milkymist. Warping is a computationally-intensive operation of the visualization rendering process which consists in taking the currently displayed picture and distorting it. This “distortion” can be of any kind – zooming, rotating, bumping, all at the same time, etc … and is configured by the MilkDrop preset.
The distorted picture has then some effects applied to it, waves and shapes are drawn to it, and the process repeats. This is basically how MilkDrop works.
The way warping is achieved is by using texture mapping on a triangle strip, which is done by your GPU when you run MilkDrop. Mathematically, this is a rather simple process, but when it comes to implementing it efficiently, it brings about long pipelines, precision problems and memory access issues. And CPUs, especially softcores, are way too slow for this task.
The new architecture is entirely based on the stream processing paradigm. Moving data around sometimes involves Verilog modules with nearly one hundred signals, but the resulting implementation should be very fast. Precision problems in linear interpolations are solved using Bresenham’s algorithm. Eventually, the warp engine launches pixel DMAs over the high-performance burst-oriented FML (Fast Memory Link) bus that has been designed as part of Milkymist.
One last novelty is the use of Verilator in the test bench. This free cycle-based simulator appears to be full of good ideas. Its main particularity is that it generates a C++ class that implements the behaviour of your IP core as if it was synthesized. This makes it a bit harder to use than other simulators, but brings about high performance and an easy connection of your simulated IP to the “outside world” – two key features when testing the warper engine where dozens of millions of transactions need to be simulated, and reading/writing to test images must be supported.
For smaller projects you can use Icarus Verilog and the GPL edition of CVer – two other good event-based simulators. With such tools, you can completely forget about Modelsim and other stuff with crazy licenses for most of your FPGA projects.