Sunday, October 13, 2013

Casting problems (part 2)

In the previous post I discussed some problems with arithmetic operations. This is the continuation of this blog series.

When you look at a cast operation, you might think: What's the deal? That can't be hard to implement correctly right? At least that was my first thought when I started to implement it.

Lets start with some simple C Code:

uint8_t  a=0xFF;
uint16_t b=(uint16_t)a;
int16_t  c=(uint16_t)a;
uint16_t d=(int16_t)a;
int16_t  e=(int16_t)a;  
uint16_t f=(int8_t)a;
int16_t  g=(int8_t)a;
printf("a=%4x b=%4x c=%4x d=%4x e=%4x f=%4x g=%4x\n",a,b,c,d,e,f,g);

The output of this program is:

a=  ff b=  ff c=  ff d=  ff e=  ff f=ffff g=ffffffff

The output of b til e is rather unsurprising, because a is unsigned nothing really happens when it is cast to a larger size. With f however the type is changed to a signed type and then resized to a uint16_t. Upon printf'ing the value, it is converted as every usigned. F on the other hand is correctly sign extended from int8 to int16 and then to 32 bit.

The rule I extract from this for PSHDL is the following:

  • Upon a cast the type is firstly resized with sign extension if the operand of the cast is a signed type. It doesn't matter what the cast itself is doing, the signedness of the operand determines whether sign extension is used.
  • The type is changed after the resize operation. You can acutally see that in VHDL.

PSHDL Example:

int<8> a=-5;
uint<16> b=a;

results in the following VHDL code:

b <= intToUint(resizeInt(a, 16));

The type is is first resized as a signed int to 16 bit and then the type is converted from signed int to usigned uint.

Sign extension

When a signed value is resized to a larger size, the additional bits have to be filled with something. For a proper sign extension, the MSB is used. So a 4 bit number 1011 is sign extended to a 8 bit by using the first bit, the MSB and filling up the first 4 bits with it. The result is then 11111011. When the MSB is zero, as in 0011, then the result would be 00000011. But what about a reduction of size?

When the size is reduced two possible ways can be taken. The first one is to simply clip the value, which is what C does. The reasoning is that, when you do a width reduction, you know what you're doing and it is your task to ensure that the result still makes sense. A simple example to demonstrate what problems might arise:

uint16_t  a=0xFFFF;
uint8_t b=a;
int8_t  c=a;

int8_t   d=-1;
uint16_t e= (uint8_t)d;
uint16_t f=d;
printf("a=%4d b=%4d c=%4d d=%4d e=%4d f=%4d\n",a,b,c,d,e,f);

The output of this is:

a=65535 b= 255 c=  -1 d=  -1 e= 255 f=65535

As you can see, it can happen that during the width reduction, an usnigned positive value can be become an unsigned negative value. This also works vice versa if the type is changed first, and then it is resized. To avoid that the principle of sign-extension can be used even for reducing the width, which is what the ieee.numeric resize operation is doing in VHDL. For PSHDL however I chose to implement the C way of resizing. Mostly because it doesn't alter the bits in unexpected ways. But it has the down-side that a change in signedness may happen. When you perform a downsize and it can affect your value data bits, you're probably doing something wrong, or you really don't care.

Implementing sign extension

In a programming language where you know the size of your variable, there is a very simple, yet effective way of implementing sign extension. Lets assume we want to cast a int<8> to an int<7>:

uint64_t data=0xFF; //Bits from an int<8>
uint64_t shift=64-min(8,7);
uint64_t seData=(((int64_t)data)<<shift)>>shift;

The minimum of the target and the current size is taken to ensure that the function works in either direction (from int<7> to int<8> and vice versa).The MSB of the current or target is then shifted to the MSB of the variable which in this case has 64 bits. The arithmetic shift is then used to perform a sign correct extension.

This was easy, but what about the implementation in a true arbitrary arithmetic? This will have to wait for the next blog post.

Friday, September 13, 2013

Using Dart - a summary of my experience

On my list of languages that I hate with passion JavaScript takes a second place, right behind VHDL. So when I thought about creating a Web UI for PSHDL, I started with GWT. GWT is nice in that I can use Java for everything and it works mostly well, but ultimately I wanted to have the ability to run simulations in the browser. Using GWT to run simulated code is awfully slow. So that was not an option. So I was looking on how to put some Lipstick on that pig called JavaScript and found TypeScript. Well, its definitely better than pure JavaScript, but it's not what I was hoping for. It really is just a thin layer around JavaScript, but I wanted something that makes simulation really fast. So I found Dart.

What's awesome about Dart?

The thing that I liked most about Dart is that I took me very little time to pick up. It is in many aspects so similar to Java that I had very few surprises so far.

I also like that it comes with its own VM, which is generally faster than the JavaScript VM. While that alone would be enough to drive me to use Dart, this VM has also arbitrary arithmetic built into it! This is pretty awesome for my application where 128 bit signals are not unheard of. And in fact, right now, Dart is probably the fastest way to simulate PSHDL when the bit width exceeds 64 bits.

Speed

Even with the burden of having to take care of arbitrary precision integers, the VM is slowly approaching JVM speed, which I find really fantastic. In order to run in the browser however you need to either have a browser with built-in Dart VM or compile to JavaScript. While the chances of having a browser with a DartVM are close to 0 right now, it eventually might happen, but if you are serious and want to get some simulation speed out of the browser, a user might just install Dartium for that reason. The most likely situation however is that the dart2js compiler will be used to create JavaScript code that runs in every browser.

Web-ui

This is another advantage of Dart over TypeScript, there is some Web UI building stuff included, and also some very important things are built-in. For example there is really no need for jQuery, Dart already has some query function that you can use.

IDE

As a fan of Eclipse, I am very happy to see that a lot of work is put into creating an IDE that is enjoyable to use. For TypeScript an Eclipse plugin has been released just recently, but I bet the quality of the Dart IDE, which is an Eclipse RCP, is higher.

Language

Some of the language constructs are pretty nice. I especially like the .. notation for initializing objects. It allows you to chain methods calls as if they were returning the this object. Example:

TestResult.addTest(new Test("Adding positive numbers", 10)..progress=5..status=Test.CANCELED);

is equivalent to:

Test t=new Test("Adding positive numbers", 10);
t.progress=5;
t.status=Test.CANCELED;
TestResult.addTest(t);

I like this so much that I actually which this would be in Java. Btw. my AST Model is designed in such a way that you can do this, but each method call is returning a new immutable object.

Like most modern languages and Java 8 it also has functions as first class objects and closures, which makes filtering tasks very smooth. There are a few other little things that I like, for example get/set declarations.

What sucks about Dart?

Creating a language is hard, especially the validation part. So it is not very surprising that I encountered and reported a few bugs here and there, but one of it is so severe that it makes me question whether google is serious about Dart or not.

Dart2JS

Fact is, right now the chance that someone is surfing the web with a Dart enabled browser is close to 0. So the dart2js compiler is VERY important. If this process fails, Dart is useless. Whatever you do in Dart, is has to be translatable in some way to JavaScript. Unfortunately this is not the case for arbitrary precision arithmetic. The feature that I like most about Dart is broken in JavaScript. Not just by a little, its a wreck. You can argue philosophically if there should be a difference for this piece of code:

var x = 123;
print(x is! double);

which will print TRUE in Dart VM and FALSE in JavaScript. But the fun is over when the arithmetic is off:

int billion = 1000000000;
int quintillion = billion * billion;
int quintillionOne = quintillion + 1;
print(quintillion == quintillionOne);

Because in what must be one hell of a party some drunk person of the dart team made the joke of using the javascript number as translation target for int. Well, what could possibly go wrong... This is not an issue that was discovered yesterday and the team had no time to fix it just yet. This has been known for more than 1,5 years now! There is not even a compiler warning that tells the programmer that something might not be right here!

To be clear what the problem here is: The dart team integrated a feature into Dart (arbitrary precision arithmetic), which they are unable to implement properly in JavaScript, with no warnings or possibility to work around just yet. As I said earlier, being able to run in JavaScript properly is really important.

For PSHDL this means that the simulation is currently limited to a maximum bit-width of up to 32 bits. After that bits are lost and the simulation is incorrect when run in JavaScript.

A purely cosmetical issue is the fact that dart2js takes 3.5s for a very simple dart file and produces an obscene amount of code around, which is caused by the fact that each dart2js invocation kind of becomes its own complete application with all APIs included and everything, but I will come to that later again.

Language

At least one bug that took me one day to find was caused by the lack of enums. But there is light on the horizon, someone is working on it and it will be available post 1.0 probably.

Another bug that I produced was caused by the fact that this code does not even generate a warning:

int doStuff(){}

When you call doStuff, it simply returns null... This is not what I expected, but oh well, some people like it that way..

Another thing that I can get my head around is the package and library system. I somehow manage to import my stuff and am able to use it, but I still don't understand what I do when I place a

library filelisting;

at the top of the file. Sometimes you need it because if you include two files that don't have it, Dart gets confused somehow. But its not like you need that anywhere... Maybe I am really just missing something here that is totally useful, but for now I just accept it and put something random at the first line so that the compiler is happy. I am also not the biggest fan of the URI based imports.

Dynamic loaded code

When I started to implement the dart code generation for simulation, I was wondering how I could actually load some code at run-time. The solution to that are called isolates. In principle a nice multi-threading concept. You create a separate process for the loaded code and communicate via message passing. So far so good. But this also implies a heavy performance overhead, because all your data is serialized and deserialized for every Message that you send. No shared memory, nothing. This makes it also quite cumbersome to send some data. My first idea was to have a simple recipient on the generated code side that receives values to set, make a run and return ports that changed. The performance of that was awful! The reason is quite simple: For the message passing Maps are used and copied in between processes. I then used another method where the input for the generated code is generated within the simulation process. The UI just sends some configuration data to the simulation process, which then runs until something changes. Then it reports those changes back. How that looks like can be seen here.

As those processes do not share anything, the dart2js compiler has to include all required dart APIs within the javascript source, which explains why the resulting javascript source is so big. A more light-weight concept with shared resources, comparable to the dynamic class loading in Java would be awesome. This could significantly improve the performance of my simulation as the message passing bottle neck would vanish.

Another thing that I learned, when working with isolates debugging becomes really hard. You can't set break points in the isolate and it sometimes crashes without propagating the exception properly.

Summary

I really enjoy programming in Dart. While the frequent updates to the Dart SDK cause some slight shivering when I see that yet another update is ready, the overall stability is good. You can't blame the Dart team for breaking API changes that cause some trouble in this early stage of Dart development.

But whether I recommend someone to use Dart depends on the dart2js bug. If you know that you will not be affected, I can recommend it. But it also shows that the designers of dart dreamed up some concept that they can't deliver right now. I really hope that this is fixed soon and if they deliver a 1.0 with this bug still open, I will call them morons and slam my door really loud, but very certainly I will not recommend Dart to anyone then.

Tuesday, September 3, 2013

Simulation in the browser

tl;dr: The simulation on the beta site now works with regular browsers and not just Dartium, but please don't overuse it just yet :)

When I started to develop the simulation for PSHDL I always thought that it would be a really cool idea to be able to run it in the browser. The reason is simple: If you don't have to install anything, and can get a quick success within a few minutes (seconds), people would be more likely to take a closer second look. Especially when I was teaching the FPGA 101 lecture at the 29C3 last year, I noticed that there is a resilience towards downloading and installing gigabytes of tooling just to get started with a blinking LED on an FPGA.

As JavaScript is probably my second most hated language (directly following VHDL), I didn't want to write a dedicated code generator for JavaScript. As I was generating Dart, which has many advantages over JavaScript, I simply use the dart2js tool. So when you hit the simulate button the following actions are triggered:

  • PSHDL code parsing and validation
  • ExecutableModel generation (this is the byte code form for simulation)
  • Dart code is generated from the ExecutableModel
  • dart2js is invoked on the Dart code

When you hit the simulate button on a Dart enabled browser (Dartium), the first three steps, including the upload and download of the generated source takes about 34ms. But when you do the same in a regular browser it takes 3,5s. Most of that time is the time it takes for dart2js to generate the JavaScript code. This is awfully long, but right now out of my hand. I really don't want to a JavaScript output generator :) Maybe one day I will have the muse to do just that, but right now I won't.

While Dart has arbitrary precision integers, and so you can simulate any bit width PSHDL code, the generated JavaScript is limited to 32 bits :( I will work around this limitation somewhen in the near future, but for now that is how it is.

So if you want to get started with some blinking LEDs, head over to the beta editor and try this code:

module BlinkinLED{
  register uint<28> counter=counter+1;
  //Change the 21 until you are happy with the blinking frequency
  out bit led=counter{21};
}

Have fun!

Friday, August 30, 2013

Interesting problems with bit arithmetic (Part 1)

During the implementation of code generators, you can run into all kind of interesting numeric problems. Normally you would not encounter those as you would use the same bit width most of the time. Probably the arithmetic with those types is also well defined. But when you develop a code generator, or interpreter with arbitrary precision, you have to take care of all of this.

In this blog entry I discuss the simple arithmetic problems that I had to reason about during. In the next blog I will discuss the difficulties of implementing a cast operation. The last blog entry will then be about the pains of implementing shift operations.

Definition of arithmetic rules

When you perform arithmetic with two numbers it really doesn't matter whether a number is signed or not. The ALU always performs it the same way. What differs is the resulting interpretation. Let's take a look at a very simple C example:

uint32_t a0=4;
uint32_t b0=5;
uint32_t diff0=a0-b0;
int32_t diffInt0=diff0;
printf("%d-%d=%5d 0x%08x %5d 0x%08x %u\n", a0, b0, diff0, diff0, diffInt0, diffInt0, diffInt0);

uint16_t a1=4;
uint16_t b1=5;
uint16_t diff1=a1-b1;
int16_t diffInt1=diff1;
printf("%d-%d=%5d 0x%08x %5d 0x%08x %u\n", a1, b1, diff1, diff1, diffInt1, diffInt1, diffInt1);

Upon execution you will see the following results:

4-5=   -1 0xffffffff    -1 0xffffffff 4294967295
4-5=65535 0x0000ffff    -1 0xffffffff 4294967295

The reason for the difference between the display of diff0 and diff1 is that the printf function interprets those values as signed 32 bit for the %d option. The %x and %u does not care about the signedness of the value and just dumps the bits.

But what happens when one operand is signed and the other is not? Lets take a look at multiplication:

uint16_t a0=0xFFFF;
uint8_t  b0=-5;
uint16_t prod0=a0*b0;
int16_t  prodInt0=prod0;
printf("%d*%d=%5d 0x%08x %5d 0x%08x %u\n", a0, b0, prod0, prod0, prodInt0, prodInt0, prodInt0);

uint16_t a1=0xFFFF;
int8_t   b1=-5;
uint16_t prod1=a1*b1;
int16_t  prodInt1=prod1;
printf("%d*%d=%6d 0x%08x %5d 0x%08x %u\n", a1, b1, prod1, prod1, prodInt1, prodInt1, prodInt1);

This produces:

65535*251=65285 0x0000ff05  -251 0xffffff05 4294967045
65535*-5=     5 0x00000005     5 0x00000005 5

What is going on here? Lets take a look at first printf. As b0 is unsigned, assigning -5 is just a strange way of writing 0xFB. So the operation that takes place is 0xFFFF * 0xFB which is 0xFAFF05. As the assignment target is 16 bit and the result is 32 bit, the first 2 bytes are cut off.

But what about the second printf? Well, now b1 is a signed type and gets assigned a signed value. This forces the multiplication in prod1 to become a fully signed operation. The 0xFFFFF is a disguised -1 for a 16 bit values. Interpreted as int16_t this becomes a -1 which is then multiplied with -5 which is 5. Or written in hex the multiplication is 0xFFFF * 0xFFFB which is 0xFFFA0005, truncated to 16 bit you have 5.

This leads to the following rules in PSHDL for the arithmetic operations (+,-,*,/,%):

  • If one operand is signed, both operands are treated as signed.
  • Both operands are sized to the bigger size operand.
  • A literal always adopts the size of the other operand. If this leads to a loss in information in the literal, a warning is generated.
  • Arithmetic with bit types is not allowed.

Unfortunately PSHDL does not just have a hand full of bit-width to deal with but can be arbitrary. Lets take a look at the following PSHDL code:

uint<16> a=0xFFFF;
int<16>  b=-5;
uint<32> p=a*b;

In C the result is 5. But is that really the intend of the programmer? One might argue that the programmer actually assumes that he is multiplying something big with a negative number, so the result should be something big that is negative. Even though it would be quite easy to simply add one bit on each operand, I decided to be compatible with C and do not perform such an extension. The result in PSHDL is thus the same as in most C based languages: 5.

Tuesday, August 20, 2013

REST API

After quite some while I finally found the time to fully document the REST API of PSHDL, but maybe it you might want to know why I but such a high emphasis on having a REST API vs. investing time in a command line compiler. Don't worry though, a follow up on how to use the command line (which already exists in some pieces) will follow soon.

To make one thing very clear: I will not use, publish or sell the code that is on my server. I put some effort into the fact that you can not extract the user name or eMail of a workspace, also the workspace is not exposed in such a way that a crawler might find it, unless you post a link to it somewhere. I might take a look at the code if I find that it creates some undesirable behavior (takes unusual long to compile, crashes the compiler, consumes large amount of disk space or is harming the server in some other way). I also might take a look at the interaction with the compiler to learn from it. One particular example I have in mind is: How many edits does it take until the user recovers from a compile error.

You can find the documentation online at the API URL. Unfortunately swagger, the framework that I am using for creating the API documentation, has some issues with correctly filling the forms. So don't be surprised if the sand-box functionality does not work correctly. I hope this will be fixed in a not too future version.

Motivation

There are good reasons for not using the REST API, especially when you are not just toying around anymore, but are producing something more serious. But most other users can actually benefit from it.

Frequent releases

One of the hardest things about designing a language is to predict what people are going to try to do with your language. In a text based language with a fixed grammar some constructs are accepted by the grammar although they don't make any sense on a semantic level. Finding all those strange ways people intend to use your language is difficult for someone who knows the language very well. So occasionally some creative way of doing something can lead to problems that I did not cover well. In the best case they crash the compiler, in the worst case they produce code that is not correct.

My fear when handing out a command line version is that people are trying to solve a problem with PSHDL and fail, either because the compiler crashed, the documentation sucked, the warning wasn't clear enough or they were not using the latest version of the compiler. In a web environment the compiler and the documentation is always the latest version, also crash logs are collected so that I can take a look at them and hopefully fix them. When people are using it on their command line, the chances that they are going to report a bug is quite low, especially on the very crucial first few minutes of toying around with a new language.

Collaborative editing

With an online API it is very simple to share your code with others. If you run into a problem, you can simply send them a link to your source and they can then take a look at it. Or maybe you're working with someone else and want to work together on the same code base at the same time. This is quite easy to accomplish with the REST API. Especially when you start to make use of the streaming API which brings us to the next point.

Interface extensions

In the future I will add the ability to add your own extensions to the web ui. For this a streaming API has been added. The streaming API is essentially a publish subscribe API realized with atmosphere, a web-socket/comet framework. You open a connection to the streaming API and get event notifications for most actions, like adding a file, compiling the workspace etc. In addition to that you can also publish your own messages.

With this API it is possible to create web pages that can react to events on the workspace. For example a local application might automatically download and synthesize your code whenever you hit compile in the workspace. You could also implement some sort of chat facility to discuss the code that you are working on. So far you can not directly hook in to the ui itself, but I am working on that.

New applications

Think of the possibilities: Let's say you want to build an interactive tutorial for learning how to program FPGAs. You could provide code snippets, which upon the press of a button, are converted into JavaScript which you can then use to let the user not just run, but also interact with the example and play around a bit.

Also a REST API is not bound to a particular language. In some circumstances it might not be desirable or even possible to run the Java version of the compiler (think of mobile handsets/tablets). With the REST API this is not much of an issue.

Zero installation

In order to get started with programming in PSHDL all you need is curl/wget or any other http client. No need to install (m)any tools.

Tuesday, July 9, 2013

PSHDL is now open source

Many PhD students before me tried to generate a living out of their new compiler. Most of them failed, or the compiler never made it to market. One simple reason is that people who have control over how money is spent, are seldom the same as the one that would benefit from an increased productivity. Or at least the feedback loop is too large for a narrow minded person to perceive. And so a lot of really awesome compilers end up not being used. Also the EDA industry is a very conservative one.

I don't want PSHDL to end up in a junk drawer without anyone using it. I want to create a community where people can learn to program FPGAs (and have fun while doing so). For this I decided to make it open source. GPL3 to be more specific. My hope is that this encourages people to use PSHDL without the fear that once they have written everything in PSHDL, the development ceases and they are stuck with a critical bug. I also hope that it encourages people to develop their own generators, annotations and native functions that enhance the functionality of PSHDL.

There is only one thing that I want to avoid and that is a clone of PSHDL with a very similar, but different enough syntax to break existing code. When something is labeled PSHDL, it should be PSHDL as I invented it. Thus I limit the usage of the term PSHDL as Apache does with its license.

So if you want to take a look at the 64000 SLOC that constitute the compiler, go check out the code. Or if you encouter a bug, please file it.

While there is a command line version code that you can use locally, I urge you to rather use the REST service. Either with the new Dart Web ui which currently is considered a beta, or with the sublime plugin that was developed by cryptix.

The very simple reason for recommending the REST service is that the PSHDL compiler has bugs, and I fix them as fast as I can. By using the online service you can be sure to alyways have the latest, and hopefully most bug free version of the compiler.

Thursday, May 23, 2013

Simulation

One very important aspect of implementing an IP core, is to validate that it is working as expected. Or to figure out why that damn piece of code is not doing what the master told him to do. The first, and most of the times the second thing as well are done using test-benches in VHDL. A test-bench provides the input for the ip core in such a way that it reaches a state where something interesting is happening. For validation the outputs are then verified with asserts such that an error occurs during simulation when something unexpected is happening.

Some people argue that VHDL is well suited for simulation, and in fact that is one of its biggest strength that you can write the test-bench in the same language as the ip core itself. After all, that is true for Java with JUnits for example. The problem is, this holds only true as long your are programming esoterically small examples like ripple carry adders. But what about your new H.264 acceleration core? Are you really going to implement it in a structural way that is fit for synthesis and then again in behavioral? What are the chances that you are doing the same error twice?

A far more common case is that you use an implementation written in language X as your reference. So the new workflow looks then like this: You run your reference implementation with some input, write its output in a format that can be read by your VHDL test-bench, then run your test-bench and validate that the output matches. This workflow is so cumbersome to use that you will either spend a lot of time on getting the conversion steps automated, or you just don't test it very intensively.

While PSHDL allows you to define test-benches as well, it has another, much more powerful way of testing. PSHDL can generate a byte-code file for you. This PSEX, as I call it, is a very simple byte code that can easily be implemented in any language. Of course the reference implementation is available in Java, another one in C++ is currently under development. So what is the advantage of having byte-code?

Why byte code?

An interpreter for byte code is essentially just a big switch case. It takes about 1-2k lines of code to implement in most languages. This interpreter then can take the same input as your reference implementation and you can directly compare its outputs. No more intermediate files! But you can go even further with this. First of all, because PSHDL is also a framework, you can easily replace the byte code during runtime, which allows you to work with real-time data and work on them in a live way. Think about it, you can tweak your audio codec while it is processing some real data. The current interpreter will probably not be fast enough for this advanced processing, but it is not too hard to create a JIT compiler for it as well.

Seeing is believing, so here is a video of it showing it in action.

Another advantage is that you can create specialized processors and run the interpreter/byte code in the actual hardware. One of my biggest hopes however is something slightly different. I want to see a Javascript interpreter that can run in the browser so that interested people can run some examples in there. Even cooler would be to have a run button on the website that can then upload the byte code to a preconfigured FPGA that it then executes in the actual hardware.

Get your blinking LEDs onto real hardware in seconds without having to learn any vendor tools! Zero installation required!

You might say that an interpreter running on the FPGA kind of in SW is much slower than a proper synthesis, that is absolutely correct. But for teaching ripple carry adders, and some simple IO stuff that is sufficient.

Tuesday, April 2, 2013

Why VHDL sucks

I am quick to say that VHDL sucks and some people argue that it is without substance. Some even argue that there is nothing wrong with VHDL. So I want to write down the points that made me come to this conclusion.

First of all, there is no reason to get offended, if you like VHDL and want to use it for everything, please continue to do so. You put a lot of effort into learning that language and I am not going to take that away from you. This is just my opinion and based on the fact that I care about getting FPGA designs done.

Problem #1: VHDL is a very capable language

VHDL has a lot of functionality built into it. Why is that a problem? Because it can become very confusing very quickly. Consider a natural language like German. It has many ways to express the same problem in different terms, which is great and fun because you can put your emphasis on different things, but there are also dialects. If you're not used to the lower german, the bavarian, the swiss german or any other german dialect, you will have a hard time to understand it. The ability to express the same problem in so many ways comes with the disadvantage of being able to make code less readable for others. This paper VHDL Synthese Vergleich describes a "novel" way to describe hardware in SW terms. It also mentions that it comes with price of reduced portability. Yet it is still valid VHDL.

With great power comes great responsibility. In VHDL you will have to constrain yourself to not use any uncommon syntax constructs. Not an easy task when you're learning it and there are 97 keywords to choose from. (C has 32, Verilog 103, PSHDL 31, Java 50, C++11 84). This is even further complicated by the 2 domains that VHDL is used in.

Problem #2: Usage domains

You can vaguely split VHDL into 2 domains. The simulation, where you can basically use everything the language has to offer (if your simulation tool of choice has implemented it). The other is the synthesizable subset. What constitutes the synthesizable subset of VHDL is defined by IEEE 1076.3 and by what your tool vendor thought might be possible and useful to implement. Especially the last part is what makes it so problematic. Some vendor tool might be able to synthesize some part correctly (as in the way the programmer intended it to work), while others might produce a different result.

As the VHDL simulation subset is significantly bigger than the synthesis part, newcomers are often confused by the choice. When you google some code, it is hard to immediately recognize whether it is suited for synthesis or not. It is also extraordinarily easy to create code that works in simulation but does not work in HW and vice versa. But the worst code is the one that works in simulation and only works most of the time in reality, for example when latches or race-conditions are created.

Problem #3: Old language

VHDL is an old language. Why is that a problem? Because a lot of people are coming from the software side of things, so VHDL is very likely not the first programming language for most of them. If you look at most procedural/object oriented languages, you will notice that the expression subset is C style. So when people see code, they put their pre-conceptions about what a language does into what they read. But VHDL being Ada based has a few traps that people can easily fall into if they are not 100% familiar with it.

But to back up a little bit, VHDL was invented to document ASIC designs. Then people realized that it could be used for simulation as well. And even later they discovered that you can use it for synthesis. This makes it very clear that synthesis is an afterthought of VHDL and explains why it is so damn easy to create a simulation/hw mismatch.

VHDL synthesis forces you to obey certain pattern so that the synthesis tools can detect those patterns and infer what you were describing there. If you make a mistake you are lucky when you get a meaningful error message on why that pattern didn't match. If you're unlucky you will simply get a bit file that fails to work as expected. If you're really unlucky you will get a model that only fails on certain conditions.

Problem #4: Tooling

Because VHDL synthesis is based on pattern detection, it is very hard for an editor to provide useful feedback to the user. If you look at the most advanced editor for VHDL that I know of, the Sigasi HDL Editor, you will notice that the errors you will get are not the same as the errors that the synthesis will provide. This is in part because Sigasi can't know whether you are designing with simulation in mind or for synthesis. If you're designing for synthesis, it can't also know what the vendor tool will eventually accept and what not and whether what you get is what you expected to get. The vendor tools on the other hand have absolutely no support for what a modern development environment provides, like VCS, instant error markers, quick fixes, templates, history comparisons, refactoring etc..

Additional issues arise by ambiguities in the language itself. a(9) for example can mean different things. It can be the access of the 9th index of a variable/signal/constant named a (also it can be the MSB or the LSB), it can also be the invocation of a function/procedure named a with the argument 9 or it can also be the access of the 9th index of parameterless function. In order to provide useful feedback for the user, a lot of context has to be known. Parsing VHDL is considered VERY hard by many people. Here someone tried to implement a bison based parser. That the language itself is hard to parse does not really help to create a better tooling. Fun fact: The reason why I started PSHDL is because I considered the idea of parsing VHDL and making sense of it a too big waste of time.

Problem #5: Verbosity / duplication

Because VHDL is based on ADA, you will find all those nice keywords that make up the language. Those are not exactly what I mean by verbosity, although it certainly does not help. What I mean is that when you want to instantiate something, you will have to declare a local signal (and duplicate its type information) that the entity can connect to and then declare the port mapping. If you're using component instantiation for some reason, you will have actually have 3 lines that you need to add/change per port. Also one of the recommended pattern for describing state machines is to use 3 processes. Each process repeats the same switch with all enums. Btw. chances are that unless you know blocks, you will have a large distance between your enum declaration and the actual usage. If you add a state, you will have to add it in 3 places. Even if you just have 2 processes (most of the time you will not be able to have less because you don't want everything to become a register), it is hard to see what is going on in a big state machine.

Problem #6: Describe what you want vs. just telling what you want

If you want to have a register in VHDL, you will have to describe behavior that works like a register. One little mistake in its description and you will get something different entirely. Verilog has a slight advantage here in that you can make a contract by saying: Beware compiler, I'm trying to describe a register here! If the result is not a register, Verilog does warn you about it. Not so in VHDL. The earliest moment you will notice that your precious register is a dull latch is when you look at the synthesis report/ warning section. The warning section on the other hand gets very crowded very quickly, and it is easy to miss some important ones.

Problem #7: Learning curve

In our university we are teaching students VHDL and we can see how they are learning. Most of them hate VHDL (and the tooling) after that class. VHDL is very frustrating to learn for all of the reasons listed above. Most people get frustrated when they are starting with programming FPGAs because learning VHDL, the idea of how to program HW and the tools at the same time is such a steep learning curve.

Conclusion

PSHDL is a drastically simplified version of VHDL with a modern syntax. This allows developers to carry over their prior knowledge more easily. It also avoids some of the most commonly observed pit falls that students make when they learn VHDL. It is also designed with advanced tooling in mind and I am eagerly working on making PSHDL a joy to use without having to install gigabytes of tools.

Wednesday, March 27, 2013

Function signatures and parameters

One of the things that is bothering me for quite a while is the fact that currently functions do not have type information attached to them. The abs function for example looks like this:

inline function abs(a) -> (((a < 0)?-a:a))

This function is of type inline, which means that its invocation will be replaced by the code it is declaring. The parameters are essentially just placeholders. But what happens when you invoke it with a bit type?

bit<16> a=0xFFFF;
bit<16> b=abs(a);

Well, this will cause an error on the generated code. So those kind of errors can only be discovered after the method has been inlined and the error message that can be produced is rather confusing because the user never sees the generated code. This is what type information is for. In Java and actually most programming languages you need to declare what type a parameter has.

void bla(int a, uint8_t b)

Additionally to be able to produce error messages much earlier, it gives the user a much better idea of what he can put into that method. So why does PSHDL not have this feature? Well, there is a rather strange problem that the width of a type introduces. For example the max method might have the following signature:

uint max(uint a, uint b)

This looks quite intuitive, but what happens when we want to use signed ints? Ok, let's add a second method (and implementation)

uint max(uint a, uint b)
int max(int a, int b)

Fair enough, but what about the variable bit-width ints? Lets make a signature that shows that you want to have any width number.

uint max(uint a, uint b)
int max(int a, int b)
int<> max(int<> a, int<> b)
uint<> max(uint<> a, uint<> b)

Now we have a total of 4 methods that all have the same implementation. One simplification would be to say that uint, which is 32 bit can be cast to uint<> implicitly. After all, array indices are cast from uint<> to uint implicitly as well. So that would reduce the number to 2 methods. Another optimization we could do is to create a "super type" num for primitives that can be compared and used in equations. This would reduce the number of methods to one again:

num<> max(num<> a, num<> b)

The downside however is that num is not of much use outside of method parameters. Introducing it can cause the impression that it could be used elsewhere. The same holds true for the notation of having the diamond width <>. But I think that is something the user might understand. But now that you have the <> width, what happens when no width is specified? Does that means it requires the 32 bit type? This is just calling for disaster when users are forgetting it. Actually you always would want to have the <>. But if it always has to be there, why not leave it away? That would be confusing because the the uint in parameters means something different than in other source parts. Better leave the <> there and throw a warning if it is absent.

Another problem remains though. The width of a and b could be different, it would be nice to be able to constrain that. One example:

uint<T> max(uint<T> a, uint<T> b)

This would constrain the width of the arguments to be the same. Determining whether two width are equal is non trivial. For example the width of one could be 2*P while the other is P*2. From an compiler point of view this is nontrivial, and in reality the expression for a width is likely to be more complex. So lets ignore this idea as well. While it is unfortunate that you may have to write the same function 3 times for all primitive types: uint, int and bit, it is much clearer than any of the other ideas.

Thursday, March 21, 2013

How much code do you actually write?

Just today I was wondering how much code I wrote for the PSHDL compiler. The numbers are pretty interesting. The core code of PSHDL is split like this:

I was curious on how much of that code I actually wrote on my own. And this is the result:

So it can easily be seen that a lot of my code is generated. The PSHDL AST code is generated from my DSL (Domain specific language) model. The ANTLR part is the code generated by the parser generator that I need for parsing input files. The XTend Output on the other hand is generated from the 4282 lines of input code. So to compare the input and generated output:

What can we learn from this?

A lot of code in the PSHDL compiler is generated from a sparse description. This has distinct advantages for maintenance. For example the code for the AST model contains lots of boilerplate code for things like hashing, equals, getter, modifier, constructors and other things. Writing this code is not just boring, writing it on your own is also very error prone. This is why PSHDL features generators that can generate this boilerplate code for you. One of our IPCores has 325 lines of PSHDL code. From this we generate the following code:

In a rough estimate the ratio of lines of PSHDL to lines of VHDL code is about 3 when the generator is not used. That means for 100 lines of PSHDL you get 300 lines of VHDL code. Another interesting thing to observe is the fact that very specialized generators can have crazy ratios from input to output, whereas the ratio gets down the more general a language gets. The ratio of 1:3 for PSHDL is possible because PSHDL is designed for (FPGA) Synthesis (and a little test-benching).

Another thing to consider is that the generated code can either be more compact or less compact than hand written output code. For XTend I found that the generated code is about twice as long as I would write it by hand. So you can say that 1 line of XTend is as powerful as 1 line of Java. But what is more important here is the readability. The XTend code is far more precise and easy to understand than the Java code. The generated VHDL code however is probably slightly smaller than what you would write by hand. This is caused by the fact that in for example a state machine typically 3 processes are used, while PSHDL will have just 2 in most cases.

Thursday, March 14, 2013

Four neat examples of PSHDL in action

If you still don't know why you should use PSHDL, here are four neat examples of what PSHDL can do for you! Generally the amount of generated VHDL code is up to 5 times of the original PSHDL code. For generators however this ratio is even worse (or better depending on how you see it). If you want to see the generated VHDL code, just paste it into the online compiler at: pshdl.org.

First example: Registers

module Calculus {
    param uint WIDTH=8;
    in uint<WIDTH> a,b;
    out register uint<WIDTH> sum=a+b;
}

If you want to describe a register in VHDL, you will have to go trough some effort to actually describe how the register is supposed to work. This includes creating a process with proper clk and reset signals. In PSHDL a register is created by placing the register keyword in front of a variable. That variable then will be realized with a proper register description.

But where does the clock come from? Well, in PSHDL there is a special reference called $clk that, if it is referenced, creates a 1 bit in port. The default register, without any further refinement, uses this as the clock. Of course you are free to declare any port to become the default $clk or specify another signal as clock. You can also change on what edge the register is working, wether the reset is active low or high, synchronous or asynchronous, the reset value etc..

Another thing that is different from VHDL is that you can place your ports anywhere in the file. This allows you to place code together that logically belongs together. I can hear you think: But I actually like the port declaration at entity level! There are 2 solutions for your concern, firstly, you can simply place all your ports at the top of the file if you want to, secondly you can annotate your module so that the compiler automatically creates an interface for you that will sit atop the module. But more about that in a later blogpost.

The most important thing to remember about PSHDL is that it reverses the synthesis results from VHDL, it is easy to create a register, while it is not possible to create a latch (unless you explicitly tell the compiler to do so). To a seasoned VHDL coder, this may seem like a triviality, but be clear:
Unexpected latches are the number one reason student designs don't work.

Second example: Interfaces

The idea of a component, the declaration of ports without providing the actual implementation, is called interfaces. An interface can be declared and implemented by a module. It can also be instantiated.

module de.tuhh.ict.BitAdder {
    in bit a,b,cin;
    out bit sum=a^b^cin; 
    out bit cout=(a&b)^(cin&(a^b));
}

interface de.tuhh.ict.IBitAdder {
    in bit a,b,cin;
    out bit sum, cout;
}

module de.tuhh.ict.rippleCarry{
    param uint WIDTH=16;
    in uint<WIDTH> a, b;
    uint<WIDTH> cTemp;
    out register uint<WIDTH> sum;
    //Instantiate the other module
    BitAdder adder[WIDTH];
    //Uncomment to instantiate the interface 
    //(you have to make sure that the synthesis tool 
    //can find the actual implementation)
    // IBitAdder adder[WIDTH];

    adder[0].a=a{0};
    adder[0].b=b{0};
    adder[0].cin=0;
    sum{0}=adder[0].sum;
    cTemp{0}=adder[0].cout;

    for (I={1:WIDTH-1}){
        adder[I].a=a{I};
        adder[I].b=b{I};
        adder[I].cin=cTemp{I-1};
        cTemp{I}=adder[I].cout;
        sum{I}=adder[I].sum;
    }
}

If you are wondering what the curly braces are doing: They can be used to access bits. What you can also see is that it is easy to use arrays. You can create arrays of instances and variables and access them with the rectangular braces, just like in other C based languages.

But what is more interesting here is that you don't need a port map. You simply say: Instantiate me a BitAdder or as in this case, a whole lot of BitAdders. Then you can access the ports of it with the dot notation.

If you change the BitAdder instance to IBitAdder, it is your task to provide an implementation of it with the same name. This is how you can instantiate any VHDL, Verilog or other black box. For VHDL however you get tool support to automatically create the interface for you.

Third example: Statemachines

module de.tuhh.ict.Statemachine{
    enum States={IDLE, WAITING, DOSOMETHING}
    //Do not assign a default state here unless you want to 
    //define a new state in each case statement
    register enum States state;
    in bit a;
    out register bit b;
    out bit c;
    switch (state) {
    case States.IDLE:
        if (a)
            state=States.WAITING;
    //You can let the Enum. away in switch cases where 
    //switch operates on an enum
    case WAITING: 
        if (!a)
            state=DOSOMETHING;
    case States.DOSOMETHING:
        b=1;
        c=1;
        state=States.IDLE;
    default:
        state=States.IDLE;
    }

    out bit x=0;
    if (state==IDLE)
        x=1;
}

In PSHDL there is no such thing as processes. As such the clock domains are only separated by the definition of the register. This makes it possible to describe a mealy state-machine (one where the outputs depend directly on the inputs) without the need for multiple processes. This increases readability as everything that is related to the state-machine can be found in one place.

Fourth example: Generators

In most FPGA designs these days, you will find some kind of processor. To extend this processor IPCores are used to accelerate certain functions. PSHDL makes it very easy to get a proper IPCore.

package de.tuhh.ict;
module BUSTest{
    include Bus bus=generate plb()<[
        row input{
            rw register uint<16> a;
            rw register uint<16> b;
        }
        row output {
            fill;
            r register uint<16> result;
        }
        column adder {
            input;
            output;
        }
        memory {
            adder[4];
        }
    ]>;
    for (i={0:3}){
        bus.result[i]=bus.a[i]+bus.b[i];
    }
}

This little example here, generates the infrastructure for a PLB based IPCore. In case you are wondering, AXI and ABP are supported as well. This code generates a total of 8 registers. Using a memory map, a description on how the peripheral can be accessed, is a pretty advanced feature, but It allows PSHDL to generate some very useful support files like this HTML representation of the memory layout:

Offset3116150Row
adder [0]
0 [0x00]a [0]b [0]input [0]
4 [0x04]unusedresult [0]output [0]
adder [1]
8 [0x08]a [1]b [1]input [1]
12 [0x0c]unusedresult [1]output [1]
adder [2]
16 [0x10]a [2]b [2]input [2]
20 [0x14]unusedresult [2]output [2]
adder [3]
24 [0x18]a [3]b [3]input [3]
28 [0x1c]unusedresult [3]output [3]

The generated Interface looks like this:

interface Bus{
   in register uint<16> result[4];
   inout register uint<16> b[4];
   inout register uint<16> a[4];
}

And it also generates the C code that you need to access the SW registers:

//Typedef
typedef struct input {
    bus_uint16_t    a;
    bus_uint16_t    b;
} input_t;
// Setter
int setInputDirect(uint32_t *base, int index, bus_uint16_t a, bus_uint16_t b);
int setInput(uint32_t *base, int index, input_t *newVal);
//Getter
int getInputDirect(uint32_t *base, int index, bus_uint16_t *a, bus_uint16_t *b);
int getInput(uint32_t *base, int index, input_t *result);
//Typedef
typedef struct output {
    bus_uint16_t    result;
} output_t;
//Getter
int getOutputDirect(uint32_t *base, int index, bus_uint16_t *result);
int getOutput(uint32_t *base, int index, output_t *result);
typedef struct adder {
    input_t input;
    output_t output;
} adder_t;

And for printing it nicely to STDOUT:

void printInput(input_t *data);
void printOutput(output_t *data);

Additionally it will also create all the files that you need to place it into your local Xilinx IP directory and instantiate it directly.

Monday, March 11, 2013

Thoughts about functions

In the last few days I thought about the following problem:

int a[5]={1,2,3,4,5};
int sum=a[0]+a[1]+a[2]+a[3]+a[4]+a[5];

It is quite clear what this code is supposed to do. But it scales very badly with the size of a. In a regular sequential programming language, you would solve the problem with a regular for loop like this:

int a[5]={1,2,3,4,5};
int sum=0;
for (I={0:4})
    sum+=a[I];

This is even correct PSHDL syntax. But due to its parallel nature, it does not do what the developer is expecting it to do here. In fact, this is just short for:

int a[5]={1,2,3,4,5};
int sum=0;
sum+=a[0];
sum+=a[1];
sum+=a[2];
sum+=a[3];
sum+=a[4];
sum+=a[5];

Well, this looks just as the same as the above. But you have to consider that the last assignment wins. A quick detour:

int a=0;
a=5;
a=7;

a is obviously 7, but it never has the value 5, not for a second, not for a pico second, just plain never! You can think about it like this: PSHDL walks through the code from the top to the bottom. When an assignment a=X is found, it remembers that upon the end of the code it will assign X to a. That is unless somewhere below it finds a=Z. In that case the X is assignment is totally forgotten and Z is assigned. The same holds true for our sum example. For PSHDL the slightly longer version is equivalent to:

int a[5]={1,2,3,4,5};
int sum=0;
sum+=a[5];

Sequential loops

One idea that comes to mind us to introduce a new keyword. For example sequential. When we now put that in front of for, the contents are interpreted sequentially. The example:

int a[5]={1,2,3,4,5};
int sum=0;
sequential for (I={0:4})
    sum+=a[I];

This would now do what the user expected it to do. But unfortunately we just gave the user a very powerful tool in his hands. There is nothing, well nothing that I want to code, that keeps him from writing:

int a[5]={1,2,3,4,5};
int sum=0;
sequential for (I={0:3}) {
    sum+=a[I];
    a[I+1]=0;
}

What is that supposed to do? The user now has the ability to write very dysfunctional code. It is very difficult to distinguish valid use cases from invalid ones, so better provide a clean concept for solving this problem.

Functions to the rescue?

However summing up an array is a very plausible thing to do and writing everything by hand is very error prune. Also it can not be done when the size of the array is parameterized. So my first idea for giving the user the ability to solve this problem properly was to allow to declare functions:

function uint doSum(uint a[]) {
    uint sum=0;
    for (I in a){
        sum+=I;
    }
    return sum;
}

int a[5]={1,2,3,4,5};
int sum=doSum(a);

This however has a few weaknesses:

  • The same syntax that is used for the parallel way of doing things is suddenly used for sequential stuff
  • The function can only sum uint arrays
  • It is very hard to create simulate-able code for this as the function would need to be unrolled into parallel code

While the second problem could be solved with type parameters, which would introduce other syntax problems, I really don't like the first problem. The user should be able to understand the model of PSHDL very quickly. Having to different kind of models does not help with that.

Maybe lambda functions?

Another thing I thought about: If using the same language to sometimes have parallel and other times have sequential meanings might confuse the user, why not use a different paradigm which can elegantly solve these kind of problems. In functional programming a solution might look like this:

int a[5]={1,2,3,4,5};
int sum=a.foldLeft{a,b|a+b};

This would have the benefit of being very short and precise. But it would demand from the user to have an idea how functional programming works. Nonetheless, the type problem would have been solved as well.

I have to admit that I like the lambda approach. But I am not sure whether many people will understand it. So to not cause utter confusion I decided that I will provide the user with sum, (left|right)Difference, xor, or, and and mul functions. Maybe those will map to a more generic version that takes an enum to select the operation like this:

int a[5]={1,2,3,4,5};
int sum=apply(a, PLUS);

The nice thing about the apply function is that you can also write it like this:

int sum=a.apply(PLUS);

Another nice aspect of it is that PLUS is an enum and as such you even provide the operation as parameter. This approach however has some disadvantages that arise from it's simplicity. How for example would we add every second item? Or how do something like a[0]*x + a[1]*x . Well those cases would require to decompose those expressions into temporary signals. On the other hand it would certainly be possible to have functions that can apply operations to an array. So that case would look like this:

int sum=a.map(MUL, x).apply(PLUS)

Monday, February 25, 2013

Moving away from XText


When I started with PSHDL, I used XText for developing the language. It seemed very beneficial for me as I had plenty of experience with it (some of which was bad experience), but others were very good. If you think about it, the idea of XText is pretty nice. You create a quite straight forward language definition and by some magic you will get a complete Eclipse Editor with syntax highlighting, outline, rename refactoring, linking etc. This is really awesome. But as PSHDL evolved it become harder to justify the usage of XText. After all it comes with some baggage.

If you don't know XText, go check it out. One of the first things that annoyed me however is the fact that the generated model is based on EMF and Eclipse. So everything is fine as long as you want is an Eclipse editor. However I discovered that PSHDL has to be more than just an Eclipse editor. While it is possible to create a command line version of it, it comes with some nasty side-effects. The first one is that it depends on none less than 45 external jars to run. This makes the compiler executable a 25MBytes biggy that takes 850ms just to fire up a completely useless OSGI and plugin management system. Compiling a bigger piece of code takes 5s.

As "replacement" I used ANTLR4. This of course does not give me a full blown Eclipse IDE, but it gives me the freedom to implement other things. For example I can now access the comments and bring most of them over to the VHDL code. This helps to bring some documentation to VHDL. I can also add documenting Comments JavaDoc style. This allows me to annotate the port declaration. Another thing is that I can give better error messages and provide better guidance towards a correct source in case of syntax errors. Also the compiler can now become much smaller. It is now 3.8MBytes and take 3.8s to compile the same source.

But what is much more important than anything else: I can now easily embed it into any IDE that I like, not just Eclipse. People are more willing to integrate it in their workflow if it is small and fast. The downside is: I have to develop my own Eclipse tooling. But I am fairly familiar with Eclipse and so that is no big problem for me.

Thursday, February 7, 2013

Using XTend

In the previous post I explained how I moved away from AspectJ towards XTend, so I want to share my experience in migrating. Let's start with a quick list of things I like:

Things that are cool

  • Polymorphic runtime dispatch
    • This was the main reason to switch to XTend
  • Template Strings
    • They can come in very handy when you want to create stuff like for example a HTML file
  • Extension Methods
    • You can "extend" any type and make things look more OOP like instead of the regular procedural code where you don't see the what the operation is actually performed on
  • The whole accessing getters as fields
    • I just have so many getters that it makes a lot of sense
  • The reduced noise by leaving ; away

A cool example:

def dispatch String toString(HDLConcat concat, SyntaxHighlighter highlight) 
  '''«FOR HDLExpression cat : concat.cats SEPARATOR highlight.operator("#")» «highlight.operator(cat.toString(highlight))» «ENDFOR»'''

Instead of:

public String HDLConcat.toString(SyntaxHighlighter highlight) {
 StringBuilder sb = new StringBuilder();
 String spacer = "";
 for (HDLExpression cat : getCats()) {
  sb.append(spacer).append(highlight.operator(cat.toString(highlight)));
  spacer = highlight.operator("#");
 }
 return sb.toString();
}

Things I am not passionate about:

  • val and var it's quite alright that you can leave the type away or that you can declare something as final, but I like my types visible and so I declare them anyway in most cases
  • The tooling. It is descent, but not yet something to brag about
  • The "everything is an expression" idea. Ternary operators are nice, no need to replace them with if then else constructs. So far I always wrote a return statement instead of relying on the "the last statement is the return value" thingy.
  • The all imports should be explicit (vs. the usage of wildcards)

Things that are not optimal:

  • In order to use nested classes, you will have to use the $ sign to refer to them. Knowing that this is the JVM type of referring to them is not a good excuse.

Things that are plain stupid:

  • As of now you can not allocate Arrays (because the [] brackets are used for closures). This however is already a planned change
  • There are no character literals. Because of stupid auto boxing I had to write a little helper method instead

So, instead of simply writing:

Character.toString((char) (i + 'I'))

I know had to write:

def String asIndex(Integer integer) {
    val int i='I'.charAt(0)
    return Character::toString((i + integer) as char);
}

Because this can be seen as an extension method, I was however able to write this:

i.asIndex
  • Working with arrays yields horrible performance because they are wrapped to a list
  • Working with enums, especially in switch cases is awkward to say the least

Here is a simple example:

switch (obj.presentation){
case HDLLiteral$HDLLiteralPresentation::STR:
    return null
case HDLLiteral$HDLLiteralPresentation::BOOL:
    return null
}

Where is your the type inference now? To be fair, you can create a static import for those enums, but switches are strange beasts in XTend. They are not real switch cases translated 1:1 to Java, instead they are cascaded if statements. Also fall throughs and empty cases are not supported.

switch (type: obj.type) {
case type==OR || type==XOR:{
    return Ranges::closed(ZERO, ONE.shiftLeft( leftRange.upperEndpoint.bitLength ).subtract( ONE ))
}
case AND: {
    return Ranges::closed(ZERO, leftRange.upperEndpoint.min( ONE.shiftLeft( rightRange.upperEndpoint.bitLength ).subtract( ONE )))
}
case type==LOGI_AND || type==LOGI_OR: {
    return Ranges::closed(ZERO, ONE)
}

The resulting Java code for the last example is:

boolean _matched = false;
if (!_matched) {
 boolean _or = false;
 boolean _equals = Objects.equal(type, HDLBitOpType.OR);
 if (_equals) {
   _or = true;
 } else {
   boolean _equals_1 = Objects.equal(type, HDLBitOpType.XOR);
   _or = (_equals || _equals_1);
 }
 if (_or) {
   _matched=true;
   BigInteger _upperEndpoint = leftRange.upperEndpoint();
   int _bitLength = _upperEndpoint.bitLength();
   BigInteger _shiftLeft = BigInteger.ONE.shiftLeft(_bitLength);
   BigInteger _subtract = _shiftLeft.subtract(BigInteger.ONE);
   return Ranges.<BigInteger>closed(BigInteger.ZERO, _subtract);
 }
}
if (!_matched) {
 if (Objects.equal(type,HDLBitOpType.AND)) {
   _matched=true;
   BigInteger _upperEndpoint_1 = leftRange.upperEndpoint();
   BigInteger _upperEndpoint_2 = rightRange.upperEndpoint();
   int _bitLength_1 = _upperEndpoint_2.bitLength();
   BigInteger _shiftLeft_1 = BigInteger.ONE.shiftLeft(_bitLength_1);
   BigInteger _subtract_1 = _shiftLeft_1.subtract(BigInteger.ONE);
   BigInteger _min = _upperEndpoint_1.min(_subtract_1);
   return Ranges.<BigInteger>closed(BigInteger.ZERO, _min);
 }
}
if (!_matched) {
 boolean _or_1 = false;
 boolean _equals_2 = Objects.equal(type, HDLBitOpType.LOGI_AND);
 if (_equals_2) {
   _or_1 = true;
 } else {
   boolean _equals_3 = Objects.equal(type, HDLBitOpType.LOGI_OR);
   _or_1 = (_equals_2 || _equals_3);
 }
 if (_or_1) {
   _matched=true;
   return Ranges.<BigInteger>closed(BigInteger.ZERO, BigInteger.ONE);
 }
}

My summary

XTend is a nice language, but in some places strange decisions have been made. Regarding the language, as well as regarding the realization of the generated code. Here a last example that is just plain horrible:

def static listTest() {
    val list=new ArrayList<String>
    for (i:0..list.size-1)
        System::out.println("Item:"+i+" is "+list.get(i))
}

If the list is empty (as it is because I didn't add anything) it will crash because the index 0 is out of range. And it is not like you could use the plain java style for loop. You have to create a range...

Switching away from AspectJ

Up until now I used AspectJ to insert functions into the AST. This allowed me to conveniently call something like .constantEvaluate() on any node and get back the value if applicable. Also all functionality for creating a constant from the AST was encapsulated into one file. For example for Literals this would look like this:
 
public BigInteger HDLLiteral.constantEvaluate(HDLEvaluationContext context) {
 switch (getPresentation()){
  case STR:
  case BOOL:
   return null;
  default:
   return getValueAsBigInt();
  }
}

This would insert the function constantEvaluate into the type HDLLiteral. On a byte code level this function would really been copied into it. But the downside of using AspectJ is that this really only happens at the byte code level. And so quite a few tools, including GWT, could not use the result. I really like GWT however and want to be able to do fancy stuff with it on pshdl.org. So I needed to find a replacement. Just inserting the functions into the respective classes was not an option to me, as I really want to have a good seperation of concerns. So I decided to use XTend.

The nice thing about XTend is that is has a polymorphic dispatch. You essentially just declare a bunch of overloaded functions and during runtime XTend will dispatch the call to the most specific function. One would assume that Java does this automatically, but the called method is determined at compile time. Take this example:
protected CharSequence _print(final Integer i) {
 StringConcatenation _builder = new StringConcatenation();
 _builder.append("Integer:");
 _builder.append(i, "");
 return _builder;
}

protected CharSequence _print(final String s) {
 StringConcatenation _builder = new StringConcatenation();
 _builder.append("String:");
 _builder.append(s, "");
 return _builder;
}

protected CharSequence _print(final Object o) {
 StringConcatenation _builder = new StringConcatenation();
 _builder.append("Object:");
 _builder.append(o, "");
 return _builder;
}

public CharSequence print(final Object i) {
 if (i instanceof Integer) {
  return _print((Integer) i);
 } else if (i instanceof String) {
  return _print((String) i);
 } else if (i != null) {
  return _print(i);
 } else {
  throw new IllegalArgumentException("Unhandled parameter types");
 }
}

public static void main(String[] args) {
 Bla b = new Bla();
 Integer i = 5;
 String s = "Hallo";
 Object o = "Moin";
 System.out.println(b._print(i));
 System.out.println(b._print(s));
 System.out.println(b._print(o));
 System.out.println(b.print(i));
 System.out.println(b.print(s));
 System.out.println(b.print(o));
}

The output will be:

Integer:5
String:Hallo
Object:Moin
Integer:5
String:Hallo
String:Moin

What you can see is that the called method depends on the declared type, not the acutal type. This is fixed with the call of print vs _print. To generated the print method, all you have to do is write the following in XTend:
class Bla {

 def dispatch print(Integer i)'''Integer:«i»'''
 def dispatch print(String s)'''String:«s»'''
 def dispatch print(Object o)'''Object:«o»'''
 
}
With that in place, you can create a very nice collection of constantEvaluate functions that automatically dispatch to the type determined at runtime. This is very neat. This way you can create a single XTend class which has all constantEvaluates with a function for each type. While playing around with XTend I however discovered a few downsides that I will explain in the next post..

Monday, January 21, 2013

Defining a good language Kernel

When you design a new language you have the ultimate freedom of defining how your language should look like. There are plenty of language favors available, each with their own set of advantages and disadvantages. As the language designer it is your ultimate goal to create a language that can easily be understood, learned and used. It should make common tasks easy but also allow more complicated cases.

A strictly procedural syntax seemed a bit to plain for me, a functional one too far away from the hardware. So my decision went for the so far most successful way, the object oriented way. But I did not try to create an academic overhead by forcing a full OOP way on the programmer. Instead I designed the expression syntax to be equivalent to the C/Java expressions and added a few OOP concepts where it made sense. This allows most programmers to get an idea of what is going on quite quickly. There are two additions to the known expression: Bit accesses designated with curly braces and concatenation of numbers with #.

Most statements also look quite a bit like regular C/Java code. But I had to add a few other concepts. Interfaces, generators and variable bit width integer and registers are among them. I will quickly explain them here and get into more details in a later post.

Interfaces are kind of like binding contracts for developers. If an interface is declared, the implementation has to have all those ports in this exact way that they are declared. This is really useful for communicating an API when the implementation is not known, does not exist yet or is in another language. For example if you want to instantiate VHDL entities, you have to declare their interface, which you can then instantiate. It is up to the developer to ensure that the interface matches the VHDL implementation. Another example are generators...

Generators are a very powerful concept to automatically generate code for you from a short description. To declare a generator multiple things are required. An interface which the generated unit will have to obey, parameters, which specify whatever code should be generated and an optional piece of code. The last part is what makes generators so powerful, you can essentially embed another language with it. Take the bus generator for example. From a few lines of code, it can generate you a whole lot of PSHDL code, C code and even documentation. But other use cases might include a processor with C code, a language for statemachines, a C to hardware generator Etc.

What I was not able to transfer over to PSHDL however is the semantic of C, a fully sequentially control flow as it is described with C. Instead, PSHDL took the path of other HDLs by having everything in parallel. If you need sequential execution, you will have to code a state machine. This is the tribute of programming FPGAs vs. programming CPUs that you have to pay. In hardware things are pipelined and running in parallel. The equivalent of functions are rather other ip cores, but to wire them up efficiently, the sequential paradigm had to go. To create efficient pipelines and state machines registers came into play. They allow you to create really fast and robust synchronous logic. This why they are an integral part of PSHDL. While on CPUs the width of int is chosen to be the most efficient for the given architecture, on an FPGA the developer has to decide how many bits his variable should have. Make it too big and expensive routing and fabric resources are wasted and timing will start to degrade.

Friday, January 18, 2013

Why PSHDL?

The reason I created PSHDL is that I needed a language for programming FPGAs for my Phd. thesis. My thesis is about FPGA architectures, that is, the very low level architecture. So for example a question when designing FPGAs is: how many registers do I need? How much routing resourcesare necessary? What coarse grain hardware elements can help to improve performance?

To answer any of those questions benchmarks are essential. To create those benchmarks, I looked at various languages to check how easy it is to parse them and make sense of them. VHDL and Verilog, failed miserably. One problem in VHDL for example is that it is easier to create a latch by chance than it is to create a properly described register. There are like a million ways to create a register and even more to describe something that looks like a register, but in the one tool is one and in another it is not. But registers are the building block of every piece of hardware. Another problem is the simulation/hardware mismatch that makes it difficult to validate your results. Do I need to mention that VHDL is very hard to parse as well?

After I looked at the options, I decided that it would be a good idea to create my own language. It is not like there aren't a gazillion other languages with varying focuses, but I thought: lets create one that is truly fun to use, with proper IDE support and all that fancy tooling that you're used to when you program something except HDLs. That is why I created my language in XText. From a very sparse language description it not just creates a proper parser, it also creates an AST (abstract syntax tree that represents the input as a tree), but also a quite nice tooling for eclipse.

In the meantime I also created a web interface. The idea is that it allows interested people to take a look at PSHDL without the need to install anything on their computer. After a while I realized that it can be more than just a toy, and this is what I am aiming now for.

Welcome to PSHDL

In this blog I try to outline my motivation for the creation of PSHDL, which by the way stands for Plain Simple Hardware Description Language. As the name itself already suggests, a great emphasis had been put on creating a language that is easy to learn and fun to work with. Unlike VHDL which I hate with a passion. So expect many rumbling rants about VHDL here. If you're thinking: good, I am programming Verilog, well I think that Verilog has the same problems in most cases, so it is just slightly better than VHDL, but not enough to make a difference.
If you have any feedback about PSHDL, don't hesitate to contact me. It is far from being perfect, but every bug report I get, helps to improve it.