Monday, April 7, 2014

The future of PSHDL part 2 (modules and sequential behavior)

This is part 2 of the improvements that I plan for the next language release v0.2. The first part can be found here.

Thoughts about modules and sequential behavior

One of the questions I was asked during my presentation of PSHDL at the 30C3 was about creating a catalog of easily usable IP cores. After all, this is key to the success of Arduino, without its library it would only be a nice looking IDE, but not the success it is now. So this question is really a key-point in making PSHDL the Arduino for FPGAs.

When you take a look at OpenCores you will find plenty of cores that are freely available, but using them is hard. They are hard for multiple reasons. The first being documentation. After one spend such a long time developing an IP core that works, you really have to motivate yourself to write extensive documentation that allows others to make use of it. This usually includes lengthy documents about the usage scenario, the input data format, the output data format, the control signals, the expected flow of signals and many others. Those are usually described in english, which, depending on the author, can result in ambiguous descriptions.

So the best way to package an ip core is by having as many parameters computer parseable as possible. After all the language that most developers speak, is the language that they have written the IP core in. While some part of an IP core are very easy to formalize, such as the ports, other are harder to do. For example, when you want to describe the timing that is required by a module.

Everyone knows and uses state machines for describing sequential behavior in hardware. Those however can become very annoying when you have to interact with other state machines. Lets pretend we invented a very useful FPU that consumes a varying amount of time, depending on the operation. We also want to re-use that one FPU because because it is expensive. So for a simple math function like: f(x)=(a*b+c)^2 we have to write a state machine like this:

enum OpTypes={ADD, SUB, MUL};
interface FPU {
    in bit<32> a;
    in bit<32> b;
    in bit start;
    in enum OpTypes op;
    out bit<32> res;
    out bit done;
}

module MulAddSqr {
    FPU fpu;
    enum FunctionState={IDLE, MUL_START, MUL_WAIT, ADD_START, ADD_WAIT, SQR_START, SQR_WAIT};
    register enum FunctionState state;
    in bit<32> a;
    in bit<32> b;
    in bit<32> c;
    out register bit<32> res;
    in bit start;
    out bit done=state==IDLE;
    switch (state) {
        case IDLE:
            if (start)
                state=MUL_START;
        case MUL_START:
            fpu.a=a;
            fpu.b=b;
            fpu.op=MUL;
            fpu.start=1;
            state=MUL_WAIT;
        case MUL_WAIT:
            fpu.start=0;
            if (fpu.done) {
                state=ADD_START;
            }
        case ADD_START:
            fpu.a=fpu.res;
            fpu.b=c;
            fpu.op=ADD;
            fpu.start=1;
            state=ADD_WAIT;
        case ADD_WAIT:
            fpu.start=0;
            if (fpu.done) {
                state=SQR_START;
            }
        case SQR_START:
            fpu.a=fpu.res;
            fpu.b=fpu.res;
            fpu.op=MUL;
            fpu.start=1;
            state=SQR_WAIT;
        case SQR_WAIT:
            fpu.start=0;
            if (fpu.done) {
                state=IDLE;
                res=fpu.res;
            }
        default:
    }
}

That is a lot of code for something rather trivial. The reason this code is longer than it has to be, is the description of the state machine. It is not like we're really interested in what state the state machine is in, but really just that something is happening sequentially. Wouldn't it be awesome to be able to write something like this: (This is not yet implemented and subject to change, or maybe I will never implement it at all)

//... FPU und OptType declaration remain the same

statemachine bit<32> do(interface<FPU> fpu, -bit<32> a, -bit<32> b, -enum<OpTypes> op) {
    {
        fpu.a=a;
        fpu.b=b;
        fpu.op=op;
        fpu.start=1;
        nextState();
    }
    {
        if (fpu.done)
            return fpu.res;
    }   
}

module MulAddSqr {
    FPU fpu;
    in bit<32> a;
    in bit<32> b;
    in bit<32> c;
    out register bit<32> res;
    in bit start;
    out bit done=0; 
    statemachine do fpu_ctrl;
    statemachine mulAddSqr{
        $idle: {
            done=1;
            if (start==1)
                nextState();
        }       
        fpu_ctrl.run(fpu, a, b, OpTypes.MUL);
        fpu_ctrl.run(fpu, fpu.res, c, OpTypes.ADD);
        res=fpu_ctrl.run(fpu, fpu.res, fpu.res, OpTypes.MUL);
    }
}

Here I combine few things. Let's start with the state-machine keyword. Unlike a switch the state-machine does not have case labels. Instead every statement becomes a unique automatically generated label. If you want to move within certain states, you can optionally declare a label and use the nextState function with it. If you simply want to continue to the next state, you will have to call nextState without argument.

Internally state machines will be turned into modules. The inline state-machine mulAddSqr is replaced with the following equivalent code:

register enum mulAddSqr_states {$idle, 
    state_1_run, state_1_wait,
    state_2_run, state_2_wait, 
    state_3_run, state_3_wait } mulAddSqr_state;
enum mulAddSqr_states $nextState;
switch (mulAddSqr_state) {
    case $idle: {
        $nextState=state_1;
        done=1;
        if (start==1)
            mulAddSqr_state=$nextState;
    }
    case state_1_run: {
        $nextState=state_1_wait;
        do.fpu=fpu;
        do.a=a;
        do.b=b;
        do.op=OpTypes.Mul;
        do.run=1;
        mulAddSqr_state=$nextState;
    case state_1_wait: {
        $nextState=state_2_run;
        if (do.done)
            mulAddSqr_state=$nextState;
    }
}

The function like state-machine do is equivalent to the following module:

module do {
    @smStart
    in bit start;
    @smOp("a")
    in bit<32> a;
    @smOp("b")
    in bit<32> b;
    @smOp("op")
    in enum OpTypes op;
    @smDone
    out bit done;
    @smResult
    out bit<32> result;
    @smOp("fpu")
    import record FPU fpu;

    register enum states {$idle, state_1, state_2} state;
    enum states $nextState;
    switch (state) {
        case $idle: {
            $nextState=state_1;
            if (start)
                state=$nextState;
        } 
        case state_1: {
            $nextState=state_2;
            fpu.a=a;
            fpu.b=b;
            fpu.op=op;
            fpu.start=1;
            state=$nextState;
        }
        case state_2: {
            $nextState=$idle;
            if (fpu.done){
                result=fpu.res;
                done=1;
                state=$idle;
            }
        }
    }
}

There are still some issues left to investigate, but I have people working on that. I think the most important aspect of all this is that you can write re-usable state-machines and create sequential behavior much easier.

Tuesday, April 1, 2014

The future of PSHDL (part 1)

With the PSHDL board campaign running, I think it is important to take a look at the future of PSHDL. In a series of posts I will show what I am working on right now and what can be expected to be realized within the next few month.

PSHDL Language features for V0.2

While I am busy with fixing the bugs that are being reported, I am also thinking about the next language features that I want to implement. In this blog entry I want to give a little preview of what I have in mind. Everything mentioned here is work in progress and subject to change, but I would be interested in what you think.

Any width types

One of the things that I see rather frequently are code snippets like this:

in bit<16> addr;
out uint<16> bla;
bla=(uint)addr;

This code does not do what the author intended it do to. It transforms the 16 bit value into a 32 bit integer, and then back to a 16 bit integer. Fortunately the synthesis is smart enough to ignore this, but when you replace 16 with 64, things can get ugly. So a new type will be introduced, the any width type, which allows to write something like this:

in bit<16> addr;
out uint<16> bla;
int<> temp=bla;
bla=(uint<>)addr;

The new type takes the width of the right-hand side and simple changes the value interpretation. It can also be used to create temporary new signals, but those signals are only allowed to be written exactly once, with the declaration.

Records or structs

Sometimes it makes sense to keep things together that belong together. For example an SPI Bus can have an interface like this:

interface SPI {
    in bit miso;
    out bit mosi;
    out bit sclk;
    out bit ss_n;
}

Now, if you want to conect some SPI busses internally, you would need to write something like this:

testbench SPITest {
    SPIMaster dut;
    SPISlave dummy;
    dut.miso=dummy.miso;
    dummy.mosi=dut.mosi;
    dummy.sclk=dut.sclk;
    dummy.ss_n=dut.ss_n;
}

With a record you could do something like this:

testbench SPITest {
    SPIMaster dut;
    SPISlave dummy;
    record SPI bus;
    bus.connectTo(dut);
    dummy.connectTo(bus);
}

Only signals with the same type, width and name are connected. The direction of all signals has to be the same or the opposite of the record. With that rule one might actually write:

testbench SPITest {
    SPIMaster dut;
    SPISlave dummy;
    dut.connectTo(dummy);
}

With the records another new feature can be implemented...

Conditional instances

When you design a library, for example a clock divider IP core, chances are that you will have to use a vendor specific IP core.

interface IClockDivider {
    in bit clk;
    out bit scaledClk;
}

module ClockDivider {
    export record IClockDivider div;
    switch (vendor)
        case Xilinx:
            import xilinx.*;
            PLL pll;
            pll.clock=div.clk;
            div.scaledClk=pll.clkX;
        default:
            assert("Only supporting Xilinx");
    }
}

The export keyword would make the record appear as regular signals on the module. The vendor is an enum, that is defined in pshdl.* namespace, whose value is specified via synthesis settings to the compiler.

Combined declaration and instantiation

Another simplification is that an enum can be declared in and instantiated at the same time. This eases the default case when you want to use your enum for a state-machine immediately.

register enum X {A,B} inst;
interface VHDL.work.BlĂ­nk {
    in bit clock;
    in bit reset;
    out bit led;
} blink;

To the future and beyond!

Another very important feature that is being worked on are re-usable modules. This is something that dedicates its own chapter and will be posted in the future.