An Automatic Approach to Generate Haste Code from Simulink ...

More documents

Recommendations

Info

Table 2. Comparisons among different block using channels,shared variables, with state or without state implementation.(These results refer to a different implementation of thedesign depicted in Fig. 3 and are in number of gates.)Implementation choices Area [µm 2 ]Registers Channels Variables Memory C-gates TotalX X - 15857.6 1441.0 54829.1X - X 15857.6 490.6 54140.9- X - 0 1134.4 44470.9- - X 0 367.9 43683.8Table 3. Comparisons among different coding styles for thedesign depicted in Fig. 3.Design Tuple Registers Area [µm 2 ]Not Used Used Not Used Used Total/C-gatesX - - X 11454.9/4804.4Datapath- X - X 11883.8/4792.2X - X - 4067.3/438.3- X X - 3670.3/254.5cheap, but they require explicit synchronization betweenreaders and writers in order to avoid data miss and dataduplication, since registers are shared between the writerand the readers.Channels on the other hand automatically synchronizeinput and output actions of modules running in paralleland thereby guarantee a correct timing relationship betweenthe read and write actions. Their implementation is moreexpensive in terms of area than a shared variable (around1.5%, see Tab. 2 for further details). To keep the conversionof the Simulink model to Haste straightforward, we wouldlike to avoid explicit synchronization between modules.Therefore we choose to use channels instead of sharedvariables.A channel is a communication mechanism shared betweendifferent objects with at least one transmitter and at leastone receiver. The implementation of a channel relies on thebundled data approach. This implementation consists of adata part and a control part. The control part takes care ofthe communication protocol and the required delay matchingof the data part.The simplest way to describe the way the blocks communicatein a Simulink diagram is using separate channelsfor each input/output. This solution is straightforward toimplement, but it can be more expensive since every inputhas its own control logic.Haste allows the user to group together data channels,thereby sharing handshake control circuitry. Such a multipledatachannel is called tuple channel. This solution requiresless area. Deadlock can be introduced however due to theIo!IAi?v ; o! A(v)Bi?[[ao,io]]; o! B( ao, io )Figure 4. Example of a Simulink model that can lead to adeadlock (see Fig. 5)fact that all the input communications are synchronizedtogether, therefore not allowing individual completion.A typical example is the one depicted in Fig. 5: blockA needs to have a complete handshake on its inputs tocompute; block I needs to wait until all the blocks fed byits output have captured its value before continuing. Forthis reason, before concluding the communication with A, itneeds to wait for the completion of the communication withB. However, B cannot finish its communication with I untilit receives data from A and this can never happen, sinceA cannot compute until it finishes its input communicationwith I. So the system is stuck waiting for a condition thatwill never happen.4.1.2. Functions or Procedures. A module in Haste canbe described as a fully combinational block or as a blockwith registers (Fig. 6). Data-flow networks usually do notinclude stages (since data is processed from input to outputcontinuously). However, in order to to increase systemthroughput decoupling stages (i.e. registers or latches) canbe required. The results presented in Tab. 3 show a largedifference in terms of area for the two implementations. As adesign trade off exists between area and speed, it is possibleto choose the desired implementation.4.1.3. Register Placement. As previously mentioned, Simulinkmodels do not have the concept of registers as is usualin digital design. Most standard blocks perform operationsregardless of the concept of time. Only a few blocks arerelated to timing events. We will come back to these blockslater.Registers are necessary to achieve performance, but wehave to decide where to insert them. Since each Simulinkblock has only one output, whereas it can have more thanone input, it is natural to insert registers on the outputin order to optimize area. Using the Haste language it isdifficult to describe such an implementation, since when youget data from one or more input channels you have to storethem into registers, and this results in latching the inputs.In the present version of the TiDE flow (5.2) the compilerwill put registers where the designer has inserted them in theHaste description. In the future release (6.0), the compilercan optimize the number of registers automatically given therequired number of decoupling stages. For this reason weOi?v4
B: process( i0?chan [0..255]& i1?chan [0..255]& o?chan [0..255]).begin& ao : var [0..255]& io : var [0..255]|forever do( i0?ao || i1?io ); o! b( ao, io )odendo!IR1+A1+R1−A1−(a)i?v o!A(v) i?[[ao,io]] o!B(ao,io)R2+R1+A2+ A1+A2−(c)R1−A1−R2−R1+A1+R1−A1−B: process(& i?chan [0..255]& o!chan [0..255]).begin& ao : var [0..255]& io : var [0..255]|forever doi? [[ ao, io ]]; o! b( ao, io )odendR+Ra+(b)o!I i?v o!A(v) i?[[ao,io]]waiting forAb+R+Ra+ cannotbe generatedInput HS notyet completed(d)waiting forRa+o!B(ao,io)Figure 5. A valid Simulink-like diagram 4 can be described using separate input channels (a) or with a tupled input (b); thelatter can lead to a deadlock as we can see in the sequence diagram that describes its behavior (d), while the former workscorrectly (c).choose to use the more common way to describe modules(with input registers) and let the compiler decide where toput them.4.2. Sampling BlocksThere are three main blocks in Simulink that deal witha fixed sampling time: the “unit delay”, the “zero orderhold,” and the “rate transition” (Fig. 7). These blocks areoften used to change the input-to-output data rate of agiven function, especially when the system has to deal withinterfaces providing (or requiring) data at slower (or faster)data rates. Such blocks are also used when it is necessaryto explicitly insert a storage element in a design (e.g. for anaccumulator).The “unit delay” block acts as a memory element, whichcan also oversample the input data in order to increase theoutput data rate. The “zero order hold” block can reducethe output data rate. Finally the “rate transition” block is asuper set of the previous one. There are also other blocks(like Buffer and Unbuffer) that are not taken into accountnow, since these are used less frequently than the previouslycited ones.These blocks are often used in Simulink diagrams for twomain purposes:• introducing an explicit storage element in a design (e.g.an accumulator, a decoupling register in loops, . . . );• an adaptation to different rates in multi-rated systems(e.g. high-speed ADC or DAC interfaced with lowerspeed circuitry or vice versa).In the synchronous implementation, these blocks need toboth sample and generate data at a given time, accordingto their parameters (input and output sampling time) toguarantee the same behavior of the Simulink model. Alsoin the asynchronous version we need the same behavior andthis can be achieved in two different ways:• to introduce, in each of these blocks, a clock signalwhich can be used to derive the desired timing relationships;• to move the clock interface only to the input blocks,5
Page 3: generated by CodeSimulink, in order
Page 7 and 8: Simulink ModelHDL CodeCodeSimulink
Page 9 and 10: & VECTOR_16& VECTOR_17& VECTOR_32=

An Automatic Approach to Generate Haste Code from Simulink ...

Create successful ePaper yourself

Delete template?

Save as template?