<strong>STM32</strong> <strong>Journal</strong> the option of bringing floatingpoint efficiency to an extensive range of low-cost embedded applications. The <strong>STM32</strong> F4 integrates a floating-point unit (FPU) to execute these operations natively in hardware. The FPU is fully compliant with the IEEE.754 standard and has its own 32- bit single-precision registers to handle operands and results. These registers can be viewed as double-word registers to enable more efficient load and store operations. The context of the FPU can be saved to the CPU stack using several methods based on the application architecture and whether registers need to be preserved or not. The FPU supports the five different classes of numbers defined by the 754 standard—normalized, denormalized, zeros, infinites, and NaNs (Not-a-Number). It also supports the five exceptions of the standard—overflow, underflow, inexact, divide by zero and invalid operation—allowing applications to handle operations such as trying to compute the square root of a negative number (i.e., resulting in NaN + invalid operation exception). Exceptions are “untrapped”, meaning that the FPU will return the result as specified by the 754 standard and raise an exception float function1(float number1, float number2) { float temp1, temp2; } temp1 = number1 + number2; temp2 = number1/temp1; return temp2; # float function1(float number1, float number2) # { # float temp1, temp2; # # temp1 = number1 + number2; VADD.F32 S1,S0,S1 # temp2 = number1/temp1; VDIV.F32 S0,S0,S1 # # return temp2; BX LR # } Figure 1 There is a significant reduction in code size when an integrated FPU is available (code on left) than when one is not (code on right). flag. If needed, developers can also use the <strong>STM32</strong> F4 floating-point global interrupt to address the issue. The integrated FPU of the <strong>STM32</strong> F4 offers a number of advantages to embedded designers: 〉〉 Access to the more useful range and precision that floating-point brings 〉〉 Reduced coding complexity by being able to work with numbers in a more natural format FPU assembly code generation 1 assembly instruction 〉〉 Greater throughput compared to software floating-point libraries. 〉〉 Accelerated application development as C code generated by high-level tools can be used without modification or wrappers 〉〉 Smaller code footprint since instructions that used to be multiple lines of code in software libraries are now implemented with a single instruction # float function1(float number1, float number2) # { PUSH {R4,LR} MOVS R4,R0 MOVS R0,R1 MOVS R1,R4 BL __aeabi_fadd MOVS R1,R0 MOVS R0,R4 BL __aeabi_fdiv POP {R4,PC} Call Soft-FPU 〉〉 Simplified debugging as macro calls in floating-point libraries are eliminated Effectively, the <strong>STM32</strong> F4’s FPU reverses the value proposition between fixed- and floating-point for many MCU-based designs. Seamless Integration Figure 1 shows the difference between the assembly code generated when an FPU is available on an MCU as 22
<strong>STM32</strong> <strong>Journal</strong> void GenerateJulia_fpu(uint16_t size_x, uint16_t size_y, uint16_t offset_x, uint16_t offset_y, uint16_t zoom, uint8_t * buffer) { float tmp1, tmp2; float num_real, num_img; float radius; uint8_t i; uint16_t x,y; for (y=0; y