Fused Multiply Add (FMA) units generally reduce delay in the overall circuit and also make it efficient in terms of energy in arithmetic operations. In this paper, the authors present an effective implementation of MFMA (Modified FMA) using pipelining by decreasing the delay compared to the parallel processing of the module. This MFMA uses two desperate Massif which are connected using pipelining. The final operation performed is A*B+C*D+F*G+H*I, parallel processing just implements unto A*B+C*D. The clock limiting stage for both these operations is involved in normalization stage and rounding stage. This paper is related to floating point calculations. Floating point calculations involve, a standard format for representing floating point numbers. The standard format for representing floating point numbers is IEEE 754- 2008.This floating point representation is used here. In this paper the pipelining implementation is mainly related to speed, i.e., delay of the circuit. This paper can be designed using trilogy HDL or VHDL and is simulated and synthesized in XILINX ISE 10.1 of FPGA.