Shivangi Tiwari, Nitin Meena
In this paper, we have proposed one designs for matrix-matrix multiplication. The one design differs by hardware complexity, throughput rate and different input/output data format to match different application needs. We have compared the proposed designs with the existing similar design and found that, the proposed designs offer higher throughput rate at relatively lower hardware cost. We have synthesized the proposed design and the existing design using Synopsys tools. Synthesis results shows that proposed design on average consumes nearly 30% less energy than the existing design and involves nearly 70% less area-delay-product than other. Interestingly, the proposed parallel-parallel input and single output (PPI-SO) structure consumes 40% less energy than the existing structure.