Type: Conference paper

Architectural Insights: Comparing Weight Stationary and Output Stationary Systolic Arrays for Efficient Computation

Journal: 2025 29th International Computer Conference, Computer Society of Iran, CSICC 2025 ()Year: 2024Volume: Issue: Pages: 146 - 150

Kalbasi M.Kalbasi M.^a

a :University of Isfahan - IRAN(IR) - Isfahan

DOI:10.1109/IKT65497.2024.10892683Language: English

Abstract

This paper compares two prevalent architectures in systolic arrays: weight stationary and output stationary methods. Systolic arrays utilize interconnected processing elements (PEs) to perform parallel processing, making them suitable for applications in digital signal processing, image processing, and machine learning. We focus on their implementation of 2D matrix multiplication, a fundamental operation in neural networks. Simulations were conducted using Verilog HDL within the Xilinx Vivado Design Suite 2019, employing a 3x1 input matrix and a 3x3 weight matrix. Results confirmed the functionality of both architectures, with output matrices matching expected results. Weight stationary designs minimized data movement, while output stationary designs enhanced throughput through effective input data reuse. Furthermore, this research demonstrates that the critical path remains constant despite increases in the number of processing units, providing valuable insights for future architectural designs. With a critical path delay of approximately 8.8 ns, corresponding to a maximum frequency of about 113 MHz, the study highlights that the critical path remains stable when scaling the number of PEs. Overall, this research validates the effectiveness of both architectures in high-performance matrix operations, offering valuable insights for future systolic array designs. © 2024 IEEE.

Author Keywords

Convolutional Neural NetworksOutput stationarySystolic arraysWeight stationary

Other Keywords

Input output programsIntegrated circuit designMatrix algebraConvolutional neural networkCritical PathsDigital signalsEfficient computationInterconnected processingOutput stationaryParallel processingProcessing elementsSignal-processingWeight stationarySystolic arrays