The discrete cosine transform (DCT) has been widely used in areas of speech and audio/video data compression. There are two traditional approaches to implementation of the DCT, implementations using butterfly structures or systolic arrays.Systolic array uses parallel pseudo-circular correlation structures as basic computational forms. The proposed algorithm can be mapped onto two linear systolic arrays with similar length and form that have a small number of I/O bandwidth that can be efficiently implemented into a VLSI chip. A highly efficient VLSI chip can be thus obtained that has good performances in the architectural topology, processing speed, hardware complexity and I/O costs and outperforms others especially in throughput. Here, we describe systolic array architectures for computation of the one-dimensional (1-D) Type-IV DCT. The proposed architectures employ simple PE's that require real multiplications and additions. They generate outputs sequentially with short computation time.