FPGA realization of FIR filters by efficient and flexible systolization using distributed arithmetic

PK Meher, S Chandrasekaran, A Amira

    Research output: Contribution to journalArticle

    105 Citations (Scopus)

    Abstract

    In this paper, we present the design optimization of one- and two-dimensional fully pipelined computing structures for area-delay-power-efficient implementation of finite-impulse-response (FIR) filter by systolic decomposition of distributed arithmetic (DA)-based inner-product computation. The systolic decomposition scheme is found to offer a flexible choice of the address length of the lookup tables (LUT) for DA-based computation to decide on suitable area time tradeoff. It is observed that by using smaller address lengths for DA-based computing units, it is possible to reduce the memory size, but on the other hand that leads to increase of adder complexity and the latency. For efficient DA-based realization of FIR filters of different orders, the flexible linear systolic design is implemented on a Xilinx Virtex-E XCV2000E FPGA using a hybrid combination of Handel-C and parameterizable VHDL cores. Various key performance metrics such as number of slices, maximum usable frequency, dynamic power consumption, energy density, and energy throughput are estimated for different filter orders and address lengths. Analysis of the results obtained indicate that performance metrics of the proposed implementation is broadly in line with theoretical expectations. It is found that the choice of address length M = 4 yields the best of area-delay-power-efficient realizations of the FIR filter for various filter orders. Moreover, the proposed FPGA implementation is found to involve significantly less area-delay complexity compared with the existing DA-based implementations of FIR filter.
    LanguageEnglish
    Pages3009-3017
    JournalIEEE Transactions on Signal Processing
    Volume56
    Issue number7, Par
    DOIs
    Publication statusPublished - Jul 2008

    Fingerprint

    FIR filters
    Field programmable gate arrays (FPGA)
    Decomposition
    Computer hardware description languages
    Table lookup
    Adders
    Electric power utilization
    Throughput
    Data storage equipment

    Keywords

    • distributed arithmetic
    • field-programmable gate arrays (FPGA)
    • finite-impulse-response (FIR) filter
    • linear convolution
    • systolic array

    Cite this

    Meher, PK ; Chandrasekaran, S ; Amira, A. / FPGA realization of FIR filters by efficient and flexible systolization using distributed arithmetic. 2008 ; Vol. 56, No. 7, Par. pp. 3009-3017.
    @article{9713cbcf1d6844b9aead6e20012b1a71,
    title = "FPGA realization of FIR filters by efficient and flexible systolization using distributed arithmetic",
    abstract = "In this paper, we present the design optimization of one- and two-dimensional fully pipelined computing structures for area-delay-power-efficient implementation of finite-impulse-response (FIR) filter by systolic decomposition of distributed arithmetic (DA)-based inner-product computation. The systolic decomposition scheme is found to offer a flexible choice of the address length of the lookup tables (LUT) for DA-based computation to decide on suitable area time tradeoff. It is observed that by using smaller address lengths for DA-based computing units, it is possible to reduce the memory size, but on the other hand that leads to increase of adder complexity and the latency. For efficient DA-based realization of FIR filters of different orders, the flexible linear systolic design is implemented on a Xilinx Virtex-E XCV2000E FPGA using a hybrid combination of Handel-C and parameterizable VHDL cores. Various key performance metrics such as number of slices, maximum usable frequency, dynamic power consumption, energy density, and energy throughput are estimated for different filter orders and address lengths. Analysis of the results obtained indicate that performance metrics of the proposed implementation is broadly in line with theoretical expectations. It is found that the choice of address length M = 4 yields the best of area-delay-power-efficient realizations of the FIR filter for various filter orders. Moreover, the proposed FPGA implementation is found to involve significantly less area-delay complexity compared with the existing DA-based implementations of FIR filter.",
    keywords = "distributed arithmetic, field-programmable gate arrays (FPGA), finite-impulse-response (FIR) filter, linear convolution, systolic array",
    author = "PK Meher and S Chandrasekaran and A Amira",
    year = "2008",
    month = "7",
    doi = "10.1109/TSP.2007.914926",
    language = "English",
    volume = "56",
    pages = "3009--3017",
    number = "7, Par",

    }

    FPGA realization of FIR filters by efficient and flexible systolization using distributed arithmetic. / Meher, PK; Chandrasekaran, S; Amira, A.

    Vol. 56, No. 7, Par, 07.2008, p. 3009-3017.

    Research output: Contribution to journalArticle

    TY - JOUR

    T1 - FPGA realization of FIR filters by efficient and flexible systolization using distributed arithmetic

    AU - Meher, PK

    AU - Chandrasekaran, S

    AU - Amira, A

    PY - 2008/7

    Y1 - 2008/7

    N2 - In this paper, we present the design optimization of one- and two-dimensional fully pipelined computing structures for area-delay-power-efficient implementation of finite-impulse-response (FIR) filter by systolic decomposition of distributed arithmetic (DA)-based inner-product computation. The systolic decomposition scheme is found to offer a flexible choice of the address length of the lookup tables (LUT) for DA-based computation to decide on suitable area time tradeoff. It is observed that by using smaller address lengths for DA-based computing units, it is possible to reduce the memory size, but on the other hand that leads to increase of adder complexity and the latency. For efficient DA-based realization of FIR filters of different orders, the flexible linear systolic design is implemented on a Xilinx Virtex-E XCV2000E FPGA using a hybrid combination of Handel-C and parameterizable VHDL cores. Various key performance metrics such as number of slices, maximum usable frequency, dynamic power consumption, energy density, and energy throughput are estimated for different filter orders and address lengths. Analysis of the results obtained indicate that performance metrics of the proposed implementation is broadly in line with theoretical expectations. It is found that the choice of address length M = 4 yields the best of area-delay-power-efficient realizations of the FIR filter for various filter orders. Moreover, the proposed FPGA implementation is found to involve significantly less area-delay complexity compared with the existing DA-based implementations of FIR filter.

    AB - In this paper, we present the design optimization of one- and two-dimensional fully pipelined computing structures for area-delay-power-efficient implementation of finite-impulse-response (FIR) filter by systolic decomposition of distributed arithmetic (DA)-based inner-product computation. The systolic decomposition scheme is found to offer a flexible choice of the address length of the lookup tables (LUT) for DA-based computation to decide on suitable area time tradeoff. It is observed that by using smaller address lengths for DA-based computing units, it is possible to reduce the memory size, but on the other hand that leads to increase of adder complexity and the latency. For efficient DA-based realization of FIR filters of different orders, the flexible linear systolic design is implemented on a Xilinx Virtex-E XCV2000E FPGA using a hybrid combination of Handel-C and parameterizable VHDL cores. Various key performance metrics such as number of slices, maximum usable frequency, dynamic power consumption, energy density, and energy throughput are estimated for different filter orders and address lengths. Analysis of the results obtained indicate that performance metrics of the proposed implementation is broadly in line with theoretical expectations. It is found that the choice of address length M = 4 yields the best of area-delay-power-efficient realizations of the FIR filter for various filter orders. Moreover, the proposed FPGA implementation is found to involve significantly less area-delay complexity compared with the existing DA-based implementations of FIR filter.

    KW - distributed arithmetic

    KW - field-programmable gate arrays (FPGA)

    KW - finite-impulse-response (FIR) filter

    KW - linear convolution

    KW - systolic array

    U2 - 10.1109/TSP.2007.914926

    DO - 10.1109/TSP.2007.914926

    M3 - Article

    VL - 56

    SP - 3009

    EP - 3017

    IS - 7, Par

    ER -