Accelerating matrix product on reconfigurable hardware for image processing applications

F Bensaali, A Amira, A Bouridane

    Research output: Contribution to journalArticle

    24 Citations (Scopus)

    Abstract

    Matrix multiplication is very important in many types of applications including image and signal processing. The suitability of reconfigurable hardware devices, in the form of field programmable gate arrays (FPGAs), is investigated as a low-cost solution for implementing two matrix multipliers for 3-D affine transformations and colour space conversion. A first solution based on processing large matrix multiplication, for large 3-D models, and for the evaluation of the Celoxica fixed-point library and Xilinx CoreGen performance has been reported * A novel architecture for efficient implementation of a colour space converter (CSC) based on distributed arithmetic (DA) principles has been presented. The two multipliers have been developed and implemented-on the RC1000-PP Celoxica board-based development platform. Results show that the FPGA-based first parallel multiplier can achieve the performance of a graphics card when performing 3-D affine transformations, while the second multiplier, which is fully pipelined and platform-independent, has a low latency (8 cycles) and is capable of a sustained data rate of over 234 mega-conversions per second.
    LanguageEnglish
    Pages236-246
    JournalIEE Proceedings - Circuits, Devices and Systems
    Volume152
    Issue number3
    DOIs
    Publication statusPublished - Jun 2005

    Fingerprint

    Reconfigurable hardware
    Field programmable gate arrays (FPGA)
    Image processing
    Color
    Signal processing
    Processing
    Costs

    Cite this

    Bensaali, F ; Amira, A ; Bouridane, A. / Accelerating matrix product on reconfigurable hardware for image processing applications. 2005 ; Vol. 152, No. 3. pp. 236-246.
    @article{75d2235a43af4df9a1102b01475b968f,
    title = "Accelerating matrix product on reconfigurable hardware for image processing applications",
    abstract = "Matrix multiplication is very important in many types of applications including image and signal processing. The suitability of reconfigurable hardware devices, in the form of field programmable gate arrays (FPGAs), is investigated as a low-cost solution for implementing two matrix multipliers for 3-D affine transformations and colour space conversion. A first solution based on processing large matrix multiplication, for large 3-D models, and for the evaluation of the Celoxica fixed-point library and Xilinx CoreGen performance has been reported * A novel architecture for efficient implementation of a colour space converter (CSC) based on distributed arithmetic (DA) principles has been presented. The two multipliers have been developed and implemented-on the RC1000-PP Celoxica board-based development platform. Results show that the FPGA-based first parallel multiplier can achieve the performance of a graphics card when performing 3-D affine transformations, while the second multiplier, which is fully pipelined and platform-independent, has a low latency (8 cycles) and is capable of a sustained data rate of over 234 mega-conversions per second.",
    author = "F Bensaali and A Amira and A Bouridane",
    year = "2005",
    month = "6",
    doi = "10.1049/ip-cds:20040838",
    language = "English",
    volume = "152",
    pages = "236--246",
    number = "3",

    }

    Accelerating matrix product on reconfigurable hardware for image processing applications. / Bensaali, F; Amira, A; Bouridane, A.

    Vol. 152, No. 3, 06.2005, p. 236-246.

    Research output: Contribution to journalArticle

    TY - JOUR

    T1 - Accelerating matrix product on reconfigurable hardware for image processing applications

    AU - Bensaali, F

    AU - Amira, A

    AU - Bouridane, A

    PY - 2005/6

    Y1 - 2005/6

    N2 - Matrix multiplication is very important in many types of applications including image and signal processing. The suitability of reconfigurable hardware devices, in the form of field programmable gate arrays (FPGAs), is investigated as a low-cost solution for implementing two matrix multipliers for 3-D affine transformations and colour space conversion. A first solution based on processing large matrix multiplication, for large 3-D models, and for the evaluation of the Celoxica fixed-point library and Xilinx CoreGen performance has been reported * A novel architecture for efficient implementation of a colour space converter (CSC) based on distributed arithmetic (DA) principles has been presented. The two multipliers have been developed and implemented-on the RC1000-PP Celoxica board-based development platform. Results show that the FPGA-based first parallel multiplier can achieve the performance of a graphics card when performing 3-D affine transformations, while the second multiplier, which is fully pipelined and platform-independent, has a low latency (8 cycles) and is capable of a sustained data rate of over 234 mega-conversions per second.

    AB - Matrix multiplication is very important in many types of applications including image and signal processing. The suitability of reconfigurable hardware devices, in the form of field programmable gate arrays (FPGAs), is investigated as a low-cost solution for implementing two matrix multipliers for 3-D affine transformations and colour space conversion. A first solution based on processing large matrix multiplication, for large 3-D models, and for the evaluation of the Celoxica fixed-point library and Xilinx CoreGen performance has been reported * A novel architecture for efficient implementation of a colour space converter (CSC) based on distributed arithmetic (DA) principles has been presented. The two multipliers have been developed and implemented-on the RC1000-PP Celoxica board-based development platform. Results show that the FPGA-based first parallel multiplier can achieve the performance of a graphics card when performing 3-D affine transformations, while the second multiplier, which is fully pipelined and platform-independent, has a low latency (8 cycles) and is capable of a sustained data rate of over 234 mega-conversions per second.

    U2 - 10.1049/ip-cds:20040838

    DO - 10.1049/ip-cds:20040838

    M3 - Article

    VL - 152

    SP - 236

    EP - 246

    IS - 3

    ER -