Accelerating 3D-FFT Using Hard Embedded Blocks in FPGAs

B. S. C. Varma, K. Paul, M. Balakrishnan

Research output: Chapter in Book/Report/Conference proceedingConference contribution

7 Citations (Scopus)

Abstract

Three dimensional Fast Fourier Transform (3D-FFT) is popularly used in many scientific applications in various domains like image processing, bioinformatics and molecular dynamics. Typically 3D-FFT computation takes significant part of the execution time of these applications. In order to speedup these applications, it becomes necessary to accelerate 3D-FFT computation. 3D-FFT can be accelerated using Field Programmable Gate Array (FPGA) based accelerators. But speedup always may not be possible as FPGAs run at slower clock frequency vis-a-vis processors and the resources available in an FPGA device might not be sufficient for the implementation of a sufficient number of copies of the processing elements to compensate for the loss of clock frequency. FPGAs with heterogeneous mix of coarse grained hard blocks along with programmable soft logic, can facilitate implementing a much larger number of processing elements and thus achieve much higher speedups. Modern FPGAs do consist of different heterogeneous hard embedded blocks (HEBs) like multipliers, DSP blocks and memory units. It is easy to predict that many more such hard blocks will be embedded into future FPGAs. The evaluation approach to identify and incorporate HEBs is complex as there are many parameters and constraints like area, granularity routing resources, etc. that need to be considered in an integrated manner to get an efficient implementation. In this paper we show acceleration of 3D-FFT using future fabrics incorporating HEBs. By using these fabrics we show speedups of upto 1900x for 2048 point FFT. We also present an evaluation methodology to design future FPGA fabrics incorporating accelerators as hard embedded blocks. This methodology will be useful for i selection of blocks to be embedded into the fabric and ii evaluating the performance gain that can be achieved by such an embedding.
LanguageEnglish
Title of host publication2013 26th International Conference on VLSI Design and 2013 12th International Conference on Embedded Systems
Pages92-97
Number of pages6
DOIs
Publication statusPublished - 1 Jan 2013

Fingerprint

Fast Fourier transforms
Field programmable gate arrays (FPGA)
Particle accelerators
Clocks
Bioinformatics
Processing
Molecular dynamics
Image processing
Data storage equipment

Keywords

  • clocks
  • digital signal processing chips
  • fast Fourier transforms
  • field programmable gate arrays
  • 3D-FFT
  • DSP blocks
  • FPGA
  • bioinformatics
  • clock frequency
  • coarse grained hard blocks
  • field programmable gate array
  • heterogeneous hard embedded blocks
  • image processing
  • memory units
  • molecular dynamics
  • multipliers
  • programmable soft logic
  • three dimensional fast Fourier transform
  • vis-a-vis processors
  • Acceleration
  • Bandwidth
  • Clocks
  • Digital signal processing
  • Fabrics
  • Field programmable gate arrays
  • Random access memory
  • FPGA based Acceleration
  • Hard Embedded Blocks

Cite this

Varma, B. S. C., Paul, K., & Balakrishnan, M. (2013). Accelerating 3D-FFT Using Hard Embedded Blocks in FPGAs. In 2013 26th International Conference on VLSI Design and 2013 12th International Conference on Embedded Systems (pp. 92-97) https://doi.org/10.1109/VLSID.2013.169
Varma, B. S. C. ; Paul, K. ; Balakrishnan, M. / Accelerating 3D-FFT Using Hard Embedded Blocks in FPGAs. 2013 26th International Conference on VLSI Design and 2013 12th International Conference on Embedded Systems. 2013. pp. 92-97
@inproceedings{c21185d635fd4aeb8dfb0a993c824524,
title = "Accelerating 3D-FFT Using Hard Embedded Blocks in FPGAs",
abstract = "Three dimensional Fast Fourier Transform (3D-FFT) is popularly used in many scientific applications in various domains like image processing, bioinformatics and molecular dynamics. Typically 3D-FFT computation takes significant part of the execution time of these applications. In order to speedup these applications, it becomes necessary to accelerate 3D-FFT computation. 3D-FFT can be accelerated using Field Programmable Gate Array (FPGA) based accelerators. But speedup always may not be possible as FPGAs run at slower clock frequency vis-a-vis processors and the resources available in an FPGA device might not be sufficient for the implementation of a sufficient number of copies of the processing elements to compensate for the loss of clock frequency. FPGAs with heterogeneous mix of coarse grained hard blocks along with programmable soft logic, can facilitate implementing a much larger number of processing elements and thus achieve much higher speedups. Modern FPGAs do consist of different heterogeneous hard embedded blocks (HEBs) like multipliers, DSP blocks and memory units. It is easy to predict that many more such hard blocks will be embedded into future FPGAs. The evaluation approach to identify and incorporate HEBs is complex as there are many parameters and constraints like area, granularity routing resources, etc. that need to be considered in an integrated manner to get an efficient implementation. In this paper we show acceleration of 3D-FFT using future fabrics incorporating HEBs. By using these fabrics we show speedups of upto 1900x for 2048 point FFT. We also present an evaluation methodology to design future FPGA fabrics incorporating accelerators as hard embedded blocks. This methodology will be useful for i selection of blocks to be embedded into the fabric and ii evaluating the performance gain that can be achieved by such an embedding.",
keywords = "clocks, digital signal processing chips, fast Fourier transforms, field programmable gate arrays, 3D-FFT, DSP blocks, FPGA, bioinformatics, clock frequency, coarse grained hard blocks, field programmable gate array, heterogeneous hard embedded blocks, image processing, memory units, molecular dynamics, multipliers, programmable soft logic, three dimensional fast Fourier transform, vis-a-vis processors, Acceleration, Bandwidth, Clocks, Digital signal processing, Fabrics, Field programmable gate arrays, Random access memory, FPGA based Acceleration, Hard Embedded Blocks",
author = "Varma, {B. S. C.} and K. Paul and M. Balakrishnan",
year = "2013",
month = "1",
day = "1",
doi = "10.1109/VLSID.2013.169",
language = "English",
pages = "92--97",
booktitle = "2013 26th International Conference on VLSI Design and 2013 12th International Conference on Embedded Systems",

}

Varma, BSC, Paul, K & Balakrishnan, M 2013, Accelerating 3D-FFT Using Hard Embedded Blocks in FPGAs. in 2013 26th International Conference on VLSI Design and 2013 12th International Conference on Embedded Systems. pp. 92-97. https://doi.org/10.1109/VLSID.2013.169

Accelerating 3D-FFT Using Hard Embedded Blocks in FPGAs. / Varma, B. S. C.; Paul, K.; Balakrishnan, M.

2013 26th International Conference on VLSI Design and 2013 12th International Conference on Embedded Systems. 2013. p. 92-97.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Accelerating 3D-FFT Using Hard Embedded Blocks in FPGAs

AU - Varma, B. S. C.

AU - Paul, K.

AU - Balakrishnan, M.

PY - 2013/1/1

Y1 - 2013/1/1

N2 - Three dimensional Fast Fourier Transform (3D-FFT) is popularly used in many scientific applications in various domains like image processing, bioinformatics and molecular dynamics. Typically 3D-FFT computation takes significant part of the execution time of these applications. In order to speedup these applications, it becomes necessary to accelerate 3D-FFT computation. 3D-FFT can be accelerated using Field Programmable Gate Array (FPGA) based accelerators. But speedup always may not be possible as FPGAs run at slower clock frequency vis-a-vis processors and the resources available in an FPGA device might not be sufficient for the implementation of a sufficient number of copies of the processing elements to compensate for the loss of clock frequency. FPGAs with heterogeneous mix of coarse grained hard blocks along with programmable soft logic, can facilitate implementing a much larger number of processing elements and thus achieve much higher speedups. Modern FPGAs do consist of different heterogeneous hard embedded blocks (HEBs) like multipliers, DSP blocks and memory units. It is easy to predict that many more such hard blocks will be embedded into future FPGAs. The evaluation approach to identify and incorporate HEBs is complex as there are many parameters and constraints like area, granularity routing resources, etc. that need to be considered in an integrated manner to get an efficient implementation. In this paper we show acceleration of 3D-FFT using future fabrics incorporating HEBs. By using these fabrics we show speedups of upto 1900x for 2048 point FFT. We also present an evaluation methodology to design future FPGA fabrics incorporating accelerators as hard embedded blocks. This methodology will be useful for i selection of blocks to be embedded into the fabric and ii evaluating the performance gain that can be achieved by such an embedding.

AB - Three dimensional Fast Fourier Transform (3D-FFT) is popularly used in many scientific applications in various domains like image processing, bioinformatics and molecular dynamics. Typically 3D-FFT computation takes significant part of the execution time of these applications. In order to speedup these applications, it becomes necessary to accelerate 3D-FFT computation. 3D-FFT can be accelerated using Field Programmable Gate Array (FPGA) based accelerators. But speedup always may not be possible as FPGAs run at slower clock frequency vis-a-vis processors and the resources available in an FPGA device might not be sufficient for the implementation of a sufficient number of copies of the processing elements to compensate for the loss of clock frequency. FPGAs with heterogeneous mix of coarse grained hard blocks along with programmable soft logic, can facilitate implementing a much larger number of processing elements and thus achieve much higher speedups. Modern FPGAs do consist of different heterogeneous hard embedded blocks (HEBs) like multipliers, DSP blocks and memory units. It is easy to predict that many more such hard blocks will be embedded into future FPGAs. The evaluation approach to identify and incorporate HEBs is complex as there are many parameters and constraints like area, granularity routing resources, etc. that need to be considered in an integrated manner to get an efficient implementation. In this paper we show acceleration of 3D-FFT using future fabrics incorporating HEBs. By using these fabrics we show speedups of upto 1900x for 2048 point FFT. We also present an evaluation methodology to design future FPGA fabrics incorporating accelerators as hard embedded blocks. This methodology will be useful for i selection of blocks to be embedded into the fabric and ii evaluating the performance gain that can be achieved by such an embedding.

KW - clocks

KW - digital signal processing chips

KW - fast Fourier transforms

KW - field programmable gate arrays

KW - 3D-FFT

KW - DSP blocks

KW - FPGA

KW - bioinformatics

KW - clock frequency

KW - coarse grained hard blocks

KW - field programmable gate array

KW - heterogeneous hard embedded blocks

KW - image processing

KW - memory units

KW - molecular dynamics

KW - multipliers

KW - programmable soft logic

KW - three dimensional fast Fourier transform

KW - vis-a-vis processors

KW - Acceleration

KW - Bandwidth

KW - Clocks

KW - Digital signal processing

KW - Fabrics

KW - Field programmable gate arrays

KW - Random access memory

KW - FPGA based Acceleration

KW - Hard Embedded Blocks

U2 - 10.1109/VLSID.2013.169

DO - 10.1109/VLSID.2013.169

M3 - Conference contribution

SP - 92

EP - 97

BT - 2013 26th International Conference on VLSI Design and 2013 12th International Conference on Embedded Systems

ER -

Varma BSC, Paul K, Balakrishnan M. Accelerating 3D-FFT Using Hard Embedded Blocks in FPGAs. In 2013 26th International Conference on VLSI Design and 2013 12th International Conference on Embedded Systems. 2013. p. 92-97 https://doi.org/10.1109/VLSID.2013.169