Abstract
Unifying the forward and inverse operations of the number theoretic transform (NTT) into a single hardware module is a common practice when designing polynomial coefficient multiplier accelerators as used in the post-quantum cryptographic algorithms. This letter experimentally evaluates that this design unification is not always advantageous. In this context, we present three NTT hardware architectures: 1) a forward NTT (FNTT) architecture; 2) an inverse NTT (INTT) architecture; and 3) a unified NTT (UNTT) architecture for computing the FNTT and INTT computations on a single design. We benchmark our throughput/area and energy/area evaluations on Xilinx Virtex-7 field-programmable gate array (FPGA) and 28-nm application-specific integrated circuit (ASIC) platforms. The standalone FNTT and INTT designs, on average on FPGA, exhibit 4.66 × and 3.75 × higher throughput/area and energy/area values, respectively, than the UNTT design. Similarly, the individual FNTT and INTT designs, on average on ASIC, achieve 1.25 × and 1.09 × higher throughput/area and energy/area values, respectively, compared to the UNTT design.
Original language | English |
---|---|
Pages (from-to) | 485-488 |
Number of pages | 4 |
Journal | IEEE Embedded Systems Letters |
Volume | 16 |
Issue number | 4 |
Early online date | 6 Jun 2024 |
DOIs | |
Publication status | Published (in print/issue) - 31 Dec 2024 |
Bibliographical note
Publisher Copyright:© 2009-2012 IEEE.
Keywords
- Post-quantum cryptography
- number theoretic transform
- polynomial multiplication
- FPGA
- ASIC
- Computer architecture
- Routing
- Hardware
- Polynomials
- Registers
- Field programmable gate arrays
- Clocks
- post-quantum cryptography
- Application-specific integrated circuit (ASIC)
- field-programmable gate array (FPGA)
- number theoretic transform (NTT)