Please use this identifier to cite or link to this item:
https://doi.org/10.1109/LSSC.2020.3024838
DC Field | Value | |
---|---|---|
dc.title | Broad-Purpose In-Memory Computing for Signal Monitoring and Machine Learning Workloads | |
dc.contributor.author | SAURABH JAIN | |
dc.contributor.author | LIN LONGYANG | |
dc.contributor.author | ALIOTO,MASSIMO BRUNO | |
dc.date.accessioned | 2021-04-12T07:38:42Z | |
dc.date.available | 2021-04-12T07:38:42Z | |
dc.date.issued | 2020-09-01 | |
dc.identifier.citation | SAURABH JAIN, LIN LONGYANG, ALIOTO,MASSIMO BRUNO (2020-09-01). Broad-Purpose In-Memory Computing for Signal Monitoring and Machine Learning Workloads. IEEE Solid-State Circuits Letters 3 : 394-397. ScholarBank@NUS Repository. https://doi.org/10.1109/LSSC.2020.3024838 | |
dc.identifier.issn | 2573-9603 | |
dc.identifier.uri | https://scholarbank.nus.edu.sg/handle/10635/189165 | |
dc.description.abstract | In this paper, a broad-purpose compute-in-memory solution (±CIM) able to handle arbitrary sign in both inputs/features and weights/coefficients is introduced. The ability to operate on arbitrary sign and under variable precision on both operands enables a wide range of applications, ranging from conventional neural networks to digital signal processing and monitoring. The ±CIM pipelined architecture, the reconfigurable row encoder and the adoption of a commercial 2-port bitcell allow uninterrupted memory availability for conventional read/write, even when performing in-memory computations. A 40nm testchip shows the ability of the ±CIM architecture to perform both neural network computations and classical signal processing. At 6-bit precision, the measured worst-case mismatch (noise) is 0.38 (0.62) LSB. The achieved accuracy when executing a LeNet-5 neural net workload is 98.3%, which is within 1.3% of state-of-the-art software implementations. As example of signal processing workload, 91.7% accuracy is achieved in voice activity detection, which is within 2.8% of a software implementation. Overall, the energy efficiency (throughput) of 41 TOPS/W (122 GOPS) is achieved at 38% area overhead, over a conventional SRAM with the same 4-KB capacity. | |
dc.publisher | IEEE | |
dc.rights | CC0 1.0 Universal | |
dc.rights.uri | http://creativecommons.org/publicdomain/zero/1.0/ | |
dc.type | Article | |
dc.contributor.department | DEPT OF ELECTRICAL & COMPUTER ENGG | |
dc.description.doi | 10.1109/LSSC.2020.3024838 | |
dc.description.sourcetitle | IEEE Solid-State Circuits Letters | |
dc.description.volume | 3 | |
dc.description.page | 394-397 | |
dc.published.state | Published | |
Appears in Collections: | Elements Staff Publications |
Show simple item record
Files in This Item:
File | Description | Size | Format | Access Settings | Version | |
---|---|---|---|---|---|---|
Broad-Purpose In-Memory Computing for Signal Monitoring and Machine Learning Workloads.pdf | 595.27 kB | Adobe PDF | OPEN | Post-print | View/Download |
This item is licensed under a Creative Commons License