Citation: | Rui Shan, Lin Jiang, Junyong Deng, Xueting Li, Xubang Shen. Design and Implementation of Memory Access Fast Switching Structure in Cluster-Based Reconfigurable Array Processor[J].JOURNAL OF BEIJING INSTITUTE OF TECHNOLOGY, 2017, 26(4): 494-504.doi:10.15918/j.jbit1004-0579.201726.0409 |
[1] |
Shi C, Yang J, Han Y, et al. A 1000 fps vision chip based on a dynamically reconfigurable hybrid architecture comprising a PE array processor and self-organizing map neural network[J]. IEEE Journal of Solid-State Circuits, 2014, 49(9):2067-2082.
|
[2] |
Chen Yang, Leibo Liu, Yansheng Wang, et al. Configuration approaches to enhance computing efficiency of coarse-grained reconfigurable array[J]. Journal of Circuits System & Computers, 2015, 24(3):426-429.
|
[3] |
Patel K, Bleakley C J. Coarse grained reconfigurable array based architecture for low power real-time seizure detection[J]. Journal of Signal Processing Systems, 2016, 82(1):55-68.
|
[4] |
Tang C, Liu D, Xing Z, et al. Memory access analysis of many-core system with abundant bandwidth[C]//IEEE International Symposium on Embedded Multicore/many-Core Systems-On-Chip, Turin, Italy, 2015.
|
[5] |
Chen Y, Liu L B, Yin S Y, et al. Efficient and flexible memory architecture to alleviate data and context bandwidth bottlenecks of coarse-grained reconfigurable arrays[J]. Science China Physics, Mechanics & Astronomy, 2014, 57(12):2214-2227.
|
[6] |
Liu Y, Zhang W. Scratchpad memory architectures and allocation algorithms for hard real-time multicore processors[J]. Journal of Computing Science & Engineering, 2015, 9(2):51-72.
|
[7] |
Chakraborty P, Panda P R, Sen S. Partitioning and data mapping in reconfigurable cache and scratchpad memory-based architectures[J]. Acm Transactions on Design Automation of Electronic Systems, 2016, 22(1):1-25.
|
[8] |
Nouri S, Hussain W, Nurmi J. Implementation of IEEE-802.11a/g receiver blocks on a coarse-grained reconfigurable array[C]//Design and Architectures for Signal and Image Processing, Cracow, Poland, 2015.
|
[9] |
Majzoub S, Diab H. MorphoSys reconfigurable hardware for cryptography:the two fish case[J]. The Journal of Supercomputing, 2012, 59(1):22-41.
|
[10] |
Bell S, Edwards B, Amann J, et al. TILE64-processor:a 64-Core SoC with mesh interconnect[C]//IEEE International Solid-state Circuits Conference, San Francisco, America, 2008.
|
[11] |
Li T, Xiao L, Huang H, et al. PAAG:A polymorphic array architecture for graphics and image processing[C]//International Symposium on Parallel Architectures, Algorithms and Programming, Taipei, Taiwan, China, 2012.
|
[12] |
Wang K, Gu H, Yang Y, et al. Optical interconnection network for parallel access to multi-rank memory in future computing systems.[J]. Optics Express, 2015, 23(16):20480-20494.
|
[13] |
Wang Y, Gu H, Wang K, et al. Low-power low-latency optical network architecture for memory access communication[J]. IEEE/OSA Journal of Optical Communications and Networking, 2016, 8(10):757-764.
|
[14] |
Li B M, Leong P H. Serial and parallel FPGA-based variable block size motion estimation processors[J]. Journal of Signal Processing Systems, 2008, 51(1):77-98.
|
[15] |
Medhat A, Shalaby A, Sayed M S, et al. A highly parallel SAD architecture for motion estimation in HEVC encoder[C]//Circuits and Systems, Okinawa, Japan, 2014.
|
[16] |
Hetul Sanghvi. 2D cache architecture for motion compensation in a 4K Ultra-HD AVE and HEVC video codec system[C]//2014 IEEE International Conference on Consumer Electronics, Lasvegas, America, 2014.
|
[17] |
Liu L B,Wang Y S,Yin S Y,et al. Row-based configuration mechanism for a 2-D processing element array in coarse-grained reconfigurable architecture[J]. Science China Information Sciences,2014, 57(10):1-18
|
[18] |
Ruiz G A, Michell J A. An efficient VLSI architecture of fractional motion estimation in H.264 for HDTV[J]. Journal of Signal Processing Systems, 2011, 62(3):443-457.
|
[19] |
Hu Z, Cuvillo J D, Zhu W, et al. Optimization of dense matrix multiplication on IBM Cyclops-64:challenges and experiences[C]//Euro-Par 2006, Parallel Processing, 12th International Euro-Par Conference, Dresden, Germany, 2006.
|
[20] |
Zhang Y P, Jeong T, Chen F, et al. A study of the on-chip interconnection network for the IBM Cyclops64 multi-core architecture[C]//Parallel and Distributed Processing Symposium, IPDPS 2006, Rhodes Island, Greece, 2006.
|
[21] |
Loi I, Benini L. An efficient distributed memory interface for many-core platform with 3D stacked DRAM.[C]//IEEE Design, Automation & Test in Europe Conference & Exhibition (DATE), Dresden, Germany, 2010.
|