Welcome to Journal of Beijing Institute of Technology
Volume 25Issue 3
.
Turn off MathJax
Article Contents
LI Tie-jun, ZHANG Jian-min, MA Ke-fan, XIAO Li-quan, LI Si-kun. Virtual and physical address translation mechanism of interconnect network[J]. JOURNAL OF BEIJING INSTITUTE OF TECHNOLOGY, 2016, 25(3): 365-374. doi: 10.15918/j.jbit1004-0579.201625.0309
Citation: LI Tie-jun, ZHANG Jian-min, MA Ke-fan, XIAO Li-quan, LI Si-kun. Virtual and physical address translation mechanism of interconnect network[J].JOURNAL OF BEIJING INSTITUTE OF TECHNOLOGY, 2016, 25(3): 365-374.doi:10.15918/j.jbit1004-0579.201625.0309

Virtual and physical address translation mechanism of interconnect network

doi:10.15918/j.jbit1004-0579.201625.0309
  • Received Date:2014-12-25
  • Most of users are accustomed to utilizing virtual address in their parallel programs running at the scalable high-performance parallel computing systems. Therefore a virtual and physical address translation mechanism is necessary and crucial to bridge the hardware interface and software application. In this paper, a new virtual and physical translation mechanism is proposed, which includes an address validity checker, an address translation cache (ATC), a complete refresh scheme and many reliability designs. The ATC employs a large capacity embedded dynamic random access memory (eDRAM) to meet the high hit ratio requirement. It also can switch the cache and buffer mode to avoid the high latency of accessing the main memory outside. Many tests have been conducted on the real chip, which implements the address translation mechanism. The results show that the ATC has a high hit ratio while running the well-known benchmarks, and additionally demonstrates that the new high-performance mechanism is well designed.
  • loading
  • [1]
    TOP500 Supercomputer Organization.Top500 supercomputer lists[EB/OL].[2014-07-20]. http://www.top500.org/lists/2014/06/.
    [2]
    Pritchard H, Gorodetsky I, Buntinas D. A uGNI-based MPICH2 nemesis network module for the cray XE[C]//Proceedings of the 18th European MPI Users' Group Conference on Recent Advances in the Message Passing Interface, Springer-Verlag, Berlin, Heidelberg, 2011: 110-119.
    [3]
    Xie M, Lu Y, Wang K, et al. TianHe-1A interconnect and message passing services[J]. IEEE Micro, 2012,32(1): 8-20.
    [4]
    Pang Z, Xie M, Zhang J, et al. The TH express high performance interconnect networks[J]. Frontires of Computer Science, 2014,8(3): 357-366.
    [5]
    Schoinas I, Hill M D. Address translation mechanisms in network interfaces[C]//Proceedings of Fourth International Symposium on High-Performance Computer Architecture, IEEE, 1998: 219-230.
    [6]
    Kostas M. Memory management support for multi-programmed remote direct memory access (RDMA) systems[C]//Proceedings of 2005 IEEE International Conference on Cluster Computing, 2005: 1-8.
    [7]
    Lee M, Lee S, Lee J, et al. Adopting system call based address translation into user-level communication[J]. IEEE Computer Architecture Letters, 2006,5: 26-29.
    [8]
    Lee M, Lee S, Maeng S. Context-aware address translation for high performance SMP cluster system[C]//Proceedings of 2008 IEEE International Conference on Cluster Computing, 2008: 292-297.
    [9]
    Mondrian N. Acceleration of the hardware-software interface of a communication device for parallel systems[D]. Mannheim: University of Mannheim, 2008.
    [10]
    Wang Y, Zhang M. Fully memory based address translation in user-level network interface[C]//Proceddings of IEEE 3rd International Conference on Communication Software and Networks, 2011: 351-355.
    [11]
    Hameed F, Bauer L, Henkel J. Simultaneously optimizing DRAM cache hit latency and miss rate via novel set mapping policies[C]//Proc of 2013 International Conference on Compilers, Architecture and Synthesis for Embedded Systems, Montreal, Canada, 2013: 1-10.
    [12]
    Hameed F, Bauer L, Henkel J. Reducing inter-core cache contention with an adaptive bank mapping policy in DRAM cache[C]//Proc of 2013 International Conference on Hardware/Software Codesign and System Synthesis, Montreal, Canada, 2013: 1-8.
    [13]
    Hameed F, Bauer L, Henkel J. Reducing latency in an SRAM/DRAM cache hierarchy via a novel tag-cache architecture[C]//Proc of the 51st Design Automation Conference, San Francisco, CA, ACM, 2014: 37.
    [14]
    Irish J D, Mcbride C B, Ouda I A, et al. Handling concurrent address translation cache misses and hits under those misses while maintaining command order. International Business Machines Corporation, United States Patent 7539840[P]. 2009-05-26.
    [15]
    Corrigan M J, Godtland P, Hinojosa J. Selectively invalidating entries in an address translation cache. International Business Machines Corporation, United States Patent 7822042[P]. 2010-10-26.
    [16]
    NASA Advanced Supercomputing Division. NAS Parallel Benchmarks[EB/OL].[2014-04-06]. http://www.nas.nasa.gov/publications/ npb.html.
  • 加载中

Catalog

    通讯作者:陈斌, bchen63@163.com
    • 1.

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Article Metrics

    Article views (861) PDF downloads(803) Cited by()
    Proportional views
    Related

    /

      Return
      Return
        Baidu
        map