Show simple item record

Files in this item

Thumbnail

Item metadata

dc.contributor.authorKristien, Martin
dc.contributor.authorSpink, Tom
dc.contributor.authorCampbell, Brian
dc.contributor.authorSarkar, Susmit
dc.contributor.authorStark, Ian
dc.contributor.authorFranke, Björn
dc.contributor.authorBöhm, Igor
dc.contributor.authorTopham, Nigel
dc.date.accessioned2020-10-27T16:58:12Z
dc.date.available2020-10-27T16:58:12Z
dc.date.issued2020-10-02
dc.identifier270898567
dc.identifier112b05cb-3b3f-4414-8ca3-7fdc6602340b
dc.identifier85096033690
dc.identifier000587712700033
dc.identifier.citationKristien , M , Spink , T , Campbell , B , Sarkar , S , Stark , I , Franke , B , Böhm , I & Topham , N 2020 , Fast and correct load-link/store-conditional instruction handling in DBT systems . in CASES '20: Proceedings of the International Conference on Compilers, Architectures and Synthesis for Embedded Systems . vol. Early Access , IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems , IEEE Computer Society , International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES '20) , 20/09/20 . https://doi.org/10.1109/TCAD.2020.3013048en
dc.identifier.citationconferenceen
dc.identifier.issn0278-0070
dc.identifier.otherORCID: /0000-0002-7662-3146/work/103138170
dc.identifier.otherORCID: /0000-0002-4259-9213/work/125727570
dc.identifier.urihttps://hdl.handle.net/10023/20838
dc.description.abstractDynamic Binary Translation (DBT) requires the implementation of load-link/store-conditional (LL/SC) primitives for guest systems that rely on this form of synchronization. When targeting e.g. x86 host systems, LL/SC guest instructions are typically emulated using atomic Compare-and-Swap (CAS) instructions on the host. Whilst this direct mapping is efficient, this approach is problematic due to subtle differences between LL/SC and CAS semantics. In this paper, we demonstrate that this is a real problem, and we provide code examples that fail to execute correctly on QEMU and a commercial DBT system, which both use the CAS approach to LL/SC emulation. We then develop two novel and provably correct LL/SC emulation schemes: (1) A purely software based scheme, which uses the DBT system’s page translation cache for correctly selecting between fast, but unsynchronized, and slow, but fully synchronized memory accesses, and (2) a hardware accelerated scheme that leverages hardware transactional memory (HTM) provided by the host. We have implemented these two schemes in the Synopsys DesignWare® ARC® nSIM DBT system, and we evaluate our implementations against full applications, and targeted micro-benchmarks. We demonstrate that our novel schemes are not only correct, but also deliver competitive performance on-par or better than the widely used, but broken CAS scheme.
dc.format.extent11
dc.format.extent806485
dc.language.isoeng
dc.publisherIEEE Computer Society
dc.relation.ispartofCASES '20: Proceedings of the International Conference on Compilers, Architectures and Synthesis for Embedded Systemsen
dc.relation.ispartofseriesIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systemsen
dc.subjectQA75 Electronic computers. Computer scienceen
dc.subjectT-NDASen
dc.subjectMCPen
dc.subject.lccQA75en
dc.titleFast and correct load-link/store-conditional instruction handling in DBT systemsen
dc.typeConference itemen
dc.contributor.institutionUniversity of St Andrews. School of Computer Scienceen
dc.identifier.doi10.1109/TCAD.2020.3013048


This item appears in the following Collection(s)

Show simple item record