Show simple item record

Files in this item

Thumbnail

Item metadata

dc.contributor.authorLi, Lideng
dc.contributor.authorYu, Teng
dc.contributor.authorZhao, Wenlai
dc.contributor.authorFu, Haohuan
dc.contributor.authorWang, Chenyu
dc.contributor.authorTan, Li
dc.contributor.authorYang, Guangwen
dc.contributor.authorThomson, John
dc.date.accessioned2018-11-13T13:30:05Z
dc.date.available2018-11-13T13:30:05Z
dc.date.issued2018-11-11
dc.identifier255501866
dc.identifierdd552c75-e08e-471f-9367-a8f909aa25fb
dc.identifier85064131730
dc.identifier000494258800013
dc.identifier.citationLi , L , Yu , T , Zhao , W , Fu , H , Wang , C , Tan , L , Yang , G & Thomson , J 2018 , Large-scale hierarchical k-means for heterogeneous many-core supercomputers . in Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis (SC '18) . IEEE Press , Piscataway , The International Conference for High Performance Computing, Networking, Storage, and Analysis , Dallas , Texas , United States , 11/11/18 . https://doi.org/10.5555/3291656.3291674en
dc.identifier.citationconferenceen
dc.identifier.isbn9781538683842
dc.identifier.urihttps://hdl.handle.net/10023/16441
dc.descriptionFunding: J.Thomson and T.Yu are supported by the EPSRC grants ”Discovery” EP/P020631/1, ”ABC: Adaptive Brokerage for the Cloud” EP/R010528/1, and EU Horizon 2020 grant Team-Play: ”Time, Energy and security Analysis for Multi/Many-core heterogenous PLAtforms” (ICT-779882, https://teamplay- h2020.eu)en
dc.description.abstractThis paper presents a novel design and implementation of k-means clustering algorithm targeting the Sunway TaihuLight supercomputer. We introduce a multi-level parallel partition approach that not only partitions by dataflow and centroid, but also by dimension. Our multi-level (nkd) approach unlocks the potential of the hierarchical parallelism in the SW26010 heterogeneous many-core processor and the system architecture of the supercomputer. Our design is able to process large-scale clustering problems with up to 196,608 dimensions and over 160,000 targeting centroids, while maintaining high performance and high scalability, significantly improving the capability of k-means over previous approaches. The evaluation shows our implementation achieves performance of less than 18 seconds per iteration for a large-scale clustering case with 196,608 data dimensions and 2,000 centroids by applying 4,096 nodes (1,064,496 cores) in parallel, making k-means a more feasible solution for complex scenarios.
dc.format.extent11
dc.format.extent926094
dc.language.isoeng
dc.publisherIEEE Press
dc.relation.ispartofProceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis (SC '18)en
dc.subjectSupercomputeren
dc.subjectMulti/many-core Processorsen
dc.subjectClusteringen
dc.subjectParallel computingen
dc.subjectQA75 Electronic computers. Computer scienceen
dc.subjectNDASen
dc.subjectBDCen
dc.subject.lccQA75en
dc.titleLarge-scale hierarchical k-means for heterogeneous many-core supercomputersen
dc.typeConference itemen
dc.contributor.sponsorEPSRCen
dc.contributor.sponsorEPSRCen
dc.contributor.sponsorEuropean Commissionen
dc.contributor.institutionUniversity of St Andrews. School of Computer Scienceen
dc.identifier.doi10.5555/3291656.3291674
dc.date.embargoedUntil2018-11-11
dc.identifier.urlhttps://dl.acm.org/citation.cfm?id=3291674en
dc.identifier.grantnumberEP/P020631/1en
dc.identifier.grantnumberEP/R010528/1en
dc.identifier.grantnumber779882en


This item appears in the following Collection(s)

Show simple item record