Secure Data Analysis in collaboration with EPCC

An important part of the The Data Lab’s analytic offering is access to our own supercomputer, Ultra. This resource is hosted by the Edinburgh Parallel Computing Centre (EPCC), a world-renowned centre for excellence in high performance computing (HPC). With this facility we are able to upload and store client data in a segregated and secure environment, and by using the considerable capabilities of this platform we can undertake an unbridled array of data-related analyses with great efficiency.

Ultra is an SGI UV2000 supercomputer with 64 processor sockets each with an Intel Xeon E5-4620 v2 (20M Cache, 2.60 GHz) processor. The system utilises NUMAlink - SGI proprietary interconnect to run a single instance of SUSE Linux Enterprise Server 11.3 on a total of 512 cores and 8 TB of memory. Eight of the cores are separated to function as a login/front-end partition. The remaining 504 cores are available to PBS Pro batch job scheduler to execute HPC jobs on.

Each project requiring Ultra will have a nominated Data Scientist from The Data Lab responsible for the project account. This “principal investigator” is able to create new project workspaces on Ultra through EPCC, then assist in creating accesses for both client-side users and any additional Data Lab staff if required. A standard SSH client such as PuTTY is used to securely log into the system, and once a connection is established it is possible to initiate a windowed graphical environment using VNC Server. Ultra has analytical software such as R and Python pre-installed, as well as the usual suite of Linux tools and utilities.

