Skip to content

HPC_User_Best_Practices

fgeorgatos edited this page Aug 31, 2012 · 1 revision

TOP10 best practices for scalable HPC systems (For Uni.Lu and beyond)

  • Be a good HPC-citizen: respect the current AUP & do report identified issues via ticketing system, on as needed basis
  • Reuse existing -and tested- mechanisms for job submission in the system queues; read the FAQ thoroughly
  • Read about and apply standard HPC techniques & practices (at least check the content index!): NCSA CI etc: http://www.citutor.org/login.php
  • Reuse existing optimized libraries and applications where possible (MPI, compilers, libraries, modules)
  • Ensure proper disk sizing/backup/redundancy level for your application situation; declare a "project" if your needs are special
  • Make your scripts generic (respect Project Directory Structure); Use variable aliasing - no hardcoding of full path names
  • Take advantage of modules, to manage multiple versions of software
  • Take advantage of easybuild, to manage organizing software from many sources; either for own software or 3rd-party
  • Identify the policy class your tasks belong to and try to make the most efficient work out of your allocation; avoid underutilization, this harms other users
  • Consider sysadmin time planning: realize that all incoming issues have to be prioritized according to user community impact

Hints & Tips:

  • Do code versioning for the sources or scripts you develop (ref: github/gforge); eg. do you have a history of all last month's revisions?
  • Keep a standard eg. "Hello World" example ready, in case you need to do differential debugging on a suspected system problem.
  • Opt for a scripting language for your code integration but, a faster optimized one for the "application kernel" (both maintainable & fast!)
  • Do some form of checkpointing if your individual jobs run for more than 1 day; the advantages you get out of it are plenty; see FAQ on http://hpc.uni.lu
  • Avoid looking for hacks to overcome existing policies; rather document your need and the rational behind it and propose it as a "project"
  • Take advantage of GPU technology if applicable in your case; be careful with the GPU vs cores speedup ratios (ie. does it worth the trouble to employ the GPUs?)
  • If you have a massive workflow of jobs to manage, do not reinvent the wheel: contact the sysadmins to poll for advice on your approach & collect ideas
  • Report any plans to use HPC systems in a special way, as early as possible; it helps both sides to prepare nicely and avoids frustration
  • If you have deadlines to adhere to, kindly notify about; the sysadmin service is always best-effort but we do try to keep our users happy
  • If you find techniques that you consider elegant and relevant to other users' work, you are auto-invited to report them to common mailing list hpc-users!