Data Access and Storage Best Practices

Accessing user storage on an High-Performance Computing (HPC) cluster involves adherence to specific rules and best practices to ensure efficient and secure data management. Collaborative data sharing between lab projects should primarily rely on designated storage areas, such as scratch storage and project storage, while avoiding the use of a user's home storage. Here are some key guidelines:

  1. Scratch Storage: Scratch storage is typically designed for temporary and high-performance data storage needs. It is suitable for intermediate data, job-specific input/output, and short-term data sharing among users working on the same project. Data in scratch storage may be periodically purged to free up space, so it's not suitable for long-term data retention.

  2. Project Storage: Project storage is a shared storage area specifically allocated for collaborative work within a research project or lab group. It offers a more stable and organized environment for sharing data among project members. Access controls can be applied to ensure that only authorized users have access to the project storage.

  3. User Home Storage: User home storage is intended for storing personal configuration files, scripts, and small-scale data files necessary for a user's account. It is not designed for collaborative data sharing between lab projects. Using user home storage for data collaboration can lead to storage limitations, inefficient use of resources, and potential security risks.

By following these rules and best practices, users and research groups can effectively manage and share data on HPC clusters, ensuring that data is stored in the right place, is accessible to authorized users, and is protected from loss or unauthorized access. This approach promotes efficient collaboration and minimizes the risk of data-related issues on the cluster.