Indiana University
University Information Technology Services
  
What are archived documents?
Login>>
Login

Login is for authorized groups (e.g., UITS, OVPIT, and TCC) that need access to specialized Knowledge Base documents. Otherwise, simply use the Knowledge Base without logging in.

Close

Overview of data storage on TeraGrid resources

TeraGrid users can store their data in their home directories on individual compute resources, in temporary scratch space or parallel file systems, and in archival mass storage.

Choose where to store your data depending on your needs (e.g., speed, visibility, quotas, backup and purge policies):

  • Home directories: These have relatively small storage quotas, but the storage is permanent and backed up regularly. Home directories are visible to all nodes in the cluster, including the login nodes. Best practices dictates that you move large data sets to mass storage as soon as possible to conserve space on individual compute resources.

  • Scratch space: For temporary storage of very large data sets, use scratch space. Scratch has more space than home directories, but data is purged regularly and not backed up. The amount of storage space available at any time depends on the level of concurrent use by others. Scratch space is visible by all nodes in a cluster, including the login nodes.

  • Parallel file systems: These provide fast access to large sets of data, but data is purged regularly and not backed up. The amount of storage space available at any time depends on the level of concurrent use by others. This space is visible to all nodes in a cluster, including the login nodes.

  • Archival (mass) storage: For long-term storage of large data sets, use archival storage. Access times are normally slower than for other storage options, but a GridFTP front end can increase transfer speeds. This space is accessible from all sites, but backups are the responsibility of the user.

    Long-term, archival storage on the TeraGrid is available on:

    • High Performance Storage System (HPSS) at San Diego Supercomputer Center (SDSC); uses HSI, an FTP-like interface, to access data. See SDSC's HPSS User Guide.
    • HPSS at Indiana University; uses GridFTP. See IU's Massive Data Storage System Service page.
    • Golem at Pittsburgh Supercomputing Center (PSC); files migrated to Golem initially reside on disk, then file size and time of last access determine when files get moved to tape. See PSC's Golem page.
    • DiskXtender Mass Storage System (MSS) at the National Center for Supercomputing Applications (NCSA); see NCSA's DiskXtender User Guide (in PDF format).
    • Ranch at Texas Advanced Computing Center (TACC); see the Ranch user guide.
    • HPSS for Frost (NCAR) users; features a maximum file size of 1TB, initial per-user quota of 5TB, the ability to choose one or two copies for a file at creation time, and a POSIX-compliant interface. See HPSS on Frost in the NCAR Frost user documentation. To request an account, email  help@teragrid.org .

Note: To determine the amount of available space on a scratch or parallel file system, use the df command. To see the data storage policies for a specific site, use tg-policy -data.

For specific data storage information for TeraGrid resources, see the Data Storage File Systems & Policies table on the Data Storage page in the TeraGrid User Support documentation.

This document was developed with support from the National Science Foundation (NSF) under Grant No. 0503697 to the University of Chicago and subcontracted to Indiana University. Additional support was provided by IU through its participation in the TeraGrid, which is supported by the NSF under Grants No. 0833618, SCI451237, SCI535258, and SCI504075. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the NSF.

This is document avyq in domains all and tgrid-all.
Last modified on October 07, 2009.

Comments/Questions/Corrections

Use this form to offer suggestions, corrections, and additions to the Knowledge Base. We welcome your input!

If you are affiliated with Indiana University and would like assistance with a specific computing problem, please use the Ask a Consultant form, or contact your campus Support Center.

Contact Information

Note: We will reply to your comment at this address. If your message concerns a problem receiving email, please enter an alternate email address.