| |
|
|
|
|
Data Transfer |
Table of Contents
- Data Transfer Overview
Data Transfer Protocols
Data Transfer Clients
NCSA-TeraGrid Data Transfer Resources
NCSA Mass Storage System Transfers
Data Transfer Software Installation
Transfer Performance Considerations
Data Transfer Examples
Data Transfer Protocols
The various software packages and command-line tools, discussed below,
work on top of a few standard protocols. Each protocol has its own set
of semantics, security and available software.
SSH/SCP
Advantages:
- Recursive feature allows simple reproduction of entire directory
hierarchies of files.
- Data is transmitted over a secure channel.
- Host-key-based authentication is possible.
- Convenient way to transfer source code or other relatively small
files to/from your /home directory.
Disadvantages:
- Individual files are transmitted separately, which becomes an issue
when network latency is high.
- Performance is poor over wide area links due to small TCP window
sizes.
- File transfers larger than 2GB are not supported on some systems.
- Data encryption can become a bottleneck for large transfers.
Recommendations:
- Use to transfer small files or directories containing source code
or other relatively small file sets.
- Tar directories containing large numbers of files when sending
over high-latency networks.
HPN-SSH/SCP
Advantages:
- Works under the normal interface of
scp.
- Allows TCP receive buffers to automatically adjusted or manually
set.
- Allows data encryption to be turned off, thus reducing the CPU
load and allowing high-bandwidth transfers to reach the full network
potential.
Disadvantages:
- Both client and server must be patched for full functionality.
- May degrade transfer performance on local low-latency networks.
Recommendations:
- Determine if the client at the site you will be issuing scp commands
is built with the HPN-SSH patch:
>ssh -V
FTP
Advantages:
- Long-established Internet protocol means widely available and easy
to implement
- Many available clients are standard on most operating systems
- Data is transmitted over an open channel (if one is concerned with
secutity of data content).
Disadvantages:
- Data is transmitted over an open channel (if one is concerned with
speed of transfer).
Recommendations:
- Good for interactive remote access.
- Use for quick access to MSS.
KFTP
- Same transfer protocol as FTP, but with added Kerberos security
features.
- Kerberized (or GSI-authenticated) FTP clients must be used to access
the NCSA MSS or other FTP servers protected by a Kerberos realm.
- Many standard FTP clients (named "ftp" not "kftp"),
including those bundled with popular Linux distributions, already
have Kerberos functionality built in.
GridFTP
Advantages:
- GSI authentication: Allows secure password-less authentication
with a valid X509 certificate and proxy.
- Extended capabilities to accommodate performance increases through
parallelism, optimized buffering and other techniques.
Disadvantages:
- Limited to the FTP interface for recursive operations, listing
renaming
- Detailed knowledge of server deployments and system characteristics
may be needed to obtain optimal performance.
Recommendations:
- Use when moving large data sets to or from a parallel file system.
- Striping or concurrent transfers (striped or non-striped) can help
take advantage of multiple server hosts.
|
|
|
|
|