Unylogix HSM - Hierarchical Storage Migration software
Overview, Features and Benefits
The Beginnings of HSM
HSM technology is not new. It has been around in mainframe environments
since the 1970s, where the management of large, centralized data repositories
was a major problem. (Imagine, if you will, the size of the disk farms that
had to exist on these computing mammoths; the data of a full enterprise!)
At that time, disk storage space was very expensive, and space was at a
premium. Accordingly, HSM was developed to automate the "freeing up"
of disk space, so that files (data, records, etc.) would be "retired"
to other storage technologies for longer term holding, primarily reel to
reel magnetic tape. The HSM software, through its own decision process,
made the selection as to which pieces of data had become inactive, and automatically
migrated that data out to the tape transports.
But HSM takes data migration to a more intelligent level. It has
multiple parameters for migration, meaning that it considers many factors
in deciding which pieces of information are the best candidates for migration.
This creates more thoughtful, accurate migration. HSM is a true rules-based
software tool that was one of the first to have a graphical user interface,
making it one of the most robust HSMs on the market.
HSM automates the storage management of all data under its control
by monitoring the hard disk space usage. Based upon the disk usage when
measured against predefined water-level marks, the HSM software will scan
the contents of the disk, and select candidates among that data for migration
out to another storage medium. When the space is finally needed, the HSM
software will push the data out so that the disk does not fill up. At the
same time, an entry is made in the HSM's data base as to the actual location
of the data, and a stub (placeholder) is left behind on the disk layer,
so that it gives the appearance to the users that their files are indeed
on-line, (on the hard disk) whereas, in reality, they may be near-line (stored
in a jukebox) or off-line on a shelf.
All of this is transparent to the user, as when they or their applications
request a file, it is fetched by the HSM software, and provided for
use. This process occurs seamlessly, without user or operator intervention.
The data, in general, is migrated upward to the disk layer, where it remains
available for use (refreshes, appending, etc), until it falls into disuse
and again becomes a candidate for migration. Again, this process is all
automatic, based upon preset parameters selected by the system administrator.
All the user sees, if at all, is a slight time delay in the provision of
data by the HSM system.
How the HSM Prioritizes Data
Unylogix HSM is a highly intelligent, robust software tool because it uses
four parameters used for the selection of candidates for migration: these
are size of file, date last accessed, date last modified and priority level.
Priority is a number assigned to a file or directory that increases or decreases
the probability of its remaining on the hard disk level. It is a parameter
adjustable only by the super user so that he can "lock" data on
the hard disk layer that he does not wish to have migrated outward.
Unlike some HSM packages, HSM is a true rules-based software tool
that allows you to extend disk storage to huge capacities without requiring
any intervention. It is used in conjunction with optical and tape jukeboxes,
either together or alone. Better HSM packages, like ours, have the ability
to back themselves up, independent of the network. It makes the disk self-grooming
and keeps track of every piece of data it ever touches. And, best of all,
it is seamless to the users.
Different Uses of HSM
As a Buffer
As all reads and writes to an HSM-controlled disk are through the
hard disk layer, one can do a "disk to disk" backup of all the
disks on a network, rather than to a tape device, which may be slower. As
the HSM ensures that the receiving disk never fills up, backups can
be expedited by the improvement in speed across the network.
As a Self-Managing Repository
When a network has a large number of users who have a lot of local disk
space, the network administrator is burdened with keeping full and incremental
backups on a network that is too slow to allow the backups in the time windows
allotted. One solution, then, is to reduce the amount of data out on users'
workstations, and have them place the data on the HSM server, where
the software can do backups of its own contents, independent of the network
and independent of the network backup package. Let's say a network has 60
GB of data spread across an Ethernet network's disks, which creates a formidable
problem for any time window to perform backups. If the users agree that
they would place 40 GB of the data on the HSM server, then the network
back-ups would only have to worry about 20 GB of storage, as the HSM
would handle the other 40 GB, automatically and independently of the network.
As a Restore Server
A network administrator could closely link the network backup and the
HSM packages so that they both spoke the same language (file system), so
the backup package could make use of the infinite capacity provided by the
HSM. That way, all of the backups would be accounted for and on-line.
In the event of a disaster, users could restore their own files to their
disks from the most recent backup event, without operator intervention.
All of the data would be there; all the user has to do is request the information
and it would be copied over the network automatically - no searching for
files or pieces of media (tapes).
Performance of HSM Systems
All other things being equal, the performance of the all HSM systems may
be slower than a server that uses only hard disks as its data holder; but
only on reads, not on writes. The HSM will hold a lot more data, at a lower
cost per MB than the all-hard-disk alternative, and it will cost far less
to administer. But it will be slower on reads if the data requested is not
on the hard disk layer. If you are doing a dir or ls or
checking for folders, all of those files look as if they are on-line, so
the response time on these is the same as the all-disk server; of course,
if the file is already on the hard disk layer, then the performance for
the reading of the data is the same. As tape devices are serial in nature
(not random access like hard disk drives), they experience a little more
delay in finding a file than just the delay caused by the robotics intervention.
Optical media, though, will have less delay because it is random access.
HSM Tech Specs
Supported Platforms
Solaris (2.5, 2.6, 7, and 8) on both Sparc and Intel platforms, including
64-bit Sparc
Hardware Compatibility
Tested and certified with libraries or jukeboxes from all major manufactures, including STK, Quantum ATL, Qualstar, Overland, HDS, SONY, Plasmon etc.
Can handle libraries for all standard tape media, including: DLT, LTO, Exabyte, AIT, DAT, etc.
Can handle libraries for all standard optical media including: MO, CDR, CDRW, DVD, etc.
File System Support
Supports both UFS and NFS
Supported Devices
The following is a list of manufacturers whose robotic devices are supported.
The list is constantly changing, so if you do not find your device, please
and we will let you know if it is supported.
Optical Jukeboxes
|
DISC
|
Maxoptix
|
Fujitsu
|
Plasmon
|
HP
|
Sony
|
IBM
|
|
Tape Libraries
|
ADIC
|
M4 Data
|
Storage Tek
|
ATL
|
Overland Data
|
StraightLine
|
Breece Hill Technologies
|
Plasmon
|
Sun
|
Exabyte
|
QualStar
|
Tandberg Data
|
IBM
|
Spectra Logic
|
|
For more information click on Overview, Features and Benefits