Usage

hcpt is a command line tool.

Syntax

$ hcpt [common arguments] subcommand [subcommand arguments]


usage: hcpt [-h] [--version] -u USER [-p PASSWORD] [-l LOGFILE]
            [-i seconds] [-t '# of threads'] [--nossl] [-v] [--gc t1.t2.t3]
            {cookie,load,list,retention,test,unload} ...
hcpt serves as a switchboard for the HCP Tools.
Required option is the subcommand to be used. Please note that arguments described here are additional to the subcommand’s arguments.

Positional arguments:

{cookie,load,list,retention,test,unload}
    cookie              calculate HCP access token
    load                load bulk testdata into HCP
    list                list HCP content
    retention           change retention setting for selected objects within
                        HCP (see specific instructions)
    test                test-run all the subcommands
    unload              delete content from HCP

Optional arguments:

-h, --help            show this help message and exit
--version             show program's version number and exit
-u USER, --user USER  data access acount
-p PASSWORD, --password PASSWORD
                      password (will require manual input if not given)
-l LOGFILE, --logfile LOGFILE
                      logfile (defaults to 'hcpt.py_subcmd.log')
-i seconds, --loginterval seconds
                      logging interval (defaults to 10 sec.)
-t '# of threads', --threads '# of threads'
                      no. of parallel threads (defaults to 30)
--nossl               use http instead of https
-v                    verbosity (-v = INFO, -vv = DEBUG, -vvv = garbage
                      collection statistics)
--gc t1.t2.t3         garbage collection thresholds (defaults to
                      '700.10.10'- see 'http://docs.python.org/py3k/
                      library/gc.html#gc.set_threshold')

Subcommands

load

Perform bulk data ingestion into HCP for testing (!) purposes.

load does already use the hcpsdk to make proper use of the resources of HCP, while maintaining perstistent http(s) sessions. Windows error 10048 doesn’t apply here.

usage: hcpt load [-h] [--version] -c CLUSTER -d directory -f ingestfile
                 [-r retention_string] --structure # [# ...] --reqlogfile
                 REQLOGFILE

hcpt load performs bulk data ingestion into HCP for testing purposes. It always uses https (or http if ‘–nossl’ is given) and allows for multi-threaded ingestion.

Optional arguments:

-h, --help            show this help message and exit
--version             show subfunctions version and exit
-c CLUSTER, --cluster CLUSTER
                      target namespace (full qualified DNS-name)
-d directory, --dir directory
                      target directory ('/rest/...' or '/fcfs_data/...')
-f ingestfile, --file ingestfile
                      file to be ingested
-r retention_string, --retention retention_string
                      retention (requires valid HCP retention string)
--structure # [# ...] directory structure to be build
--reqlogfile REQLOGFILE
                      log time needed per PUT into file

Controlled by –structure [#_of_dirs [#_of_dirs […]]] #_of_files`, a directory structure is build and #_of_files copies of ingestfile will be ingested into each lowest level directory.

Example: 3 3 3 causes three directories to be created below targetdir (0000, 0001, 0002), with another three subdirectories (0000, 0001, 0002) in each of them and three copies of ingestfile to be written into each of these subdirectories.

Warning

Be cautious, you could use up a lot of capacity in HCP and generate a lot of network trafic while using it…

Example:

hcpt --user ns1 --password ns101 -v -i3 load \
     --cluster ns1.matrix.hcp1.vm.local \
     --dir /rest /hcpt_test1 --file c:\hitachi_logo.txt \
     --structure 10 10 1

list

Discover all objects in a given subdirectory within an HCP namespace while discovering the directory tree top/down. List the found objects and directories in a MS Excel usable file (*.csv) and in a Sqlite3 database file.

list does already use the hcpsdk to make proper use of the resources of HCP, while maintaining perstistent http(s) sessions. Windows error 10048 doesn’t apply here.

usage: hcpt list [-h] [--version] -c CLUSTER -d directory [--all]
                 [-B DATABASE] [--out {db,csv,both}] [--fatDB]
                 [--QF queuesize] [--Qdb queuesize] [--delay milliseconds]
                 [--outfile OUTFILE] [--showThreads]
                 [--pause_after PAUSE_AFTER]
                 [--pause_minutes PAUSE_MINUTES]

hcpt list lists all objects in a given subdirectory within an HCP namespace while discovering the directory tree top/down. Don’t (!!!) run it against large directory trees on a production server - it may kill the server while eating up all resources…

Optional arguments:

-h, --help            show this help message and exit
--version             show subfunctions version and exit
-c CLUSTER, --cluster CLUSTER
                      target namespace (full qualified DNS-name)
-d directory, --dir directory
                      target directory ('/rest/...' or '/fcfs_data/...')
--all                 find deleted objects, too (if versioning is configured
                      for the namespace)
-B DATABASE, --database DATABASE
                      database file (defaults to
                      'hcpthcptcmds.<timestamp>.[fat|slim].sqlite3')
--out {db,csv,both}   select the output format
--fatDB, --fat        include all available information in database
--QF queuesize        defines the allowed no. of items in FindQueue
--Qdb queuesize       defines the allowed no. of items in dbWriterQueue
--delay milliseconds  add a delay (pause) in ms between two requests
                      executed against HCP by a single thread
--outfile OUTFILE     filename for the resulting .csv file (defaults to
                      'hcpt_list.csv')
--showThreads         show info about running threads
--pause_after PAUSE_AFTER
                      pause discovery after <amt> files found
--pause_minutes PAUSE_MINUTES
                      pause discovery for <amt> minutes when
                      --pause_after triggers

Be aware: when discovering large directory trees, memory usage might become a problem, up to the point where this program might hang or even crash. You should monitor it by using -v or even -vvv. Best advice is to limit the number of threads (-t) to not more than 50 and limit the queues (--QF and --Qdb) to 10.000 and 20.000 respectively. You might encounter a deadlock situation, where --QF will be at max. and no object will be found. In this case, you’ll need to unlimit --QF and maybe lower the threads. Speeding up the garbage collection by tuning --gc might help, too. But take care: this program might grab as many main memory as available, potentially affecting other applications - it’s up to you to monitor that! Expect long (and I mean: really long) run times when discovering multi-million object directory trees! If you’d like to work with the database generated by this program, you could use tools provided at http://www.sqlite.org/download.html.

Example:

hcpt --user ns1 --password ns101 -v -i3 list \
     --cluster ns1.matrix.hcp1.vm.local \
     --dir /rest/hcpt_test1

retention

Change retention setting for selected objects. Takes a database generated by hcpt list as input to update the retention setting of the objects listed in the database. After generating the database with hcpt list, the field flist.new_ret must be updated with the new retention setting for each object (see description below).

usage: hcpt retention [-h] [--version] -B DATABASE [--delay DELAY]

hcpt retention takes a database generated by hcpt list, where the column flist.new_ret has been altered with a new retention string (see below). For every object (!) with a value in column flist.new_ret, hcpt retention tries to change the objects retention within HCP to the given value.

Optional arguments:

-h, --help            show this help message and exit
--version             show subfunctions version and exit
-B DATABASE, --database DATABASE
                      database file generated by 'hcp list' and altered as
                      described below
--delay DELAY         add a delay (pause) in ms between two requests
                      executed against HCP by a single thread

To alter the database, you can use the SQLite shell, available on Mac OS X, many Linux distributions, of from https://sqlite.org/download.html.

For example, if your database file is called hcplist.sqlite3 and you want to add 1 year to every object’s retention, you can follow these steps prior to running this tool:

$ sqlite3 hcplist.sqlite3
sqlite> UPDATE flist SET new_ret='R+1y' WHERE type='file' OR type='object';
sqlite> .quit

It is YOUR responsibility to specify a valid retention string - hcpt retention will not check it for validity!!!

Example:

hcpt --user ns1 --password ns101 -v -i3 retention \
     --database hcplist.sqlite3

test

Runs all subcommands agains HCP to verify the tools functionality.

usage: hcpt test [-h] [--version] -c CLUSTER -d directory -f ingestfile
                 [--versionedNS] [-r retention_string] --structure #
                 [# ...]

hcpt test runs all subcommands agains HCP, making sure that the program works.

Optional arguments:

-h, --help            show this help message and exit
--version             show subfunctions version and exit
-c CLUSTER, --cluster CLUSTER
                      target namespace (full qualified DNS-name)
-d directory, --dir directory
                      target directory ('/rest/...' or '/fcfs_data/...')
-f ingestfile, --file ingestfile
                      file to be ingested
--versionedNS         set this if the target namespace has versioning
                      enabled
-r retention_string, --retention retention_string
                      retention (defaults to 'N+1s')
--structure # [# ...] directory structure to be build

Example:

hcpt -i5 --user <user> --password <password> test \
     --cluster  ns.tenant.hcp.vm.loc --dir /rest/hcpt_test \
     --file <filename> --structure 10 100

unload

Perform deletion of data within HCP namespaces by discovering a directory tree top/down (alternatively, a list with objects to be deleted can be provided). Will find all directories and objects within that tree and will imediately begin with object deletion right after one has been found. Directory deletion will start down/up when the whole tree has been discovered. It will write a sqlite3 database file with a single record for each directory and object found, containing all the information available for it. This can grow quite large…

usage: hcpt unload [-h] [--version] -c CLUSTER -d directory
                   [--infile INFILE] [-B DATABASE] [--fatDB] [--keepDB]
                   [--QF queuesize] [--Qdb queuesize] [--objonly] [--purge]
                   [--privileged REASON] [--YES] [--versionedNS]

hcpt unload performs deletion of data within HCP namespaces by discovering a directory tree top/down. Will find all directories and objects within that tree and will imediately begin with object deletion right after one has been found. Directory deletion will start down/up when the whole tree has been discovered. It will write a sqlite3 database file with a single record for each directory and object found, containing all the information available for it. This can grow quite large…

Optional arguments:

-h, --help            show this help message and exit
--version             show subfunctions version and exit
-c CLUSTER, --cluster CLUSTER
                    target namespace (full qualified dns-name)
-d directory, --dir directory
                    target directory (/rest/... or /fcfs_data/...)
--infile INFILE       file holding a list of objects to be deleted (full
                    path: '/rest/.../object' or '/fcfs_data/.../object'.
                    If set, '--dir' will be used to determine the type of
                    namespace, only.
-B DATABASE, --database DATABASE
                    database file (defaults to
                    'hcpthcptcmds.<timestamp>.[fat|slim].sqlite3')
--fatDB               include all available information in database
--keepDB              do not delete the database file when finished
--QF queuesize        size of internal queue (defaults to unlimited)
--Qdb queuesize       defines the allowed no. of items in dbWriterQueue
--objonly             do not delete directories
--purge               purge versions (if not set, directory deletion will
                    fail if versioning is enabled)
--privileged REASON   perform privileged delete (requires a 'reason')
--YES                 ...if you really (!) want to delete the found
                    objects/directories (defaults to 'generate a list of
                    objects/directories only')
--versionedNS         set this if the target namespace has versioning
                    enabled

Be aware: if you have directories with a huge number (10.000++) of objects, main memory will become excessive used, even more the more threads you use. This could lead to runtime errors - in this case you will need to serialize the processing by limiting the number of threads down to 1 (one) depending on the available main memory. Of course, this will lead to a much longer runtime - monitor the processing by using the commandline switch -v.

Example:

hcpt --user ns1 --password ns101 -v -i3 unload --cluster \
     ns1.matrix.hcp1.vm.local --dir /rest/hcpt_test1 --YES