sul-dlss/preservation_catalog

Name: preservation_catalog

Owner: Stanford University Digital Library

Description: Rails application to track, audit and replicate archival artifacts associated with SDR objects.

Created: 2017-08-16 17:55:52.0

Updated: 2018-05-24 18:50:22.0

Pushed: 2018-05-25 02:28:08.0

Homepage:

Size: 271577

Language: Ruby

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

Build Status Coverage Status GitHub version

README

Rails application to track, audit and replicate archival artifacts associated with SDR objects.

Getting Started
PostgreSQL
Installing Postgres

If you use homebrew you can install PostgreSQL with:

 install postgresql

Make sure Postgres starts every time your computer starts up.

 services start postgresql

Check to see if Postgres is installed with postgres -V and that it's accepting connections with pg_isready.

Configuring Postgres

Using the psql utility, run these two setup scripts from the command line, like so:

 -f db/scripts/pres_setup.sql postgres
 -f db/scripts/pres_test_setup.sql postgres

These scripts do the following for you:

For more info on postgres commands, see https://www.postgresql.org/docs/

Redis

Install and run redis. For example, using homebrew:

 install redis
 services start redis
Usage Instructions
General Info About Running These Rake Tasks

As an alternative to screen, you can also run tasks in the background using nohup so the invoked command is not killed when you exist your session. Output that would've gone to stdout is instead redirected to a file called nohup.out, or you can redirect the output explicitly. For example:

S_ENV=production nohup bundle exec rake seed_catalog >seed_whole_catalog_nohup-2017-12-12.txt &
Seed the catalog

Seeding the catalog presumes an empty or nearly empty database – otherwise running the seed task will throw druid NOT expected to exist in catalog but was found errors for each found object.

Without profiling:

S_ENV=production bundle exec rake seed_catalog

With profiling:

S_ENV=production bundle exec rake seed_catalog[profile]

this will generate a log at, for example, log/profile_seed_catalog_for_all_storage_roots2017-11-13T13:57:01-flat.txt

Reset the catalog for re-seeding

WARNING! this will erase the catalog, and thus require re-seeding from scratch. It is mostly intended for development purposes, and it is unlikely that you'll need to run this against production once the catalog is in regular use.

Drop or Populate the catalog for a single endpoint

To run either of the rake tasks below, give the name of the moab storage_root (e.g. from settings/development.yml) as an argument.

Drop all database entries:
S_ENV=production bundle exec rake drop[fixture_sr1]
Populate the catalog:
S_ENV=production bundle exec rake populate[fixture_sr1]
Run Moab to Catalog existence check for a single root and for all storage roots

To run rake tasks below, give the name of the moab storage_root (e.g. from settings/development.yml) as an argument.

Single Root

RAILS_ENV=production bundle exec rake m2c_exist_single_root[fixture_sr1,profile]

 will generate a log at, for example, `log/profiler_check_existence_for_dir2017-12-11T14:34:06-flat.txt`

 All Roots
thout profiling:

RAILS_ENV=production bundle exec rake m2c_exist_all_storage_roots

th profiling:

RAILS_ENV=production bundle exec rake m2c_exist_all_storage_roots[profile]

 will generate a log at, for example, `log/profile_check_existence_for_all_storage_roots2017-12-11T14:25:31-flat.txt`

Run Catalog to Moab existence check for a single root or for all storage roots

ven a catalog entry for an online moab, ensure that the online moab exists and that the catalog version matches the online moab version.

 run rake tasks below, give a date and the name of the moab storage_root (e.g. from settings/development.yml) as arguments.

e (date/timestamp) argument is a threshold:  it will run the check on all catalog entries which last had a version check BEFORE the argument. It should be in the format '2018-01-22 22:54:48 UTC'.

te: Must enter date/timestamp argument as a string.

 Single Root
thout profiling

RAILS_ENV=production bundle exec rake c2m_check_version_on_dir['2018-01-22 22:54:48 UTC',fixture_sr1]

th profiling

RAILS_ENV=production bundle exec rake c2m_check_version_on_dir['2018-01-22 22:54:48 UTC',fixture_sr1,profile]

 will generate a log at, for example, `log/profile_c2m_check_version_on_dir2018-01-01T14:25:31-flat.txt`

 All Roots
thout profiling:

RAILS_ENV=production bundle exec rake c2m_check_version_all_dirs['2018-01-22 22:54:48 UTC']

th profiling:

RAILS_ENV=production bundle exec rake c2m_check_version_all_dirs['2018-01-22 22:54:48 UTC',profile]

 will generate a log at, for example, `log/profile_c2m_check_version_all_roots2018-01-01T14:25:31-flat.txt`

Run Checksum Validation for a single root or for all storage roots
rse all manifestInventory.xml and most recent signatureCatalog.xml for stored checksums and verify against computed checksums.
 run rake tasks below, give the name of the endpoint (e.g. from settings/development.yml)

 Single Root
thout profiling

RAILS_ENV=production bundle exec rake cv_single_endpoint[fixture_sr3]

th profiling

RAILS_ENV=production bundle exec rake cv_single_endpoint[fixture_sr3,profile]

 will generate a log at, for example, `log/profile_cv_validate_disk2018-01-01T14:25:31-flat.txt`

 All Roots
thout profiling:

RAILS_ENV=production bundle exec rake cv_all_endpoints

th profiling:

RAILS_ENV=production bundle exec rake cv_all_endpoints[profile]

 will generate a log at, for example, `log/profile_cv_validate_disk_all_endpoints2018-01-01T14:25:31-flat.txt`

One druid at a time
thout profiling:

RAILS_ENV=production bundle exec rake cv_druid[bz514sm9647]

evelopment

Running Tests

un the tests:

rake spec

eploying

strano is used to deploy.

un `rake db:seed` in a deploy environment:

bundle exec cap stage db_seed # for the stage servers


bundle exec cap prod db_seed # for the prod servers


This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.