Name: base_indexer
Owner: Stanford University Digital Library
Description: working
Created: 2015-03-19 00:02:14.0
Updated: 2018-03-19 18:55:00.0
Pushed: 2018-03-12 15:50:24.0
Size: 326
Language: Ruby
GitHub Committers
User | Most Recent Commit | # Commits |
---|
Other Committers
User | Most Recent Commit | # Commits |
---|
First time setup and to generate all docs:
$ rake
Just run the tests next time around:
$ bundle exec rake spec
$ rails new my_indexer_app
gem 'base_indexer'
$ bundle install
ils g base_indexer:install
The engine is looking for the following values
config.solr_config_file_path = “#{config.root}/config/solr.yml” DiscoveryIndexer::PURL_DEFAULT='https://purl.stanford.edu'
The engine gives the developer the ability to extend any of its classes
To extend any of indexer features (purl-reader, mods-reader, mapper, solr-writer)
To extend mapper functionality.
All rake tasks that perform batch indexing will generate log files in the “log” folder within the app itself. You can tail the log file to watch the progress. The log file is also useful since you can pass it to the “reindexer” rake task to retry just the errored out druids. The name of the log file will depend on which rake task you are running, and will be timestamped to be unique.
$ rake index RAILS_ENV=production target=revs_prod druid=oo000oo0001
$ rake log_indexer RAILS_ENV=production target=revs_prod log_file=/tmp/mailander_1.yaml log_type=preassembly = preassembly run
$ nohup rake log_indexer RAILS_ENV=production target=revs_prod log_file=/tmp/mailander_1.yaml log_type=preassembly & = for a long running process, which will be most runs that have more than a few dozen druids, nohup it
$ rake log_indexer RAILS_ENV=production target=revs_prod log_file=/tmp/mailander_1_remediate.yaml log_type=remediate = remediation run
$ rake log_indexer RAILS_ENV=production target=revs_prod log_file=/tmp/mailander.csv log_type=csv = a simple csv file – it must have a header line, with the header of “druid” definining the items you wish to index
$ rake collection_indexer RAILS_ENV=production target=revs_prod collection_druid=oo000oo0001 $ nohup rake collection_indexer RAILS_ENV=production target=revs_prod collection_druid=oo000oo0001 & = for a long running process, e.g. a collection with more than a few dozen druids, nohup it
If you had errors when indexing from a preassembly/remediation log or from indexing an entire collection, you can re-run the errored out druids only with the log file. All log files are kept in the log folder in the revs-indexer-service app.
$ rake reindexer RAILS_ENV=production target=revs_prod file=log/logfile.log
$ nohup rake reindexer RAILS_ENV=production target=revs_prod file=log/logfile.log & = probably no need to nohup unless there were alot of errors
Delete a list of druids specified in a CSV/txt file. Be careful, this will delete from all targets! Put one druid per line, no header is necessary.
$ rake delete_druids RAILS_ENV=production file=druid_list.txt
$ rake delete RAILS_ENV=production druid=oo000oo0001
/items/:druid
DELETE /items/:druid
Deletes a druid from all registered subtargets.
Name | Located In | Description | Required | Schema | Default
—- | ———- | ———– | ——– | —— | ——-
druid
| url | object identifier | yes | String |
Code | Description
—- | ———–
200
| Request received and successfully processed for all subtargets
500
| Request received but did not complete successfully (one or more subtargets may have not been deleted)
/items/:druid/subtargets/:subtarget
DELETE /items/:druid/subtargets/:subtarget
Deletes a druid from a specific subtarget.
Name | Located In | Description | Required | Schema | Default
—- | ———- | ———– | ——– | —— | ——-
druid
| url | object identifier | yes | String |
subtarget
| url | subtarget name (usually capitialized) | yes | String |
Code | Description
—- | ———–
200
| Request received and successfully processed for subtarget
500
| Request received but did not complete successfully
PATCH/PUT /items/:druid/subtargets/:subtarget
Indexes a druid from a specific subtarget.
Name | Located In | Description | Required | Schema | Default
—- | ———- | ———– | ——– | —— | ——-
druid
| url | object identifier | yes | String |
subtarget
| url | subtarget name (usually capitialized) | yes | String |
Code | Description
—- | ———–
200
| Request received and successfully processed for subtarget
500
| Request received but did not complete successfully