hasadna/nli-z3950

Name: nli-z3950

Owner: The Public Knowledge Workshop

Description: Script to help getting bibliographical data from The National Library of Israel using Z3950 protocol and MARC format

Created: 2018-03-12 11:25:14.0

Updated: 2018-04-23 14:50:16.0

Pushed: 2018-04-23 14:51:11.0

Homepage:

Size: 39

Language: Python

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

nli-z3950

Script to help getting bibliographical data from The National Library of Israel using Z3950 protocol and MARC format

The script dumps JSON serialization of the MARC data by default, optionally it can also dump MARC data in the original, binary MARC21 format

Usage
Stateful search using CCL queries

Search queries should be provided in data/ccl_queries/ccl_queries.csv with a single ccl_query column

Search takes the result as input and only updates new entires

er run -it -v `pwd`/data:/data orihoch/nli-z3950 run ./search

Output data will be available under data/search_results directory

Export search results
er run -it -v `pwd`/data:/data orihoch/nli-z3950 run ./search_export
Using CCL Queries

See https://software.indexdata.com/yaz/doc/tools.html#CCL for some examples

Development
Using Docker

Build and run locally

er build -t nli-z3950 . &&\
er run -it -v `pwd`/data:/data orihoch/nli-z3950 run --verbose ./search
Locally

See the Dockerfile for installation instructions. You need both Python 2.7 and Python 3.6 and some dependencies.

PYTHON2=python2 MAX_RECORDS=50 dpp run --verbose ./search
Sync with google storage
 chown -R $USER data
il -m rsync -r ./data gs://knesset-data-pipelines/hasadna-migdar-data/$USER-`date +%Y-%m-%d_%H-%m`

This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.