Name: metasanity
Owner: DataONE
Description: A bare bones metadata validation tool
Created: 2017-06-22 23:55:57.0
Updated: 2017-06-26 19:35:07.0
Pushed: 2017-08-23 19:15:37.0
Homepage: null
Size: 68
Language: Java
GitHub Committers
User | Most Recent Commit | # Commits |
---|
Other Committers
User | Most Recent Commit | # Commits |
---|
No-frills schema aware metadata validator.
Attempts to validate content based on the local copies of metadata schemas
used by Coordinating Nodes to emulate the validation process used during
the internal create
operation.
cd
to the metasanity foldermvn package
Result if all goes as expected should be target/metasanity-X.Y-SNAPSHOT.jar
First, populate the local schema
folder with a copy of the schemas from a
Coordinating Node (requires shell access to a CN):
c -avz -e "ssh" cn.dataone.org:/var/lib/tomcat7/webapps/metacat/schema .
Run metasanity from the commandline, for example:
-jar target/metasanity-1.0-SNAPSHOT.jar samples/iso_01.xml
The metasanity expects an xml catalog file “schemas.xml” to be in the
working directory. Use -c
to specify a different catalog.
The output from the tool is something like:
va -jar target/metasanity-1.0-SNAPSHOT.jar samples/iso_01.xml
ing: samples/iso_01.xml
ment is valid.
or:
va -jar target/metasanity-1.0-SNAPSHOT.jar samples/iso_02_cn-invalid.xml
ing: samples/iso_02_cn-invalid.xml
r:
ublic ID: null
ystem ID: file:///Users/vieglais/Documents/Projects/DataONE_PhaseII/Projects/NetBeans/metasanity/samples/iso_02_cn-invalid.xml
ine number: 632
olumn number: 21
essage: cvc-complex-type.2.4.a: Invalid content was found starting with element 'gmd:taxonomy'. One of '{"http://www.isotc211.org/2005/gmd":aggregationInfo, "http://www.isotc211.org/2005/gmd":spatialRepresentationType, "http://www.isotc211.org/2005/gmd":spatialResolution, "http://www.isotc211.org/2005/gmd":language}' is expected.
ment is not valid. Please review issues noted above.
Note that metasanity
uses an XMLCatalog and so differs from the implementation on the DataONE CNs.
Three examples of XML Catalog files are provided:
schemas.xml
Generic catalog for Dublin Core and ISOTC211isotc211-catalog.xml
Catalog specifically for the formatID http://www.isotc211.org/2005/gmd
isotc211-noaa-catalog.xml
Catalog specifically for the formatID http://www.isotc211.org/2005/gmd-noaa
Example of ISOTC211 from NOAA valid for the gmd-noaa schema variant:
-jar target/metasanity-1.0-SNAPSHOT.jar -c isotc211-noaa-catalog.xml samples/iso_01.xml
23, 2017 3:03:54 PM org.dataone.metasanity.MetaSanity main
: Using catalog: isotc211-noaa-catalog.xml
23, 2017 3:03:54 PM org.dataone.metasanity.MetaSanity main
: Parsing: samples/iso_01.xml
23, 2017 3:03:55 PM org.dataone.metasanity.MetaSanity main
: Document is valid.
And invalid for the plain ISOTC211 variant:
-jar target/metasanity-1.0-SNAPSHOT.jar -c isotc211-catalog.xml samples/iso_01.xml
23, 2017 3:10:15 PM org.dataone.metasanity.MetaSanity main
: Using catalog: isotc211-catalog.xml
23, 2017 3:10:15 PM org.dataone.metasanity.MetaSanity main
: Parsing: samples/iso_01.xml
23, 2017 3:10:22 PM org.dataone.metasanity.MetaSanity$ValidationErrorHandler error
RE: cvc-complex-type.2.4.a: Invalid content was found starting with element 'gmx:Anchor'. One of '{"http://www.isotc211.org/2005/gco":CharacterString}' is expected.
blic ID: null
stem ID: file:///Users/vieglais/Documents/Projects/DataONE_PhaseII/Projects/NetBeans/metasanity/samples/iso_01.xml
ne number: 136
lumn number: 167
23, 2017 3:10:22 PM org.dataone.metasanity.MetaSanity$ValidationErrorHandler error
RE: cvc-complex-type.2.4.a: Invalid content was found starting with element 'gmx:Anchor'. One of '{"http://www.isotc211.org/2005/gco":CharacterString}' is expected.
blic ID: null
stem ID: file:///Users/vieglais/Documents/Projects/DataONE_PhaseII/Projects/NetBeans/metasanity/samples/iso_01.xml
ne number: 884
lumn number: 125
23, 2017 3:10:22 PM org.dataone.metasanity.MetaSanity main
ING:
ment is not valid with 70 issues. Please review issues noted above.