Name: uri-normalizer
Owner: Histograph
Description: Histograph URI Normalizer
Created: 2015-06-24 12:39:47.0
Updated: 2016-01-13 00:33:32.0
Pushed: 2017-01-10 11:10:26.0
Homepage: null
Size: 17
Language: JavaScript
GitHub Committers
User | Most Recent Commit | # Commits |
---|
Other Committers
User | Most Recent Commit | # Commits |
---|
Used by Histograph to define one set of identifiers to be used by Graphmalizer.
Graphmalizer is stupid (by design): when two identifiers are lexicographically equal (equal as character strings) they are considered to refer to the same thing.
Histograph is more flexible when it comes to specifying identifiers:
http://vocab.getty.edu/tgn/7006952
, urn:ietf:rfc:2141
)123
, foo/234
).id
or an uri
field (but not both).from
and to
field which can be an URI or HGID.This project matches any histograph identifier string and normalizes it into a URI.
We have:
https://api.histograph.io/foo/123
,urn:foo:123
.They are documented in
Lexical equivalence means: equal as character strings, just by looking at two, you can decide if they refer to the same thing.
RFC 2141: For various purposes such as caching, it's often desirable to determine if two URNs are the same without resolving them.
Example: the following URNs are lexically equivalent.
URN:foo:a123,456
urn:foo:a123,456
urn:FOO:a123,456
Again, quoting RFC 2141.
RFC 2141: Functional equivalence is determined by practice within a given namespace and managed by resolvers for that namespeace. Thus, it is beyond the scope of this document. Namespace registration must include guidance on how to determine functional equivalence for that namespace, i.e. when two URNs are the identical within a namespace.
Fictional example:
urn:hgconcept:bag/123,tgn/234,geonames/345
urn:hgconcept:geonames/345,bag/456
These might refer to the same concepts.
Known URLs are converted into canonical form URNs with NID hg
http://vocab.getty.edu/tgn/7006952 ~> urn:hg:tgn:7006952
http://sws.geonames.org/2758064/ ~> urn:hg:geonames:2758064
URNs are left untouched (if canonical form is known, convert to that):
urn:hg:geonames:2758064 ~> urn:hg:geonames:2758064
urn:ietf:rfc:2141 ~> urn:ietf:rfc:2141
HGIDs within a dataset foo
are expandend to URNs with NID hgid
12345-nl ~> urn:hgid:foo:12345-nl
bar/45678901 ~> urn:hgid:bar:45678901
Reverse (resolving)
urn:hg:geonames:2758064 ~> http://sws.geonames.org/2758064/
Etc.
namespaces.js
contains a set of default namespaces.
See also:
Identifier strings are matched according to the following regular expressions
atching strings look like an URI to use, based on RFC2141
SCHEME = /^[a-zA-Z][a-zA-Z0-9+-\.]*:$/
atch `foo/123` HGID's
HGID = /^[a-zA-Z0-9\.+-_]+\/[a-zA-Z0-9\.+-_]+$/
lleß Andere
ID = /^[a-zA-Z0-9\.+-_]+$/
First:
npm install histograph/uri-normalizer
Just do the right thing:
n = require('histograph-uri-normalizer').normalize;
ole.log(n('http://sws.geonames.org/2758064/'))
> urn:hg:geonames:2758064
on't need to, but might as well pass dataset identifier
ole.log(n('foo/123', 'bar'))
> urn:hgid:foo:123
eed to pass dataset identifier
ole.log(n('123', 'bar'))
> urn:hgid:bar:123
Or use the more specific methods:
normalizer = require('histograph-uri-normalizer');
urn = normalizer.URLtoURN('http://sws.geonames.org/2758064/');
ole.log(urn); // contains 'urn:hg:geonames:2758064'
normalizer.normalize(s, nid)
Tries to detect if you pass an URI, local HGID or global HGID. Then does the right thing to normalize it.
It uses all namespaces to convert s
if it's a URI.
normalizer.URLtoURN(url, [nid])
Tries to normalize url
, using all available namespaces. If nid
is specified, only uses that namespace.
normalizer.URNtoURL(urn)
Resolves urn
to URL.
normalizer.addNamespace(nid, namespace)
Adds new namespace nid
to available namespaces. A new namespace must define a string baseUrl
, and two functions URLtoURN(url)
and URNtoURL(nid, nss)
.
Example:
newNamespace = {
seUrl: 'http://sws.geonames.org/',
LtoURN: function(url) {
var match = /.*?(\d+).*/.exec(url);
return 'urn:geonames:' + match[1];
NtoURL: function(nid, nss) {
return this.baseUrl + nss + '/';
alizer.addNamespace('geonames', newNamespace)
normalizer.removeNamespace(nid)
Removes namespace nid
from namespaces list.
Copyright © 2015 Waag Society.