NVIDIA/dfcpub

Name: dfcpub

Owner: NVIDIA Corporation

Description: null

Created: 2017-12-14 01:07:30.0

Updated: 2018-04-02 23:57:53.0

Pushed: 2018-04-03 00:37:04.0

Homepage: null

Size: 2676

Language: Go

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

DFC: Distributed File Cache with Amazon and Google Cloud backends
Overview

DFC is a simple distributed caching service written in Go. The service consists of arbitrary number of gateways (realized as http proxy servers), and any number of storage targets utilizing local disks:

DFC overview

Users connect to the proxies and execute RESTful commands. Data then moves directly between storage targets that cache this data and the requesting http(s) clients.

Prerequisites

The capability called extended attributes, or xattrs, is currently supported by all mainstream filesystems. Unfortunately, xattrs may not always be enabled in the OS kernel (configurations) - the fact that can be easily found out by running setfattr (Linux) or xattr (macOS) command as shown in this single-host local deployment script. If this is the case - that is, if you happen not to have xattrs handy, you can configure DFC not to use them at all (section Configuration below).

To get started, it is also optional (albeit desirable) to have access to an Amazon S3 or GCP bucket. If you don't have Amazon and/or Google Cloud accounts, you can use DFC local buckets as illustrated a) in the API section below and b) in the test sources. Note that local and Cloud-based buckets support the same API with minor exceptions (only local buckets can be renamed, for instance).

Getting Started

If you've already installed Go, getting started with DFC takes about 30 seconds and consists in the following 4 steps:

 get -u -v github.com/NVIDIA/dfcpub/dfc
 $GOPATH/src/github.com/NVIDIA/dfcpub/dfc
ke deploy
 test ./tests -v -run=down -numfiles=2 -bucket=<your bucket name>

The 1st command will install both the DFC source code and all its dependencies under your configured $GOPATH.

The 3rd - deploys DFC daemons locally (for details, please see the script).

Finally, for the last 4th command to work, you'll need to have a name - the name of a bucket. The bucket could be an AWS or GCP based one, or a DFC-own so-called “local bucket”.

Assuming the bucket exists, the 'go test' command above will download 2 (two) objects. Similarly:

 test ./tests -v -run=download -args -numfiles=100 -match='a\d+' -bucket=myS3bucket

downloads up to 100 objects from the bucket called myS3bucket, whereby names of those objects will match 'a\d+' regex.

For more testing/running command line options, please refer to the source.

For other useful commands, see the Makefile.

Helpful Links: Go
Helpful Links: AWS
Configuration

DFC configuration is consolidated in a single JSON file where all of the knobs must be self-explanatory and the majority of those, except maybe just a few, have pre-assigned default values. The notable exceptions include:

DFC configuration: TCP port and URL

and

DFC configuration: local filesystems

Disabling extended attributes

To make sure that DFC does not utilize xattrs, configure “checksum”=“none” and “versioning”=“none” for all targets in a DFC cluster. This can be done via the common configuration “part” that'd be further used to deploy the cluster.

Enabling HTTPS

To switch from HTTP protocol to an encrypted HTTPS, configure “use_https”=“true” and modify “server_certificate” and “server_key” values so they point to your OpenSSL cerificate and key files respectively.

Miscellaneous

The following sequence downloads 100 objects from the bucket called “myS3bucket”:

 test -v -run=down -bucket=myS3bucket

and then finds the corresponding cached objects in the local bucket and cloud buckets, respectively:

nd /tmp/dfc -type f | grep local
nd /tmp/dfc -type f | grep cloud

This, of course, assumes that all DFC daemons are local and non-containerized

Further, to locate all the logs, run:

nd $LOGDIR -type f | grep log

where $LOGDIR is the configured logging directory as per DFC configuration.

To terminate a running DFC service and cleanup local caches, run:

ke kill
ke rmcache
REST operations

DFC supports a growing number and variety of RESTful operations. To illustrate common conventions, let's take a look at the example:

rl -X GET -H 'Content-Type: application/json' -d '{"what": "config"}' http://192.168.176.128:8080/v1/daemon

This command queries the DFC configuration; at the time of this writing it'll result in a JSON output that looks as follows:

{“smap”:{““:{“node_ip_addr”:““,“daemon_port”:““,“daemon_id”:““,“direct_url”:““},“15205:8081”:{“node_ip_addr”:“192.168.176.128”,“daemon_port”:“8081”,“daemon_id”:“15205:8081”,“direct_url”:“http://192.168.176.128:8081”},“15205:8082”:{“node_ip_addr”:“192.168.176.128”,“daemon_port”:“8082”,“daemon_id”:“15205:8082”,“direct_url”:“http://192.168.176.128:8082”},“15205:8083”:{“node_ip_addr”:“192.168.176.128”,“daemon_port”:“8083”,“daemon_id”:“15205:8083”,“direct_url”:“http://192.168.176.128:8083”}},“version”:5}

Notice the 4 (four) ubiquitous elements in the curl command line above:

  1. HTTP verb aka method.

In the example, it's a GET but it can also be POST, PUT, and DELETE. For a brief summary of the standard HTTP verbs and their CRUD semantics, see, for instance, this REST API tutorial.

  1. URL path: hostname or IP address of one of the DFC servers.

By convention, a RESTful operation performed on a DFC proxy server usually implies a “clustered” scope. Exceptions include querying proxy's own configuration via {"what": "config"} message.

  1. URL path: version of the REST API, resource that is operated upon, and possibly more forward-slash delimited specifiers.

For example: /v1/cluster where 'v1' is the currently supported API version and 'cluster' is the resource.

  1. Control message in JSON format, e.g. {"what": "config"}.

Combined, all these elements tell the following story. They specify the most generic action (e.g., GET) and designate the target aka “resource” of this action: e.g., an entire cluster or a given daemon. Further, they may also include context-specific and JSON-encoded control message to, for instance, distinguish between getting system statistics ({"what": "stats"}) versus system configuration ({"what": "config"}).

| Operation | HTTP action | Example | |— | — | —| | Unregister storage target | DELETE /v1/cluster/daemon/daemonID | curl -i -X DELETE http://192.168.176.128:8080/v1/cluster/daemon/15205:8083 | | Register storage target | POST /v1/cluster/register | curl -i -X POST -H 'Content-Type: application/json' -d '{"node_ip_addr": "172.16.175.41", "daemon_port": "8083", "daemon_id": "43888:8083", "direct_url": "http://172.16.175.41:8083"}' http://192.168.176.128:8083/v1/cluster/register | | Get cluster map | GET {“what”: “smap”} /v1/daemon | curl -X GET -H 'Content-Type: application/json' -d '{"what": "smap"}' http://192.168.176.128:8080/v1/daemon | | Get proxy or target configuration| GET {“what”: “config”} /v1/daemon | curl -X GET -H 'Content-Type: application/json' -d '{"what": "config"}' http://192.168.176.128:8080/v1/daemon | | Update individual DFC daemon (proxy or target) configuration | PUT {“action”: “setconfig”, “name”: “some-name”, “value”: “other-value”} /v1/daemon | curl -i -X PUT -H 'Content-Type: application/json' -d '{"action": "setconfig","name": "stats_time", "value": "1s"}' http://192.168.176.128:8081/v1/daemon | | Update individual DFC daemon (proxy or target) configuration | PUT {“action”: “setconfig”, “name”: “some-name”, “value”: “other-value”} /v1/daemon | curl -i -X PUT -H 'Content-Type: application/json' -d '{"action":"setconfig","name":"loglevel","value":"4"}' http://192.168.176.128:8080/v1/daemon | | Set cluster-wide configuration (proxy) | PUT {“action”: “setconfig”, “name”: “some-name”, “value”: “other-value”} /v1/cluster | curl -i -X PUT -H 'Content-Type: application/json' -d '{"action": "setconfig","name": "stats_time", "value": "1s"}' http://192.168.176.128:8080/v1/cluster | | Shutdown target/proxy | PUT {“action”: “shutdown”} /v1/daemon | curl -i -X PUT -H 'Content-Type: application/json' -d '{"action": "shutdown"}' http://192.168.176.128:8082/v1/daemon | | Shutdown cluster (proxy) | PUT {“action”: “shutdown”} /v1/cluster | curl -i -X PUT -H 'Content-Type: application/json' -d '{"action": "shutdown"}' http://192.168.176.128:8080/v1/cluster | | Rebalance cluster (proxy) | PUT {“action”: “rebalance”} /v1/cluster | curl -i -X PUT -H 'Content-Type: application/json' -d '{"action": "rebalance"}' http://192.168.176.128:8080/v1/cluster | | Get cluster statistics (proxy) | GET {“what”: “stats”} /v1/cluster | curl -X GET -H 'Content-Type: application/json' -d '{"what": "stats"}' http://192.168.176.128:8080/v1/cluster | | Get target statistics | GET {“what”: “stats”} /v1/daemon | curl -X GET -H 'Content-Type: application/json' -d '{"what": "stats"}' http://192.168.176.128:8083/v1/daemon | | Get object (proxy) | GET /v1/objects/bucket-name/object-name | curl -L -X GET http://192.168.176.128:8080/v1/objects/myS3bucket/myobject -o myobject 1 | | Put object (proxy) | PUT /v1/objects/bucket-name/object-name | curl -L -X PUT http://192.168.176.128:8080/v1/objects/myS3bucket/myobject -T filenameToUpload | | Get bucket names | GET /v1/buckets/* | curl -L -X GET http://192.168.176.128:8080/v1/buckets/*?local=true | | List bucket | GET { properties-and-options… } /v1/buckets/bucket-name | curl -X GET -L -H 'Content-Type: application/json' -d '{"props": "size"}' http://192.168.176.128:8080/v1/buckets/myS3bucket 2 | | Rename/move object (local buckets) | POST {“action”: “rename”, “name”: new-name} /v1/objects/bucket-name/object-name | curl -i -X POST -L -H 'Content-Type: application/json' -d '{"action": "rename", "name": "dir2/DDDDDD"}' http://192.168.176.128:8080/v1/objects/mylocalbucket/dir1/CCCCCC 3 | | Copy object | PUT /v1/objects/bucket-name/object-name?from_id=&to_id= | curl -i -X PUT http://192.168.176.128:8083/v1/objects/mybucket/myobject?from_id=15205:8083&to_id=15205:8081 4 | | Delete object | DELETE /v1/objects/bucket-name/object-name | curl -i -X DELETE -L http://192.168.176.128:8080/v1/objects/mybucket/mydirectory/myobject | | Evict object from cache | DELETE '{“action”: “evict”}' /v1/objects/bucket-name/object-name | curl -i -X DELETE -L -H 'Content-Type: application/json' -d '{"action": "evict"}' http://192.168.176.128:8080/v1/objects/mybucket/myobject | | Create local bucket (proxy) | POST {“action”: “createlb”} /v1/buckets/bucket-name | curl -i -X POST -H 'Content-Type: application/json' -d '{"action": "createlb"}' http://192.168.176.128:8080/v1/buckets/abc | | Destroy local bucket (proxy) | DELETE {“action”: “destroylb”} /v1/buckets/bucket | curl -i -X DELETE -H 'Content-Type: application/json' -d '{"action": "destroylb"}' http://192.168.176.128:8080/v1/buckets/abc | | Rename local bucket (proxy) | POST {“action”: “renamelb”} /v1/buckets/bucket-name | curl -i -X POST -H 'Content-Type: application/json' -d '{"action": "renamelb", "name": "newname"}' http://192.168.176.128:8080/v1/buckets/oldname | | Prefetch a list of objects | POST '{“action”:“prefetch”, “value”:{“objnames”:“[o1[,o]]“[, deadline: string][, wait: bool]}}' /v1/buckets/bucket-name | curl -i -X POST -H 'Content-Type: application/json' -d '{"action":"prefetch", "value":{"objnames":["o1","o2","o3"], "deadline": "10s", "wait":true}}' http://192.168.176.128:8080/v1/buckets/abc 5 | | Prefetch a range of objects| POST '{“action”:“prefetch”, “value”:{“prefix”:“your-prefix”,“regex”:“your-regex”,“range”,“min:max” [, deadline: string][, wait:bool]}}' /v1/buckets/bucket-name | curl -i -X POST -H 'Content-Type: application/json' -d '{"action":"prefetch", "value":{"prefix":"__tst/test-", "regex":"\\d22\\d", "range":"1000:2000", "deadline": "10s", "wait":true}}' http://192.168.176.128:8080/v1/buckets/abc 5 | | Delete a list of objects | DELETE '{“action”:“delete”, “value”:{“objnames”:“[o1[,o]]“[, deadline: string][, wait: bool]}}' /v1/buckets/bucket-name | curl -i -X DELETE -H 'Content-Type: application/json' -d '{"action":"delete", "value":{"objnames":["o1","o2","o3"], "deadline": "10s", "wait":true}}' http://192.168.176.128:8080/v1/buckets/abc 5 | | Delete a range of objects| DELETE '{“action”:“delete”, “value”:{“prefix”:“your-prefix”,“regex”:“your-regex”,“range”,“min:max” [, deadline: string][, wait:bool]}}' /v1/buckets/bucket-name | curl -i -X DELETE -H 'Content-Type: application/json' -d '{"action":"delete", "value":{"prefix":"__tst/test-", "regex":"\\d22\\d", "range":"1000:2000", "deadline": "10s", "wait":true}}' http://192.168.176.128:8080/v1/buckets/abc 5 | | Evict a list of objects | DELETE '{“action”:“evict”, “value”:{“objnames”:“[o1[,o]]“[, deadline: string][, wait: bool]}}' /v1/buckets/bucket-name | curl -i -X DELETE -H 'Content-Type: application/json' -d '{"action":"evict", "value":{"objnames":["o1","o2","o3"], "dea1dline": "10s", "wait":true}}' http://192.168.176.128:8080/v1/buckets/abc 5 | | Evict a range of objects| DELETE '{“action”:“evict”, “value”:{“prefix”:“your-prefix”,“regex”:“your-regex”,“range”,“min:max” [, deadline: string][, wait:bool]}}' /v1/buckets/bucket-name | curl -i -X DELETE -H 'Content-Type: application/json' -d '{"action":"evict", "value":{"prefix":"__tst/test-", "regex":"\\d22\\d", "range":"1000:2000", "deadline": "10s", "wait":true}}' http://192.168.176.128:8080/v1/buckets/abc 5 | | Get bucket props | HEAD /v1/buckets/bucket-name | curl --head http://192.168.176.128:8080/v1/buckets/mybucket | | Get object props | HEAD /v1/objects/bucket-name/object-name | curl --head http://192.168.176.128:8080/v1/objects/mybucket/myobject | | Set primary proxy (primary proxy only)| PUT /v1/cluster/proxy/new primary-proxy-id | curl -i -X PUT http://192.1168.176.128:8080/v1/cluster/proxy/26869:8080 |


1: This will fetch the object “myS3object” from the bucket “myS3bucket”. Notice the -L - this option must be used in all DFC supported commands that read or write data - usually via the URL path /v1/objects/. For more on the -L and other useful options, see Everything curl: HTTP redirect.

2: See the List Bucket section for details. ?

3: Notice the -L option here and elsewhere. ?

4: Advanced usage only. ?

5: See the List/Range Operations section for details.

Example: querying runtime statistics
rl -X GET -H 'Content-Type: application/json' -d '{"what": "stats"}' http://192.168.176.128:8080/v1/cluster

This single command causes execution of multiple GET {"what": "stats"} requests within the DFC cluster, and results in a JSON-formatted consolidated output that contains both http proxy and storage targets request counters, as well as per-target used/available capacities. For example:

DFC statistics

More usage examples can be found in the the source.

List Bucket

the ListBucket API returns a page of object names (and, optionally, their properties including sizes, creation times, checksums, and more), in addition to a token allowing the next page to be retrieved.

properties-and-options

The properties-and-options specifier must be a JSON-encoded structure, for instance '{“props”: “size”}' (see examples). An empty structure '{}' results in getting just the names of the objects (from the specified bucket) with no other metadata.

| Property/Option | Description | Value | | — | — | — | | props | The properties to return with object names | A comma-separated string containing any combination of: “checksum”,“size”,“atime”,“ctime”,“iscached”,“bucket”,“version”. 6 | | time_format | The standard by which times should be formatted | Any of the following golang time constants: RFC822, Stamp, StampMilli, RFC822Z, RFC1123, RFC1123Z, RFC3339. The default is RFC822. | | prefix | The prefix which all returned objects must have. | For example, “my/directory/structure/” | | pagemarker | The token identifying the next page to retrieve | Returned in the “nextpage” field from a call to ListBucket that does not retrieve all keys. When the last key is retrieved, NextPage will be the empty string | | pagesize | The maximum number of object names returned in response | Default value is 1000. GCP and local bucket support greater page sizes. AWS is unable to return more than 1000 objects in one page. |\b

6: The objects that exist in the Cloud but are not present in the DFC cache will have their atime property empty (““). The atime (access time) property is supported for the objects that are present in the DFC cache. ?

Example: listing local and Cloud buckets

To list objects in the smoke/ subdirectory of a given bucket called 'myBucket', and to include in the listing their respective sizes and checksums, run:

rl -X GET -L -H 'Content-Type: application/json' -d '{"props": "size, checksum", "prefix": "smoke/"}' http://192.168.176.128:8080/v1/buckets/myBucket

This request will produce an output that (in part) may look as follows:

DFC list directory

For many more examples, please refer to the test sources in the repository.

Example: Listing All Pages

The following Go code retrieves a list of all of object names from a named bucket (note: error handling omitted):

.g. proxyurl: "http://localhost:8080"
:= proxyurl + "/v1/buckets/" + bucket

:= &dfc.GetMsg{}
bucketlist := &dfc.BucketList{Entries: make([]*dfc.BucketEntry, 0)}
{
// 1. First, send the request
jsbytes, _ := json.Marshal(msg)
request, _ := http.NewRequest("GET", url, bytes.NewBuffer(jsbytes))
r, _ := http.DefaultClient.Do(request)
defer func(r *http.Response){
    r.Body.Close()
}(r)

// 2. Unmarshal the response
pagelist := &dfc.BucketList{}
respbytes, _ := ioutil.ReadAll(r.Body)
_ = json.Unmarshal(respbytes, pagelist)

// 3. Add the entries to the list
fullbucketlist.Entries = append(fullbucketlist.Entries, pagelist.Entries...)
if pagelist.PageMarker == "" {
    // If PageMarker is the empty string, this was the last page
    break
}
// If not, update PageMarker to the next page returned from the request.
msg.GetPageMarker = pagelist.PageMarker

Note that the PageMarker returned as a part of pagelist is for the next page.

Cache Rebalancing

DFC rebalances its cached content based on the DFC cluster map. When cache servers join or leave the cluster, the next updated version (aka generation) of the cluster map gets centrally replicated to all storage targets. Each target then starts, in parallel, a background thread to traverse its local caches and recompute locations of the cached items.

Thus, the rebalancing process is completely decentralized. When a single server joins (or goes down in a) cluster of N servers, approximately 1/Nth of the content will get rebalanced via direct target-to-target transfers.

List/Range Operations

DFC provides two APIs to operate on groups of objects: List, and Range. Both of these share two optional parameters:

| Parameter | Description | Default | |— | — | — | | deadline | The amount of time before the request expires formatted as a golang duration string. A timeout of 0 means no timeout.| 0 | | wait | If true, a response will be sent only when the operation completes or the deadline passes. When false, a response will be sent once the operation is initiated. When setting wait=true, ensure your request has a timeout at least as long as the deadline. | false |

List

List APIs take a JSON array of object names, and initiate the operation on those objects.

| Parameter | Description | | — | — | | objnames | JSON array of object names |

Range

Range APIs take an optional prefix, a regular expression, and a numeric range. A matching object name will begin with the prefix and contain a number that satisfies both the regex and the range as illustrated below.

| Parameter | Description | | — | — | | prefix | The prefix that all matching object names will begin with. Empty prefix (““) will match all names. | | regex | The regular expression, represented as an escaped string, to match the number embedded in the object name. Note that the regular expression applies to the entire name - the prefix (if provided) is not excluded. | | range | Represented as “min:max”, corresponding to the inclusive range from min to max. Either or both of min and max may be empty strings (““), in which case they will be ignored. If regex is an empty string, range will be ignored. |

Examples

| Prefix | Regex | Escaped Regex | Range | Matches
(the match is highlighted) | Doesn't Match | | — | — | — | — | — | — | | “tst/test-” | "\d22\d" | "\\d22\\d" | “1000:2000” | “tst/test-1223
tst/test-1229-4000.dat”
tst/test-1111-1229.dat”
tst/test-12222-40000.dat” | “prod/test-1223”
tst/test-1333”
tst/test-2222-4000.dat” | | “a/b/c” | "^\d+1\d" | "^\\d+1\\d" | “:100000” | “a/b/c/110
“a/b/c/99919-200000.dat”
“a/b/c/2314video-big” | “a/b/110”
“a/b/c/d/110”
“a/b/c/video-99919-20000.dat”
“a/b/c/100012”
“a/b/c/30331” |

Multiple Proxies

DFC can be run with multiple proxies. When there are multiple proxies, one of them is the primary proxy, and any others are secondary proxies. The primary proxy is the only one allowed to be used for actions related to the Smap (Registration, Local Bucket actions). The URL of the current primary proxy must be specified in the config file at the time a proxy or target is run. On startup, a proxy will start as Primary if the environment variable DFCPRIMARYPROXY is set to any non-empty string. If it is unset, it will start as primary if its id matches the id of the current primary proxy in the configuration file, unless the command line variable -proxyurl is set.

When any target or proxy discovers that the primary proxy is not working (because a keepalive fails), they intitiate a vote to determine the next primary proxy. The election process is as follows:

Proxy Startup Process

While it is running, a proxy persists the cluster map when it changes, loading it as the hint cluster map on startup. When a proxy starts up as primary, it performs the following process:

This process allows a proxy to be rerun with the same command and environment variables, even if it should no longer be primary.

Current Limitations
WebDAV

WebDAV aka “Web Distributed Authoring and Versioning” is the IETF standard that defines HTTP extension for collaborative file management and editing. DFC WebDAV server is a reverse proxy (with interoperable WebDAV on the front and DFC's RESTful interface on the back) that can be used with any of the popular WebDAV-compliant clients.

For information on how to run it and details, please refer to the WebDAV README.


This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.