Name: go-syncstorage
Owner: Mozilla Services
Description: SyncStorage Server with more golang and less indexes!
Created: 2015-11-24 06:04:57.0
Updated: 2018-03-01 09:07:34.0
Pushed: 2017-09-06 14:03:07.0
Size: 5253
Language: Go
GitHub Committers
User | Most Recent Commit | # Commits |
---|
Other Committers
User | Most Recent Commit | # Commits |
---|
Go-syncstorage is the next generation sync storage server. It was built to solve the data degradation problem with the python+mysql implementation. Logical separation of data is now physical separation of data. In go-syncstorage each user gets their own sqlite database. Many indexes were harmed in the making of this product.
The server is distributed as a Docker container. Latest builds and releases can be found on Dockerhub.
Running the server is easy:
cker pull mozilla/go-syncstorage:latest
cker run -it \
"PORT=8000" \ [1]
"SECRETS=secret0,secret1,secret2" \ [2]
"DATA_DIR=/data" \ [3]
"/host/data/path:/data" \ [4]
zilla/go-syncstorage
Only three configurations are required: PORT
, SECRETS
and DATA_DIR
.
PORT
- where to listen for HTTP requestsSECRETS
- CSV of secrets preshared with the token serviceDATA_DIR
- where to save files (relative to inside the container)The server has a few knobs that can be tweaked.
| Env. Var | Info |
|—|—|
| HOST
| Address to listen on. Defaults to 0.0.0.0
. |
| PORT
| Port to listen on |
| DATA_DIR
| Where to save DB files. Use an absolute path. :memory:
is valid and saves databases in RAM but recommended only for testing. |
| SECRETS
| Comma separated list of shared secrets. Secrets are tried in order and allows for secret rotation without downtime. |
| LOG_LEVEL
| Log verbosity, allowed: fatal
,error
,warn
,debug
,info
. Default info
. |
| LOG_MOZLOG
| Can be true
or false
. Outputs logs in mozlog format. Default false
.|
| LOG_DISABLE_HTTP
| Can be true
or false
. Disables logging of HTTP requests. Default false
. |
| LOG_ONLY_HTTP_ERRORS
| Can be true
or false
. Logs only when errno != 0
to reduce noise. Default false
. |
| HOSTNAME
| Set a hostname value for mozlog output |
| LIMIT_MAX_REQUESTS_BYTES
| The maximum size in bytes of the overall HTTP request body that will be accepted by the server. |
| LIMIT_MAX_POST_BYTES
| Maximum size of a POST request. Default: 2097152 (2MB). |
| LIMIT_MAX_POST_RECORDS
| Maximum number of BSOs per POST request. Default 100. |
| LIMIT_MAX_TOTAL_BYTES
| Maximum total size of a POST batch job. Default: 26,214,400 (20MB). |
| LIMIT_MAX_TOTAL_RECORDS
| Maximum total BSOs in a POST batch job. Default 1000. |
| LIMIT_MAX_BATCH_TTL
| Maximum TTL for a batch to remain uncommitted in seconds. Default 7200 (2 hours). |
| LIMIT_MAX_RECORD_PAYLOAD_BYTES
| Maximum bytes for a BSO payload. Default 2MB. |
| INFO_CACHE_SIZE
| Cache size in MB for <uid>/info/collections
and <uid>/info/configuration
. Default 0 (disabled) |
| HAWK_TIMESTAMP_MAX_SKEW
| Sets number of seconds hawk timestamps can differ from the server. Default 60. |
| Env. Var | Info |
|—|—|
| POOL_NUM
| Number of DB pools. Defaults to number of CPUs. |
| POOL_SIZE
| Number of open DB files per pool. Defaults to 25
. |
| POOL_VACUUM_KB
| Threshold of free space in kilobytes to trigger a database vacuum. Defaults to 0
(disabled). |
| POOL_PURGE_MIN_HOURS
| Minimum hours before purging BSOs, Batches, etc for a user. Defaults to 168
(1 week) |
| POOL_PURGE_MAX_HOURS
| Max hours before purging. Defaults to 336
(2 weeks). |
go-syncstorage limits the number of open SQLite database files to keep memory usage constant. This allows a small server to handle thousands of users for a small performance hit.
Multiplying POOL_NUM x POOL_SIZE
gives the maximum number of open files. The product should to large enough so pools are not starved and have to clean up too often. A sign things are too small is when sql: database is closed
errors appear in the logs.
A low level lock is used in each pool when opening and closing files. Having a larger POOL_NUM
decreases lock contention.
When a pool reaches POOL_SIZE
number of open files it will close the least recently used database. Having a larger POOL_SIZE
reduces open/close disk IO. It also increases memory usage.
Tweaking these values from default won't provide significant performance gains in production. However, a POOL_NUM=1
and POOL_SIZE=1
is useful for testing the overhead of opening and closing databases files.
The POOL_PURGE_MIN_HOURS
and POOL_PURGE_MAX_HOURS
define a time range to trigger a purge job for a user. The default range is between 168 and 336 hours. This means a user will have a purge job run only once every one to two weeks. A large range spreads evens out IO load.
The POOL_VACUUM_KB
sets the threshold before a vacuum is run. Purging of batches and BSOs free up database pages but not disk space. A vacuum will rewrite the database, defragment it and free up disk space. Depending on the number of records it can take seconds to vacuum a database.
| Env. Var | Info |
|—|—|
| SQLITE3_CACHE_SIZE
| Sets sqlite's internal cache size for each open DB. Busy servers open/close the db files often so a smaller cache size may be more efficient. Follows the PRAGMA cache_size rules. Positive integers are number of pages to cache, negative numbers are KB of RAM to use for cache. Default 0 (no cache)|
When deploying choose the EXT4 filesystem. EXT4 is an extent based filesystem and may help improve performance for magnetic storage media.
go-syncstorage gives each user gets their own sqlite database. On a production server that enough files to be a real burden for a human when troubleshooting. Thus, files are created into a directory structure like this:
a-dir/
0/
1/
4/
21/
100001234.db
..
9/
100001234
, is located at 34/21/100001234.db
. The path starts at the reverse of their id. Their id is used for the actual database name.Using this scheme, one million users will only have 10,000 files per directory. This is a relatively low number that CLI tools like ls
will have no trouble with. Always optimize for the proper care and feed of your sysadmins.
A linux binary is also available as build artifacts from Circle CI.
See LICENSE.