nuxeo/esync

Name: esync

Owner: Nuxeo

Description: Nuxeo Elasticsearch VCS sync checker

Created: 2015-02-20 13:44:29.0

Updated: 2018-03-23 14:39:36.0

Pushed: 2018-03-29 18:45:25.0

Homepage: null

Size: 150

Language: Java

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

esync a tool to compare Nuxeo repository and Elasticsearch content

When using nuxeo-elasticsearch we want to be sure that the repository content is in sync with the content indexed in Elasticsearch.

This tool enables to detect difference between the Nuxeo database repository and the indexed content in Elasticsearch.

Install

Download

Download the nuxeo-esync-VERSION-capsule-full.jar from https://maven.nuxeo.org.

Version Support

| Esync Version | Nuxeo Version | Elasticsearch version| | — | — |— | | 1.1.X| 7.10 | 1.5.2| | 2.0.X| 8.10 | 2.3.5| | 3.0.X| 9.10 | 5.6.4|

From esync version 3 the Elasticsearch rest client is used instead of the transport client.

Building from sources

Create the all in one jar:

mvn package

The jar is located here:

./target/nuxeo-esync-VERSION-capsule-full.jar
QA results

Build Status

Usage

Configuration

Create a /etc/esync.conf or ~/.esync.conf using one of the samples provided :

You will need to configure the database and Elasticsearch access.

Refer to the source for the full list of options available.

Invocation
 # using a default conf located in /etc/esync.conf or ~/.esync.conf
 java -jar /path/to/nuxeo-esync-$VERSION-capsule-full.jar

 # using an another config file
 java -jar /path/to/nuxeo-esync-$VERSION-capsule-full.jar /path/to/config-file.conf

 # customizing the log
 java -Dlog4j.configuration=file:mylog4j.xml -jar nuxeo-esync-$VERSION-capsule-full.jar

You can find the default log4.xml here default log file is in /tmp/trace.log.

Checkers

The tool runs concurrently different checkers.

Checkers compare the reference database aka expected with the Elasticsearch content aka actual. You should run a full re-index on Elasticsearch before running the tool.

Checkers report different things:

Here is a list of available checkers.

Cardinality Checker

This is a quick check to count the total number of documents in the db and Elasticsearch. There are 4 document counts:

False positive cases:

False negative cases:

Type Cardinality Checker

Checks the number of each document types for documents and versions

False positive cases:

False negative cases:

Type Document Lister

When there is a difference raise by the Type Cardinality checker the list of ids for this type is compared, to gives the missing and spurious document ids.

False positive cases: None False negative cases: None

It can takes time and memory to list all doc ids from the database.

ACL Checker

It performs 2 checks:

False positive cases:

False negative cases:

License

Apache License, Version 2.0

About Nuxeo

Nuxeo dramatically improves how content-based applications are built, managed and deployed, making customers more agile, innovative and successful. Nuxeo provides a next generation, enterprise ready platform for building traditional and cutting-edge content oriented applications. Combining a powerful application development environment with SaaS-based tools and a modular architecture, the Nuxeo Platform and Products provide clear business value to some of the most recognizable brands including Verizon, Electronic Arts, Netflix, Sharp, FICO, the U.S. Navy, and Boeing. Nuxeo is headquartered in New York and Paris. More information is available at www.nuxeo.com.


This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.