openeuropa/europa-search-client

Name: europa-search-client

Owner: OpenEuropa

Description: null

Created: 2017-07-16 19:15:10.0

Updated: 2018-02-20 16:17:56.0

Pushed: 2018-02-21 16:33:23.0

Homepage: null

Size: 1317

Language: PHP

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

Europa Search Client Library

Build Status Scrutinizer Code Quality

The Europa Search Client Library aims to hide Europa Search services complexity behind a easy-to-use client library so that users don't have to worry about building their own request messages nor implementing REST interactions.

Table of content:

Library use

In order to send a request to one of the REST services of Europa Search, 3 elements are necessary:

Parameters array

It allows defining the library settings that will be used by the different layers.

It requires the following items:

Example:

If the Europa Search gives the following data:

And you want to use your PSR-3 Logger object for recording log messages.

The array must be defined as follow:

entConfiguration = [
ndexing_settings' => [
  'url_root' => 'http://www.es-ingestion/ingestion-api',
  'api_key' => 'aaa-eifnzff-blabla-0828',
  'database' => 'TEST-DUMMY-EXAMPLE',

earch_settings' => [
 'url_root' => 'http://www.es-search/searching-api',
 'api_key' => 'aaa-eifnzff-blabla-0828',

ervices_settings' => [
'logger' => new Logger(),
'log_level' => LogLevel::DEBUG,


Message objects
Messages for indexing

So far, there are 2 types of messages for sending indexing requests.

IndexWebContentMessage

This message object allows defining the message representing an indexing request for a web content. To learn more about its structure, please consult the API documentation.

IndexFileMessage (Not completely implemented yet)

This message object allows defining the message representing an indexing request for a file, when this part of the client will be implemented.

DeleteIndexItemMessage

This message object allows defining the message representing an deletion request for an indexed item.

In the current implementation, the object diverts from IndexWebContentMessage or IndexFileMessage on the mandatory properties. Only the document id (setDocumentId) is mandatory; The other properties can be omitted.

To learn more about its structure, please consult the API documentation.

Component objects: the Metadata .

This section only concerns the adding of the updating of the Europa Search index with the IndexWebContentMessage and IndexFileMessage objects.

Each document (web content or file) sent for indexing are characterized by metadata that form the components of the indexing messages.

There are 7 types:

For more information about the related classes, read the API documentation.

As the library supports the Dynamic schema of Europa Search, the metadata names can be those already used in the consumer system.
The client service via its proxy layer ensures that the names are formatted correctly before sending the request; I.E. the names are prefixed with the sequence declaring its type.
For instance, a string metadata name in the system is blabla, it will become esST_blabla when it is sent.

Example
ContentMessage = new IndexWebContentMessage();
exedDocument->setDocumentURI('http://europa.test.com/content.html');
exedDocument->setDocumentContent('<div id="lipsum">

m ipsum dolor sit amet, consectetur adipiscing elit. Phasellus tempor mattis sem vitae egestas. Nulla sed mauris sed ante convallis scelerisque. Vestibulum urna nisl, aliquam non risus vel, varius commodo augue. Aliquam efficitur elementum dapibus. Aliquam erat volutpat. Nulla orci purus, ultricies non velit at, venenatis fringilla ipsum. Sed porta nunc sit amet felis semper, at tempor erat dapibus. Sed id ipsum enim. Mauris suscipit pharetra lacinia. In nisi sem, tincidunt ac vestibulum ut, ultrices sed nisi. Phasellus nec diam at libero suscipit consequat. Nunc dapibus, ante ac hendrerit varius, sapien ex consequat ante, non venenatis ipsum metus eu ligula. Phasellus mattis arcu ut erat vulputate, sit amet blandit magna egestas. Vivamus nisl ipsum, maximus non tempor nec, finibus eu nisl. Phasellus lacinia interdum iaculis.
\n

 pellentesque, risus id efficitur convallis, elit justo sollicitudin elit, in convallis urna est id nibh. Sed rhoncus est nec leo hendrerit, ut tempus urna feugiat. Ut sed tempor orci, eu euismod massa. Phasellus condimentum sollicitudin ante, vel pretium mauris auctor quis. Etiam sit amet consectetur lorem. Phasellus at massa ex. Fusce porta est sit amet arcu pretium, ut suscipit eros molestie. Fusce malesuada ornare cursus. Curabitur sit amet eros nibh. Sed imperdiet magna quis odio tempus vehicula. Praesent auctor porta dolor, eu lacinia ante venenatis vel.
\n

iam tellus, sagittis sit amet finibus eget, ultrices sed turpis. Proin sodales dictum elit eget mollis. Aliquam nec laoreet purus. Pellentesque accumsan arcu vitae ipsum euismod, nec faucibus tellus rhoncus. Sed lacinia at augue vitae hendrerit. Aliquam egestas ante sit amet erat dignissim, non dictum ligula iaculis. Nulla tempor nec metus vitae pellentesque. Nulla porta sit amet lacus eu porttitor.
\n

consectetur leo eu felis vehicula sollicitudin. Aliquam pharetra, nulla quis tempor malesuada, odio nunc accumsan dui, in feugiat turpis ipsum vel tortor. Praesent auctor at justo convallis convallis. Aenean fringilla magna leo, et dictum nisi molestie sed. Quisque non ornare sem. Duis quis felis erat. Praesent rutrum vehicula orci ac suscipit.
\n

nec eros sit amet lorem convallis accumsan sed nec tellus. Maecenas eu odio dapibus, mollis leo eget, interdum urna. Phasellus ac dui commodo, cursus lorem nec, condimentum erat. Pellentesque eget imperdiet nisl, at convallis enim. Sed feugiat fermentum leo ac auctor. Aliquam imperdiet enim ac pellentesque commodo. Mauris sed sapien eu nulla mattis hendrerit ac ac mauris. Donec gravida, nisi sit amet rhoncus volutpat, quam nisl ullamcorper nisl, in luctus sapien justo et ex. Fusce dignissim felis felis, tempus faucibus tellus pulvinar vitae. Proin gravida tempus eros sit amet viverra. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Curabitur bibendum libero quis tellus commodo, non vestibulum lacus rutrum. Etiam euismod odio ipsum, nec pulvinar nisl ultrices sit amet. Nunc feugiat orci vel odio interdum, non dignissim erat hendrerit. Vestibulum gravida et elit nec placerat.
</div>');
exedDocument->setDocumentId('web_content_client_1');
exedDocument->setDocumentLanguage('en');

omponent definition for the message.

adata = new FullTextMetadata('title');
adata->setValues(['this the title']);
exedDocument->addMetadata($metadata);

adata = new StringMetadata('tag');
adata->setValues(['taxonomy term']);
exedDocument->addMetadata($metadata);

adata = new IntegerMetadata('rank');
adata->setValues([1]);
exedDocument->addMetadata($metadata);

adata = new FloatMetadata('percentage');
adata->setValues([0.1]);
exedDocument->addMetadata($metadata);
adata = new DateMetadata('publishing_date');
adata->setValues([date('F j, Y, g:i a', strtotime('11-12-2018'))]);
exedDocument->addMetadata($metadata);

adata = new URLMetadata('uri');
adata->setValues(['http://www.europa.com']);
exedDocument->addMetadata($metadata);
Messages for searching

So far, there is 1 type of messages.

SearchMessage

This message object allows defining the message representing a search request. To learn more about its structure, please consult the API documentation.

Component objects: the filters and queries components.

Each sent search request contains a query used for filtering the search results. This query is made of the different filter types that form the components of the search messages.
Each available component represents a component of the Europa Search Search API syntax. To know more about, please consult the “Advanced Search Parameters” page of the official documentation of Europa Search API.

There are 2 main types composed themselves of sub-types:

For more information about the related classes, read the API documentation.

As the library supports the Dynamic schema of Europa Search, the metadata names implies in the queries can be those already used in the consumer system.
The client service via its proxy layer ensures that the names are formatted correctly before sending the request; I.E. the names are prefixed with the sequence declaring its type
For instance, a string metadata name in the system is blabla, it will become esST_blabla when it is sent.

Example
rchMessage = new SearchMessage();
rchMessage->setSearchedLanguages(['fr']);
rchMessage->setHighLightParameters('<strong>{}</strong>', 250);
rchMessage->setPagination(20, 1);
rchMessage->setSearchedText('Lorem ipsum');

uery Component definition for the message.

leanQuery = new BooleanQuery();
ter = new RangeClause(new IntegerMetadata('rank'));
ter->setLowerBoundaryIncluded(1);
ter->setUpperBoundaryIncluded(5);
leanQuery->addMustFilterClause($filter);

ter = new TermClause(new FullTextMetadata('title'));
ter->setTestedValue('title');
leanQuery->addMustFilterClause($filter);

rchMessage->setQuery($searchQuery);
EuropaSearch object

As already said sooner, It is the entry point for the host applications like Drupal. From it, the Europa search ingestion and search REST services are accessible.

To call an ingestion or a search service, proceed as follow:

  1. Instantiate the object with the array of configuration parameters.
    If we use the parameters array from the example of the “Parameters array” section, we get:

    tory = new EuropaSearch($clientConfiguration);
    
  2. Call the proper application instance:

  3. For an ingestion request:

    dexApp = $factory->getIndexingApplication();
    
  4. For a search request:

    archApp = $factory->getSearchApplication();
    
  5. Send the proper message through the application instance:

  6. For an ingestion request, if we use the OpenEuropa\EuropaSearch\Messages\Index\IndexWebContent from the example of the “Message objects” section:

    sponse = $indexApp->sendMessage($webContentMessage);
    

    $response is an OpenEuropa\EuropaSearch\Mssages\Index\IndexingResponse object containing the indexed reference and a tracking id returned by the REST services.

  7. For a search request, if we use the OpenEuropa\EuropaSearch\Messages\Search\SearchMessage from the example of the “Message objects” section:

    sponse = $searchApp->sendMessage($searchMessage);
    

    $response is an OpenEuropa\EuropaSearch\Mssages\Search\SearchResponse object containing search results and some other data related to the current search like the total number of results.

Architectural overview
Architecture in layers

The library is organized into 3 layers with a specific scope:

To have more information about these layers, please consult the API documentation.

API Documentation

The Documentation is to be generated by your favorite phpDoc generator like phpDocumentor.

It is recommended to have and consult the one for these packages:

And the one of the class OpenEuropa\EuropaSearch\EuropaSearch.
They complete the information given in the Library use section.

Quality control

The automatic quality control is managed by the “OpenEuropa code review” component.

The component depends on GrumPHP and based its controls on the Drupal coding convention.

Check the “OpenEuropa code review” documentation for more.

Component's Usage

GrumPHP tasks will be ran at every commit, if you want to run them without performing a commit use the following command:

vendor/bin/grumphp run

If you want to simulate a commit message use:

vendor/bin/grumphp git:pre-commit

Check GrumPHP documentation for more.

Testing

For testing the client process, PHPUnit has been used to define different unit tests.

All tests are located in the tests/src repository and can be run with the following command line:

or/bin/phpunit

The basic configuration of PHPUnit environment is defined in phpunit.xml.dist

Dependencies:

The client library depends on the following projects:


This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.