Name: data-import
Owner: Simpleweb
Description: Import data from and export data to a range of different file formats and media
Created: 2013-10-10 08:09:21.0
Updated: 2013-12-03 22:46:36.0
Pushed: 2013-10-10 15:16:19.0
Homepage: null
Size: 278
Language: PHP
GitHub Committers
User | Most Recent Commit | # Commits |
---|
Other Committers
User | Most Recent Commit | # Commits |
---|
This PHP library offers a way to read data from, and write data to, a range of file formats and media. Additionally, it includes tools to manipulate your data.
This library is available on Packagist. The recommended way to install it is through Composer:
mposer require ddeboer/data-import:0.2
For integration with Symfony2 projects, the DdeboerDataImportBundle is available.
Each data import revolves around the workflow and takes place along the following lines:
So, schematically:
Ddeboer\DataImport\Workflow;
Ddeboer\DataImport\Reader;
Ddeboer\DataImport\Writer;
Ddeboer\DataImport\Filter;
der = new Reader\...;
kflow = new Workflow($reader);
kflow->addWriter(new Writer\...());
->addWriter(new Writer\...())
->addFilter(new Filter\CallbackFilter(...))
->process();
This library includes a handful of readers:
ArrayReader
for reading arrays (or testing your workflow).CsvReader
for reading CSV files, optimized to use a little memory as possible:Ddeboer\DataImport\Reader\CsvReader;
der = new CsvReader(new \SplFileObject('/path/to/csv_file.csv'));
DbalReader
to read data through Doctrine?s DBAL:Ddeboer\DataImport\Reader\DbalReader;
der = new DbalReader(
$connection, // Instance of \Doctrine\DBAL\Connection
'SELECT u.id, u.username, g.name FROM `user` u INNER JOIN groups g ON u.group_id = g.id'
DoctrineReader
to read data through the Doctrine ORM:Ddeboer\DataImport\Reader\DoctrineReader;
der = new DoctrineReader($entityManager, 'Your\Namespace\Entity\User');
ExcelReader
that acts as an adapter for the PHPExcel library:Ddeboer\DataImport\Reader\ExcelReader;
der = new ExcelReader(new \SplFileObject('/path/to/ecxel_file.xls'));
After you?ve set up your reader, construct the workflow from it:
kflow = new Workflow($reader);
Many of the data writers closely resemble their reader counterparts:
ArrayWriter
.CsvWriter
.DoctrineWriter
.Also available are:
ConsoleProgressWriter
that displays import progress when you start the
workflow from the command-line:Ddeboer\DataImport\Writer\ConsoleProgressWriter;
Symfony\Component\Console\Output\ConsoleOutput;
put = new ConsoleOutput(...);
gressWriter = new ConsoleProgressWriter($output, $reader);
Build your own writer by implementing the WriterInterface.
If you want, you can use multiple writers:
kflow->addWriter($progressWriter)
->addWriter($csvWriter);
A filter decides whether data input is accepted into the import process. The library currently ships with a CallbackFilter:
Ddeboer\DataImport\Filter\CallbackFilter;
on?t import The Beatles
ter = new CallbackFilter(function($data) {
if ('The Beatles' == $data['name']) {
return false;
} else {
return true;
}
kflow->addFilter($filter);
Value converters are used to convert specific fields (e.g., columns in database).
DateTimeValueConverter
that converts a date representation in a format you
specify into a DateTime
object:e Ddeboer\DataImport\ValueConverter\DateTimeValueConverter;
onverter = new DateTimeValueConverter('d/m/Y H:i:s');
orkflow->addValueConverter('my_date_field', $converter);
StringToObjectConverter
that looks up an object in the database based
on a string value:Ddeboer\DataImport\ValueConverter\StringToObjectConverter;
verter = new StringToObjectConverter($repository, 'name');
kflow->addValueConverter('input_name', $converter);
Ddeboer\DataImport\Workflow;
Ddeboer\DataImport\Source\HttpSource;
Ddeboer\DataImport\Source\Filter\Unzip;
Ddeboer\DataImport\Reader\CsvReader;
Ddeboer\DataImport\ValueConverter\DateTimeValueConverter;
)
reate the source; here we use an HTTP one
rce = new HttpSource('http://www.opta.nl/download/registers/nummers_csv.zip');
s the source file is zipped, we add an unzip filter
rce->addFilter(new Unzip('nummers.csv'));
etrieve the \SplFileObject from the source
e = $source->getFile();
reate and configure the reader
Reader = new CsvReader($file);
Reader->setHeaderRowNumber(0);
reate the workflow
kflow = new Workflow($csvReader);
eTimeConverter = new DateTimeValueConverter();
dd converters to the workflow
kflow
->addValueConverter('twn_datumbeschikking', $dateTimeConverter)
->addValueConverter('twn_datumeind', $dateTimeConverter)
->addValueConverter('datummutatie', $dateTimeConverter)
ou can also add closures as converters
->addValueConverter('twn_nummertm',
new \Ddeboer\DataImport\ValueConverter\CallbackValueConverter(
function($input) {
return str_replace('-', '', $input);
}
)
->addValueConverter('twn_nummervan',
new \Ddeboer\DataImport\ValueConverter\CallbackValueConverter(
function($input) {
return str_replace('-', '', $input);
}
)
se one of the writers supplied with this bundle, implement your own, or use
closure:
->addWriter(new \Ddeboer\DataImport\Writer\CallbackWriter(
function($csvLine) {
var_dump($csvLine);
}
));
rocess the workflow
kflow->process();
The ArrayValueConverterMap is used to filter values of a multi-level array.
The converters defined in the list are applied on every data-item's value that match the defined array_keys.
//...
$data = array(
'products' => array(
0 => array(
'name' => 'some name',
'price' => '?12,16',
),
1 => array(
'name' => 'some name',
'price' => '?12,16',
),
)
);
// ...
// create the workflow and reader etc.
// ...
$workflow->addValueConverter(new ArrayValueConverterMap(array(
'name' => array(new CharsetValueConverter('UTF-8', 'UTF-16')), // encode to UTF-8
'price' => array(new CallbackValueConverter(function($input) { // remove ? char
return str_replace('?', '', $intput);
}),
)));
// ..
// after filtering data looks as follows
$data = array(
'products' => array(
0 => array(
'name' => 'some name', // in UTF-8
'price' => '12,16',
),
1 => array(
'name' => 'some name',
'price' => '12,16',
),
)
);
The global-mapping allows you to define an array that is used to rename fields of an item.
Using global-mapping can be used to add renaming-rules for a multi-level array and is applied after the standard-mapping rules are applied.
//...
$data = array(
0 => array(
'foo' => 'bar',
'baz' => array(
'some' => 'value',
'some2' => 'value'
)
)
);
// ...
// create the workflow and reader etc.
// ...
// this defines a single mapping
$workflow->addMapping('baz', 'bazinga');
// this defines renaming global rules
$workflow->setGlobalMapping(array(
'foo' => 'fooloo',
'bazinga' => array( // we need to use the new name here because global mapping is applied after standard mapping
'some' => 'something',
'some2' => 'somethingelse'
)
));
// ..
// after filtering data looks as follows
$data = array(
0 => array(
'fooloo' => 'bar',
'bazinga' => array(
'something' => 'value',
'somethingelse' => 'value'
)
)
);