Name: Zander
Owner: F# Community Project Incubation Space
Description: Regular expression types for matrix information. I.e. parse structured blocks of information from csv or excel files (or similar 2d matrixes)
Created: 2015-09-17 15:25:31.0
Updated: 2018-03-06 11:56:06.0
Pushed: 2018-03-06 08:34:10.0
Size: 136
Language: F#
GitHub Committers
User | Most Recent Commit | # Commits |
---|
Other Committers
User | Most Recent Commit | # Commits |
---|
Named after the fish: Zander. It's a small library to ease with parsing structured blocks of information within a 2-dimensional matrix of information. Typically you get this sort of information from report generators. You might still want to extract this information programmatically, thus the need for the fish.
When you have data in a structured format, but with different blocks of information. A very simple example is the following:
Report Title | 16/09/15 16:17 | Page: 1 | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Company AB | |||||||||||
Some text | |||||||||||
that goes on and explains the report | |||||||||||
Id | Value | Type | Attribute 1 | Attribute 2 | |||||||
1244 | 25 | A | |||||||||
1244 | 25 | B | 255 | 155 | |||||||
1244 | 25 | C | |||||||||
1250 | 25 | B | 255 | 100 | |||||||
1250 | 25 | C | |||||||||
Report Title | 16/09/15 16:17 | Page: 2 | |||||||||
Company AB | |||||||||||
Some text | |||||||||||
that goes on and explains the report | |||||||||||
Id | Value | Type | Attribute 1 | Attribute 2 | |||||||
1251 | 25 | A | 255 | ||||||||
1251 | 25 | B | 130 | ||||||||
1251 | 25 | C | |||||||||
1260 | 25 | A | |||||||||
1260 | 25 | B | 255 | 15 | |||||||
1260 | 25 | C |
But the structure of the block layout might change from “page” to “page”.
`_
` to indicate that there should be an empty column`"Some constant"
or ``
constant``` to indicate a column with a constant value`@Value
` to indicate that you want the value on that column`( .. | .. )
` to match any ofIn order to match rows you supply the row specification with a name by postfixing with ` : title
If you want the row to match many rows with the same format you add a '+' : ``
: title+```
How do you use this library to extract the information above? You use the parser builder:
g Zander;
parsed = new BlockEx( @" _ _ _ _ _ _ ""Report Title"" _ _ _ @Time @Page : report_title
""Company AB"" _ _ _ _ _ _ _ _ _ _ _ : company
@Text _ _ _ _ _ _ _ _ _ _ _ : text+
_ Id _ Value Type _ _ ""Attribute 1"" _ ""Attribute 2"" _ _ : header
_ @Id _ @Value @Type _ _ (@Attribute1|_) _ (@Attribute2|_) _ _ : row+
")
.Matches(arrayOfArrays);
This will give you structured information that will be easy to consume.