JuliaArrays/AxisArrays.jl

Name: AxisArrays.jl

Owner: JuliaArrays

Description: Performant arrays where each dimension can have a named axis with values

Created: 2015-02-13 01:08:29.0

Updated: 2017-12-18 23:57:20.0

Pushed: 2018-01-18 07:13:35.0

Homepage: http://JuliaArrays.github.io/AxisArrays.jl/latest/

Size: 920

Language: Julia

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

AxisArrays

Build Status Coverage Status

This package for the Julia language provides an array type (the AxisArray) that knows about its dimension names and axis values. This allows for indexing with the axis name without incurring any runtime overhead. AxisArrays can also be indexed by the values of their axes, allowing column names or interval selections. This permits one to implement algorithms that are oblivious to the storage order of the underlying arrays. In contrast to similar approaches in Images.jl and NamedArrays.jl, this allows for type-stable selection of dimensions and compile-time axis lookup. It is also better suited for regularly sampled axes, like samples over time.

Collaboration is welcome! This is still a work-in-progress. See the roadmap for the project's current direction.

Example of currently-implemented behavior:
a> Pkg.add("AxisArrays")
a> using AxisArrays, Unitful
a> import Unitful: s, ms, µs

a> rng = MersenneTwister(123) # Seed a random number generator for repeatable examples
a> fs = 40000 # Generate a 40kHz noisy signal, with spike-like stuff added for testing
a> y = randn(rng, 60*fs+1)*3
a> for spk = (sin.(0.8:0.2:8.6) .* [0:0.01:.1; .15:.1:.95; 1:-.05:.05] .* 50,
              sin.(0.8:0.4:8.6) .* [0:0.02:.1; .15:.1:1; 1:-.2:.1] .* 50)
       i = rand(rng, round(Int,.001fs):1fs)
       while i+length(spk)-1 < length(y)
           y[i:i+length(spk)-1] += spk
           i += rand(rng, round(Int,.001fs):1fs)
       end
   end

a> A = AxisArray([y 2y], Axis{:time}(0s:1s/fs:60s), Axis{:chan}([:c1, :c2]))
mensional AxisArray{Float64,2,...} with axes:
:time, 0.0 s:2.5e-5 s:60.0 s
:chan, Symbol[:c1, :c2]
data, a 2400001×2 Array{Float64,2}:
5708     7.14161
14454   12.2891  
42795    6.85591
37825    2.75649
19004   -2.38007
99414   -3.98828
9429     5.88581
226449  -0.452898
821446   1.64289
582687  -1.16537
                 
50593   -7.01187
26783    4.53565
16902   -0.33804
84852   -7.69703
226457   0.452914
560809   1.12162
67663    9.35326
41005   -4.8201  
71612   -7.43224

AxisArrays behave like regular arrays, but they additionally use the axis information to enable all sorts of fancy behaviors. For example, we can specify indices in any order, just so long as we annotate them with the axis name:

a> A[Axis{:time}(4)]
mensional AxisArray{Float64,1,...} with axes:
:chan, Symbol[:c1, :c2]
data, a 2-element Array{Float64,1}:
7825
5649

a> A[Axis{:chan}(:c2), Axis{:time}(1:5)]
mensional AxisArray{Float64,1,...} with axes:
:time, 0.0 s:2.5e-5 s:0.0001 s
data, a 5-element Array{Float64,1}:
14161
2891
85591
75649
38007

We can also index by the values of each axis using an Interval type that selects all values between two endpoints a .. b or the axis values directly. Notice that the returned AxisArray still has axis information itself… and it still has the correct time information for those datapoints!

a> A[40µs .. 220µs, :c1]
mensional AxisArray{Float64,1,...} with axes:
:time, 5.0e-5 s:2.5e-5 s:0.0002 s
data, a 7-element Array{Float64,1}:
42795
37825
19004
99414
9429  
226449
821446

a> axes(ans, 1)
Arrays.Axis{:time,StepRangeLen{Quantity{Float64, Dimensions:{?}, Units:{s}},Base.TwicePrecision{Quantity{Float64, Dimensions:{?}, Units:{s}}},Base.TwicePrecision{Quantity{Float64, Dimensions:{?}, Units:{s}}}}}(5.0e-5 s:2.5e-5 s:0.0002 s)

You can also index by a single value on an axis using atvalue. This will drop a dimension. Indexing with an Interval type retains dimensions, even when the ends of the interval are equal:

a> A[atvalue(2.5e-5s), :c1]
453912336772

a> A[2.5e-5s..2.5e-5s, :c1]
mensional AxisArray{Float64,1,...} with axes:
:time, 2.5e-5 s:2.5e-5 s:2.5e-5 s
data, a 1-element Array{Float64,1}:
4454

You can even index by multiple values by broadcasting atvalue over an array:

a> A[atvalue.([2.5e-5s, 75.0µs])]
mensional AxisArray{Float64,2,...} with axes:
:time, Quantity{Float64, Dimensions:{?}, Units:{s}}[2.5e-5 s, 7.5e-5 s]
:chan, Symbol[:c1, :c2]
data, a 2×2 Array{Float64,2}:
4454  12.2891
7825   2.75649

Sometimes, though, what we're really interested in is a window of time about a specific index. One of the operations above (looking for values in the window from 40µs to 220µs) might be more clearly expressed as a symmetrical window about a specific index where we know something interesting happened. To represent this, we use the atindex function:

a> A[atindex(-90µs .. 90µs, 5), :c2]
mensional AxisArray{Float64,1,...} with axes:
:time_sub, -7.5e-5 s:2.5e-5 s:7.500000000000002e-5 s
data, a 7-element Array{Float64,1}:
85591
75649
38007
98828
88581
452898
64289

Note that the returned AxisArray has its time axis shifted to represent the interval about the given index! This simple concept can be extended to some very powerful behaviors. For example, let's threshold our data and find windows about those threshold crossings.

a> idxs = find(diff(A[:,:c1] .< -15) .> 0);

a> spks = A[atindex(-200µs .. 800µs, idxs), :c1]
mensional AxisArray{Float64,2,...} with axes:
:time_sub, -0.0002 s:2.5e-5 s:0.0008 s
:time_rep, Quantity{Float64, Dimensions:{?}, Units:{s}}[0.162 s, 0.20045 s, 0.28495 s, 0.530325 s, 0.821725 s, 1.0453 s, 1.11967 s, 1.1523 s, 1.22085 s, 1.6253 s  ?  57.0094 s, 57.5818 s, 57.8716 s, 57.8806 s, 58.4353 s, 58.7041 s, 59.1015 s, 59.1783 s, 59.425 s, 59.5657 s]
data, a 41×247 Array{Float64,2}:
.672063    7.25649      0.633375  ?    1.54583     5.81194    -4.706
.65182     2.57487      0.477408       3.09505     3.52478     4.13037
.46035     2.11313      4.78372        1.23385     7.2525      3.57485
.25651    -2.19785      3.05933        0.965021    6.78414     5.94854
.8537      0.345008     0.960533       0.812989    0.336715    0.303909
.466816    0.643649    -3.67087   ?    3.92978    -3.1242      0.789722
.0445    -13.2441      -4.60716        0.265144   -4.50987    -8.84897
.21703   -13.2254     -14.4409        -8.6664    -13.3457    -11.6213
.1809    -22.7037     -25.023        -15.9376    -28.0817    -16.996
.2671    -31.2021     -25.3787       -24.4914    -32.2599    -26.1118
                                  ?                ?
.301629    0.0683982   -4.36574        1.92362    -5.12333    -3.4431
.7182      1.18615      4.40717       -4.51757    -8.64314     0.0800021
.43775    -0.151882    -1.40817       -3.38555    -2.23418     0.728549
.2482     -0.60967      0.471288  ?    2.53395     0.468817   -3.65905
.26967     2.24747     -3.13758        1.74967     4.5052     -0.145357
.752487    1.69446     -1.20491        1.71429     1.81936     0.290158
.64348    -3.94187     -1.59213        7.15428    -0.539748    4.82309
.09652    -2.66999      0.521931      -3.80528     1.70421     3.40583
.94341     2.60785     -3.34291   ?    1.10584     4.31118     3.6404

By indexing with a repeated interval, we have added a dimension to the output! The returned AxisArray's columns specify each repetition of the interval, and each datapoint in the column represents a timepoint within that interval, adjusted by the time of the theshold crossing. The best part here is that the returned matrix knows precisely where its data came from, and has labeled its dimensions appropriately. Not only is there the proper time base for each waveform, but we also have recorded the event times as the axis across the columns.

Indexing
Indexing axes

Two main types of Axes supported by default include:

Here is an example with a Dimensional axis representing a time sequence along rows and a Categorical axis of symbols for column headers.

AxisArray(reshape(1:15, 5, 3), .1:.1:0.5, [:a, :b, :c])
is{:row}(Interval(.2,.4))] # restrict the AxisArray along the time axis
terval(0.,.3), [:a, :c]]   # select an interval and two of the columns

User-defined axis types can be added along with custom indexing behaviors.

Example: compute the intensity-weighted mean along the z axis
AxisArray(randn(100,100,100), :x, :y, :z)
al = sumz = 0.0
iter in eachindex(B)  # traverses in storage order for cache efficiency
I = B[iter]  # intensity in a single voxel
Itotal += I
sumz += I * iter[axisdim(B, Axis{:z})]  # axisdim "looks up" the z dimension

z = sumz/Itotal

The intention is that all of these operations are just as efficient as they would be if you used traditional position-based indexing with all the inherent assumptions about the storage order of B.


This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.