hammerlab/iterators

Name: iterators

Owner: Hammer Lab

Description: Enrichment-methods for Scala collections (Iterators, Iterables, Arrays)

Created: 2016-11-13 21:33:16.0

Updated: 2017-11-07 16:02:08.0

Pushed: 2018-01-15 00:25:34.0

Homepage:

Size: 210

Language: Scala

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

iterators

Build Status Coverage Status Maven Central

Enrichment-methods for Scala collections (Iterators, Iterables, and Arrays):

rt hammerlab.iterator._
cala
ator(1, 2, 3).nextOption           // Some(1)
ator(1, 2, 3).buffered.headOption  // Some(1)

y(1, 2, 1, 3).countElems
ap(1?2, 2?1, 3?1)

(1, 1, 2, 1, 7, 7, 7).runLengthEncode
terator(1?2, 2?1, 1?1, 7?3)

Methods are defined in org.hammerlab.iterator and made available for convenient importing in hammerlab.iterator

examples

by package:

count
y(1, 2, 1, 3).countElems
ap(1?2, 2?1, 3?1)

ator("a"?1, "b"?2, "a"?10, "c"?3).countByKey
ap("a"?11, "b"?2, "c"?3)
either
L[T](t: T) = Left(t)
R[T](t: T) = Right(t)

ator(R('a), R('b), L(4)).findLeft
ome(4)

ator(
'a'),
1),
'b'),
'c'),
2),
3),
'd')

upByLeft
Values(_.mkString(""))
ist
ist((1,bc), (2,""), (3,d))
end

.finish: run a closure when the iterator is finished traversing:

rt scala.io.Source.fromFile
source = fromFile("build.sbt")
ce
ilter(_ == 'a')
inish({
println("closing!")
source.close()

ize

2
rints "closing!" and closes `source` after traversal is finished

.dropright: drop k elements from the end of an iterator in O(k) space:

ator(1 to 10: _*).dropright(4)
terator(1, 2, 3, 4, 5, 6)
group

Group runs of elements that satisfy a predicate or equivalence relation:

ator(1, 0, 2, 3, 0, 0, 4, 5, 6).groupRuns(_ > 0)
terator(Iterator(1), Iterator(0), Iterator(2, 3), Iterator(0), Iterator(0), Iterator(4, 5, 6))

Run-length encode elements:

(1, 1, 2, 1, 7, 7, 7).runLengthEncode
terator(1?2, 2?1, 1?1, 7?3)

Contiguous weighted sums up to a maximum:

ator(1 to 6: _*).cappedCostGroups(costFn = x?x, limit = 10)
terator(Iterator(1, 2, 3, 4), Iterator(5), Iterator(6))
level

Flatten a nested iterator but retain access to a cursor into unflattened version:

it1 = Iterator(1, 2)
it2 = Iterator(3, 4)
it = Iterator(it1, it2).level

ur.get == it1
rue

ext


ur.get == it1
rue

ext


ur.get == it2
rue

ext


ur.get == it2
rue

ext


ur
one
ordered

A variety of merge operations are available for sequences that are mutually ordered (possibly with respect to some 3rd type that each of their elements can be converted to).

.eitherMerge

Merge two ordered sequences using Eithers to preserve provenance (or handle the case that the sequences' elements are not the same type):

1, 3, 4).eitherMerge(Seq(2, 3, 5, 6))
terator(L(1), R(2), L(3), R(3), L(4), R(5), R(6))
.orMerge

Merge two ordered sequences using Ors:

1, 3, 4).orMerge(Seq(2, 3, 5, 6))
terator(L(1), R(2), Both(3, 3), L(4), R(5), R(6))
.leftMerge

Collecting right-side elements for each left-side element:

1, 3, 4).leftMerge(Seq(2, 3, 5, 6))
terator((1,Iterator(2)), (3,Iterator(3)), (4,Iterator(5, 6)))
.merge
1, 3, 4).merge(Seq(2, 3, 5, 6))
terator(1, 2, 3, 3, 4, 5, 6)
Merging with a 3rd type

Instances of the View type-class let merges use a type other than that of the elements being merged:

ank a (Symbol,Int) pair using its Int value
icit val view = View[(Symbol, Int), Int](_._2)

'a?1, 'b?3).merge('c?2)
terator('a?1, 'c?2, 'b?3)

'a ? 1, 'b ? 3).eitherMerge(2)
terator(L('a?1), R(2), L('b?3))
range

sliceOpt, given a start and length:

o 9).sliceOpt(0,  5)
 to 4

o 9).sliceOpt(0, 11)
 to 9

o 9).sliceOpt(2, 10)
 to 9

o 9).sliceOpt(2,  1)
 to 2

Also, .joinOverlaps left-merges sequences of Ranges, sorted by start-coordinate, based on overlaps.

sample: reservoir-sample

Reservoir-sample:

ator(1 to 100: _*).sample(5)
y(15, 18, 55, 63, 98)
scan
rt hammerlab.monoid._  // some Monoid defaults

1, 2, 3, 4).scanL
terator(0, 1, 3, 6)

1, 2, 3, 4).scanLeftInclusive
terator(1, 3, 6, 10)

1, 2, 3, 4).scanR
terator(9, 7, 4, 0)

1, 2, 3, 4).scanRightInclusive
terator(10, 9, 7, 4)

Additionally, scan over values of kv-pairs:

'a'?1, 'b'?2, 'c'?3, 'd'?4).scanLeftValues
terator((a,0), (b,1), (c,3), (d,6))

'a'?1, 'b'?2, 'c'?3, 'd'?4).scanLeftValuesInclusive
terator((a,1), (b,3), (c,6), (d,10))

'a'?1, 'b'?2, 'c'?3, 'd'?4).scanRightValues
terator((a,9), (b,7), (c,4), (d,0))

'a'?1, 'b'?2, 'c'?3, 'd'?4).scanRightValuesInclusive
terator((a,10), (b,9), (c,7), (d,4))
sliding

Windows of size 2, including an optional next or previous element:

1, 2, 3).sliding2
terator((1,2), (2,3))

1, 2, 3).sliding2Opt
terator((1,Some(2)), (2,Some(3)), (3,None))

1, 2, 3).sliding2Prev
terator((None,1), (Some(1),2), (Some(2),3))

Windows of size 3, including 2 succeeding elements, one successor and one predecessor, or full tuples only:

1, 2, 3, 4).sliding3
terator((1,2,3), (2,3,4))

1, 2, 3, 4).sliding3Opt
terator((None,1,Some(2)), (Some(1),2,Some(3)), (Some(2),3,Some(4)), (Some(3),4,None))

1, 2, 3, 4).sliding3NextOpts
terator((1,Some(2),Some(3)), (2,Some(3),Some(4)), (3,Some(4),None), (4,None,None))

Windows of arbitrary size, output having same number of elems as input:

1, 2, 3, 4, 5).slide(4)
terator(Seq(1, 2, 3, 4), Seq(2, 3, 4, 5), Seq(3, 4, 5), Seq(4, 5), Seq(5))
start

This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.