Enrichment-methods for Scala collections (Iterator
s, Iterable
s, and Array
s):
import hammerlab.iterator._
Iterator(1, 2, 3).nextOption // Some(1)
Iterator(1, 2, 3).buffered.headOption // Some(1)
Array(1, 2, 1, 3).countElems
// Map(1→2, 2→1, 3→1)
List(1, 1, 2, 1, 7, 7, 7).runLengthEncode
// Iterator(1→2, 2→1, 1→1, 7→3)
Methods are defined in org.hammerlab.iterator
and made available for convenient importing in hammerlab.iterator
by package:
Array(1, 2, 1, 3).countElems
// Map(1→2, 2→1, 3→1)
Iterator('a→1, 'b→2, 'a→10, 'c→3).countByKey
// Map('a→2, 'b→1, 'c→1)
def L[T](t: T) = Left(t)
def R[T](t: T) = Right(t)
Iterator(R('a), R('b), L(4)).findLeft
// Some(4)
Iterator(
R('a),
L( 1),
R('b),
R('c),
L( 2),
L( 3),
R('d)
)
.groupByLeft
.mapValues(_.mkString(""))
.toList
// List((1,bc), (2,""), (3,d))
.finish
: run a closure when the iterator is finished traversing:
import scala.io.Source.fromFile
val source = fromFile("build.sbt")
source
.filter(_ == 'a')
.finish({
println("closing!")
source.close()
})
.size
// 32
// prints "closing!" and closes `source` after traversal is finished
.dropright
: drop k
elements from the end of an iterator in O(k)
space:
Iterator(1 to 10: _*).dropright(4)
// Iterator(1, 2, 3, 4, 5, 6)
Group runs of elements that satisfy a predicate or equivalence relation:
Iterator(1, 0, 2, 3, 0, 0, 4, 5, 6).groupRuns(_ > 0)
// Iterator(Iterator(1), Iterator(0), Iterator(2, 3), Iterator(0), Iterator(0), Iterator(4, 5, 6))
Run-length encode elements:
List(1, 1, 2, 1, 7, 7, 7).runLengthEncode
// Iterator(1→2, 2→1, 1→1, 7→3)
Contiguous weighted sums up to a maximum:
Iterator(1 to 6: _*).cappedCostGroups(costFn = x⇒x, limit = 10)
// Iterator(Iterator(1, 2, 3, 4), Iterator(5), Iterator(6))
Flatten a nested iterator but retain access to a cursor into unflattened version:
val it1 = Iterator(1, 2)
val it2 = Iterator(3, 4)
val it = Iterator(it1, it2).level
it.cur.get == it1
// true
it.next
// 1
it.cur.get == it1
// true
it.next
// 2
it.cur.get == it2
// true
it.next
// 3
it.cur.get == it2
// true
it.next
// 4
it.cur
// None
A variety of merge operations are available for sequences that are mutually ordered (possibly with respect to some 3rd type that each of their elements can be converted to).
Merge two ordered sequences using Either
s to preserve provenance (or handle the case that the sequences' elements are not the same type):
Seq(1, 3, 4).eitherMerge(Seq(2, 3, 5, 6))
// Iterator(L(1), R(2), L(3), R(3), L(4), R(5), R(6))
Merge two ordered sequences using Or
s:
Seq(1, 3, 4).orMerge(Seq(2, 3, 5, 6))
// Iterator(L(1), R(2), Both(3, 3), L(4), R(5), R(6))
Collecting right-side elements for each left-side element:
Seq(1, 3, 4).leftMerge(Seq(2, 3, 5, 6))
// Iterator((1,Iterator(2)), (3,Iterator(3)), (4,Iterator(5, 6)))
Seq(1, 3, 4).merge(Seq(2, 3, 5, 6))
// Iterator(1, 2, 3, 3, 4, 5, 6)
Instances of the View
type-class let merges use a type other than that of the elements being merged:
// Rank a (Symbol,Int) pair using its Int value
implicit val view = View[(Symbol, Int), Int](_._2)
Seq('a→1, 'b→3).merge('c→2)
// Iterator('a→1, 'c→2, 'b→3)
Seq('a → 1, 'b → 3).eitherMerge(2)
// Iterator(L('a→1), R(2), L('b→3))
sliceOpt
, given a start
and length
:
(0 to 9).sliceOpt(0, 5)
// 0 to 4
(0 to 9).sliceOpt(0, 11)
// 0 to 9
(0 to 9).sliceOpt(2, 10)
// 2 to 9
(0 to 9).sliceOpt(2, 1)
// 2 to 2
Also, .joinOverlaps
left-merges sequences of Range
s, sorted by start-coordinate, based on overlaps.
sample
: reservoir-sample
Reservoir-sample:
Iterator(1 to 100: _*).sample(5)
Array(15, 18, 55, 63, 98)
- scanL, scanR
- in terms of
cats.Monoid
- optionally include the final (total/sum) element instead of the initial empty/zero
import hammerlab.monoid._ // some Monoid defaults
Seq(1, 2, 3, 4).scanL
// Iterator(0, 1, 3, 6)
Seq(1, 2, 3, 4).scanLeftInclusive
// Iterator(1, 3, 6, 10)
Seq(1, 2, 3, 4).scanR
// Iterator(9, 7, 4, 0)
Seq(1, 2, 3, 4).scanRightInclusive
// Iterator(10, 9, 7, 4)
Additionally, scan over values of kv-pairs:
Seq('a'→1, 'b'→2, 'c'→3, 'd'→4).scanLeftValues
// Iterator((a,0), (b,1), (c,3), (d,6))
Seq('a'→1, 'b'→2, 'c'→3, 'd'→4).scanLeftValuesInclusive
// Iterator((a,1), (b,3), (c,6), (d,10))
Seq('a'→1, 'b'→2, 'c'→3, 'd'→4).scanRightValues
// Iterator((a,9), (b,7), (c,4), (d,0))
Seq('a'→1, 'b'→2, 'c'→3, 'd'→4).scanRightValuesInclusive
// Iterator((a,10), (b,9), (c,7), (d,4))
Windows of size 2, including an optional next or previous element:
Seq(1, 2, 3).sliding2
// Iterator((1,2), (2,3))
Seq(1, 2, 3).sliding2Opt
// Iterator((1,Some(2)), (2,Some(3)), (3,None))
Seq(1, 2, 3).sliding2Prev
// Iterator((None,1), (Some(1),2), (Some(2),3))
Windows of size 3, including 2 succeeding elements, one successor and one predecessor, or full tuples only:
Seq(1, 2, 3, 4).sliding3
// Iterator((1,2,3), (2,3,4))
Seq(1, 2, 3, 4).sliding3Opt
// Iterator((None,1,Some(2)), (Some(1),2,Some(3)), (Some(2),3,Some(4)), (Some(3),4,None))
Seq(1, 2, 3, 4).sliding3NextOpts
// Iterator((1,Some(2),Some(3)), (2,Some(3),Some(4)), (3,Some(4),None), (4,None,None))
Windows of arbitrary size, output having same number of elems as input:
Seq(1, 2, 3, 4, 5).slide(4)
// Iterator(Seq(1, 2, 3, 4), Seq(2, 3, 4, 5), Seq(3, 4, 5), Seq(4, 5), Seq(5))
take
s,drop
s,collect
s with deterministic semantics around the iterator that was operated onheadOption
,nextOption