-
qubole/spark-state-store 1.0.0
Rocksdb state storage implementation for Structured Streaming.
Scala versions: 2.11 -
derrickburns/generalized-kmeans-clustering 1.2.2
Spark library for generalized K-Means clustering. Supports general Bregman divergences. Suitable for clustering probabilistic data, time series data, high dimensional data, and very large data.
Scala versions: 2.10 -
data-tools/big-data-types 1.3.4
A library to transform Scala product types and Schemes from different systems into other Schemes. Any implemented type automatically gets methods to convert it into the rest of the types and vice versa. E.g: a Spark Schema can be transformed into a BigQuery table.
Scala versions: 3.x 2.13 2.12 -
phymbert/spark-search 0.2.0
Spark Search - high performance advanced search features based on Apache Lucene
Scala versions: 2.12 2.11 -
tupol/spark-tools 0.4.1
Executable Apache Spark Tools: Format Converter & SQL Processor
Scala versions: 2.12 2.11 -
anskarl/parsimonious 0.5.1
Parsimonious is a helper library for encoding/decoding Apache Thrift and Twitter Scrooge classes to Spark Dataframes and Jackson JSON.
Scala versions: 2.13 2.12 -
housepower/clickhouse-native-jdbc 2.7.1
ClickHouse Native Protocol JDBC implementation
Scala versions: 2.12 2.11 -
databeans/lighthouse 0.1.0
Shed light on your data layout in order to monitor the health of your Lakehouse tables and identify when data maintenance operations should be performed.
Scala versions: 2.12 -
coxautomotivedatasolutions/vegalite4s 0.4
Vega-Lite4s is a small library over the comprehensive Vega-Lite Javascript visualisation library, allowing you to create beautiful Vega-Lite visualisations in Scala
Scala versions: 2.12 2.11 -
sadikovi/spark-netflow 2.1.0
NetFlow data source for Spark SQL and DataFrames
Scala versions: 2.12 -
izhangzhihao/sbt-spark-submit 0.0.5
sbt plugin for spark-submit
-
keks51/spark_plan_as_uml 1.0.0
visualizing spark plan as UML diagram
Scala versions: 2.12 2.11