Terasort-like benchmark for spark 2.x that uses dataframes, saves files in parquet etc for a more realistic testing.