Standard Deviation in one Action [Spark Scala]


#1

// definiton of stdev = sqrt( E[X^2] - E[X]^2)

var rdd = sc.parallelize(Array(2, 3, 5, 6))
var rdd_x2_count = rdd.map(x=> (x, x*x, 1))
var (sum, sumsq, count) = rdd_x2_count.reduce((x, y) => (x._1 + y._1, x._2 + y._2, x._3 + y._3))

import math._;
var stdev = sqrt((sumsq.toDouble/count.toDouble) - pow(sum.toDouble/count.toDouble,2))