spark Pi && word count计算

2019-11-07 10:31:25 浏览数 (1)

方法:蒙特卡罗法,又叫随机抽样或统计

步骤

1.构造一个边长为1的正方形和1/4的圆(正方形面积1大于圆面积π/4)

2.随机向正方形内随机找n个点,计算每一个点到圆心的距离,小于1的就是圆内的点,假设数量是count

3. 4*count/n的值就是π的值,spark中的pi就是用这种方法算的

代码语言:javascript复制
val sparkSession = SparkSession.builder().master("local).getOrCreate()
val sc = sparkSession.sparkContext
val slices = 6
val n = 600000
val count = spark.parallelize(1 to n, slices).map { i =>
    val x = random * 2 - 1
    val y = random * 2 - 1
    if ( x*x   y*y < 1) 1 else 0
   }.reduce(_   _)
val pi = 4.0 * count / n
println(pi)
sparkSession.stop()

代码语言:javascript复制
object WordCount {

  def main(args: Array[String]): Unit = {
    val conf = new SparkConf().setAppName("wordcount")
    val sc = new SparkContext(conf)
    val input = sc.textFile("/data/spark/demo/word_count")
    val lines = input.flatMap(line => line.split(" "))
    val count = lines.map(word => (word, 1)).reduceByKey(_   _)
    val output = count.saveAsTextFile("/data/spark/demo/word_count_result")
  }
}

File - Project Structure - Artifacts - " " - Jar - from modules
Build - Build Artifacts

./bin/spark-submit 
--mater spark://localhost:9000 
--class WordCount /data/spark/demo/jar/spark-demo.jar

参考

https://www.cnblogs.com/aze-003/p/5127192.html

0 人点赞