基于以下List集合实现词频统计
代码语言:javascript复制val list = List("hadoop spark hive ",""," hue spark hadoop hadoop","hue hive hive hive","spark hadoop hadoop")
实现词频统计,并按照单词个数降序排序,实现结果如下
代码语言:javascript复制hadoop-5
hive-4
spark-3
hue-2
代码语言:javascript复制 val list = List("hadoop spark hive ",""," hue spark hadoop hadoop","hue hive hive hive","spark hadoop hadoop")
// var m = Map[String, Int]()
// readLine.trim.split(" ").foreach(i => if (m.contains(i)) m = (i -> (m(i) 1)) else m = (i -> 1))
// val sorted = m.toSeq.sortWith(_._2 > _._2)
// sorted.foreach(println)
val unit = list.flatMap(x =>x.split(" ") //1.转化为List扁平化 1.切割
.filter(x =>x.trim.length!=0)) //2.过滤空字符及前后空格 2.分组
.groupBy(x => x) //3.一个个分组 3.排序
.mapValues(_.size) //4.取map的值
.toList //5.转换成List
.sortBy(-_._2) //6.按次数排序 降序
.foreach(x => println(x)) //7.循环输出
println(unit.toString())```