@羲凡——只为了更好的活着
Spark RDD与DataFrame相互转换
Q:Spark中RDD转成DataFrame用什么算子
A:.rdd
Q:Spark中DataFrame转成RDD用什么算子
A:.toDF
1.直接上代码
import org.apache.spark.rdd.RDDimport org.apache.spark.sql.types.{IntegerType, StringType, StructField, StructType}import org.apache.spark.sql.{DataFrame, SparkSession}object Demo{def main(args: Array[String]): Unit = {val spark = SparkSession.builder().appName("Demo").master("local[*]").getOrCreate()val file="file:///D:\\data\\test.txt"val schema = StructType(Array(StructField("name",StringType), StructField("age",IntegerType)))// DataFrame转换成RDDprintln("=========DataFrame转换成RDD=========")val dataFrame: DataFrame = spark.read.option("delimiter",",").schema(schema).csv(file)dataFrame.show()val rdd: RDD[(String, Int)] = dataFrame.rdd.map(t => (t.getAs[String](0),t.getAs[Int](1)))println("===转换后的结果如下===")rdd.foreach(println)// RDD转换成DataFrameprintln("=========RDD转换成DataFrame=========")import spark.implicits._val rdd2: RDD[(String, String)] = spark.sparkContext.textFile(file).map(t=>{val arr = t.split(",")(arr(0),arr(1))})rdd2.foreach(println)val dataFrame2: DataFrame = rdd2.toDF("name","age")println("===转换后的结果如下===")dataFrame2.show()spark.stop()}}
2.结果展示
=========DataFrame转换成RDD=========+----+---+|name|age|+----+---+| 扎克|227|| 赵信|200|| 魔腾|188|+----+---+===转换后的结果如下===(扎克,227)(赵信,200)(魔腾,188)=========RDD转换成DataFrame=========(魔腾,188)(扎克,227)(赵信,200)===转换后的结果如下===+----+---+|name|age|+----+---+| 扎克|227|| 赵信|200|| 魔腾|188|+----+---+
====================================================================
@羲凡——只为了更好的活着
若对博客中有任何问题,欢迎留言交流