600字范文,内容丰富有趣,生活中的好帮手!
600字范文 > pyspark DataFrame 转RDD

pyspark DataFrame 转RDD

时间:2019-07-08 09:21:36

相关推荐

pyspark  DataFrame 转RDD

# -*- coding: utf-8 -*-from __future__ import print_functionfrom pyspark.sql import SparkSessionfrom pyspark.sql import Rowif __name__ == "__main__":# 初始化SparkSessionspark = SparkSession \.builder \.appName("RDD_and_DataFrame") \.config("spark.some.config.option", "some-value") \.getOrCreate()sc = spark.sparkContextlines = sc.textFile("employee.txt")parts = lines.map(lambda l: l.split(","))employee = parts.map(lambda p: Row(name=p[0], salary=int(p[1])))#RDD转换成DataFrameemployee_temp = spark.createDataFrame(employee)#显示DataFrame数据employee_temp.show()#创建视图employee_temp.createOrReplaceTempView("employee")#过滤数据employee_result = spark.sql("SELECT name,salary FROM employee WHERE salary >= 14000 AND salary <= 20000")# DataFrame转换成RDDresult = employee_result.rdd.map(lambda p: "name: " + p.name + " salary: " + str(p.salary)).collect()#打印RDD数据for n in result:print(n)

本内容不代表本网观点和政治立场,如有侵犯你的权益请联系我们处理。
网友评论
网友评论仅供其表达个人看法,并不表明网站立场。