600字范文,内容丰富有趣,生活中的好帮手!
600字范文 > python-pandas-读取与写入文件

python-pandas-读取与写入文件

时间:2022-11-20 16:17:39

相关推荐

python-pandas-读取与写入文件

import numpy as npimport pandas as pdfrom pandas import Series,DataFramefrom numpy import nan as NAimport matplotlib.pyplot as pltplt.rcParams['font.sans-serif'] = ['Microsoft YaHei']

读取csv

# pd.read_csv() 默认 逗号为分隔符 ,默认首行作为 columnsNamepd.read_csv('data/ex1.csv')

abcdmessage01234hello15678world29101112foo

# pd..read_table()也可以读取,但是要指定分隔符参数sep=''pd.read_table('data/ex1.csv',sep=',')

abcdmessage01234hello15678world29101112foo

# 读取没有明显列名称的文件,不需要让首行作为列,用参数header=Nonedf2=pd.read_csv('data/ex2.csv',header=None)df2

0123401234hello15678world29101112foo

# 自d义列名称读取 参数names=[]# names给几列,就会从最后向前读几列数据,剩下的都是index# 超出就是nandf3=pd.read_csv('data/ex2.csv',names=['aa','bb','cc','dd','message','nn'])df3

aabbccddmessagenn01234helloNaN15678worldNaN29101112fooNaN

# 设置行索引 index_col='某个列name'df4=pd.read_csv('data/ex2.csv',names=['aa','bb','cc','dd','message'],index_col='message')df4

aabbccddmessagehello1234world5678foo9101112

# 多列的值作为行索引df5=pd.read_csv('data/csv_mindex.csv',index_col=['key1','key2'])df5

value1value2key1key2onea12b34c56d78twoa910b1112c1314d1516

# 读取不规则的分隔符数据,使用分隔符sep=''# 使用正则表达式做分隔符,不需要导入re,直接写正则字符串# \s+df6=pd.read_csv('data/ex3.txt',sep='\s+')df6

ABCaaa-0.264438-1.026059-0.619500bbb0.9272720.302904-0.032399ccc-0.264273-0.386314-0.217601ddd-0.871858-0.3483821.100491

# 使用参数skiprows=[] 跳过一些行df7=pd.read_csv('data/ex4.csv',skiprows=[0,2,3])df7

abcdmessage01234hello15678world29101112foo

# comment='#',指定#开头的注释,不用读取pd.read_csv('data/ex4.csv',comment='#')

abcdmessage01234hello15678world29101112foo

# 需要读取xxx值为nan,参数na_values=[],空值与NA字符默认是nan了df8=pd.read_csv('data/ex5.csv',na_values=[1,2,'world'])df8

somethingabcdmessage0oneNaNNaN3.04NaN1two5.06.0NaN8NaN2three9.010.011.012foo3fooNaNNaN11.012two

# 指定不同列的不同数据为nan# 用字典匹配,某个列的一个值或多个值nans={'something':'two','message':['foo','world']}df9=pd.read_csv('data/ex5.csv',na_values=nans)df9

somethingabcdmessage0one123.04NaN1NaN56NaN8NaN2three91011.012NaN3foo1211.012two

写入csv

# 模拟创建一个成绩表,语文,数学,英语。各科成绩有重复。names = '尺,寸,人,下,匕,卜,之,田,丫,乃,贝,井,工,几,女,巨,爪,火,了,方,木,中,寸,石,户,友,夫,不,可,主,又,丑,巾,口,电,门,术,儿,羊,丁,心,天,化,气,正,页,兄,伏,大,计'.split(',')df10 = DataFrame({'语文':np.random.randint(90,100,50),'数学':np.random.randint(80,100,50),'英语':np.random.randint(60,100,50),'化学':np.random.randint(60,100,50),'物理':np.random.randint(60,100,50),'生物':np.random.randint(60,100,50),'政治':np.random.randint(60,100,50),'历史':np.random.randint(60,100,50),'地理':np.random.randint(60,100,50)},index = [np.random.choice(list('赵钱孙李周吴郑王'))+names.pop(np.random.randint(0,len(names))) for i in range(50)])df10

语文数学英语化学物理生物政治历史地理赵大979068867394707072孙兄919986896896708260钱术958787679988856595吴羊929894649682658569王女918484889275829791王卜948990999599678282王可949076689382939983钱巾978592817185906065李丁909486719098608775吴丫918270708668619499吴丑919778798295838092钱火979772996667808073王石968369669060869392李人919560969994696489郑正908482626782887165王乃939660798695896165李寸949162717788619487郑之969796747070727873钱口968098697066737069孙寸968170977492649487吴儿988571967283619798李木948269718892939276赵主958083806394766775钱巨939698728084869896郑电999563979892609079赵工998660838697809880李方918385757685748776李门908964767589757672李几968270887996608561赵夫949468989585659863吴井928986797868987873郑天918597846493909793孙计968474918564696466郑尺998493889987768085孙友989575859489697482郑中968188666871888363赵页989763667198976360郑户918868737681878789王了958364948971968576赵爪929288838198819763吴又999971619060657477钱化939263828682889688吴不948375907191849983吴伏918466936288788168王贝929375867084848464李气979972606968677475李田919084717980916566李下988165927265726477李匕908296998788886495李心999787727163987578

# 保存,行列索引都存进去了df10.to_csv('成绩.csv',encoding='utf-8')# index=False 不要行索引df10.to_csv('成绩1.csv',index=False)# header=False 不要行索引df10.to_csv('成绩2.csv',header=False)# na_rep='' 将nan换成xx进行存储,默认是空df10.iloc[0,3]=NAdf10.to_csv('成绩3.csv',na_rep='空')# 预览存储效果import sys# 以竖线为分隔符的预览效果df10.to_csv(sys.stdout,sep='|',na_rep='空空')

|语文|数学|英语|化学|物理|生物|政治|历史|地理赵大|97|90|68|空空|73|94|70|70|72孙兄|91|99|86|89.0|68|96|70|82|60钱术|95|87|87|67.0|99|88|85|65|95吴羊|92|98|94|64.0|96|82|65|85|69王女|91|84|84|88.0|92|75|82|97|91王卜|94|89|90|99.0|95|99|67|82|82王可|94|90|76|68.0|93|82|93|99|83钱巾|97|85|92|81.0|71|85|90|60|65李丁|90|94|86|71.0|90|98|60|87|75吴丫|91|82|70|70.0|86|68|61|94|99吴丑|91|97|78|79.0|82|95|83|80|92钱火|97|97|72|99.0|66|67|80|80|73王石|96|83|69|66.0|90|60|86|93|92李人|91|95|60|96.0|99|94|69|64|89郑正|90|84|82|62.0|67|82|88|71|65王乃|93|96|60|79.0|86|95|89|61|65李寸|94|91|62|71.0|77|88|61|94|87郑之|96|97|96|74.0|70|70|72|78|73钱口|96|80|98|69.0|70|66|73|70|69孙寸|96|81|70|97.0|74|92|64|94|87吴儿|98|85|71|96.0|72|83|61|97|98李木|94|82|69|71.0|88|92|93|92|76赵主|95|80|83|80.0|63|94|76|67|75钱巨|93|96|98|72.0|80|84|86|98|96郑电|99|95|63|97.0|98|92|60|90|79赵工|99|86|60|83.0|86|97|80|98|80李方|91|83|85|75.0|76|85|74|87|76李门|90|89|64|76.0|75|89|75|76|72李几|96|82|70|88.0|79|96|60|85|61赵夫|94|94|68|98.0|95|85|65|98|63吴井|92|89|86|79.0|78|68|98|78|73郑天|91|85|97|84.0|64|93|90|97|93孙计|96|84|74|91.0|85|64|69|64|66郑尺|99|84|93|88.0|99|87|76|80|85孙友|98|95|75|85.0|94|89|69|74|82郑中|96|81|88|66.0|68|71|88|83|63赵页|98|97|63|66.0|71|98|97|63|60郑户|91|88|68|73.0|76|81|87|87|89王了|95|83|64|94.0|89|71|96|85|76赵爪|92|92|88|83.0|81|98|81|97|63吴又|99|99|71|61.0|90|60|65|74|77钱化|93|92|63|82.0|86|82|88|96|88吴不|94|83|75|90.0|71|91|84|99|83吴伏|91|84|66|93.0|62|88|78|81|68王贝|92|93|75|86.0|70|84|84|84|64李气|97|99|72|60.0|69|68|67|74|75李田|91|90|84|71.0|79|80|91|65|66李下|98|81|65|92.0|72|65|72|64|77李匕|90|82|96|99.0|87|88|88|64|95李心|99|97|87|72.0|71|63|98|75|78print(123,file=open('123.txt','w'))

# 读取我们自己存储的成绩csv,人名以及行索引s1=pd.read_csv('成绩.csv',index_col='Unnamed: 0')s1

Unnamed: 0语文数学英语化学物理生物政治历史地理0赵大9790688673947070721孙兄9199868968967082602钱术9587876799888565953吴羊9298946496826585694王女9184848892758297915王卜9489909995996782826王可9490766893829399837钱巾9785928171859060658李丁9094867190986087759吴丫91827070866861949910吴丑91977879829583809211钱火97977299666780807312王石96836966906086939213李人91956096999469648914郑正90848262678288716515王乃93966079869589616516李寸94916271778861948717郑之96979674707072787318钱口96809869706673706919孙寸96817097749264948720吴儿98857196728361979821李木94826971889293927622赵主95808380639476677523钱巨93969872808486989624郑电99956397989260907925赵工99866083869780988026李方91838575768574877627李门90896476758975767228李几96827088799660856129赵夫94946898958565986330吴井92898679786898787331郑天91859784649390979332孙计96847491856469646633郑尺99849388998776808534孙友98957585948969748235郑中96818866687188836336赵页98976366719897636037郑户91886873768187878938王了95836494897196857639赵爪92928883819881976340吴又99997161906065747741钱化93926382868288968842吴不94837590719184998343吴伏91846693628878816844王贝92937586708484846445李气97997260696867747546李田91908471798091656647李下98816592726572647748李匕90829699878888649549李心999787727163987578

excel

# 存储 前提需要安装 openpyxls1.to_excel('成绩.xlsx')s1.to_excel('成绩1.xlsx',startrow=1,encoding='utf-8')

# 读取excel,参数读取csv ,多列个sheet_namepd.read_excel('成绩.xlsx')

Unnamed: 0Unnamed: 0.1语文数学英语化学物理生物政治历史地理00赵大97906886739470707211孙兄91998689689670826022钱术95878767998885659533吴羊92989464968265856944王女91848488927582979155王卜94899099959967828266王可94907668938293998377钱巾97859281718590606588李丁90948671909860877599吴丫9182707086686194991010吴丑9197787982958380921111钱火9797729966678080731212王石9683696690608693921313李人9195609699946964891414郑正9084826267828871651515王乃9396607986958961651616李寸9491627177886194871717郑之9697967470707278731818钱口9680986970667370691919孙寸968170977492649487吴儿9885719672836197982121李木9482697188929392762222赵主9580838063947667752323钱巨9396987280848698962424郑电9995639798926090792525赵工9986608386978098802626李方9183857576857487762727李门9089647675897576722828李几9682708879966085612929赵夫9494689895856598633030吴井9289867978689878733131郑天9185978464939097933232孙计9684749185646964663333郑尺9984938899877680853434孙友9895758594896974823535郑中9681886668718883633636赵页9897636671989763603737郑户9188687376818787893838王了9583649489719685763939赵爪9292888381988197634040吴又9999716190606574774141钱化9392638286828896884242吴不9483759071918499834343吴伏9184669362887881684444王贝9293758670848484644545李气9799726069686774754646李田9190847179809165664747李下9881659272657264774848李匕9082969987888864954949李心999787727163987578

本内容不代表本网观点和政治立场,如有侵犯你的权益请联系我们处理。
网友评论
网友评论仅供其表达个人看法,并不表明网站立场。