Python Pandas to_pickle cannot pickle large dataframes -
i have dataframe "df" with 500,000 rows. here data types per column:
id int64 time datetime64[ns] data object
each entry in "data" column array size = [5,500]
when try save dataframe using
df.to_pickle("my_filename.pkl")
it returned me following error:
12 """ 13 open(path, 'wb') f: ---> 14 pkl.dump(obj, f, protocol=pkl.highest_protocol) oserror: [errno 22] invalid argument
i try method same error:
import pickle open('my_filename.pkl', 'wb') f: pickle.dump(df, f)
i try save 10 rows of dataframe:
df.head(10).to_pickle('test_save.pkl')
and have no error @ all. therefore, can save small df not large df.
i using python 3, ipython notebook 3 in mac.
please me solve problem. need save df pickle file. can not find solution in internet.
probably not answer hoping did......
split dataframe smaller chunks using np.array_split (although numpy functions not guaranteed work, now, although there used bug it).
then pickle smaller dataframes.
when unpickle them use pandas.append or pandas.concat glue together.
i agree fudge , suboptimal. if can suggest "proper" answer i'd interested in seeing it, think simple dataframes not supposed above size.