Python Pandas to_pickle cannot pickle large dataframes -


i have dataframe "df" with 500,000 rows. here data types per column:

id      int64 time    datetime64[ns] data    object 

each entry in "data" column array size = [5,500]

when try save dataframe using

df.to_pickle("my_filename.pkl") 

it returned me following error:

     12     """      13     open(path, 'wb') f: ---> 14         pkl.dump(obj, f, protocol=pkl.highest_protocol)   oserror: [errno 22] invalid argument 

i try method same error:

import pickle   open('my_filename.pkl', 'wb') f:     pickle.dump(df, f) 

i try save 10 rows of dataframe:

df.head(10).to_pickle('test_save.pkl') 

and have no error @ all. therefore, can save small df not large df.

i using python 3, ipython notebook 3 in mac.

please me solve problem. need save df pickle file. can not find solution in internet.

probably not answer hoping did......

split dataframe smaller chunks using np.array_split (although numpy functions not guaranteed work, now, although there used bug it).

then pickle smaller dataframes.

when unpickle them use pandas.append or pandas.concat glue together.

i agree fudge , suboptimal. if can suggest "proper" answer i'd interested in seeing it, think simple dataframes not supposed above size.

split large pandas dataframe


Popular posts from this blog