python - Numpy: use dtype from genfromtxt() when exporting with savetxt() -
numpy.genfromtxt(infile, dtype=none) pretty job of determining number formats in each column of input files. how can use same determined types when saving data file numpy.savetxt()? savetxt uses different format syntax.
indata = ''' 1000 254092.500 1630087.500 9144.00 9358.96 214.96 422 258667.500 1633267.500 6096.00 6490.28 394.28 15 318337.500 1594192.500 9144.00 10524.28 1380.28 -15 317392.500 1597987.500 6096.00 4081.26 -2014.74 -14 253627.500 1601047.500 21336.00 20127.51 -1208.49 end ''' code
import numpy np header = 'scaled_residual,x,y,local_std_error,vertical_std_error,unscaled_residual' data = np.genfromtxt(indata, names=header, dtype=none, comments='e') #skip 'end' lines print data.dtype emits:
[('scaled_residual', '<i4'), ('x', '<f8'), ('y', '<f8'), ('local_std_error', '<f8'), ('vertical_std_error', '<f8'), ('unscaled_residual', '<f8')] so how elegantly reconstruct data.dtype fits savetxt(... fmt='%i, %f, ...' syntax without manually stepping through it? there savefromgentxt() corollary haven't discovered?
a simplistic, hopeful attempt @ fmt=data.dtype fails completely. ;-)
np.savetxt('test.csv', data, header=header, delimiter=',', fmt=data.dtype) result:
...snip...\numpy\lib\npyio.py", line 1047, in savetxt fh.write(asbytes(format % tuple(row) + newline)) unboundlocalerror: local variable 'format' referenced before assignment
fmt supposed format string, or list of strings. see examples in savetxt documentation. not dtype.
np.savetxt('test.csv',data, fmt='%10s') gets 90% of way there:
1000 254092.5 1630087.5 9144.0 9358.96 214.96 422 258667.5 1633267.5 6096.0 6490.28 394.28 15 318337.5 1594192.5 9144.0 10524.28 1380.28 -15 317392.5 1597987.5 6096.0 4081.26 -2014.74 -14 253627.5 1601047.5 21336.0 20127.51 -1208.49 you closer specifying fmt string number of decimals etc each column.
np.savetxt('test.csv',data, fmt='%10d %10.3f %10.3f %10.2f %10.2f %10.2f') does better. can tweak fmt further.
the python code savetxt not complex. i'd suggest looking @ it.
the problem generating fancier dtype there isn't more information.
in [154]: [x[1] x in data.dtype.descr] out[154]: ['<i4', '<f8', '<f8', '<f8', '<f8', '<f8'] compare these formats:
in [158]: '%i %f %f %f %f %f'%tuple(data[0]) out[158]: '1000 254092.500000 1630087.500000 9144.000000 9358.960000 214.960000' in [159]: '%s %s %s %s %s %s'%tuple(data[0]) out[159]: '1000 254092.5 1630087.5 9144.0 9358.96 214.96' in [160]: ' '.join(['%10s']*6)%tuple(data[0]) out[160]: ' 1000 254092.5 1630087.5 9144.0 9358.96 214.96' a simple translation of dtype info:
def foo(astr): if 'i' in astr: return '%10i' elif 'f' in astr: return '%10f' [foo(x[1]) x in data.dtype.descr] # ['%10i', '%10f', '%10f', '%10f', '%10f', '%10f'] you use dtype names generate header line.