python - Numpy: use dtype from genfromtxt() when exporting with savetxt() -
numpy.genfromtxt(infile, dtype=none)
pretty job of determining number formats in each column of input files. how can use same determined types when saving data file numpy.savetxt()
? savetxt uses different format syntax.
indata = ''' 1000 254092.500 1630087.500 9144.00 9358.96 214.96 422 258667.500 1633267.500 6096.00 6490.28 394.28 15 318337.500 1594192.500 9144.00 10524.28 1380.28 -15 317392.500 1597987.500 6096.00 4081.26 -2014.74 -14 253627.500 1601047.500 21336.00 20127.51 -1208.49 end '''
code
import numpy np header = 'scaled_residual,x,y,local_std_error,vertical_std_error,unscaled_residual' data = np.genfromtxt(indata, names=header, dtype=none, comments='e') #skip 'end' lines print data.dtype
emits:
[('scaled_residual', '<i4'), ('x', '<f8'), ('y', '<f8'), ('local_std_error', '<f8'), ('vertical_std_error', '<f8'), ('unscaled_residual', '<f8')]
so how elegantly reconstruct data.dtype
fits savetxt(... fmt='%i, %f, ...'
syntax without manually stepping through it? there savefromgentxt() corollary haven't discovered?
a simplistic, hopeful attempt @ fmt=data.dtype
fails completely. ;-)
np.savetxt('test.csv', data, header=header, delimiter=',', fmt=data.dtype)
result:
...snip...\numpy\lib\npyio.py", line 1047, in savetxt fh.write(asbytes(format % tuple(row) + newline)) unboundlocalerror: local variable 'format' referenced before assignment
fmt
supposed format string, or list of strings. see examples in savetxt
documentation. not dtype
.
np.savetxt('test.csv',data, fmt='%10s')
gets 90% of way there:
1000 254092.5 1630087.5 9144.0 9358.96 214.96 422 258667.5 1633267.5 6096.0 6490.28 394.28 15 318337.5 1594192.5 9144.0 10524.28 1380.28 -15 317392.5 1597987.5 6096.0 4081.26 -2014.74 -14 253627.5 1601047.5 21336.0 20127.51 -1208.49
you closer specifying fmt string number of decimals etc each column.
np.savetxt('test.csv',data, fmt='%10d %10.3f %10.3f %10.2f %10.2f %10.2f')
does better. can tweak fmt
further.
the python code savetxt
not complex. i'd suggest looking @ it.
the problem generating fancier dtype
there isn't more information.
in [154]: [x[1] x in data.dtype.descr] out[154]: ['<i4', '<f8', '<f8', '<f8', '<f8', '<f8']
compare these formats:
in [158]: '%i %f %f %f %f %f'%tuple(data[0]) out[158]: '1000 254092.500000 1630087.500000 9144.000000 9358.960000 214.960000' in [159]: '%s %s %s %s %s %s'%tuple(data[0]) out[159]: '1000 254092.5 1630087.5 9144.0 9358.96 214.96' in [160]: ' '.join(['%10s']*6)%tuple(data[0]) out[160]: ' 1000 254092.5 1630087.5 9144.0 9358.96 214.96'
a simple translation of dtype
info:
def foo(astr): if 'i' in astr: return '%10i' elif 'f' in astr: return '%10f' [foo(x[1]) x in data.dtype.descr] # ['%10i', '%10f', '%10f', '%10f', '%10f', '%10f']
you use dtype
names generate header line.