python - Numpy: use dtype from genfromtxt() when exporting with savetxt() -

numpy.genfromtxt(infile, dtype=none) pretty job of determining number formats in each column of input files. how can use same determined types when saving data file numpy.savetxt()? savetxt uses different format syntax.

indata = '''     1000  254092.500 1630087.500  9144.00  9358.96   214.96      422  258667.500 1633267.500  6096.00  6490.28   394.28       15  318337.500 1594192.500  9144.00 10524.28  1380.28      -15  317392.500 1597987.500  6096.00  4081.26 -2014.74      -14  253627.500 1601047.500 21336.00 20127.51 -1208.49 end '''

code

import numpy np header = 'scaled_residual,x,y,local_std_error,vertical_std_error,unscaled_residual' data = np.genfromtxt(indata, names=header, dtype=none,     comments='e') #skip 'end' lines  print data.dtype

emits:

[('scaled_residual', '<i4'), ('x', '<f8'), ('y', '<f8'), ('local_std_error', '<f8'), ('vertical_std_error', '<f8'), ('unscaled_residual', '<f8')]

so how elegantly reconstruct data.dtype fits savetxt(... fmt='%i, %f, ...' syntax without manually stepping through it? there savefromgentxt() corollary haven't discovered?

a simplistic, hopeful attempt @ fmt=data.dtype fails completely. ;-)

np.savetxt('test.csv', data, header=header, delimiter=',',     fmt=data.dtype)

result:

  ...snip...\numpy\lib\npyio.py", line 1047, in savetxt     fh.write(asbytes(format % tuple(row) + newline)) unboundlocalerror: local variable 'format' referenced before assignment

fmt supposed format string, or list of strings. see examples in savetxt documentation. not dtype.

np.savetxt('test.csv',data, fmt='%10s')

gets 90% of way there:

  1000   254092.5  1630087.5     9144.0    9358.96     214.96    422   258667.5  1633267.5     6096.0    6490.28     394.28     15   318337.5  1594192.5     9144.0   10524.28    1380.28    -15   317392.5  1597987.5     6096.0    4081.26   -2014.74    -14   253627.5  1601047.5    21336.0   20127.51   -1208.49

you closer specifying fmt string number of decimals etc each column.

np.savetxt('test.csv',data, fmt='%10d  %10.3f %10.3f %10.2f %10.2f %10.2f')

does better. can tweak fmt further.

the python code savetxt not complex. i'd suggest looking @ it.

the problem generating fancier dtype there isn't more information.

in [154]: [x[1] x in data.dtype.descr] out[154]: ['<i4', '<f8', '<f8', '<f8', '<f8', '<f8']

compare these formats:

in [158]: '%i %f %f %f %f %f'%tuple(data[0]) out[158]: '1000 254092.500000 1630087.500000 9144.000000 9358.960000 214.960000'  in [159]: '%s %s %s %s %s %s'%tuple(data[0]) out[159]: '1000 254092.5 1630087.5 9144.0 9358.96 214.96'  in [160]: ' '.join(['%10s']*6)%tuple(data[0]) out[160]: '      1000   254092.5  1630087.5     9144.0    9358.96     214.96'

a simple translation of dtype info:

def foo(astr):     if 'i' in astr:         return '%10i'     elif 'f' in astr:         return '%10f' [foo(x[1]) x in data.dtype.descr] # ['%10i', '%10f', '%10f', '%10f', '%10f', '%10f']

you use dtype names generate header line.

Search This Blog

hj

python - Numpy: use dtype from genfromtxt() when exporting with savetxt() -

Popular posts from this blog

title2

debugging - Reference - What does this error mean in PHP? -

c++ - Why doesn't unordered_set provide an array access operator -