Julia performance compared to Python+Numba LLVM/JIT-compiled code -


the performance benchmarks julia have seen far, such @ http://julialang.org/, compare julia pure python or python+numpy. unlike numpy, scipy uses blas , lapack libraries, optimum multi-threaded simd implementation. if assume julia , python performance same when calling blas , lapack functions (under hood), how julia performance compare cpython when using numba or numbapro code doesn't call blas or lapack functions?

one thing notice julia using llvm v3.3, while numba uses llvmlite, built on llvm v3.5. julia's old llvm prevent optimum simd implementation on newer architectures, such intel haswell (avx2 instructions)?

i interested in performance comparisons both spaghetti code , small dsp loops handle large vectors. latter more efficiently handled cpu gpu me due overhead of moving data in , out of gpu device memory. interested in performance on single intel core-i7 cpu, cluster performance not important me. of particular interest me ease , success creating parallelized implementations of dsp functions.

a second part of question comparison of numba numbapro (ignoring mkl blas). numbapro's target="parallel" needed, given new nogil argument @jit decorator in numba?

this broad question. regarding benchmark requests, may best off running few small benchmarks matching own needs. answer 1 of questions:

one thing notice julia using llvm v3.3, while numba uses llvmlite, built on llvm v3.5. julia's old llvm prevent optimum simd implementation on newer architectures, such intel haswell (avx2 instructions)?

[2017/01+: information below no longer applies current julia releases]

julia turn off avx2 llvm 3.3 because there deep bugs on haswell.

julia built llvm 3.3 current releases , nightlies, can build 3.5, 3.6, , svn trunk (if haven't yet updated api change on given day, please file issue). so, set llvm_ver=svn (for example) in make.user , proceed follow build instructions.


Popular posts from this blog