datatable - How to "join" with the split-apply-combine method in Julia -
i have complex join (in sql sense) perform in julia, can't figure out how working in split-apply-combine method (although can written out hand). seems should easy though. problem looks this. have dataframe of data on turtles running races:
using dataframes data = dataframe() data[:turtle] = ["suzy", "suzy", "bob", "batman", "batman", "batman", "bob"] data[:event] = ["5k", "5k", "1k", "5k", "5k", "1k", "1k"] data[:time] = [6.2 , 6.7 , 2.1, 3.2, 3.1, 0.9, 2.4] data[:photo] =["111.jpg","123.jpg","145.jpg","167.jpg","189.jpg","190.jpg","195.jpg"] data
i datatable consists of rows of table each turtle's personal (turtlenal?) best in event ran. can need
bestfinishes = by(data, [:turtle, :event]) df dataframe(fastesttime = minimum(df[:time])) end
but need photo column matching rows. how do this?
well, typed in realized 1 way based on this question.
bestfinishes = by(data, [:turtle, :event]) df dataframe(fastesttime = minimum(df[:time]), winningphoto = df[indmin(df[:time]),:photo] ) end
however, more general way is
bestfinishes = by(data, [:turtle, :event]) df thisfastesttime = minimum(df[:time]) df[df[:time].==thisfastesttime,:] end
which makes things easier if want prune rows large data set efficiently. i'll see if can add example documentation since didn't seem covered (or assumed more familiarity method had).