Quantcast
Channel: First steps - JuliaLang
Viewing all articles
Browse latest Browse all 2795

How to select top 5 results per group in dataframe?

$
0
0

I’m looking for a Julian way of selecting a subset of each group. I have a DataFrame with (among others) two columns, say name and length. I want to group over all names and pick the 5 tallest people within each name. I tried this, but it does not return the correct result:

df = ...
sort!(df, [:length])
df2 = df |> @groupby([:name]) |> @take(5) |> collect
print(DataFrame(df2))

Changing collect to DataFrame does not work either. The print will tell me I have a dataframe with as many rows as the initial dataframe df. This sort of thing; taking a df -> grouping it -> selecting a subset of the rows of the groups -> recombining the selected rows into a dataframe, is something I would assume is a common thing to do.

4 posts - 2 participants

Read full topic


Viewing all articles
Browse latest Browse all 2795

Latest Images

Trending Articles



Latest Images