Issue
What's the syntax for using a groupby-having in Spark without an sql/hiveContext? I know I can do
DataFrame df = some_df
df.registreTempTable("df");
df1 = sqlContext.sql("SELECT * FROM df GROUP BY col1 HAVING some stuff")
but how do I do it with a syntax like
df.select(df.col("*")).groupBy(df.col("col1")).having("some stuff")
This .having() does not seem to exist.
Solution
Yes, it doesn't exist. You express the same logic with agg followed by where:
df.groupBy(someExpr).agg(somAgg).where(somePredicate)
Answered By - zero323
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.