apache spark - How to use Array of column names agaisnt a dataframe row inside map function and create a new DF -
i have dataframe 200 columns created array df.columns,
while iterating through dataframe df how tell row select columns row , create new dataframe.
val df = df1.join.df2 val colnames = df.columns df.map { row => **val createnewdf = (row(colnames)** } how create below line?
**val createnewdf = (row(colnames)**
if intending select limited number of columns joined dataframe need create array of column names , use select method as
val colnames = array("col1", "col2", "col4") import org.apache.spark.sql.functions._ val createnewdf = df.select(colnames.map(col): _*) df.columns select column names array , don't see use of selecting columns inside loop of dataframe rows dataframe has columns.
moreover, can change values of selected columns inside loop of rows. looping dataframe rows not recommended unless inbuilt function not defined.
Comments
Post a Comment