apache spark - How to use Array of column names agaisnt a dataframe row inside map function and create a new DF -


i have dataframe 200 columns created array df.columns,

while iterating through dataframe df how tell row select columns row , create new dataframe.

val df = df1.join.df2  val colnames = df.columns  df.map {  row =>  **val createnewdf = (row(colnames)**   } 

how create below line?

**val createnewdf = (row(colnames)** 

if intending select limited number of columns joined dataframe need create array of column names , use select method as

val colnames = array("col1", "col2", "col4") import org.apache.spark.sql.functions._ val createnewdf = df.select(colnames.map(col): _*) 

df.columns select column names array , don't see use of selecting columns inside loop of dataframe rows dataframe has columns.

moreover, can change values of selected columns inside loop of rows. looping dataframe rows not recommended unless inbuilt function not defined.


Comments

Popular posts from this blog

What is happening when Matlab is starting a "parallel pool"? -

angular - DownloadURL return null in below code -

php - Cannot override Laravel Spark authentication with own implementation -