distcp - hadoop discp issue while copying singe file -
(note: need use distcp parallelism)
i have 2 files in /user/bhavesh folder
i have 1 file in /user/bhavesh1 folder
copying 2 files /user/bhavesh /user/uday folder (this work fine)
this create /user/uday folder
copying 1 file /user/bhavesh1 /user/uday1 folder if creates file instead of folder
what need if there 1 file /user/bhavesh1/emp1.csv need should create /user/uday1/emp1.csv [uday1 should form directory] suggestion or highly appreciated.
in unix systems, when u copy single file giving destination directory name ending /user/uday1/, destination directory created, hadoop fs -cp command fail if destination directory missing.
when comes hdfs distcp, file/dir names ending / ignored if it's single file. 1 workaround create destination directory before executing distcp command. may add -p option in -mkdir avoid directory exists error.
hadoop fs -mkdir -p /user/uday1 ; hadoop distcp /user/bhavesh1/emp*.csv /user/uday1/
this works both single file , multiple files in source directory.
Comments
Post a Comment