distcp - hadoop discp issue while copying singe file -


(note: need use distcp parallelism)

i have 2 files in /user/bhavesh folder

enter image description here

i have 1 file in /user/bhavesh1 folder

enter image description here

copying 2 files /user/bhavesh /user/uday folder (this work fine)

enter image description here

enter image description here

this create /user/uday folder

copying 1 file /user/bhavesh1 /user/uday1 folder if creates file instead of folder

enter image description here

enter image description here

what need if there 1 file /user/bhavesh1/emp1.csv need should create /user/uday1/emp1.csv [uday1 should form directory] suggestion or highly appreciated.

in unix systems, when u copy single file giving destination directory name ending /user/uday1/, destination directory created, hadoop fs -cp command fail if destination directory missing.

when comes hdfs distcp, file/dir names ending / ignored if it's single file. 1 workaround create destination directory before executing distcp command. may add -p option in -mkdir avoid directory exists error.

hadoop fs -mkdir -p /user/uday1  ; hadoop distcp /user/bhavesh1/emp*.csv /user/uday1/   

this works both single file , multiple files in source directory.


Comments

Popular posts from this blog

Is there a better way to structure post methods in Class Based Views -

performance - Why is XCHG reg, reg a 3 micro-op instruction on modern Intel architectures? -

jquery - Responsive Navbar with Sub Navbar -