Compress and Decompress
tar
##reference:https://man.linuxde.net/tar
- Compress:
tar -zcvf File.tar.gz File
- Decompress:
tar - zxvf File.tar.gz
Format | Decompress | compress |
---|---|---|
.tar | xvf | cvf |
.tar.gz | zxvf | zcvf |
.tar.bz2 | jxvf | jcvf |
.tar.bz | jxvf | |
.tar.Z | .tar.Z | .tar.Z |
Tar with ssh to substitute scp
This pipe could help you upload large files much faster than scp
. It is a very good way to substitute scp
.
I tried to backup 1.4 T files from a moveable hard drive with scp
and it takes a half hour for 24 KB files. It spends most of the time reading files.
When I switched this pipeline, a few decade gigabytes was uploaded within a few minutes. It is crazy fast!!!
Cite: roaima; 2015
|
cp files with tar
|
Samll size fiels:
cp -r Github /media/Side/ken/Github
cp -r Github /media/Side/ken/Github 0.00s user 0.23s system 27% cpu 0.835 total
time tar cf -Github | gzip | ssh ken@0.0.0.0 'cd /media/Side/ken && gzip -d | tar xvf -'
tar cf - Githu* 0.06s user 0.32s system 2% cpu 13.564 total gzip 10.70s user 0.04s system 79% cpu 13.566 total ssh ken@0.0.0.0 'cd /media/Side/ken && gzip -d | tar xvf -' 0.59s user 0.26s system 6% cpu 13.567 total
For the Github
directory, cp
only takes less than 1 s, but take 13.5s
for tar-pipe. So, if you have lots of small files, cp
still are your first choose.
Large file test
check the size of the file: du -sh Mutation/Raw_VCF
23G Mutation/Raw_VCF
time cp -r Mutation/Raw_VCF /media/Side/ken/
cp -r Mutation/Raw_VCF /media/Side/ken/ 0.53s user 59.35s system 7% cpu 12:31.78 total
time tar cf - Mutation/Raw_VCF | gzip | ssh ken@0.0.0.0 'cd /media/Side/ken && gzip -d | tar xvf -'
tar cf - Mutation/Raw_VCF 3.21s user 27.98s system 4% cpu 10:36.64 total gzip 532.32s user 2.73s system 84% cpu 10:36.65 total ssh ken@0.0.0.0 'cd /media/Side/ken && gzip -d | tar xvf -' 18.95s user 7.16s system 4% cpu 10:36.65 total
So, in this result, cp
takes like 12 minutes, but our tar-pipe takes 10.5 minutes
A better way
Though the pipeline works, but the ssh
part is wasting large of resource. The best way for this situation is:
|
And it only takes roughly 2 minutes.
gzip
- Compress:
gzip -cr 220725_KEGG > KEGG.gz
- Decompress:
gzip -d KEGG.gz
Compress and Decompress