Tar up all PDFs in a directory site, preserving directory site framework
I'm attempting to create a pressed tarball which contains all PDF submits that exist in among my directory sites. The directory site framework requires to be preserved. Vacant directory sites are not required, yet I actually uncommitted if they exist.
As an example, claim I had a directory site that resembled this:
dir
dir/subdir1
dir/subdir1/subsubdir1/song.mp3
dir/subdir2
dir/subdir2/subsubdir1
dir/subdir2/subsubdir1/document.pdf
dir/subdir2/subsubdir1/another-song.mp3
dir/subdir2/subsubdir1/top-ten-movies.txt
dir/subdir3
dir/subdir3/another-document.pdf
After running the command, I would certainly such as to have dir.tar.gz
have this:
dir
dir/subdir2
dir/subdir2/subsubdir1
dir/subdir2/subsubdir1/document.pdf
dir/subdir3
dir/subdir3/another-document.pdf
Possible?
This will certainly detail all the PDFs:
$ find dir/ -name '*.pdf'
./dir/subdir2/subsubdir1/document.pdf
./dir/subdir3/another-document.pdf
You can pipeline that to xargs
to get it as a solitary room - delimited line, and also feed that to tar
to create the archive:
$ find dir/ -name '*.pdf' | xargs tar czf dir.tar.gz
(This means leaves out the vacant directory sites)
With bash ≥ 4 or zsh and also GNU tar:
tar -czf dir.tar.gz dir/**/*.pdf
This could not function if you have a large variety of PDF documents and also the command line is also long. After that you would certainly require an extra intricate find - based remedy (once more, making use of GNU tar):
tar -cf dir.tar -T /dev/null
find dir -name '*.pdf' -exec tar -rf dir.tar {} +
gzip dir.tar
Alternatively (and also portably) you can create the archive with pax.
pax -w -x ustar -s '/\.pdf$/&/' -s '/.*//' . | gzip >dir.tar.gz
The first -s
claims to include all .pdf
documents, without transforming their name. The 2nd -s
claims to relabel all various other documents to a vacant name, which in fact suggests not to include them in the archive.
Related questions