Stream tar.gz file from FTP server
Posted
by
linker
on Server Fault
See other posts from Server Fault
or by linker
Published on 2012-06-26T19:41:03Z
Indexed on
2012/06/26
21:17 UTC
Read the original article
Hit count: 192
Here is the situation: I have a tar.gz file on a FTP server which can contain an arbitrary number of files.
Now what I'm trying to accomplish is have this file streamed and uploaded to HDFS through a Hadoop job. The fact that it's Hadoop is not important, in the end what I need to do is write some shell script that would take this file form ftp with wget
and write the output to a stream.
The reason why I really need to use streams is that there will be a large number of these files, and each file will be huge.
It's fairly easy to do if I have a gzipped file and I'm doing something like this:
wget -O - "ftp://${user}:${pass}@${host}/$file" | zcat
But I'm not even sure if this is possible for a tar.gz file, especially since there are mutliple files in the archive. I'm a bit confused on what direction to take for this, any help would be greatly appreciated.
© Server Fault or respective owner