Questions related to writing your own file downloader using multiple threads java
- by Shekhar
Hello
In my current company, i am doing a PoC on how we can write a file downloader utility. We have
to use socket programming(TCP/IP) for downloading the files. One of the requirements of the
client is that a file(which will be large in size) should be transfered in chunks for example
if we have a file of 5Mb size then we can have 5 threads which transfer 1 Mb each. I have
written a small application which downloads a file. You can download the eclipe project
from http://www.fileflyer.com/view/QM1JSC0
A brief explanation of my classes
FileSender.java
This class provides the bytes of file. It has a method called
sendBytesOfFile(long start,long end, long sequenceNo) which gives the number of bytes.
import java.io.File;
import java.io.IOException;
import java.util.zip.CRC32;
import org.apache.commons.io.FileUtils;
public class FileSender {
private static final String FILE_NAME = "C:\\shared\\test.pdf";
public ByteArrayWrapper sendBytesOfFile(long start,long end, long sequenceNo){
try {
File file = new File(FILE_NAME);
byte[] fileBytes = FileUtils.readFileToByteArray(file);
System.out.println("Size of file is " +fileBytes.length);
System.out.println();
System.out.println("Start "+start +" end "+end);
byte[] bytes = getByteArray(fileBytes, start, end);
ByteArrayWrapper wrapper = new ByteArrayWrapper(bytes, sequenceNo);
return wrapper;
} catch (IOException e) {
throw new RuntimeException(e);
}
}
private byte[] getByteArray(byte[] bytes, long start, long end){
long arrayLength = end-start;
System.out.println("Start : "+start +" end : "+end + " Arraylength : "+arrayLength +" length of source array : "+bytes.length);
byte[] arr = new byte[(int)arrayLength];
for(int i = (int)start, j =0; i < end;i++,j++){
arr[j] = bytes[i];
}
return arr;
}
public static long fileSize(){
File file = new File(FILE_NAME);
return file.length();
}
}
Second Class is FileReceiver.java - This class receives the file.
Small Explanation what this file does
This class finds the size of the file to be fetched from Sender
Depending upon the size of the file it finds the start and end position till the bytes needs to be read.
It starts n number of threads giving each thread start,end, sequence number and a list which all the threads share.
Each thread reads the number of bytes and creates a ByteArrayWrapper.
ByteArrayWrapper objects are added to the list
Then i have while loop which basically make sure that all threads have done their work
finally it sorts the list based on the sequence number.
then the bytes are joined, and a complete byte array is formed which is converted to a file.
Code of File Receiver
package com.filedownloader;
import java.io.File;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Collections;
import java.util.Comparator;
import java.util.List;
import java.util.zip.CRC32;
import org.apache.commons.io.FileUtils;
public class FileReceiver {
public static void main(String[] args) {
FileReceiver receiver = new FileReceiver();
receiver.receiveFile();
}
public void receiveFile(){
long startTime = System.currentTimeMillis();
long numberOfThreads = 10;
long filesize = FileSender.fileSize();
System.out.println("File size received "+filesize);
long start = filesize/numberOfThreads;
List<ByteArrayWrapper> list = new ArrayList<ByteArrayWrapper>();
for(long threadCount =0; threadCount<numberOfThreads ;threadCount++){
FileDownloaderTask task = new FileDownloaderTask(threadCount*start,(threadCount+1)*start,threadCount,list);
new Thread(task).start();
}
while(list.size() != numberOfThreads){
// this is done so that all the threads should complete their work before processing further.
//System.out.println("Waiting for threads to complete. List size "+list.size());
}
if(list.size() == numberOfThreads){
System.out.println("All bytes received "+list);
Collections.sort(list, new Comparator<ByteArrayWrapper>() {
@Override
public int compare(ByteArrayWrapper o1, ByteArrayWrapper o2) {
long sequence1 = o1.getSequence();
long sequence2 = o2.getSequence();
if(sequence1 < sequence2){
return -1;
}else if(sequence1 > sequence2){
return 1;
}
else{
return 0;
}
}
});
byte[] totalBytes = list.get(0).getBytes();
byte[] firstArr = null;
byte[] secondArr = null;
for(int i = 1;i<list.size();i++){
firstArr = totalBytes;
secondArr = list.get(i).getBytes();
totalBytes = concat(firstArr, secondArr);
}
System.out.println(totalBytes.length);
convertToFile(totalBytes,"c:\\tmp\\test.pdf");
long endTime = System.currentTimeMillis();
System.out.println("Total time taken with "+numberOfThreads +" threads is "+(endTime-startTime)+" ms" );
}
}
private byte[] concat(byte[] A, byte[] B) {
byte[] C= new byte[A.length+B.length];
System.arraycopy(A, 0, C, 0, A.length);
System.arraycopy(B, 0, C, A.length, B.length);
return C;
}
private void convertToFile(byte[] totalBytes,String name) {
try {
FileUtils.writeByteArrayToFile(new File(name), totalBytes);
} catch (IOException e) {
throw new RuntimeException(e);
}
}
}
Code of ByteArrayWrapper
package com.filedownloader;
import java.io.Serializable;
public class ByteArrayWrapper implements Serializable{
private static final long serialVersionUID = 3499562855188457886L;
private byte[] bytes;
private long sequence;
public ByteArrayWrapper(byte[] bytes, long sequenceNo) {
this.bytes = bytes;
this.sequence = sequenceNo;
}
public byte[] getBytes() {
return bytes;
}
public long getSequence() {
return sequence;
}
}
Code of FileDownloaderTask
import java.util.List;
public class FileDownloaderTask implements Runnable {
private List<ByteArrayWrapper> list;
private long start;
private long end;
private long sequenceNo;
public FileDownloaderTask(long start,long end,long sequenceNo,List<ByteArrayWrapper> list) {
this.list = list;
this.start = start;
this.end = end;
this.sequenceNo = sequenceNo;
}
@Override
public void run() {
ByteArrayWrapper wrapper = new FileSender().sendBytesOfFile(start, end, sequenceNo);
list.add(wrapper);
}
}
Questions related to this code
1) Does file downloading becomes fast when multiple threads is used? In this code i am not able to see the benefit.
2) How should i decide how many threads should i create ?
3) Are their any opensource libraries which does that
4) The file which file receiver receives is valid and not corrupted but checksum (i used FileUtils of common-io) does not match. Whats the problem?
5) This code gives out of memory when used with large file(above 100 Mb) i.e. because byte array which is created. How can i avoid?
I know this is a very bad code but i have to write this in one day -:). Please suggest any
other good way to do this?
Thanks
Shekhar