Python Forum
file transfer - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Forum & Off Topic (https://python-forum.io/forum-23.html)
+--- Forum: Bar (https://python-forum.io/forum-27.html)
+--- Thread: file transfer (/thread-41600.html)

Pages: 1 2


file transfer - DPaul - Feb-15-2024

Hi,
My business is converting legacy files,lists,databases,images,pdfs,cards... etc.
into a format that can be searched by (genalogy) enthousiasts.
In doing this I create a lot of small files that need to be transferred to the server. (millions)
We are talking win 10/11 environment.
It is becoming a bottleneck, even using robocopy (in windows command mode)
My transfer speed is +/- 30.000-50.000 files /hour.
Note: the files are mostly very small like 25 kb. (small png images)
Any suggestions to speed up the process, given that I cannot change the hardware config.?
thx,
Paul


RE: file transfer - DeaD_EyE - Feb-15-2024

My first idea was to put everything in a database to get a single big file. But this will add overhead.

You could make on both sides a database only with metadata like path, mtime, size and hash. Before you transfer files, you update the database and after it, you transfer only the changed and new files. On the other side, you must delete files, which are removed from the source.


RE: file transfer - Gribouillis - Feb-15-2024

If you can install rsync in Windows, you could try to synchronize directories with rsync. This program transfer onlys the files that have changed.


RE: file transfer - DPaul - Feb-15-2024

Let me reflect on these proposals.
rsync does not seem a solution, because all the files are new (and they don't change)
Making one big file might be better, but how?
Think, think, think ...
thx,
Paul


RE: file transfer - Gribouillis - Feb-15-2024

(Feb-15-2024, 06:24 PM)DPaul Wrote: Making one big file might be better, but how?
If you need a single big file, you can just compress the directory containing all the files and images and transfer the compressed file.


RE: file transfer - DPaul - Feb-16-2024

It is becoming a real problem, so I'm ready to try anything Smile
Compress you say. Not Zip?

- As long as the compressing does not take more time , when added to the transfer time. Rolleyes
- I will need to establish what size of batch I will transfer in 1 compression .
- I'll try with 25.000 first, then 50.000 and see if it is linear or not.
- Number of mb/gb must be proportional of course.
- and the total time must be < robocopy.
I'll report back.
thanks,
Paul


RE: file transfer - DPaul - Feb-16-2024

(Feb-15-2024, 08:28 PM)Gribouillis Wrote: compress the directory containing all the files and images and transfer the compressed file.
I did a small preliminary test, before I try big transfers.
Batch = 11.629 png files = +/- 100 mb
1) Benchmark : ctrl-c , ctrl-v to SSD via usb3 => time 3 min 6 seconds.
2) send to -> zipping -> ctrl-c , ctrl-v. -> ssd.
The transfer is of course lightning fast, but then you need to unzip. The whole process takes 3 min 8 seconds.

Unless I am missing something, this is not very encouraging to continue.
Paul


RE: file transfer - buran - Feb-18-2024

I don't how you create the files and how long it takes, but right now it looks like your workflow is to create all (or batch of) files and then move them at once and you clock that time. Is it plausible approach to monitor the folder(s) and transfer any new file right after it is created while more files are being created at the same time? Basically the destination folder will mirror source folder in [almost] real time.


RE: file transfer - DPaul - Feb-18-2024

(Feb-18-2024, 06:42 AM)buran Wrote: transfer any new file right after it is created
Yes, you are right. I did think of that, but:
The place where I create these "files", is not the same (physically)
as where the server is. Hence.
But as it is a real problem, and I if don't find another solution,
I'll have to do the development over there, at least for large batches.

The trick with the "send to" (zip), transfer, unzip would be OK,
but why does the unzip take so much more time than the zip?
thx,
Paul


RE: file transfer - buran - Feb-18-2024

(Feb-18-2024, 07:10 AM)DPaul Wrote: Yes, you are right. I did think of that, but:
The place where I create these "files", is not the same (physically)
as where the server is. Hence.
I don't see why this would be an issue as long as they are on same network or otherwise establish connection between the two. Actually I never thought it's the same, hence the need to transfer...