Jun-19-2018, 06:18 AM
(This post was last modified: Jun-19-2018, 06:19 AM by Gingmeister.)
Hi All,
Forgive my ignorance on the realities of networking but...how can I support many users simultaneously with my (cloud-hosted) scraping app ? I have written a demo that supports only a single user:
After user input (via browser) the script does several cycles of scraping and crunching - which usually takes 5-10 mins. (Users will get the output by email). This is fine for a single user, but how could I support 100 or even 1000 users simultaneously? A separate script running for every single user?! :-O
Queue? - not really feasible because users wont wait for their report for very long
Multi-thread? - this will slow things down (each thread will take longer than it would as a single thread - right?)
...so I am left wondering if I have to have one script running for every single user ?!
Somewhere else I saw a reference to Twisted, but I am not sure how this fits.
There must be other web apps or cloud-based services that have to deliver serious real-time data crunching to users. How do they do that?
Thanks a lot for any help in advance - I really appreciate any advice.
Forgive my ignorance on the realities of networking but...how can I support many users simultaneously with my (cloud-hosted) scraping app ? I have written a demo that supports only a single user:
After user input (via browser) the script does several cycles of scraping and crunching - which usually takes 5-10 mins. (Users will get the output by email). This is fine for a single user, but how could I support 100 or even 1000 users simultaneously? A separate script running for every single user?! :-O
Queue? - not really feasible because users wont wait for their report for very long
Multi-thread? - this will slow things down (each thread will take longer than it would as a single thread - right?)
...so I am left wondering if I have to have one script running for every single user ?!
Somewhere else I saw a reference to Twisted, but I am not sure how this fits.
There must be other web apps or cloud-based services that have to deliver serious real-time data crunching to users. How do they do that?
Thanks a lot for any help in advance - I really appreciate any advice.