Python Forum
While loop and read_lines() - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: Web Scraping & Web Development (https://python-forum.io/forum-13.html)
+--- Thread: While loop and read_lines() (/thread-15885.html)

Pages: 1 2 3


RE: While loop and read_lines() - pooria - Feb-08-2019

Could you tell me the python code?

I've been playing around with it like 24 hours and I couldn't come up with a solution. I'd appreciate it


RE: While loop and read_lines() - buran - Feb-08-2019

pfile = 'processed.txt' # processed lines
infile = 'source_file.txt' # source file

with open(infile, 'r') as inf, open(pfile, 'a+') as pf:
    pf.seek(0)
    processed_lines = set([line.strip() for line in pf])
    for line in inf:
        query = line.strip()
        if query not in processed_lines: # skip if line in processed lines
            # process query here
            pf.write(line) # save to processed lines



RE: While loop and read_lines() - pooria - Feb-08-2019

(Feb-08-2019, 06:19 PM)buran Wrote:
pfile = 'processed.txt' # processed lines
infile = 'source_file.txt' # source file

with open(infile, 'r') as inf, open(pfile, 'a+') as pf:
    pf.seek(0)
    processed_lines = set([line.strip() for line in pf])
    for line in inf:
        query = line.strip()
        if query not in processed_lines: # skip if line in processed lines
            # process query here
            pf.write(line) # save to processed lines

That is one elegant solution. Thank you.

Two questions:

1) Technically, my function should go right after "Query = line.strip()", correct?
2) This sees which lines are processed and copies those to another file. But how can I delete it from the original file? I think I should set the "r" to "r+", then I should either use writelines() or truncate(). Could you show me that code as well?

And listen I appreciate all the hard work you've done for me. These things must be like a breeze to you but that doesn't diminish your efforts.

If there is anyway I could repay you, please let me know


RE: While loop and read_lines() - buran - Feb-08-2019

(Feb-08-2019, 08:01 PM)pooria Wrote: 1) Technically, my function should go right after "Query = line.strip()", correct?
it should go instead of comment # process query her
(Feb-08-2019, 08:01 PM)pooria Wrote: 2) This sees which lines are processed and copies those to another file. But how can I delete it from the original file? I think I should set the "r" to "r+", then I should either use writelines() or truncate(). Could you show me that code as well?
I am not a native english speaker, but you CANNOT delete rows while iterating over the file is simple enough to be correct and you to understand it.
This approach stores the processed lines in a new file, while keeping the original file intact (again you CANNOT delete rows while iterating over the file).
Then in a separate run you can read all the lines from the original file IN THE MEMORY. Read all the lines from the processed lines file. Iterate over the lines in memory and write to NEW file those lines that are not processed. This new file is what you want. Technically you can overwrite the original file directly, but you run the risk of unexpected error and in this case you will loose all the information. That is why the correct way is to write to new file and only after it's written, delete the old file and rename the newly created one.


RE: While loop and read_lines() - pooria - Mar-11-2019

Hi,

I have updated my code.

It worked flawlessly before. But now it just asks one question and fails.

Could you take a look at it?

https://python-forum.io/Thread-Web-automation-quora