How to Analyze a File Line By Line With Python

Using the While Loop Statement to Analyze a Text File

My workstation
aadis/Flikr/CC BY 2.0

One of the primary reasons people use Python is for analyzing and manipulating text. If your program needs to work through a file, it is usually best to read in the file one line at a time for reasons of memory space and processing speed. This is best done with a while loop.

Code Sample for Analyzing Text Line by Line

 fileIN = open(sys.argv[1], "r")
 line = fileIN.readline()
 
 while line:
 [some bit of analysis here]
 line = fileIN.readline()
 

This code takes the first command line argument as the name of the file to be processed. The first line opens it and initiates a file object, "fileIN." The second line then reads the first line of that file object and assigns it to a string variable, "line." The while loop executes based on the constancy of "line." When "line" changes, the loop restarts. This continues until there are no more lines of the file to be read. The program then exits.

Reading the file in this way, the program does not bite off more data than it is set to process. It processes the data it does input faster, giving its output incrementally. In this way, the memory footprint of the program is kept low, and the processing speed of the computer does not take a hit. This can be important if you are writing a CGI script that may see a few hundred instances of itself running at a time. 

More About "While" in Python

The while loop statement repeatedly executes a target statement as long as the condition is true.

The syntax of the while loop in Python is: 

while expression:
 statement(s)

The statement may be a single statement or a block of statements. All the statements indented by the same amount are considered to be part of the same code block. Indentation is how Python indicates groups of statements.