As seen in the previous post opening a file will create a file handle that does not contain the data in the file. To read a file we have to create for loop which will be used to read through and count each of the lines in a file.
Note: In the examples of this post we will use the text from the Wikipedia page subject Mathematics. Go to the page: https://en.wikipedia.org/wiki/Mathematics copy the text into a notepad file and save it as "Mathematics.txt". The file cannot be opened from Python due to an encodings issue UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 4383: character maps to < undefined > so we will have to open a file by changing the encoding to "utf-8". To do that, when writing a command for opening a file write the following.
Example 1 Write the program that will open the "Mathematics.txt" and perform line counting which will be displayed as the output of the program.
Solution: First of all the "Mathematics.txt" have to be located in the current Python working directory or a folder in which Python script is saved. Then the first step is to open the "Mathematics.txt" file using the open function with 'utf-8' encoding.
Solution:File "Mathematics.txt" is loaded in the same way as in previous example.
file_name = open("Mathematics.txt", encoding = 'utf-8'). ### More on encoding protocol!!! In the following example, we will write the program that will open the "Mathematics.txt" and retrieve the number of lines in the file.
Example 1 Write the program that will open the "Mathematics.txt" and perform line counting which will be displayed as the output of the program.
Solution: First of all the "Mathematics.txt" have to be located in the current Python working directory or a folder in which Python script is saved. Then the first step is to open the "Mathematics.txt" file using the open function with 'utf-8' encoding.
fileM = open("Mathematics.txt",encoding = "utf-8")Then we have to define a variable that will be updated in each iteration of for loop and used to count the lines in the "Mathematics.txt" file. Of course, the initial value that will be assigned to this variable is 0.
LINE_COUNT = 0Now we have to create a for loop which will be used for counting lines of the "Mathematics.txt". The body of the for loop will contain one code line which will be used to update the LINE_COUNT variable.
for line in file:Before going further with the program development we have to describe the for a loop. Here, the for loop is used to read a file. Python takes care of splitting the data stored in the file into separate lines using the newline character ("\n"). Python reads each line through the newline and includes the newline ("\n") as the last character in the line variable at each iteration of the for a loop. Finally, we can show the line count value as the output of the program.
LINE_COUNT += 1
print("LINE_COUNT = {}".format(LINE_COUNT))The entire code for this program is given below.
fileM = open("Mathematics.txt", encoding="utf-8")The output of the program.
LINE_COUNT = 0
for line in fileM:
LINE_COUNT += 1
print("LINE_COUNT = {}".format(LINE_COUNT))
LINE_COUNT = 525Python programming language offers the ability to read the whole file into one string using the read method on the file handle. In the following example, we will read the entire file at once i.e. the entire file will be one string and we will use the len function to determine the length of that string. Example 2 Read the entire file "Mathematics.txt" and assign it to a variable STRING_LENGTH and show the length as the output of the program.
Solution:File "Mathematics.txt" is loaded in the same way as in previous example.
fileM = open("Mathematics.txt", encoding = "utf-8")Now we will implement the read method and assign it to a variable STRING
STRING = fileM.read()To determine the length of the string we will use len method and assign it to a variable STRING_LENGTH.
STRING_LENGTH = len(STRING)Finally, we will show the length of the STRING as output.
print("STRING_LENGTH = {}".format(STRING_LENGTH))The entire code in this example:
fileM=open("Mathematics.txt", encoding="utf-8")The output is given below.
STRING = fileM.read()
STRING_LENGTH = len(STRING)
print("STRING_LENGTH = {}".format(STRING_LENGTH))
STRING_LENGTH = 64534The output of the program showed that the total number of characters inside the "Mathematics.txt" file was 64534 characters. Since the entire data inside the fileM is represented as a string type using the read method we can apply all string methods to the string. For example, if we want to slice the string i.e. see characters from index 500 to 6000 then we type:
print("STRING_SLICE[500:600] = {}".format(STRING[500:600]))which will generate the following output.
STRING[500:600] = line citations. Please help to improve this article by introducing more precise citations. (June 202