13. Reading and Writing Files#
Files can generally be categorized into binary files and text files. Binary files consist of bit patterns and allow efficient data storage, often used in well-known formats like images, videos, or audio files. Such files usually require specialized tools or Python libraries for import. Text files, on the other hand, are readable with a standard text editor and include formats like .txt
, .csv
, .yml
, or .py
. These are the files we’ll primarily focus on.
13.1. Reading Files#
Files can be read in Python using the open()
function. For example, consider a file named testfile.txt
located in the same directory:
data = open("testfile.txt", "r")
print(data.readline()) # => 'Here is some sample text for testing.\n'
Here, the "r"
argument specifies the mode in which the file is opened. The following table summarizes the most common modes:
Mode |
Meaning |
Explanation |
---|---|---|
|
Read mode, text |
Cursor starts at the beginning of the file. Does not create new files. |
|
Read mode, binary |
|
|
Write mode, text |
Overwrites existing files or creates a new file. |
|
Write mode, binary |
|
|
Append mode, text |
Writes to the end of the file without overwriting the existing content. |
|
Append mode, binary |
|
|
Exclusive creation, text |
Fails if the file exists. |
|
Exclusive creation, binary |
13.1.1. Reading Line by Line#
You can also read a file line by line:
data = open("testfile.txt", "r")
for line in data:
print(line)
If the file isn’t in the same folder, you need to provide the file path.
Windows: Use
"C:/User/my_folder/testfile.txt"
or escape backslashes like"C:\\User\\my_folder\\testfile.txt"
.macOS/Linux: Use
"/home/user/my_folder/testfile.txt"
.
You can also use relative paths:
data = open("my_data/testfile2.txt", "r")
For advanced path handling, consider using libraries like os
.
13.2. Writing Files#
To write to a file, use the write()
method. First, open a file in write mode:
output = open("output_file.txt", "w")
text = "Now for something completely different."
output.write(text + "\n")
output.close()
With mode "w"
, a new file is created each time. To append to an existing file instead, use "a"
:
output = open("output_file.txt", "a")
output.write("Yes. Exactly.\n")
output.write("Yes " * 5 + "\n")
output.close()
13.2.1. Using with
for File Management#
A common and recommended approach is to use the with
statement for file handling. This ensures the file is properly closed, even if an error occurs.
with open("output_file.txt", "w") as file:
for i in range(1, 6):
file.write(f"Line {i} \n")
Reading a file with with
:
with open("output_file.txt", "r") as file:
for line in file:
if "3" in line:
print(line)
13.3. Encoding#
Computers don’t inherently understand letters or characters—only bytes, which are numbers from 0 to 255. To map these numbers to characters, encoding systems are used.
13.3.1. ASCII#
ASCII (American Standard Code for Information Interchange) can represent 256 characters. However, it lacks many special characters.
13.3.2. Unicode#
Unicode has become the standard, especially for the web (HTML 4.0), supporting over 100,000 characters. The most commonly used Unicode encoding in Python is UTF-8, as it efficiently represents characters while saving space.
For more on encoding, refer to the SelfHTML Wiki on character encoding.
13.3.3. Handling Encoding in Python#
If you encounter issues with special characters (e.g., umlauts), it’s often due to improper conversion between ASCII and UTF-8. To avoid such problems, explicitly specify the encoding:
with open("output_file.txt", "w", encoding="utf-8") as file:
for i in range(1, 6):
file.write(f"Line {i} \n")