-
Notifications
You must be signed in to change notification settings - Fork 31
Description
Idea 1: LoopGlobalUsage
Description
Accessing global variables in Python is slower than accessing local variables due to the way Python manages its variable scope. While the performance difference is marginal per access, it can become significant inside a loop due to repeated lookups in the global namespace.
Compliant Code Example
d = {
"x": 1234,
"y": 5678,
}
def copy_dict_key_to_fast():
i = d["x"]
j = d["y"]
for _ in range(100000):
i + j
i + j
i + j
i + j
i + j
Non-Compliant Code Example
d = {
"x": 1234,
"y": 5678,
}
def dont_copy_dict_key_to_fast():
for _ in range(100000):
d["x"] + d["y"]
d["x"] + d["y"]
d["x"] + d["y"]
d["x"] + d["y"]
d["x"] + d["y"]
References
Idea 2: DottedImportInLoop
Description
Importing submodules or functions via dotted access (e.g., os.path.exists) within loops is inefficient. Each iteration requires multiple attribute lookups. Direct imports reduce attribute resolution overhead and improve performance, especially in tight loops.
Compliant Code Example
from os import environ
fromos.path import exists
def test_direct_import(items):
for item in items:
val = environ[item]
def test_direct_import_exists(items):
for item in items:
val = exists(item)
Non-Compliant Code Example
import os # NOQA
def test_dotted_import(items):
for item in items:
val = os.environ[item] # Use `from os import environ`
def even_worse_dotted_import(items):
for item in items:
val = os.path.exists(item) # Use `from os.path import exists` instead
References
Idea 3: UseTupleOverList
Description
Tuples are immutable and more efficient than lists for fixed-size, read-only sequences. They are faster to create and access due to their immutability.
Compliant Code Example
def get_coordinates():
# Using tuple for immutable data
coordinates = (10, 20, 30)
for i in range(3):
print(f"Coordinate {i}: {coordinates[i]}")
return coordinates
Non-Compliant Code Example
def get_coordinates():
# Using list for immutable data is less efficient
coordinates = [10, 20, 30]
for i in range(3):
print(f"Coordinate {i}: {coordinates[i]}")
return coordinates
References
Idea 4: UseContextManagersForFileOperations
Description
Using with statements to handle file operations ensures files are properly closed, even if an exception occurs. This prevents resource leaks and improves resource efficiency.
Compliant Code Example
def read_data_from_file(filename):
# Using context manager ensures file is properly closed
with open(filename, 'r') as file:
data = file.read()
# File is automatically closed when exiting the with block
return data
Non-Compliant Code Example
def read_data_from_file(filename):
# Manually opening file risks forgetting to close it
file = open(filename, 'r')
data = file.read()
# If an exception occurs before this line, file remains open
file.close()
return data
Idea 5: AvoidDataParallelInsteadOfDistributedDataParallel
Context
This rule is already implemented by AghilesAzzoug here:
https://github.com/AghilesAzzoug/GreenPyData/blob/main/pytorch-plugin/src/main/java/fr/greenpydata/pytorch/checks/AvoidDataParallelInsteadofDistributedDataParallel.java
Description
Use DistributedDataParallel (DDP) instead of DataParallel in PyTorch, even for single-node multi-GPU setups. DDP offers better performance, scalability, and reduced overhead.
Compliant Code Example
def compliant_example_with_ddp(model):
setup(0, 1)
model_ = torch.nn.parallel.DistributedDataParallel(model)
optim = torch.optim.SGD(model_.parameters(), lr=1e-3)
for _ in range(10):
inputs = torch.rand(5, 10, requires_grad=True)
targets = torch.randint(0, 2, (5,), dtype=torch.int64)
preds = model_(inputs)
loss = torch.nn.functional.cross_entropy(preds, targets)
loss.backward()
optim.step()
optim.zero_grad(set_to_none=True)
Non-Compliant Code Example
def non_compliant_example_fully_qualified(model):
setup(0, 1)
model_ = torch.nn.DataParallel(model) # Noncompliant {{Use DistributedDataParallel instead of DataParallel.}}
print(model_(torch.rand(5, 10)))
References
https://github.com/AghilesAzzoug/GreenPyData
Idea 6: UseInPlaceOperationsInModulesWhenPossible
Context
This rule is already implemented by AghilesAzzoug here:
https://github.com/AghilesAzzoug/GreenPyData/blob/main/pytorch-plugin/src/main/java/fr/greenpydata/pytorch/checks/UseInPlaceOperationsInModulesWhenPossible.java
Description
In PyTorch, in-place operations reduce memory usage and improve performance by modifying tensors directly. This is especially useful inside sequential modules like nn.Sequential.
Compliant Code Example
class CompliantReLU(nn.Module):
def __init__(self):
super(CompliantReLU, self).__init__()
self.encoder = nn.Sequential(
nn.Conv2d(in_channels=1, out_channels=10, kernel_size=5, stride=1),
nn.MaxPool2d(2),
nn.ReLU(inplace=True),
nn.Conv2d(10, 20, kernel_size=5, bias=True),
nn.BatchNorm2d(20),
nn.MaxPool2d(2),
nn.ReLU(inplace=True)
)
self.dense1 = nn.Linear(in_features=320, out_features=50)
self.dense1_bn = nn.BatchNorm1d(50)
self.dense2 = nn.Linear(50, 10)
def forward(self, x):
x = self.encoder(x)
x = x.view(-1, 320)
x = F.relu(self.dense1_bn(self.dense1(x)))
return F.relu(self.dense2(x))
Non-Compliant Code Example
class NonCompliantReLU(nn.Module):
def __init__(self):
super(NonCompliantReLU, self).__init__()
self.encoder = nn.Sequential(
nn.Conv2d(in_channels=1, out_channels=10, kernel_size=5, stride=1),
nn.MaxPool2d(2),
nn.ReLU(), # Noncompliant {{Use InPlace operations when possible.}}
nn.Conv2d(10, 20, kernel_size=5, bias=True),
nn.BatchNorm2d(20),
nn.MaxPool2d(2),
nn.ReLU(inplace=False) # Noncompliant {{Use InPlace operations when possible.}}
)
self.dense1 = nn.Linear(in_features=320, out_features=50)
self.dense1_bn = nn.BatchNorm1d(50)
self.dense2 = nn.Linear(50, 10)
def forward(self, x):
x = self.encoder(x)
x = x.view(-1, 320)
x = F.relu(self.dense1_bn(self.dense1(x)))
return F.relu(self.dense2(x))
References
https://github.com/AghilesAzzoug/GreenPyData
Idea 7: AsynchronousDataTransfer
Description
Asynchronous GPU memory transfers using non_blocking=True allow overlap between computation and data transfer, improving performance when combined with pinned memory.
Compliant Code Example
x = x.to("cuda:0", non_blocking=True)
y = y.to("cuda:0", non_blocking=True)
Non-Compliant Code Example
x = x.to("cuda:0")
y = y.to("cuda:0")
Idea 8: AvoidTryCatchInLoop
Description
Placing try-except blocks inside loops can degrade performance due to the cost of setting up exception handlers repeatedly. When possible, move exception handling outside the loop.
Compliant Code Example
def process_items(items):
results = []
for item in items:
# Regular processing without try-catch in the loop
result = transform_item(item)
results.append(result)
try:
# Handle all results at once outside the loop
return finalize_results(results)
except Exception as e:
print(f"Error finalizing results: {e}")
return None
Non-Compliant Code Example
def process_items(items):
results = []
for item in items:
try:
# Try-catch inside the loop is inefficient
result = transform_item(item)
results.append(result)
except Exception as e:
print(f"Error processing item {item}: {e}")
return results
Idea 9: UseBuiltInFunctions
Context
This rule is from 2023 hackathon, it's not yet implemented in the codebase.
here's previous discussion: green-code-initiative/creedengo-challenge#47
This rule can be divided in sub-rules
Description
Built-in functions in Python (like max, sum, etc.) are implemented in C and are much faster than custom Python equivalents. Use them when available for better performance.
Compliant Code Example
def find_max_value(data):
# Using built-in max function
return max(data)
Non-Compliant Code Example
def find_max_value(data):
# Manual implementation is slower than built-in
max_val = data[0]
for item in data[1:]:
if item > max_val:
max_val = item
return max_val
References
green-code-initiative/creedengo-challenge#47
Idea 10: UseNumpyArrayInsteadOfStandardList
Context
This rule is from 2023 hackathon, it's not yet implemented in the codebase.
here's previous discussion:
green-code-initiative/creedengo-challenge#31
Description
NumPy arrays are significantly more efficient for numerical operations and large datasets. They offer vectorized operations and better memory usage compared to standard Python lists.
Compliant Code Example
my_list = list(range(1000000))
for item in my_list:
tmp = item
Non-Compliant Code Example
my_list = np.arange(1000000)
for item in my_list:
tmp = item
Idea 11: W8402 - UseListCopyInsteadOfForLoop
Description
Use list() or .copy() to duplicate lists instead of manually looping through elements. This is more concise and optimized.
Compliant Code Example
def copy_list(original):
# Using list constructor for copying
filtered = list(original)
return filtered
# Alternative using copy method
def copy_list_alt(original):
filtered = original.copy()
return filtered
Non-Compliant Code Example
def copy_list(original):
# Inefficient way to copy a list
filtered = []
for i in original:
filtered.append(i)
return filtered
Idea 12: W8201 - LoopInvariantStatement
Description
Avoid evaluating expressions that remain constant across iterations inside loops. Move them outside the loop to save redundant computations.
Compliant Code Example
def process_data(data):
x = (1, 2, 3, 4)
n = len(x) # Computed once outside the loop
results = []
for i in range(10_000):
results.append(n * i)
return results
Non-Compliant Code Example
def process_data(data):
x = (1, 2, 3, 4)
results = []
for i in range(10_000):
n = len(x) # Computed inside the loop
results.append(n * i)
return results
Enjoy the 2025 Hackathon! 🎉
Cléophas