Reducing code runtime is important for developers. Python Profilers, like cProfile, help us to find which part of the program or code takes more time to run. Whether you are using a Python GUI or the command line profiling is a huge help in tracking down code bottlenecks which impact performance.
This article will walk you through the process of using the cProfile
module for extracting profiling data and the snakeviz
module for visualization and implementing those steps to test machine learning scripts.
Table of Contents
What is code profiling?
Code profiling is a technique to figure out how time is spent in a program. For more details, a profile is a set of statistics that describes how often and for how long various parts of the program are executed.
With these statistics, we can find the “hot spot” of a program and think about ways of improvement. Sometimes, a hot spot in an unexpected location may give you hints about bugs in your program.
A program running slow can generally be due to two reasons: A part is running slow, or a part is running too many times, adding up and taking too much time. We call these “performance hogs” the hot spot.
How do I get the cProfile library?
As cProfile is a built-in Python library, no further installation is needed.
How do I get the snakeviz library, to visualize the profiling results?
Here is how you can get a stable release of snakeviz
using pip
:
1 |
pip install snakeviz |
How do I implement Python profiling tools into my machine learning code?
How can I use a profiler inside Python code?
The advantage of this method is we can focus on profiling only a part, instead of the entire program. For example, if we load a large module, it takes time to bootstrap, and we want to remove this from the profiler. In this case, we can invoke the profiler only for certain lines.
The following is an example of profiling an ordinary least square (OLS) linear regression program, only for the regression until the plotting steps:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 |
# Import profiling tools import cProfile as profile import pstats # Code source for Ordinary Linear Regression: Jaques Grobler # License: BSD 3 clause import matplotlib.pyplot as plt import numpy as np from sklearn import datasets, linear_model from sklearn.metrics import mean_squared_error, r2_score # Load the diabetes dataset diabetes_X, diabetes_y = datasets.load_diabetes(return_X_y=True) # Use only one feature diabetes_X = diabetes_X[:, np.newaxis, 2] # Split the data into training/testing sets diabetes_X_train = diabetes_X[:-20] diabetes_X_test = diabetes_X[-20:] # Split the targets into training/testing sets diabetes_y_train = diabetes_y[:-20] diabetes_y_test = diabetes_y[-20:] # Perform all the regression steps with profiling prof = profile.Profile() prof.enable() # Create linear regression object regr = linear_model.LinearRegression() # Train the model using the training sets regr.fit(diabetes_X_train, diabetes_y_train) # Make predictions using the testing set diabetes_y_pred = regr.predict(diabetes_X_test) # The coefficients print("Coefficients: n", regr.coef_) # The mean squared error print("Mean squared error: %.2f" % mean_squared_error(diabetes_y_test, diabetes_y_pred)) # The coefficient of determination: 1 is perfect prediction print("Coefficient of determination: %.2f" % r2_score(diabetes_y_test, diabetes_y_pred)) # Plot outputs plt.scatter(diabetes_X_test, diabetes_y_test, color="black") plt.plot(diabetes_X_test, diabetes_y_pred, color="blue", linewidth=3) plt.xticks(()) plt.yticks(()) prof.disable() # Print profiling output stats = pstats.Stats(prof).strip_dirs().sort_stats("cumtime") stats.print_stats(10) # Print only top 10 rows # Show plot plt.show() |
Here is the output on PyScripter IDE:
For the second example, let’s consider a program that uses a hillclimbing
algorithm to find hyperparameters for a Perceptron
model. We want to profile the hill climb algorithm only for the hillclimbing
search part:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 |
# Import profiling tools import cProfile as profile import pstats # Manually search perceptron hyperparameters for binary classification from numpy import mean from numpy.random import randn from numpy.random import rand from sklearn.datasets import make_classification from sklearn.model_selection import cross_val_score from sklearn.model_selection import RepeatedStratifiedKFold from sklearn.linear_model import Perceptron # Objective function def objective(X, y, cfg): # Unpack config eta, alpha = cfg # Define model model = Perceptron(penalty='elasticnet', alpha=alpha, eta0=eta) # Define evaluation procedure cv = RepeatedStratifiedKFold(n_splits=10, n_repeats=3, random_state=1) # Evaluate model scores = cross_val_score(model, X, y, scoring='accuracy', cv=cv, n_jobs=-1) # Calculate mean accuracy result = mean(scores) return result # Take a step in the search space def step(cfg, step_size): # Unpack the configuration eta, alpha = cfg # Step eta new_eta = eta + randn() * step_size # Check the bounds of eta if new_eta <= 0.0: new_eta = 1e-8 if new_eta > 1.0: new_eta = 1.0 # Step alpha new_alpha = alpha + randn() * step_size # Check the bounds of alpha if new_alpha < 0.0: new_alpha = 0.0 # Return the new configuration return [new_eta, new_alpha] # Hill climbing local search algorithm def hillclimbing(X, y, objective, n_iter, step_size): # Starting point for the search solution = [rand(), rand()] # Evaluate the initial point solution_eval = objective(X, y, solution) # Run the hill climb for i in range(n_iter): # Take a step candidate = step(solution, step_size) # Evaluate candidate point candidate_eval = objective(X, y, candidate) # Check if we should keep the new point if candidate_eval >= solution_eval: # Store the new point solution, solution_eval = candidate, candidate_eval # Report progress print('>%d, cfg=%s %.5f' % (i, solution, solution_eval)) return [solution, solution_eval] # Define dataset X, y = make_classification(n_samples=1000, n_features=5, n_informative=2, n_redundant=1, random_state=1) # Define the total iterations n_iter = 100 # Step size in the search space step_size = 0.1 # Perform the hill climbing search with profiling prof = profile.Profile() prof.enable() cfg, score = hillclimbing(X, y, objective, n_iter, step_size) prof.disable() # Print program output print('Done!') print('cfg=%s: Mean Accuracy: %f' % (cfg, score)) # Print profiling output stats = pstats.Stats(prof).strip_dirs().sort_stats("cumtime") stats.print_stats(10) # Print only top 10 rows |
Here is the output on PyScripter IDE:
How can I use a Python code profiler at runtime from a command prompt?
Another way to perform the profiling of the machine learning script is by running cProfile
at a runtime. The advantage of this method is you can easily profile the whole code just by one line of command, and you can export the profiling result as a file, for further analysis.
Here is how to do machine learning code profiling at the command prompt: First, remove the profiler parts, and save the code as ols.py
. Next, we can run the profiler in the command line as follows:
1 |
python -m cProfile ols.py |
The following is the excerpt of the profiling results:
Do the same treatment to the Hillclimb algorithm script: Remove the profiler parts, and save the code as hillclimb.py
. Next, we can run the profiler in the command line as follows:
1 |
python -m cProfile hillclimb.py |
The following is the excerpt of the profiling results:
It provides very rich and detailed code profiling data.
How to sort Python profiling results by call count?
The profiling output as presented in previous sections is very long and may not be useful to us as it could be difficult to tell which function is the hot spot. So we can sort the above output by their call counts (ncalls
) to find out the part that is running too many times, using the following command:
1 |
python -m cProfile -s ncalls ols.py |
The following is an excerpt of the profiling results for ols.py
, sorted from the most called function:
Run the following command to sort the profiling results of the Hillclimb algorithm by call count:
1 |
python -m cProfile -s ncalls hillclimb.py |
The following is an excerpt of the profiling results for hillclimb.py
, sorted from the most called function:
How to sort Python profiling results by total time spent?
We can also sort the cProfile
output by the total time spent in the given function (tottime
) to find out the part that is running slow, using the following command:
1 |
python -m cProfile -s tottime ols.py |
And here is the output on the command prompt:
Run the following command to sort the profiling results of the Hillclimb algorithm by total time spent in the given function:
1 |
python -m cProfile -s tottime hillclimb.py |
And here is the output on the command prompt:
How to save Machine Learning code profiling results for further analysis?
Instead of only printing the profiling result on the command line, we can make it more useful to further results by exporting it into a file.
Here is how you can do it:
1 |
python -m cProfile -o statsOls.dump ols.py |
And the following command to save the profiling results of the Hillclimb algorithm:
1 |
python -m cProfile -o statsHillclimb.dump hillclimb.py |
The above command would export the profiling results into statsOls.dump
and statsHillclimb.dump
file.
How to visualize Python profiling results using snakeviz?
To visualize your Python code profiling results, call the .dump
file with snakeviz
, using this command:
1 |
snakeviz statsOls.dump |
It would start a snakeviz web server and would open the visualization results on your default browser. snakeviz web server started on 127.0.0.1:8080
by default.
You can set up the Style
, Depth
, and Cutoff
of the visualization.
Visualize the profiling results for an ordinary least square (OLS) linear regression program in Icicle
style:
Visualize the profiling results for an ordinary least square (OLS) linear regression program in Sunburst
style:
Excerpt of all the profiling results for an ordinary least square (OLS) linear regression program in tabular format:
Do the same for the Hillclimb algorithm script using this command:
1 |
snakeviz statsHillclimb.dump |
Visualize the profiling results for the Hillclimb algorithm script in Icicle
style:
Visualize the profiling results for the Hillclimb algorithm script in Sunburst
style:
Excerpt of all the profiling results for the Hillclimb algorithm script in tabular format:
The following table is the explanation for each column:
ncalls | The number of calls. |
tottime | The total time spent in the given function (and excluding time made in calls to sub-functions). |
percall | The quotient of tottime is divided by ncalls . |
cumtime | The cumulative time spent in this and all subfunctions (from invocation till exit). This figure is accurate even for recursive functions. |
percall | The quotient of cumtime is divided by primitive calls. |
filename:lineno(function) | Provides the respective data of each function. |
Amazing isn’t it? Now you can easily find out the bottleneck in your machine learning program using cProfile
, and visualize them professionally with snakeviz
. And start from now, you can add code profiling as an optional but powerful step in your machine learning workflow.
Finally, Python’s profiler gives you only the statistics on time but not memory usage. You may need to look for another library or tools for this purpose.
Click here to start using PyScripter, a free, feature-rich, and lightweight IDE for Python developers.
Download RAD Studio to build more powerful Python GUI Windows Apps 5x Faster with Less Code.
Check out Python4Delphi which easily allows you to build Python GUIs for Windows using Delphi.
Also, check out DelphiVCL which easily allows you to build GUIs for Windows using Python.
References & further readings
[1] Nguyen, D. (2021).
How to profile code in Python. AnyMind Group. anymindgroup.com/news/tech-blog/15280
[2] Shrivarsheni. (2020).
cProfile – How to profile your python code. Machine Learning Plus. machinelearningplus.com/python/cprofile-how-to-profile-your-python-code
[3] Stack Overflow. (2011).
cProfile saving data to file causes jumbles of characters. stackoverflow.com/questions/8283112/cprofile-saving-data-to-file-causes-jumbles-of-characters
[4] Tam, A. (2022).
Profiling Python Code. Machine Learning Mastery. machinelearningmastery.com/profiling-python-code