CodeLearn PythonPythonWindows

How To Add Python Profiling Tools Into Machine Learning Code

blog banner python profiler for machine learning code

Reducing code runtime is important for developers. Python Profilers, like cProfile, help us to find which part of the program or code takes more time to run. Whether you are using a Python GUI or the command line profiling is a huge help in tracking down code bottlenecks which impact performance.

This article will walk you through the process of using the cProfile module for extracting profiling data and the snakeviz module for visualization and implementing those steps to test machine learning scripts.

What is code profiling?

Code profiling is a technique to figure out how time is spent in a program. For more details, a profile is a set of statistics that describes how often and for how long various parts of the program are executed.

With these statistics, we can find the “hot spot” of a program and think about ways of improvement. Sometimes, a hot spot in an unexpected location may give you hints about bugs in your program.

A program running slow can generally be due to two reasons: A part is running slow, or a part is running too many times, adding up and taking too much time. We call these “performance hogs” the hot spot.

How do I get the cProfile library?

As cProfile is a built-in Python library, no further installation is needed.

How do I get the snakeviz library, to visualize the profiling results?

Here is how you can get a stable release of snakeviz using pip:

How do I implement Python profiling tools into my machine learning code?

How can I use a profiler inside Python code?

The advantage of this method is we can focus on profiling only a part, instead of the entire program. For example, if we load a large module, it takes time to bootstrap, and we want to remove this from the profiler. In this case, we can invoke the profiler only for certain lines.

The following is an example of profiling an ordinary least square (OLS) linear regression program, only for the regression until the plotting steps:

Here is the output on PyScripter IDE:

output1 cprofile ml 8073570

For the second example, let’s consider a program that uses a hillclimbing algorithm to find hyperparameters for a Perceptron model. We want to profile the hill climb algorithm only for the hillclimbing search part:

Here is the output on PyScripter IDE:

output2 cprofile ml 4194875

How can I use a Python code profiler at runtime from a command prompt?

Another way to perform the profiling of the machine learning script is by running cProfile at a runtime. The advantage of this method is you can easily profile the whole code just by one line of command, and you can export the profiling result as a file, for further analysis.

Here is how to do machine learning code profiling at the command prompt: First, remove the profiler parts, and save the code as ols.py. Next, we can run the profiler in the command line as follows:

The following is the excerpt of the profiling results:

output3 cprofile ml 9403605

Do the same treatment to the Hillclimb algorithm script: Remove the profiler parts, and save the code as hillclimb.py. Next, we can run the profiler in the command line as follows:

The following is the excerpt of the profiling results:

output4 cprofile ml 3687250

It provides very rich and detailed code profiling data.

How to sort Python profiling results by call count?

The profiling output as presented in previous sections is very long and may not be useful to us as it could be difficult to tell which function is the hot spot. So we can sort the above output by their call counts (ncalls) to find out the part that is running too many times, using the following command:

The following is an excerpt of the profiling results for ols.py, sorted from the most called function:

output11 cprofileols orderedbycallcount 7800363

Run the following command to sort the profiling results of the Hillclimb algorithm by call count:

The following is an excerpt of the profiling results for hillclimb.py, sorted from the most called function:

output12 cprofilehillclimb orderedbycallcount 3656514

How to sort Python profiling results by total time spent?

We can also sort the cProfile output by the total time spent in the given function (tottime) to find out the part that is running slow, using the following command:

And here is the output on the command prompt:

output13 cprofileols orderedbytottime 1498133

Run the following command to sort the profiling results of the Hillclimb algorithm by total time spent in the given function:

And here is the output on the command prompt:

output14 cprofileols orderedbyhillclimb 5495736

How to save Machine Learning code profiling results for further analysis?

Instead of only printing the profiling result on the command line, we can make it more useful to further results by exporting it into a file.

Here is how you can do it:

And the following command to save the profiling results of the Hillclimb algorithm:

The above command would export the profiling results into statsOls.dump and statsHillclimb.dump file.

How to visualize Python profiling results using snakeviz?

To visualize your Python code profiling results, call the .dump file with snakeviz, using this command:

It would start a snakeviz web server and would open the visualization results on your default browser. snakeviz web server started on 127.0.0.1:8080 by default.

You can set up the Style, Depth, and Cutoff of the visualization.

Visualize the profiling results for an ordinary least square (OLS) linear regression program in Icicle style:

output5 snakevizols icicle 4336785

Visualize the profiling results for an ordinary least square (OLS) linear regression program in Sunburst style:

output6 snakevizols sunburst 7178020

Excerpt of all the profiling results for an ordinary least square (OLS) linear regression program in tabular format:

output7 snakevizols tabular 9263512

Do the same for the Hillclimb algorithm script using this command:

Visualize the profiling results for the Hillclimb algorithm script in Icicle style:

output8 hillclimb icicle 1371488

Visualize the profiling results for the Hillclimb algorithm script in Sunburst style:

output9 hillclimb sunburst 9398092

Excerpt of all the profiling results for the Hillclimb algorithm script in tabular format:

output10 hillclimb tabular 5733430

The following table is the explanation for each column:

ncallsThe number of calls.
tottimeThe total time spent in the given function (and excluding time made in calls to sub-functions).
percallThe quotient of tottime is divided by ncalls.
cumtimeThe cumulative time spent in this and all subfunctions (from invocation till exit). This figure is accurate even for recursive functions.
percallThe quotient of cumtime is divided by primitive calls.
filename:lineno(function)Provides the respective data of each function.

Amazing isn’t it? Now you can easily find out the bottleneck in your machine learning program using cProfile, and visualize them professionally with snakeviz. And start from now, you can add code profiling as an optional but powerful step in your machine learning workflow.

Finally, Python’s profiler gives you only the statistics on time but not memory usage. You may need to look for another library or tools for this purpose.


Click here to start using PyScripter, a free, feature-rich, and lightweight IDE for Python developers.

Download RAD Studio to build more powerful Python GUI Windows Apps 5x Faster with Less Code.

Check out Python4Delphi which easily allows you to build Python GUIs for Windows using Delphi.

Also, check out DelphiVCL which easily allows you to build GUIs for Windows using Python.


References & further readings

[1] Nguyen, D. (2021).

How to profile code in Python. AnyMind Group. anymindgroup.com/news/tech-blog/15280

[2] Shrivarsheni. (2020).

cProfile – How to profile your python code. Machine Learning Plus. machinelearningplus.com/python/cprofile-how-to-profile-your-python-code

[3] Stack Overflow. (2011).

cProfile saving data to file causes jumbles of characters. stackoverflow.com/questions/8283112/cprofile-saving-data-to-file-causes-jumbles-of-characters

[4] Tam, A. (2022).

Profiling Python Code. Machine Learning Mastery. machinelearningmastery.com/profiling-python-code

Related posts
CodeIDELearn PythonPythonPython GUITkinter

How To Make More Than 20 ChatGPT Prompts Work With Python GUI Builders And Matplotlib Library?

CodeIDELearn PythonPythonPython GUITkinter

How To Make More Than 20 ChatGPT Prompts Work With Python GUI Builders And Pillow Library?

CodeDelphiDelphiFMXLearn PythonProjectsPythonPython GUI

How To Create A Weather App With The Python Delphi Ecosystem and Weatherstack API

CodeDelphiDelphiFMXLearn PythonProjectsPythonPython GUI

How To Create A Music Player With The Python Delphi Ecosystem

Leave a Reply

Your email address will not be published. Required fields are marked *