Powerful Data Analysis And Manipulation Using Pandas Library In A Delphi Windows App

Are you looking for powerful tools to analyze and manipulate structured data, and build a nice GUI for them? You can build fast, expressive, insightful, and scalable data analysis tools easily by combining pandas and Python4Delphi library, inside Delphi and C++Builder.

pandas is a Python package that provides fast, flexible, and expressive data structures designed to work with structured (tabular, multidimensional, potentially heterogeneous) and time-series data easily and intuitively.

pandas aim to be the fundamental high-level building block for doing practical, real-world data analysis in Python. Additionally, it has the broader goal of becoming the most powerful and flexible open-source data analysis/manipulation tool available in any language. It is already on its way toward this goal.

Table of Contents

10+ Amazing pandas Examples inside the Delphi Windows GUI App

This post will guide you on how to run various data analysis and manipulation examples using the pandas library and using Python for Delphi to display it in the Delphi Windows GUI app.

First, open and run our Python GUI using project Demo01 from Python4Delphi with RAD Studio. Then insert the script into the lower Memo, click the Execute script button, and get the result in the upper Memo. You can find the Demo01 source on GitHub. The behind the scene details of how Delphi manages to run your Python code in this amazing Python GUI can be found at this link.

These examples will cover almost various functions and methods you are most likely to use in a typical data analysis process. Let’s run them all in our Python4Delphi Demo01 GUI:

1. Reading the CSV file into a pandas dataframe

import numpy as np

import pandas as pd

# Read the dataset

df = pd.read_csv("/Churn_Modelling.csv")

# See the data

print(df)

2. Check the shape or dimension of the dataset

1 2	# Print the data shape print(df.shape)

3. See the column labels of the DataFrame

1 2	# Print the data columns df.columns

4. Dropping columns

We want to remove 4 columns: ‘RowNumber‘, ‘CustomerId‘, ‘Surname‘, and ‘CreditScore‘. The axis parameter is set as 1 to drop columns and 0 for rows. The inplace parameter is set as True to save the changes:

# Drop 4 columns

df.drop(['RowNumber', 'CustomerId', 'Surname', 'CreditScore'],

axis=1,

inplace=True)

print(df.shape)

We dropped 4 columns so the number of columns reduced to 10 from 14.

5. Select particular columns while reading

We want to read only specific columns: ‘Gender‘, ‘Age‘, ‘Tenure‘, and ‘Balance‘:

# Select particular columns

df_spec = pd.read_csv("/Churn_Modelling.csv", usecols=['Gender', 'Age', 'Tenure', 'Balance'])

print(df_spec.head())

6. Reading a part of the dataframe (from the first n number of the rows)

We want to read the first 5000 rows of the CSV file:

# Reading a part of the dataframe

df_partial = pd.read_csv("/Churn_Modelling.csv", nrows=5000)

print(df_partial.shape)

7. Select rows from the end of the file

We can also select rows from the end of the file by using the skiprows parameter. skiprows=5000 means that we will skip the first 5000 rows while reading the csv file:

# Select rows from the end of the file

df_partialEnd = pd.read_csv("/Churn_Modelling.csv", skiprows=5000)

print(df_partialEnd.shape)

8. Draw a small sample to work

We can either use the n parameter or frac parameter to determine the sample size.

n: The number of rows in the sample

# The number of rows in the sample

df_sample = df.sample(n=1000)

print(df_sample.shape)

frac: The ratio of the sample size to the whole dataframe size

# The ratio of the sample size to the whole dataframe size

df_sample2 = df.sample(frac=0.2)

print(df_sample2.shape)

9. Checking the missing values

Using the isna with the sum function, we can see the number of missing values in each column:

1 2	# Check the missing values print(df.isna().sum())

10. Adding missing values using loc and iloc

The “loc” and “iloc” will select rows and columns based on index or label.

loc: selects with label
iloc: selects with index

# Adding missing values using loc and iloc

## Create 20 random indices to select

missing_index = np.random.randint(10000, size=20)

## We will use these indices to change some values as np.nan (missing value)

df.loc[missing_index, ['Balance','Geography']] = np.nan

## Let’s try another example using the indices instead of labels (select the last column)

df.iloc[missing_index, -1] = np.nan

print(df.isna().sum())

11. Fill the missing values

Fill NA using the most common value (mode)

# Filling missing values

## See the "Geography" column

print(df["Geography"].value_counts())

## Fill NA using the most common value (mode)

mode = df['Geography'].value_counts().index[0]

df['Geography'].fillna(value=mode, inplace=True)

Fill NA using the mean value

## Fill NA using the mean value

avg = df['Balance'].mean()

df['Balance'].fillna(value=avg, inplace=True)

print(df.isna().sum())

Congratulations, now you have learned how to run various data analysis and manipulation examples using the pandas library and using Python for Delphi to display it in the Delphi Windows GUI app!

Check out the pandas library for Python and use it in your projects: https://pypi.org/project/pandas/ and

Check out Python4Delphi which easily allows you to build Python GUIs for Windows using Delphi: https://github.com/pyscripter/python4delphi

References & further readings

[1] Hakim, M. A. (2022).

Build The Ultimate GUI For Pandas To Perform Complex Data Analysis. Embarcadero Blogs. blogs.embarcadero.com/ultimate-guide-for-building-gui-for-pandas-to-perform-complex-data-analysis

[2] Yıldırım, S. (2020).

30 Examples to Master Pandas. Towards Data Science. towardsdatascience.com/30-examples-to-master-pandas-f8a2da751fa4

Powerful Data Analysis And Manipulation Using Pandas Library In A Delphi Windows App

10+ Amazing pandas Examples inside the Delphi Windows GUI App

1. Reading the CSV file into a pandas dataframe

2. Check the shape or dimension of the dataset

3. See the column labels of the DataFrame

4. Dropping columns

5. Select particular columns while reading

6. Reading a part of the dataframe (from the first n number of the rows)

7. Select rows from the end of the file

8. Draw a small sample to work

9. Checking the missing values

10. Adding missing values using loc and iloc

11. Fill the missing values

References & further readings

[1] Hakim, M. A. (2022).

[2] Yıldırım, S. (2020).

Watch the Python GUI Apps Con 2023 sessions today!

Download RAD Studio And Build Python GUI Windows Apps 5x Faster with Less Code

PyScripter is an open-source Python Integrated Development Environment (IDE)

Leave a Reply Cancel reply

Something Fresh

Unlock the Power of Python for Deep Learning with Diffusion Model - The Engine behind Stable Diffusion

How To Make More Than 20 ChatGPT Prompts Work With Python GUI Builders And OpenCV Library?

Unlock the Power of Python for Deep Learning with Radial Basis Function Networks (RBFNs)

What People Reading

6 Best Python GUI Frameworks in December 2021

Top 5 Ways To Build A Python Desktop App in 2021

Compare DelphiVCL4Python With Python GUI Frameworks Like Tkinter For Windows