Hands-On Bioinformatics With These 6 Powerful Python Libs

Are you looking for Python development tools that can be used in bioinformatics and to create a graphical user interface (GUI)?

You can build scalable Bioinformatics systems easily by combining these 6 powerful Python libraries and Python4Delphi for the GUI building part. Python4Delphi (P4D) is a free tool that allows you to execute Python scripts, create new Python modules and types in Delphi.

Table of Contents

What is Bioinformatics?

According to the National Human Genome Research Institute, Bioinformatics is a subdiscipline of biology and computer science concerned with the acquisition, storage, analysis, and dissemination of biological data, most often DNA and amino acid sequences. Bioinformatics uses computer programs for a variety of applications, including determining gene and protein functions, establishing evolutionary relationships, and predicting the three-dimensional shapes of proteins.

Why use Python for Bioinformatics?

According to Bitesize Bio, Python is particularly well suited to researchers because several biology programmers have already contributed many libraries to make Python science-friendly. Python documentation also has a section dedicated to its scientific audience. Here are some more reasons why Python could be your best choice of programming language for biology research:

Widely used in the scientific community.
Well-built libraries for complex scientific problems.
Compatible with other existing tools.
Easy manipulation of sequences like DNA, RNA, amino acids.
Easy data manipulation and visualization.

Read these articles, to see “How Python is Powerful for Dealing with Scientific Purposes”:

Powerful Advanced Scientific Computing – it’s easy!

Perform Ultra-Fast Time Series Analysis To Empower Your Apps

Data Visualization: 5 Ways To Be An Enterprise-Grade Master!

Machine Learning: 5 Ways To Use ML in your Windows Apps

Delphi adds Powerful GUI Features and Functionalities to Python

In this tutorial, we’ll build Windows Apps with extensive Bioinformatics capabilities by integrating Python’s Bioinformatics libraries with Embarcadero’s Delphi, using Python4Delphi (P4D).

P4D empowers Python users with Delphi’s award-winning VCL functionalities for Windows which enables us to build native Windows apps 5x faster. This integration enables us to create a modern GUI with Windows 10 looks and responsive controls for our Python for Bioinformatics applications. Python4Delphi also comes with an extensive range of demos, use cases, and tutorials.

We’re going to cover the following…

How to use Biopython, DEAP, Nilearn, PsychoPy, scikit-bio, and scikit-image Python libraries for Bioinformatics

All of them would be integrated with Python4Delphi to create Windows Apps with Bioinformatics capabilities.

Prerequisites

Before we begin to work, download and install the latest Python for your platform. Follow the Python4Delphi installation instructions mentioned here. Alternatively, you can check out the easy instructions found in the Getting Started With Python4Delphi video by Jim McKeeth.

A practical demo app

First, open and run our Python GUI using project Demo1 from Python4Delphi with RAD Studio. Then insert the script into the lower Memo, click the Execute button, and get the result in the upper Memo. You can find the Demo1 source on GitHub. The behind the scene details of how Delphi manages to run your Python code in this amazing Python GUI can be found at this link.

1. How do you perform Bioinformatics tasks with Biopython?

The Biopython Project is an international association of developers of freely available Python tools for computational molecular biology.

What can I find in the Biopython package?

The main Biopython releases have lots of functionality, including:

The ability to parse bioinformatics files into Python utilizable data structures, including support for the following formats:
- Blast output – both from standalone and WWW Blast
- Clustalw
- FASTA
- GenBank
- PubMed and Medline
- ExPASy files, like Enzyme and Prosite
- SCOP, including ‘dom’ and ‘lin’ files
- UniGene
- SwissProt
Files in the supported formats can be iterated over record by record or indexed and accessed via a Dictionary interface.
Code to deal with popular online bioinformatics destinations such as:
- NCBI – Blast, Entrez, and PubMed services
- ExPASy – Swiss-Prot and Prosite entries, as well as Prosite searches
Interfaces to common bioinformatics programs such as:
- Standalone Blast from NCBI
- Clustalw alignment program
- EMBOSS command-line tools
A standard sequence class that deals with sequences, ids on sequences, and sequence features.
Tools for performing common operations on sequences, such as translation, transcription, and weight calculations.
Code to perform classification of data using k Nearest Neighbors, Naive Bayes, or Support Vector Machines.
Code for dealing with alignments, including a standard way to create and deal with substitution matrices.
Code making it easy to split up parallelizable tasks into separate processes.
GUI-based programs to do basic sequence manipulations, translations, BLASTing, etc.
Extensive documentation and help with using the modules, including this file, online wiki documentation, the website, and the mailing list.
Integration with BioSQL, a sequence database schema also supported by the BioPerl and BioJava projects.

After installing Python4Delphi properly, you can get Biopython using pip or easy install to your command prompt:

1	pip install biopython

Don’t forget to put the path where your Biopython library installed, to the System Environment Variables:

System Environment Variable Examples

C:/Users/YOUR_USERNAME/AppData/Local/Programs/Python/Python38/Lib/site-packages

C:/Users/YOUR_USERNAME/AppData/Local/Programs/Python/Python38/Scripts

C:/Users/YOUR_USERNAME/AppData/Local/Programs/Python/Python38

The following is a code example of the Biopython package to work with sequences and parsing FASTA formatted text file (run this inside the lower Memo of Python4Delphi Demo01 GUI):

from Bio.Seq import Seq

from Bio import SeqIO

my_seq = Seq("AGTACACTGGT")

print(my_seq)

print(my_seq.complement())

print(my_seq.reverse_complement())

for seq_record in SeqIO.parse("C:/Users/ASUS/Bio/examples/ls_orchid.fasta", "fasta"):

print(seq_record.id)

print(repr(seq_record.seq))

print(len(seq_record))

Here is the final Biopython result in Python GUI

demo01_biopython-2338446 — **Biopython** Demo with Python4Delphi in Windows.

2. How do you perform Bioinformatics tasks with DEAP?

DEAP is a novel evolutionary computation framework for the rapid prototyping and testing of ideas. It seeks to make algorithms explicit and data structures transparent. It works in perfect harmony with parallelization mechanisms such as multiprocessing and SCOOP.

DEAP includes the following features:

Genetic algorithm using any imaginable representation
- List, Array, Set, Dictionary, Tree, Numpy Array, etc.
Genetic programing using prefix trees
- Loosely typed, Strongly typed
- Automatically defined functions
Evolution strategies (including CMA-ES)
Multi-objective optimization (NSGA-II, NSGA-III, SPEA2, MO-CMA-ES)
Coevolution (cooperative and competitive) of multiple populations
Parallelization of the evaluations (and more)
Hall of Fame of the best individuals that lived in the population
Checkpoints that take snapshots of a system regularly
Benchmarks module containing most common test functions
Genealogy of an evolution (that is compatible with NetworkX)
Examples of alternative algorithms: Particle Swarm Optimization, Differential Evolution, Estimation of Distribution Algorithm

How do I get the DEAP Python library?

First, here is how you can get DEAP:

1	pip install deap

The following code is the implementation of DEAP for One Max Problem. The code credited to these authors: Félix-Antoine Fortin, EunSeop Shin, and François-Michel De Rainville:

import random

import numpy

from deap import algorithms

from deap import base

from deap import creator

from deap import tools

creator.create("FitnessMax", base.Fitness, weights=(1.0,))

creator.create("Individual", numpy.ndarray, fitness=creator.FitnessMax)

toolbox = base.Toolbox()

toolbox.register("attr_bool", random.randint, 0, 1)

toolbox.register("individual", tools.initRepeat, creator.Individual, toolbox.attr_bool, n=100)

toolbox.register("population", tools.initRepeat, list, toolbox.individual)

def evalOneMax(individual):

return sum(individual),

def cxTwoPointCopy(ind1, ind2):

"""Execute a two points crossover with copy on the input individuals. The

copy is required because the slicing in numpy returns a view of the data,

which leads to a self overwritting in the swap operation. It prevents

>>> import numpy

>>> a = numpy.array((1,2,3,4))

>>> b = numpy.array((5,6,7,8))

>>> a[1:3], b[1:3] = b[1:3], a[1:3]

>>> print(a)

[1 6 7 4]

>>> print(b)

[5 6 7 8]

"""

size = len(ind1)

cxpoint1 = random.randint(1, size)

cxpoint2 = random.randint(1, size - 1)

if cxpoint2 >= cxpoint1:

cxpoint2 += 1

else: # Swap the two cx points

cxpoint1, cxpoint2 = cxpoint2, cxpoint1

ind1[cxpoint1:cxpoint2], ind2[cxpoint1:cxpoint2]

= ind2[cxpoint1:cxpoint2].copy(), ind1[cxpoint1:cxpoint2].copy()

return ind1, ind2

toolbox.register("evaluate", evalOneMax)

toolbox.register("mate", cxTwoPointCopy)

toolbox.register("mutate", tools.mutFlipBit, indpb=0.05)

toolbox.register("select", tools.selTournament, tournsize=3)

def main():

random.seed(64)

pop = toolbox.population(n=300)

# Numpy equality function (operators.eq) between two arrays returns the

# equality element wise, which raises an exception in the if similar()

# check of the hall of fame. Using a different equality function like

# numpy.array_equal or numpy.allclose solve this issue.

hof = tools.HallOfFame(1, similar=numpy.array_equal)

stats = tools.Statistics(lambda ind: ind.fitness.values)

stats.register("avg", numpy.mean)

stats.register("std", numpy.std)

stats.register("min", numpy.min)

stats.register("max", numpy.max)

algorithms.eaSimple(pop, toolbox, cxpb=0.5, mutpb=0.2, ngen=40, stats=stats, halloffame=hof)

return pop, stats, hof

if __name__ == "__main__":

main()

Here are the DEAP examples in the Python GUI:

demo02_deap-3665616 — **DEAP** Demo with Python4Delphi in Windows.

3. How do you perform Bioinformatics tasks with Nilearn?

Nilearn enables approachable and versatile analyses of brain volumes. It provides statistical and Machine Learning tools, with instructive documentation & a friendly community.

It supports general linear model (GLM) based analysis and leverages the scikit-learn Python toolbox for multivariate statistics with applications such as predictive modeling, classification, decoding, or connectivity analysis.

First, here is how you can get Nilearn:

1	pip install nilearn

Below is the code for fetching dataset using Nilearn (Run the following code inside the lower Memo of Python4Delphi Demo01 GUI):

from nilearn import datasets

haxby_dataset = datasets.fetch_haxby()

# The different files

print(sorted(list(haxby_dataset.keys())))

# Path to first functional file

print(haxby_dataset.func[0])

# Print the data description

print(haxby_dataset.description)

here are the Nilearn Python4Delphi Results

demo03_nilearn01_fetchdataset-7829766 — **Nilearn** Demo with Python4Delphi in Windows.

4. How do you perform Bioinformatics tasks with PsychoPy?

PsychoPy is an open-source package for creating experiments in behavioral science. It aims to provide a single package that is:

precise enough for psychophysics
easy enough for teaching
flexible enough for everything else
able to run experiments in a local Python script or online in JavaScript

To meet these goals PsychoPy provides a choice of interface – you can use a simple graphical user interface called Builder or write your experiments in Python code. The entire application and library are written in Python and are platform-independent.

How to get the PsychoPy library?

First, here is how you can get PsychoPy:

1	pip install PsychoPy

Run these simple examples of PsychoPy code inside the lower Memo of Python4Delphi Demo01 GUI to generate your first stimulus:

# Import some libraries from PsychoPy

from psychopy import visual, core, event

# Create a window

mywin = visual.Window([800,600],monitor="testMonitor", units="deg")

# Create some stimuli

grating = visual.GratingStim(win=mywin, mask='circle', size=3, pos=[-4,0], sf=3)

fixation = visual.GratingStim(win=mywin, size=0.2, pos=[0,0], sf=0, rgb=-1)

#draw the stimuli and update the window

while True: #this creates a never-ending loop

grating.setPhase(0.05, '+')#advance phase by 0.05 of a cycle

grating.draw()

fixation.draw()

mywin.flip()

if len(event.getKeys())>0:

break

event.clearEvents()

# Cleanup

mywin.close()

core.quit()

PsychoPy Simple Examples:

demo04_psychopy-6224189 — **PsychoPy** Demo with Python4Delphi in Windows.

5. How do you perform Bioinformatics tasks with scikit-bio?

scikit-bio is an open-source, BSD-licensed python package providing data structures, algorithms and educational resources for bioinformatics.

Here is how you can install scikit-bio :

1	pip install scikit-bio

Run the following code to create a TabularMSA object with three DNA sequences and four positions:

from skbio import DNA, TabularMSA

seqs = [

DNA('ACGT'),

DNA('AG-T'),

DNA('-C-T')

]

msa = TabularMSA(seqs)

print(msa)

Here is the scikit-bio Demo Result in the Python GUI:

demo05_scikit-bio-6029221 — **scikit-bio** Demo with Python4Delphi in Windows.

6. How do you perform Bioinformatics tasks with scikit-image?

scikit-image is an image processing library that implements algorithms and utilities for use in research, education, and industry applications. It is released under the liberal Modified BSD open source license, provides a well-documented API in the Python programming language, and is developed by an active, international team of collaborators.

scikit-image aims to:

To provide high-quality, well-documented, and easy-to-use implementations of common image processing algorithms.
To facilitate education in image processing.
To address industry challenges.

First, here is how you can get scikit-image

1	pip install scikit-image

Here is an example to interact with 3D images of kidney tissue:

import matplotlib.pyplot as plt

import numpy as np

from scipy import ndimage as ndi

import plotly

import plotly.express as px

from skimage import data

# Load image

data = data.kidney()

print(f'number of dimensions: {data.ndim}')

# Dimensions are provided in the following order: (z, y, x, c), i.e., [plane, row, column, channel]:

print(f'shape: {data.shape}')

print(f'dtype: {data.dtype}')

# Dimensions are provided in the following order: (z, y, x, c), i.e., [plane, row, column, channel]:

n_plane, n_row, n_col, n_chan = data.shape

# Display both grayscale and RGB(A) 2D images

_, ax = plt.subplots()

ax.imshow(data[n_plane // 2])

plt.show()

# According to the warning message, the range of values is unexpected. The image rendering is clearly not satisfactory colour-wise.

vmin, vmax = data.min(), data.max()

print(f'range: ({vmin}, {vmax})')

# Turn to plotly’s implementation of the imshow function, for it supports value ranges beyond (0.0, 1.0) for floats and (0, 255) for integers.

fig = px.imshow(data[n_plane // 2], zmax=vmax)

#plotly.io.show(fig)

fig.show()

scikit-image with Python4Delphi Results

demo06_scikit-image1-2333634 — **scikit-image** Demo with Python4Delphi in Windows.

The second output will show up in your default browser (just like the default Plotly output):

demo06_scikit-image2-8132277 — **scikit-image** Demo with Python4Delphi in Windows.

7. Are you ready to build awesome things with these Python’s Bioinformatics libraries?

We already demonstrate 6 powerful Python libraries for Bioinformatics (Biopython, DEAP, Nilearn, PsychoPy, scikit-bio, and scikit-image). All of them wrapped well inside a powerful GUI provided by Python4Delphi. We can’t wait to see what you build with Python4Delphi!

Want to know some more? Then check out Python4Delphi which easily allows you to build Python GUIs for Windows using Delphi, and

Download RAD Studio to build more powerful Python GUI Windows Apps 5x Faster with Less Code.