Table of Contents
What is quantum chemistry?
What do you think of when you hear the word chemistry? You imagine explosions, smoking glassware, and colorful compounds, right? But did you know that chemistry can be done outside a lab, and we can use Python tools to help too? There are chemical reactions that are difficult to perform experimentally. The reagents might be too expensive, the process outcome might be yet unknown, or it might be too time-consuming to perform a large series of reaction pathways in order to see which ones (if any) give the desired product. Chemistry can also be done by modelling atoms’ and molecules’ chemical and physical properties. The field of computational chemistry includes several research areas, including quantum chemistry and molecular dynamics.
An introduction to quantum chemical calculations
Quantum chemistry is a borderline field where chemistry and quantum physics meet. Quantum chemists typically use supercomputers to perform complicated calculations.
Quantum chemical calculations aim to obtain the total energy of a molecule and, from there, derive its properties. It sounds simple, however, only very few cases can be solved exactly, one of which is the hydrogen atom. The so-called Schrödinger equation can be solved for practically all molecules after applying a series of approximations. Not even the most powerful supercomputer can give the exact solution.
Quantum chemistry has been traditionally implemented using Fortran and C, though, in recent times, Python scripting has become increasingly popular due to its simplicity and the pre-existing libraries.
Atoms and molecules
Everything around us consists of atoms organised in a specific way. Some atoms form crystalline lattices, while others combine into molecules. Most of these interactions are due to how the electrons of the atoms respond to each other.
Electrons have specific energy, and they are assigned to what are called orbitals. The term came about at the time when scientists believed that electrons are moving around the atomic nuclei just like planets orbit around the Sun. This was later proven to be wrong.
Electrons exhibit wave-particle duality, which is an unintuitive concept where the electron can act as a wave or as a particle in different cases. This is an intrinsic property of many particles.
What does quantum mechanics tell us about chemical systems?
Electronic structure theory describes the electronic behaviour and the electronic configuration of atoms, molecular systems and materials. Within it, molecular orbital theory describes how chemical bonding occurs as an overlap between atomic orbitals. Typically, a chemical bond is formed by two electrons.
Knowing the energy and the electronic structure of a chemical system makes it possible to derive its properties, such as chemical reactivity and spectroscopic behavior.
How do we know that the quantum mechanical model is correct?
As mentioned above, the exact solution to the Schrödinger equation is usually unknown. The calculations give approximate results. They need to be compared to experimental data for a set of cases before the quantum chemical method can be used routinely.
Significant advances came about in the 90s when a method called density functional theory (DFT) started to predict chemical properties accurately. The results obtained from such studies can explain the nature of various molecules, and it serves as a starting point for further investigations.
What is quantum chemistry used for?
Modern quantum chemistry is a widely recognised field of science that is indispensable in rationalising chemical reaction mechanisms, predicting molecular properties and developing new materials for various practical applications.
Large systems of thousands of atoms are typically too heavy to perform. A protein would require months and even years to complete the computation, even when using a simplified approximation.
How can we implement quantum chemistry in a Python program?
One masterful implementation of various theoretical methods is done in the PySCF code available as a package that can be installed using pip. I use PyScripter to perform calculations with a relatively low computational cost on my personal computer. Heavier ones can only be done on a supercomputer where they run on many processors, sometimes for several days.
How to install PySCF?
PySCF is available for Linux and Mac. Currently, compilation under Windows fails because some libraries do not exist for Windows. Linux users only need to type
1 |
pip install pyscf |
Preferably, pip is part of a conda installation so that other required packages are pre-installed.
You will see the following error if you try installing PySCF under Windows:
I have installed PySCF on the Windows Subsystem for Linux (WSL). There are many tutorials online on how to enable it. I can write PySCF scripts using the feature of PySCF to open remote files. The WSL virtual machine is the “remote” server in this case. For more information, check out this blog post:
In the SSH server settings, I have defined the hostname as localhost, and I use the program called Putty to handle the SSH connection.
How to calculate molecular properties in PyScripter
A PySCF run requires that specific parameters be defined in an input file. The computational method, the coordinates of the atoms have to be specified
I need to create an input file where I will add the values that the code needs.
First, I need to prepare the molecular structure by drawing it in an external program called Avogadro. It is free and open source. The nuclear positions are shown as balls for simplicity. The red one is oxygen, and the two white ones are hydrogen atoms. You probably still remember basic chemistry that H2O is the formula of water.
Next, I went to the menu Extensions and selected Gaussian, which opens a small window with some settings. What I am interested in are the coordinates of the atoms in the text field. The last three rows contain the chemical symbol, O or H, and three numbers for each of them. These are the x, y and z coordinate. The rest of the settings in this window are not relevant to the example. I will copy the three lines and use them in the input file for PySCF.
The input file starts by importing two modules from the PySCF package. The data structures are initialised, and the coordinates from Avogadro are subsequently given inside quote marks.
After the coordinates, I need to specify something called the basis set. It consists of mathematical functions that are used to approximate the molecular orbitals in the calculation.
Then, I tell PySCF to analyse if the molecule is symmetric – which is true for water.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
#!/usr/bin/env python from pyscf import gto, scf mol = gto.Mole() # optionally enable a more detailed output # mol.verbose = 4 # defining the molecule mol.atom = ''' O -1.82473 1.24653 -0.01294 H -0.85684 1.28190 0.01714 H -2.10358 1.95337 0.58838''' # specifying the basis set mol.basis = 'ccpvdz' mol.symmetry = 1 mol.build() # Computational method # Hartree-Fock theory mf = scf.RHF(mol) mf.kernel() # Perform additional analyses # mf.analyze() |
I can save and call the above script just like any other Python script, and it gives the energy of the molecule.
1 2 |
$ ./scf.py converged SCF energy = -76.0267679973746 |
What happens under the hood?
The user defines the parameters in the Python script that calls PySCF. The first step is constructing the so-called Fock matrix, which is essentially a way of representing molecular orbitals using a set of mathematical functions called a basis set. Each function is taken with a certain weight. These weight coefficients are optimised during the calculation. They are used to calculate the energy of the molecule. By varying the values, one obtains different energies. A theorem states that the lower the energy, the closer it is to the correct value. Typically, a specific threshold is compared to the difference in the energies of two sets of coefficients in the matrix.
What happens if I choose another computational method for PySCF?
If I select another computational method, the energy will be slightly different. I have replaced the lines under the comment “Hartree-Fock theory” with the following ones:
1 2 3 4 |
# Density functional theory mf = mol.KS() mf.xc = 'b3lyp' mf.kernel() |
Everything else in the script is unchanged. Here, I am using a popular density functional called B3LYP. There are hundreds of other options.
Calling PySCF gives the value
1 |
converged SCF energy = -76.3832396709755 |
You might be wondering why the numbers are not identical. Why are there different methods to begin with? Well, it is complicated.
The problem stems from the fact that it is impossible to measure what molecular orbitals are like, so we need to do mathematical tricks to approximate them. Only an infinite number of basis functions can approximate exactly the actual orbital. Apparently, we cannot use an infinite number of functions on a computer, so we have to limit them to a feasible amount.
What have you learned about quantum chemistry?
You made it to the end, congratulations!
Quantum chemistry is an exciting field of science. I am biased because it is my job, but I am sure many will get interested in checking what it is like to be a researcher. How about you calculate the energy of aspirin? Or maybe caffeine?
Take up the challenge and set up PySCF to become a quantum chemist, too!
PyScripter is available for download for free at
https://www.embarcadero.com/free-tools/pyscripter/free-download