Chemical Modelling: Applications and Theory comprises critical literature reviews of all aspects of molecular modelling. Molecular modelling in this context refers to modelling the structure, properties and reactions of atoms, molecules and materials.
The tenth volume of the series brings Jan Ole Joswig to the editorial team, and a wealth of new reviews spanning several disciplines. For example, materials scientists will benefit from the review on Inverse Molecular Design for Materials and Modelling PAHs will be of interest to environmental scientists. Other reviews have detailed focus on modelling, such as Reaction Kinetics and Accurate Modelling of Electric Properties of Polyatomic molecules from the first principles.
Each chapter provides a selective review of recent literature, incorporating sufficient historical perspective for the non-specialist to gain an understanding.
With chemical modelling covering such a wide range of subjects, this Specialist Periodical Report serves as the first port of call to any chemist, biochemist, materials scientist or molecular physicist needing to acquaint themselves with major developments in the area.
Die Inhaltsangabe kann sich auf eine andere Ausgabe dieses Titels beziehen.
Prof. Dr. Michael Springborg heads up of the three groups in Physical Chemistry at the University of Saarland where the main activities concentrate on teaching and research. The major part of Prof. Dr. Michael Springborg's research concentrates on the development and application of theoretical methods, including accompanying computer programs, for the determination of materials properties. Quantum theory forms the theoretical foundation for most of our work. The materials of the group's interest range from atoms, via clusters and polymers, to solids. They study their structural, electronic, energetic, and opitcal properties.
Chemical Modelling: Applications and Theory comprises critical literature reviews of all aspects of molecular modelling. Molecular modelling in this context refers to modelling the structure, properties and reactions of atoms, molecules and materials.
The tenth volume of the series brings Jan Ole Joswig to the editorial team, and a wealth of new reviews spanning several disciplines. For example, materials scientists will benefit from the review on Inverse Molecular Design for Materials and Modelling PAHs will be of interest to environmental scientists. Other reviews have detailed focus on modelling, such as Reaction Kinetics and Accurate Modelling of Electric Properties of Polyatomic molecules from the first principles.
Each chapter provides a selective review of recent literature, incorporating sufficient historical perspective for the non-specialist to gain an understanding.
With chemical modelling covering such a wide range of subjects, this Specialist Periodical Report serves as the first port of call to any chemist, biochemist, materials scientist or molecular physicist needing to acquaint themselves with major developments in the area.
Preface Michael Springborg and Jan-Ole Joswig, v,
Inverse molecular design for materials discovery Dequan Xiao, Ingolf Warnke, Jason Bedford and Victor S. Batista, 1,
Complete basis set results in electron correlation methods using F12 theory Andreas Köhn, 32,
Reactive intermediates with large amplitude degrees of freedom Rex T. Skodje, 64,
Modelling electron quantum dynamics in large molecular systems Diego A. Hoff and Luis G. C. Rego, 102,
Modelling polycyclic aromatic hydrocarbons and their derivatives Mathias Rapacioli, 127,
Surface reactivity of the sulfide minerals Guilherme Ferreira de Lima, Heitor Avelino de Abreu and Hélio Anderson Duarte, 153,
Electric dipole moments of small polyatomic molecules from first principles Sergei N. Yurchenko, 183,
Inverse molecular design for materials discovery
Dequan Xiao, Ingolf Warnke, Jason Bedford and Victor S. Batista
DOI: 10.1039/9781849737241-00001
1 Introduction
Discovering materials with optimum properties is a long-term dream for both experimental and theoretical researchers. Historically, scientists used an approach of 'trial and error' to find new materials that exhibit desired properties. Owing to the development in modern theoretical and computational chemistry (e.g., density functional theory), predicting molecular properties using accurate and efficient quantum chemistry methods becomes more and more practical. As a consequence, inverse molecular design has emerged as an attractive computational approach to take on the challenges in materials discovery.
Inverse molecular design is a general term describing strategies in molecular design, that are in contrast to direct design methods. In direct design, a new molecule is proposed first, and then the molecular property is computed or measured to check its potential use. In contrast, inverse molecular design aims at searching for optimum points on hypersurfaces defining property-structure relationships, and then mapping out the molecular structures at the optimum points. Hence, using the idea of inverse molecular design could significantly enhance the efficiency and success rate of molecular design and save costs in materials discovery.
Inverse molecular design has been implemented as an optimization method in theory, assisting the search for optimum chemical structures using global optimization algorithms.
[MATHEMATICAL EXPRESSION OMITTED] (1)
Here, finv is a notation for the operation of inverse molecular design. Ô denotes a molecular property (an observable), which is a functional of the Hamiltonian operator H. λ1, λ2, ..., λn are the set of user-defined variables for varying the Hamiltonian. For example, these variables could be the indices defining a molecule as a composition of molecular fragments, a set of nuclear coordinates, or even atomic numbers.
OT denotes a given target value of a molecular property, e.g. a maximum point of the molecular property. The minimization operation 'min' may be performed through a variety of different optimization algorithms that minimize the quantity |Ô[H[λ1, λ2, ...,] - OT]|. Thus, [Florin]inv aims at finding a particular set {λ1, λ2, ... λn} (and thereby a molecular structure) that has the best match to the target molecular property. This is a general formulation for the idea of inverse molecular design. When applied to specific systems, the formulation may be transformed for the purpose of optimizing particular target molecular properties.
In this work, we focus on reviewing inverse molecular design based on hypersurfaces of molecular properties vs. molecular structures that are constructed through direct calculations of molecular properties from variable Hamiltonians (for representing different chemical structures). Alternatively, analogous hypersurfaces relating property and structure have been constructed based on statistical models for molecular properties with respect to sets of chosen molecular descriptors (for molecular structures or properties). Such hypersurfaces are used extensively for the inverse design based on quantitive structure activity relationship (QSAR). We refer interested readers to literature on inverse QSAR, which is not the focus of this review.
In molecular structure space, the Hamiltonian variables are associated with the atom types and their spatial arrangement. Different stochastic and deterministic optimization algorithms have been adapted to work in inverse molecular design methods. The choice of an optimization method depends on how the particular Hamiltonian, linking structure and property, is varied during a search (i.e. depends on the set of Hamiltonian parameters/variables that are varied).
We begin with reviewing optimization algorithms that are based on discrete molecular objects such as genetic algorithms and Monte Carlo methods. Then, we describe an emerging approach named linear combination of atomic potentials (LCAP) developed by Beratan and Yang for inverse molecular design. This approach allows us to search for optimum molecules using continuous or discrete optimization algorithms. In particular, we will review recent progress made in applying LCAP in the tight-binding (TB) framework, which could provide an efficient way for molecule search. Finally, we review applications of TB-LCAP for optimizing non-linear optical materials and dye-sensitized solar cells. In particular, novel materials have been proposed by TB-LCAP and verified by experiments. Due to the low computational cost of tight-binding electronic structure calculations, we envision that the TB-LCAP will be a promising inverse molecular design method for taking on challenges in materials discovery such as catalysts design and solar fuels applications.
2 Strategies in inverse molecular design
Genetic algorithms and Monte Carlo methods are commonly used as optimizers for inverse molecular design in discrete molecular structure space of chemically representable candidates.
2.1 Genetic algorithms
Genetic algorithms are methods tailored to address complicated multi-dimensional optimization problems. They belong to a broader class of evolutionary algorithms which originated in the 60s and 70s when scientists started to explore the possibility of using basic principles of evolution to develop adaptive and highly efficient and general optimization schemes. Nowadays, GAs find extensive use in a large variety of scientific fields. Notably, over the past 30 years, they have graduated to become a major tool for computational disciplines in physics, chemistry and materials science where they are used for atomistic and electronic structure level optimizations of molecular geometries, energies and properties.
In inverse molecular design, GAs may be applied as optimization algorithms to locate vectors X = {λ1, λ2, ... λn} such that |Ô|[H[λ1, λ2, ...,] - OT]| becomes minimal. In electronic structure calculations there is generally no simple relationship between X and the property of interest |Ô|[H[λ1, λ2, ...,] - OT]|. The calculation of these properties often is complicated and comes at high computational cost. Efficient optimization algorithms such as GAs are necessary for locating optima on the hypersurface |Ô|[H[λ1, λ2, ...,] – OT]|.
Zunger et al. adapted GAs for the purpose of inverse molecular design, and discovered structural motifs with optimal bandgaps in quaternary (In, Ga)(As, Sb) semiconductors. Hutchison et al. used GAs for optimizing organic polymers for photovoltaics. To shed light on how GAs are used for inverse molecular design, we review selected basic principles and current developments.
GAs are population based methods. In GA context, a population consists of a set of trial solutions {Xa|a=1, ..., population size}. Basic selection and recombination principles inspired by Darwinian evolution drive the adaptation of the population towards a target property, i.e. the optimum of an objective function. Genetic operators are realizations of the selection and recombination rules. They determine which trial-solutions (individuals) are selected from the population and specify how they are recombined to form new ones. The concept of fitness refers to the difference between the objective function value of a trial solution and the global optimum.
Figure 1 illustrates the basic steps common to all GA optimizations. (1) The first step in a GA optimization consists in generating an initial population which, in practice, will often be made of randomized trial solutions X. (2) Subsequently, GAs proceed to evaluate the individuals' fitness. (3) Based on the fitness a selection operator chooses a number of individuals (parents) for recombination and formation of new individuals (mating). (4) During the mating process, crossover operators recombine characteristics of parent-solutions to form new trial solutions while mutation operators account for the introduction of small random changes with a certain likelihood. (5) The resulting new trial solutions are then pitched against their parents and form a new generation by replacing the weakest individuals of the previous one. Selection and mating are repeated until convergence is achieved.
Fitness function. The criteria applied in the selection process are expressed in terms of a fitness function [Florin]fit(X). To minimize a given objective function, [Florin]fit can often be set to equal |Ô|[H[λ1, λ2, ...,] - OT]|. However, appropriate scaling of the objective function may influence the roughness of the potential hypersurface and play an important role for the efficiency of the optimization. Approximations to the objective function can be used to estimate their fitness. Such strategies can be used to increase computational efficiency when the objective function evaluation comes at high computational cost. Especially when electronic structure methods are needed to calculate energies or locally optimize individuals, the evaluation of a single individual's fitness may take hours or even days. In such cases, an efficient and sufficiently accurate approximation can often be used to identify promising candidates which are in turn treated at a higher theoretical level.
Selection operators realize methods for choosing individuals from the existing population to generate new individuals (children) in the recombination process. Population size, the number of individuals chosen for recombination and the number of new individuals to be generated from a given generation are parameters that may be chosen to suit the problem. In a straightforward selection, a certain amount of the fittest individuals would be chosen for recombination. In practice however, a variety of selection strategies exist which involve the drawing of random numbers. Such probabilistic schemes introduce a likelihood for weak individuals to be drawn for recombination while ensuring that the fittest individuals have the highest probability of being selected. Most commonly applied are fitness proportionate selection schemes, where the likelihood of selection is proportional to the individual's fitness. In the alternative tournament selection method, the fittest individuals within randomly chosen sub-populations are drawn. Here, the number of sub-populations and the number of individuals to be drawn from each population are parameters that adjust the selection pressure, i.e. the likelihood of weak individuals to be drawn for recombination. Other selection schemes exist and are outlined in the literature.
Crossover operators realize the set of rules according to which the selected parents are recombined to form new individuals that exhibit combined or completely new characteristics. As is the case with all other GA operators, there is no unique way of defining such a procedure and several variants are commonly used. The simplest version is a single point crossover operator. It chooses a random crossing point between two parent vectors X and then interchanges the crossed sections. The number of crossing points can be increased to allow for more flexibility in the recombination process. A schematic of the procedure is presented in Fig. 2. The number of parents entering the recombination process, as well as the number of children resulting from the crossover is sometimes increased. "Cut-and-Splice" techniques differ slightly from the described crossover method; parents can have different crossover points which allows for evolutionary variation of the dimensionality of the optimization problem, or size of the system. The uniform crossover method uses a fixed mixing ratio (e.g. 0.5) and allows each bit of information to be interchanged between the parents individually and independently. This strategy is equivalent to choosing a random number of crossing points while maintaining a given ratio of information that is exchanged between the parents.
Depending on how specifically individuals are encoded in the GA, exchange of random regions during the crossover procedure may not always produce meaningful outcomes. Whenever trial-solutions need to satisfy conditions such as minimum or maximum distances between nuclei, the ordering of variables may become important. A number of strategies exist to overcome this challenge. Some schemes involve reparation of a random trial solution while others create solutions that satisfy existing conditions by design. For example, Ahlrichs and co-workers describe a crossing method designed for optimizing geometries of clusters. The structures are cut into fragments and recombined such that
(i) only contiguous groups of atoms are interchanged and
(ii) the minimum interatomic distance is controlled. Before pitching the created individuals against the rest of the current population, their structures are locally relaxed to the closest local minimum.
Mutation operators. Most GAs incorporate the biological concept of mutation to preserve a degree of diversity during evolution of the population. Mutation operators introduce random changes to trial solutions while being generated in the crossover process. Mutation operators are designed to help the GA avoid local trapping which may occur whenever the population has obtained a reduced diversity due to evolution towards a non-global optimum. Mutation creates random features to enable sampling over new regions in the space of possible variables. The motivation for mutation operators is similar to the reasoning for introducing randomized selection operators that avoid an exclusive drawing of only the fittest individuals. Different versions of mutation operators exist and are appropriate depending on the nature of the optimization and on whether the algorithm uses vectors of real numbers or catenated binary strings to represent individuals. For real-coded GAs, a common procedure is to add a Gaussian distributed random value to an arbitrary variable in the vector. If the population is encoded in the form of binary vectors, bit operations such as swapping, inverting, and scrambling the bits of variables can be used to introduce mutations. The likelihood of mutation and the number of mutations allowed per individual influence sampling and convergence of the optimization. In general, an excessive mutation probability will slow the convergence as the genetic information stored in the population is lost at an elevated rate and the method becomes equivalent to a random search algorithm.
Convergence. Ideally, the algorithm would stop producing new generations when the global optimum is found. However, in most GA applications, there is no clear indicator that the global optimum has been located, i.e., no single criterion exists that is sufficient for global convergence. In practice, different criteria are used to decide when to terminate an optimization. The time available on a computer system might limit the number of generations that can be produced. For some applications it might be sufficient to locate solutions that exhibit a certain target fitness regardless of whether they correspond to the global optimum. Convergence is assumed when the population's fittest individuals stop to evolve and remain constant for a number of generations. Sometimes, manual inspection of the individuals can give a good indication for convergence. For example, chiral structures may be among the low-energy geometries of metal clusters. If this is the case, then the GA has to locate both iso-energetic enantiomers. The lack of a sufficient convergence criterion is a problem for all global optimization methods.
More aspects of GAs.
Seeding. In certain instances, the performance of GA optimizations may be improved greatly by seeding. If common features or patterns in solutions are known prior to the GA optimization they may incorporated into the initial population. For instance, when searching for minimum energy metal cluster structures of a given size N, the structural motifs that appear in the most stable clusters of size N – 1 may be used to generate initial structures of size N that may be already close to the optimum.
Excerpted from Chemical Modelling Applications and Theory Volume 10 by J-O Joswig, M. Springborg. Copyright © 2014 The Royal Society of Chemistry. Excerpted by permission of The Royal Society of Chemistry.
All rights reserved. No part of this excerpt may be reproduced or reprinted without permission in writing from the publisher.
Excerpts are provided by Dial-A-Book Inc. solely for the personal use of visitors to this web site.
„Über diesen Titel“ kann sich auf eine andere Ausgabe dieses Titels beziehen.
Gratis für den Versand innerhalb von/der Deutschland
Versandziele, Kosten & DauerAnbieter: moluna, Greven, Deutschland
Gebunden. Zustand: New. InhaltsverzeichnisrnrnComplete basis set results in electron correlation methods using F12 theory Low-dimensional transition-metal dichalcogenides Ionic liquids Reaction kinetics Modelling PAHs Accurate modelling of electric properties of p. Artikel-Nr. 905684376
Anzahl: 4 verfügbar
Anbieter: Revaluation Books, Exeter, Vereinigtes Königreich
Hardcover. Zustand: Brand New. 1st edition. 238 pages. 9.30x6.20x0.80 inches. In Stock. Artikel-Nr. x-1849735867
Anzahl: 1 verfügbar