Ionic Molecular Crystallization

A few months ago, I encountered an interesting problem related to molecular crystallization. Consider a two-dimensional system with two types of molecules: the first type consists of rigid triangles with positive charges localized at the vertices (shown as blue triangles in the image below), and the second type is a single-atom molecule with a negative charge (shown as a green dot).

Our key questions are:

  1. Does this molecular system crystallize in the thermodynamic limit?
  2. If so, what are the characteristics of its ground states?

Let’s assume the system’s intermolecular interactions follow a normalized Coulomb/electrostatic potential:

    \begin{equation*} \phi_{(i,j)} = \frac{q_i q_j}{r_{(i,j)}},  i \in I, j \in J \end{equation*}

where I and J denote molecules of either type (green negative or blue positive triangular), r_{(i,j)} is the Euclidean distance between charges q_i and q_j. For green particles, q_i=-1, while for vertices of blue triangles, q_i=1. This creates attractive forces between opposite charges and repulsive forces between like charges, proportional to their relative distance.

The intermolecular energy is defined as the sum of all pairwise interactions \phi_{(i,j)} between molecules I and J:

    \begin{equation*} E(I,J) = \sum_{i \in I,j \in J} \phi_{(i,j)} \end{equation*}

This quantity generally lacks an infimum, so we need additional constraints.

We aim to find a sequence of molecular configurations X_1 \subset X_2 \subset \ldots \subset X_N \subset \ldots where the following limit converges to a constant:

    \begin{equation*} \lim_{ \ |X_N|\rightarrow \infty} \frac{1}{| X_N|} \min_{\iota: X_N \rightarrow \mathbb{R}^{(2N + n_{+})} } \sum_{\substack{ \{I,J\}| I \neq J \\ I,J \in X_N }} E(I,J) = \epsilon \end{equation*}

Here, X_N represents a collection of N molecules of both types (N=n_{-} + n_{+}).

Let’s focus on local clusters, as shown below:

For any k where (n_{-} + n_{+}) = 2k, an alternating configuration of charges in a local cluster can found for any \epsilon. Moreover, due to the rigidity of triangular molecules and the form of the potential \phi, the maximal cluster contains n_{-}=n_{+} = 6 charges.

In our example cluster, C, we set the energy to:

    \begin{equation*} E^{C}(C_0)= \sum_{\substack{\{i,j\} \in C_0 \\ i\neq j}}\phi(i,j) = -1 \end{equation}

for demonstration purposes.

Alternatively, the cluster configuration can be expressed in the form of the optimization problem:

    \begin{equation*} C_0 = \text{argmin}_{\iota:C_0 \rightarrow \mathbb{R}^{2k}}  \left[ E^{C}(C_0^{\iota}) +1 \right]^2 \end{equation}

In the C_0 configuration, the blue positive charges occupy the interstitial voids of the hexagonal close-packed arrangement of negative green charges:

Following these local optimizations, we define a quantity we call the energy per cluster as:

    \begin{equation*} E^C_{\mathcal{A}}(\xi)=E^{C}(C_0)+\sum_{\substack{\{i,j\} \\ i\in C_0 \\ j \in C_\alpha | \alpha \in \mathcal{A}}}\phi(i,j) \end{equation}

where \mathcal{A} is an index set . Minimizing E^C_{\mathcal{A}}(\xi) for such a cluster collection becomes a well-defined problem:

    \begin{equation*} \min_{\mathcal{\xi}:\mathcal{A} \rightarrow  \mathbb{R}^{2|\mathcal{A}|}} E^C_{\mathcal{A}}(\xi) \end{equation}

For cardinality |\mathcal{A}| = 6, the solution is illustrated below.

Note that the interaction potential \phi(i,j) respects molecular memberships—for i \in I and j \in J, we require I \neq J.

We can expand E_C^\Alpha to |\mathccal{A}|=18:

or any |\mathcal{A}|=6(2n-1).

What we did is that we transformed the problem of finding molecular system minimizers into finding minimizers of a single-particle system with potential \frac{1}{6}E^C_{\mathcal{A}} for |\mathcal{A}|=6, and we already know that the global minimizer is a dilated, translated, or rotated version of the triangular lattice.

Geometrically, this transformation effectively contracts local clusters into single points, resulting in a regular triangular tiling:

Having resolved the crystallization question, let’s examine the ground states’ characteristics, starting with a natural question: Are there other possible crystal phases? (From the materials science perspective, this question is important because of Crystal polymorphism)

On one side we can view our ground state configuration as a triangular tiling because of the rigidity of the blue molecules. On the other side, because of the way we have reframed the intermolecular interactions in terms of pairwise cluster interactions we may view the ground state configuration as the triangular lattice. The connecting point of these two views are the Archimedean circle packings and their associated semi-regular tessellations.

The densest packing of regular triangles is the p3m1 configuration with a density of 1. It is, in fact, a tiling of triangles as shown below.

The maximal non-isomorphic subgroup of p3m1 is the p31m plane group. The densest packing of regular triangles when the packing configurations are restricted to the p31m isomorphism class has a density of \frac{3}{4} and is shown in the image below.

This is, in fact, a consequence of removing one of the mirror symmetries from the p3m1 plane group.

Another way to get here is to look at p3m1, the Archimedean framework

associated with the p3m1 disc packing with packing density of \frac{\sqrt(3)}{9}\pi

, and the Archimedean framework associated with the hexagonal close packed configuration of disc with density \frac{\sqrt(3)}{6}\pi

and its associated Archimedean framework

By interlacing the hexagonal and triangular frameworks

we construct new framework. The common (blue) intersection point of the hexagona and triangular frameworks in fact define a triangular sub-tiling

and the p31m is constructed by removing triangle tiles containing a vertex at its center of mass

Alternatively we can create a complementary triangle packing with density of \frac{1}{4} by removing tiles not containing an interior vertex

The resulting structure is sometimes referred to as the Kagome Lattice, although it is not a lattice per se.

Starting from the cm triangle packing which is in fact a tiling

and decomposing it into two complementary packings with pg symmetry with density of \frac{1}{2}

This should come as no surprise since cm is the a maximal t-subgroup of p3m1 of index 3.

To demonstrate this point, we start again from the p3m1 disc packing framework

and interlace it with the triangular framework associated with p1 packing of discs

in such a way that disc centers are are subsets of the hexagon vertices

and we remove one pg symmetry from the tiling, the one not having a hexagon vertex at its center.

In fact, there are two possibilities to achieve the same result by using the triangle tiling above and rotating it by \pi with respect to zero (center of the middle pentagon).

and do exactly the same thing as previously to get the second pg triangle tiling

Combining these two triangle packings,

the cm symmetry of p3m1 should be clearly visible.

This tiling is a isomorphic to the one of the Archimedean (semi-regular vertex transitive) tilings associated with the densest cm packing of discs with density of (2-\sqrt{3})\pi

with it’s 3^3.4^2 semi-regular tiling

These constructions are nothing new. For a comprehensive treatment of regular and semi-regular tilings, I recommend Regular Polytopes by H. S. M. Coxeter, first published in 1947. (Coxeter was the geometer who inspired M. C. Escher, which resulted in the Circle Limit I–IV series.)

Returning to our original problem of characterizing and enumerating possible Coulomb potential ground states of our molecular system, all these triangle packings represent viable local energy minima. This conclusion follows from our transformation of the intermolecular interactions into a nearly spherically symmetric “energy per cluster” potential, which preserves all symmetries of the hexagonal close-packing configuration of discs: p2, p2gg, pg, p3, and p1, as well as the rigidity of the triangular molecule.

We have classified the packings by associating each configuration with Archimedean disc packings and tessellations. The global energy minimizer corresponds to the triangular tiling and lattice, as demonstrated by Theil in “A Proof of Crystallization in Two Dimensions” (Commun. Math. Phys. 262, 209–236, 2006). https://doi.org/10.1007/s00220-005-1458-7. Since all our configurations are subsets of the triangular tessellation relative to some tiling symmetry subgroup, they are energy minimizers within their respective symmetry groups. (See Bétermin, L. “Effect of Periodic Arrays of Defects on Lattice Energy Minimizers,” Ann. Henri Poincaré 22, 2995–3023, 2021. https://doi.org/10.1007/s00023-021-01045-0.)

A property of Archimedean tessellations is that they are vertex-transitive, meaning the same number, q, of regular p-gons meet at each vertex. In triangular tessellations, six regular triangles (3-gons) meet at each vertex. For regular tessellations like the triangular one, this follows a simple equation:

    \begin{equation*} q\left( 1- \frac{2}{p}  \right)\pi = 2\pi \end{equation}

This is only for one type of polygon, but we have more types. However, we can generalize this formula to n-types of polygons:

    \begin{equation*} q_1\left( 1- \frac{2}{p_1}  \right)\pi + q_2\left( 1- \frac{2}{p_2}  \right)\pi + \ldots + q_n\left( 1- \frac{2}{p_n}  \right)\pi = 2\pi \end{equation}

For example, the tessellation associated with the cm packing of discs shown above has every vertex surrounded by q_1 = 3 regular p_1 = 3-gons (triangles) and q_2 = 2 regular p_2 = 4-gons (squares).

Ultimately, the equation above is simply a disguised form of the Euler characteristic for plane-connected graphs induced by the vertex figure of a semi-regular tessellation.

What does this mean in the context of our molecular system?

  1. The vertex figure of each ground state is determined by the leading contributions to its total energy.
  2. In training Graph neural networks for Machine-learned interatomic potentials, learning these interactions alone suffices for ground state prediction in such molecular systems, significantly improving computational efficiency.
  3. Combined with the Crystallographic restriction theorem, this formula allows us to explore possible ground state configurations based on molecular shape. In a periodic configuration, for every p_i, there exists at least one k \in \{2,3,4,6\} such that the expression p_i \ \pmod{k} = 0 holds. Thus, for any such p_i-gon shaped molecule, the vertex figure equation has only a finite number of integer solutions. Unfortunately, in cases like the pentagon, we might need to delve into the realm of aperiodic tilings.

Chiral Interaction Energy Ground States

A few days ago, my beloved fiancé surprised me with a GEOMAG construction toy for my birthday. GEOMAG consists of small rods with embedded magnets and metallic spheres that can be assembled into various structures. It’s truly amazing—I had no idea something like this existed!

This toy can effectively model crystallography problems. For example, below is a geometric proof of the Crystallographic restriction theorem

It can also demonstrate structural rigidity problems, as shown in this framework

which becomes flexible after removing just one link.

The stable framework is actually the snub trihexagonal tiling – a regular tessellation associated with the p6 packing of discs.

This framework is created by drawing links between the centers of touching disks. The tessellation on the right is called a floret pentagonal tiling. We can create this tiling from the p6 packing of hexagons by connecting each hexagon’s edge vertices to their nearest lattice points.

From the crystallographic perspective, the vertices of the dual tessellation lie in the interstitial sites of the regular tessellation. Thus, the circles of the p6 packing are incircles of the pentagons in the floret pentagonal tiling and the hexagons in the hexagonal p6 packing.

This serves as a physical model of the Lennard-Jones system I described in my blog entry Lennard-Jones hexagonal molecular system. Through this model, I discovered that there are two chiral ground states connected by a continuous path in the configuration space.

Take a look at the animation below to see how it all works!

This is closely related to the vibrational part of lattice energy, degrees of freedom and structural stability of molecular crystals. Let’s have closer looks at this. Below is a visualization illustrating the transition between the chiral ground states of a 3-molecule cluster. Each molecule in the cluster contains six atoms. The intermolecular energy is given by the Lennard-Jones potential:

    \begin{equation*} E_{LJ}=\sum_{\substack{\{I,J\} \\ I \cap J =0}} \sum_{\substack{(i,j ) \\ i \in I,  j \in J}} \left[\left(\frac{1}{r_{(i,j)}}\right)^{12} - \left(\frac{1}{r_{(i,j)}}\right)^{6}\right] \end{equation*}

where r_{( i,j ) } = \| x_i - x_j \| is the Euclidean distance between the i-th atom of molecule I and the j-th atom of molecule J, as shown in the image below.

The chiral ground states are in clear energy potential wells and represent two different global solutions of the energy cluster E_{LJ} minimization problem.

From the GEOMAG model animation, one can imagine this cluster being held together only by the bonds between nearest neighbors. By allowing three of the total bonds to stretch, one introduces one degree of freedom into the otherwise rigid cluster, making it possible for the configuration to move to its neighboring potential well.

This however requires a short range interaction potential, since the pairwise interaction is computed among all atoms of different molecules not only the first neighbors. The Lennard-Jones potential is one such example where the leading contribution to the overall energy is the three atom cluster in the middle and the transition from one ground state to the other is animated as revolving around the centre of mass of this cluster.

It is also by this Lennard-Jones potential property that the ground state of this two-dimensional molecular system coincides with the densest \mathbf{p6} packing of discs. This means that there are also two chiral densest p6 packings of all n-gons with six-fold rotational symmetry. This observation connects directly to my earlier work on plane group packings during my PhD, as it shows an aspect these packings I overlooked. The results of this study are published in our Densest plane group packings of regular polygons manuscript.

Moreover, this explains the crystal defects in the Lennard-Jones hexagonal molecular system. Since mirror symmetry is not permitted in this two-dimensional system, we observe incompatible local ground state patches.

I was wondering if a real material with this crystal structure exists. In fact, it does, and it was published only a few years ago. See the image and reference below.

Rusek, M., Kwaśna, K., Budzianowski, A., & Katrusiak, A. (2019). Fluorine··· fluorine interactions in a high-pressure layered phase of perfluorobenzene. The Journal of Physical Chemistry C124(1), 99-106.

The above is a space-filling visualization of this high-pressure phase of perfluorobenzene, which is in excellent agreement with our highly simplified model. The sphere radii are set to the van der Waals radii. The left image shows one layer, while the right image displays the same layer with the carbons removed. The space group of this crystal is C2/c. C2/c symmetry consists of a two-fold rotational symmetry (here, within one layer of perfluorobenzene) and a perpendicular mirror symmetry. This effectively means that the layers alternate as two chiral two-dimensional ground states, interchanging chirality layer by layer.

Why is this important beyond just creating appealing pictures and animations? One reason is that it can significantly reduce computational time complexity by eliminating unnecessary calculations in Crystal Structure Prediction (CSP). For example, if we know beforehand that our target compound satisfies a few assumptions, the degrees of freedom to explore can be reduced to a maximum of 12, sometimes even less, as demonstrated in our Close-Packing approach to CSP (Alternatively, we can think of the hard sphere model as a pairwise interaction potential. However, in this blog, we are considering soft core interactions).

And we know time is money, as illustrated by the recent developments around the DeepSeek model and its ensemble learning approach to large language model training. The concept of ensemble here mirrors the Thermodynamic ensemble that Gibbs introduced in 1902.

This also shows how chemistry can inspire the development of novel machine learning algorithms beyond Machine-learned interatomic potentials.

Sphere packings continued – space groups P2_1 and C2/c

In this post, we’re diving back into the world of sphere packings, focusing on the space groups P2_1 and C2/c. Feel free to check out an earlier discussion on Densest P1, P-1 and P21/c packings of spheres. Both of these groups reach the packing density of \frac{\pi}{\sqrt{18}}.

The densest P2_1 packing almost identical to the P\overline{1} group. Even though they belong to different crystal systems—triclinic for P\overline{1} and monoclinic for P2_1—they share the same unit cell parameters: a = c = 2, b = 2\sqrt{2}, and \beta = \frac{\pi}{2}. Here’s a snapshot of what this packing looks like:

Next up is the C2/c space group, which also falls under the monoclinic category. This group is a bit more complex, with eight symmetry operations. Its densest packing has unit cell parameters of a = b = 4, c = 2\sqrt{3}, and \beta = \pi - \text{arcsin} \left( \frac{\sqrt{2}}{\sqrt{3}} \right). Check out the illustration below of the densest packing configuration:

So far, we’ve seen that the optimal packing density for the space groups P1, P\overline{1}, P21/c, P2_1, and C2/c matches the general optimal sphere packing density. This isn’t too surprising since these groups are all related to the Hexagonal Closed Packed structure, via group – subgroup relations. Interestingly, these space groups are among the top ten of the most frequently occurring in the Cambridge Structural Database, making up 74% of its entries.

Yayoi Kusama’s Chandelier of Grief: Symmetry in Art and Science

This photo captures my fiancé Lina and myself at Yayoi Kusama’s “Infinity Mirror Rooms” exhibit at Tate Modern in London, taken in March 2024. The installation, called “Chandelier of Grief,” uses mirrors and light to explore ideas about symmetry.

The centrepiece of the artwork is a chandelier surrounded by six mirrors arranged in a regular hexagonal prism. This creates an interesting visual effect—an infinite lattice of chandeliers extending in all directions, with each reflected chandelier appearing at the centre of symmetry within this lattice.

What we’re experiencing is a visual representation of a hexagonal lattice, one of the five two-dimensional Bravais lattices. Mathematically, it represents a discrete group of isometries in two-dimensional Euclidean space. The point group of this hexagonal lattice is isomorphic to the dihedral group D_6, highlighting its sixfold rotational symmetry and mirror reflection properties.

Physically, the hexagonal lattice emerges as the ground state configuration for systems of particles interacting via certain potentials, such as the Lennard-Jones potential, in the thermodynamic limit. Moreover, the hexagonal lattice is realised in materials like graphene, where carbon atoms arrange themselves in this highly symmetric pattern, leading to its unique properties.

Kusama’s installation is not only aesthetically pleasing but also serves as a visual representation of abstract concepts in mathematics and physics. The infinite reflections illustrate how artistic expression can intersect with scientific principles, making complex ideas accessible and engaging to a wider audience.

Exercise in Monte-Carlo packing density computation.

I needed a way to calculate the packing densities of structures modeled using the van der Waals sphere model, generated through CSP and DFT computations. Specifically, for periodic configurations, this comes down to computing:: \rho = \frac{\text{vol}(\mathbf{O})}{\text{vol}(\mathbf{U})}. Here, \text{vol}(\mathbf{U}) represents the volume of the unit cell \mathbf{U}, and \text{vol}(\mathbf{O}) is the volume of the subset of \mathbf{U} occupied by the van der Waals spheres denoted as \mathbf{O}.

Calculating the volume of \mathbf{U} is is straightforward; it’s simply the determinant of the unit cell’s generators: \text{vol}(\mathbf{U})}=\text{det}(\mathbf{U}). However, analytically computing the volume of \mathbf{O} involves solving numerous intricate integrals of intersections of spheres with various radii, making it cumbersome.

Let’s try to do this the Monte-Carlo way, instead. The volume of \mathbf{O} can be expressed as an integral over the unit cell \mathbf{U}, i.e.

    \begin{equation*}\text{vol}(\mathbf{O})=\int_{\mathbf{O}} d\mathbf{V}\end{equation*}

Here, d\mathbf{V} is the natural volume form on \mathbf{R}^3. Since this integral is zero everywhere except in the unit cell, it can be redefined as an integral of the indicator function over the unit cell:

    \begin{equation*}\text{vol}(\mathbf{O})=\int \bm{1}_{\mathbf{O}} d\mathbf{U}\end{equation*}

By changing coordinates to an integral over the unit cube \textbf{C}, we get:

    \begin{equation*}\text{vol}(\mathbf{O})=\int \bm{1}_{\mathbf{U}^{-1}\mathbf{O}} \text{det}(\mathbf{U}) d\mathbf{C}\end{equation*}

Plugging this into the density equation yields:

    \begin{equation*}$\rho=\int \bm{1}_{\mathbf{U}^{-1}\mathbf{O}} d\mathbf{C}\end{equation*}

Considering \mathbf{C} as a uniform random vector on the unit cube, this integral represents the expected value of the indicator function with respect to the 3-variate uniform distribution. This, in turn, equals the probability of X \sim \mathbf{C} lying in the collection of the van der Waals spheres \mathbf{U}^{-1}\mathbf{O}, scaled by the inverse of the unit cell:

    \begin{equation*}$\rho=\textbf{E}\left[ \bm{1}_{\mathbf{U}^{-1}\mathbf{O}} \right]=\text{P}\left( X \in \mathbf{U}^{-1}\mathbf{O} \right)\end{equation*}

This means that the crystal’s density can be estimated by sampling X_1, \ldots X_N \sim \mathbf{C}, counting how many fall inside \mathbf{U}^{-1}\mathbf{O}, and dividing by N:

    \begin{equation*}\hat{\rho} = \frac{\# \left \{ X_i \in \mathbf{U}^{-1}\mathbf{O} \right \}}{N}\end{equation*}

Now, say I want to determine the density of this P21C anthracene structure

Equivalently, I can sample uniformly from the parallelepiped defined by the unit cell, e.g. 100 000 realisations,

and count how many fall inside at least one of the van der Waals spheres

which is 73056. Therefore, the density estimate is \hat{\rho}=0.73056.

But how accurate is this estimate? Or even better, how large does N need to be so that | \hat{\rho} - \rho | < c, where c is some constant?

To answer this, consider each X_i as a Bernoulli trial. Then \mathbm{X}=\sum_{i=1}^N X_i is a binomially distributed random variable with the expected value N\rho and variance N\rho\left(1-\rho \right). According to the central limit theorem, the random variable \mathbm{Z}=\frac{\sqrt{N}\left( \hat{\rho} - \rho \right)}{\sqrt{\rho\left(1-\rho \right)}} converges in distribution to a normally distributed random variable with a mean of 0 and a variance of 1. Therefore, I can request \mathbm{Z} being in the interval (-z, z), and I want the size of this interval to be less than some constant with probability p:

    \begin{equation*} P \left( | \hat{\rho} - \rho | < 2 z \sqrt{\frac{\hat{\rho}\left(1-\hat{\rho} \right)}{N}} \right)=p \end{equation*}

where z is the \frac{1+p}{2} quantile of the standard normal distribution.

For the P21C anthracene structure mentioned earlier, to achieve | \hat{\rho} - \rho | < 0.001 with probability p=0.999, I need N=9,000,000. The density estimate in this case is then:

    \begin{equation*} \hat{\rho} \approx 0.731501 \end{equation*}

As a point of comparison, CCDC’s Mercury reports the packing coefficient of this structure as 0.733255.

Closing note: This method could also be used to quantify uncertainty about the true packing during ETRPA‘s optimization run in the tradition of Sequential analysis.

Repairing Unfeasible Solutions in ETRPA.

The ETRPA is designed to optimize a multi-objective function by:

1. Maximizing packing density.
2. Minimizing overlap between shapes in a configuration.

This is achieved through the use of a penalty function, which generally works quite effectively. However, there are some downsides to this approach. Many generated samples are flagged as unfeasible, leading to inefficiency. Additionally, some potentially good configurations with minimal overlaps are discarded. A more effective approach would involve mapping unfeasible solutions to feasible ones, essentially repairing the overlaps.

I’ve designed a relatively straightforward method to accomplish this almost analytically. It boils down to finding roots of a second-order polynomial, which is essentially a high school algebra problem. Admittedly, you’ll need to solve quite a few of these second-order polynomials, but they are relatively easy to handle.

So, with this approach, you can now map a configuration with overlap/s

to one without overlaps

Now, a couple of new issues have cropped up. First, there are some seriously bad solutions that are being sampled. For example, consider this one:

which is then mapped to this configuration

This is not helpful at all. Second, the parameters of this configuration are way above the optimisation boundaries that were used to create all configurations. As a result, it falls outside the configuration space where my search is taking place.

To address this issue, after repairing the configuration, I conduct a check to determine if the repaired configuration falls within my optimization boundaries. Any configurations that fall outside of these bounds are labeled as unfeasible solutions.

It might seems this puts me back in the same situation of multi-objective optimization, but that’s not the case. In my scenario, the Euclidean gradient at time t takes this form:

\nabla_{\bm{\theta}} J(\bm{\theta}^t) \ = \ \bm{\mu}^t_{\textbf{F}_{1- \frac{1}{q^{*}}}}-\bm{\mu}^t.

Here, \bm{\mu}^t represents the mean of all samples in the current batch, and I only consider the configurations without overlap or the repaired ones that lie within the optimization boundaries for \bm{\mu}^t_{\textbf{F}_{1- \frac{1}{q^{*}}}}. This effectively means I’ve eliminated the need for the penalty function and am optimizing only based on the objective function given by the packing density.

This is a promising, but it does introduce another problem. When I repair some of the unfeasible configurations, I end up making slight alterations to the underlying probability distribution, shifting it slightly from \bm{\theta}^t

to \bm{\theta}_{\text{rep}}^t

The issue here is that if I were to compute the Natural gradient as I used to, I’d obtain \widetilde{\nabla} J(\bm{\theta}_{\text{rep}}^t) =\mathcal{I}_{\bm{\theta}_{\text{rep} }^t}^{-1} \left({\bm{\mu}_{\text{rep}}^t}_{\textbf{F}_{1- \frac{1}{q^{*}}}}-\bm{\mu_\text{rep}}^t\right) in the tangent space \mathcal{T}_{\bm{\theta}_\text{rep}^t} of the statistical manifold, where everything is unfolding, but what I actually need is \widetilde{\nabla} J(\bm{\theta}^t) in the tangent space at the point \bm{\theta}^t.

Fortunately, the statistical manifold of the exponential family is a flat Riemannian manifold. So, first, I move from \bm{\theta}^t to \bm{\theta}_{\text{rep} }^t using this vector \mathcal{I}_{\bm{\theta}^t}^{-1} \left(\bm{\mu_\text{rep}}^t-\bm{\mu}^t\right) (where \bm{\mu}^t corresponds to the dual Legendre parametrization of \bm{\theta}^t, and \bm{\mu}_\text{rep}^t to \bm{\theta}_\text{rep}^t), and then I compute the natural gradient at the point \bm{\theta}_\text{rep}^t.

In addition, to increase the competition within my packing population, I introduce a sort of tournament by retaining the N-best configurations I’ve found so far and incorporating them with the repaired configurations from the current generation. This leads to yet another slightly modified distribution, \bm{\theta}_\text{rep + N-best}^t

and I have to perform the same Riemannian gymnastics once more. In the end, the approximation to the natural gradient at the point \bm{\theta}^t is as follows

\widetilde{\nabla} J(\bm{\theta}^t) = \mathcal{I}_{\bm{\theta}_{\text{rep + N-best} }^t}^{-1} \left({\bm{\mu}_{\text{rep + N-best}}^t}_{\textbf{F}_{1- \frac{1}{q^{*}}}}-\bm{\mu}_{\text{rep + N-best} }^t\right) + \mathcal{I}_{\bm{\theta}_{\text{rep} }^t}^{-1} \left(\bm{\mu}_{\text{rep + N-best}}^t-\bm{\mu_\text{rep}}^t\right) + \mathcal{I}_{\bm{\theta}^t}^{-1}\left(\bm{\mu_\text{rep}}^t-\bm{\mu}^t\right)

So far, experiments show that these modifications to the ETRPA are far more effective than the original ETRPA. Here’s a visualization from one of the test optimization runs:

In this particular run, the best packing I achieved attains the density of 0.7845. Previously, reaching such a level of density required multiple rounds of the refinement process, each consisting of 8000 iterations. Now, a single run with just 2000 iterations is sufficient. It’s worth noting that even after only 200 iterations, the algorithm already reaches areas with densities exceeding 0.7.

It’s important to mention that currently, I’ve only implemented the repair process for unfeasible configurations for the P1 space group. However, similar repairs can be done for configurations in any space group. Furthermore, the algorithm’s convergence speed can be further enhanced by fine-tuning its hyperparameters. Perhaps it’s time to make use of the BARKLA cluster for the hyperparameter tuning.

Minimising a Lennard-Jones hexagonal molecular system

Each frame of this simulation showcases the outcomes of Sequential Quadratic Programming (SQP) minimization applied to a Lennard-Jones system. The system’s energy is defined by the following expression:

    \begin{equation*} E_{LJ}=\sum_{\{ A, B \} } \frac{1}{2}\sum_{a \in A} \sum_{b \in B} 4\epsilon\left[\left(\frac{\sigma}{r_{ab}}\right)^{12} - \left(\frac{\sigma}{r_{ab}}\right)^{6}\right] \end{equation*}

where the parameters are set as \sigma = 1 and \epsilon = \frac{1}{4}. Here, \{ A, B \} represents the ordered pair of molecules A and B. The simulation involves a system of 500 molecules, each comprising 6 atoms. At the start, a uniform random configuration was generated, with each molecule represented by two center of mass coordinates and one rotational coordinate.

As the simulation progresses, the boundaries of the box are incrementally reduced following each SQP minimization. This effectively simulates an increase in pressure on the system. Notice the hexagonal tiling patches as the simulation advances. These are depicted as blue regular hexagons, each scaled by a factor of 1.93185 and rotated by an angle of 0.17862 relative to the molecules.

It’s important to note that the empty spaces observed in the simulation are a result of the rate at which pressure is increased. A sufficiently slow increase in pressure would eventually lead to a 6^3 regular tiling.

Densest P1 packing of the tetrahedron

I applied the Entropic Trust Region Packing Algorithm to search for the densest lattice packing of tetrahedra. It turned out that the GPU implementation of the overlap constraint evaluation for space group packings of polyhedra was more complicated than I expected. The good news was that the optimization algorithm worked as intended.

The densest lattice packing of the regular tetrahedron is \frac{18}{49} \approx 0.36734 (https://www.ams.org/journals/bull/1970-76-01/S0002-9904-1970-12400-4/). This is the configuration the algorithm found. See the visualization of the output packing with density 0.36734 below.