Exercise in Monte-Carlo packing density computation.

I needed a way to calculate the packing densities of structures modeled using the van der Waals sphere model, generated through CSP and DFT computations. Specifically, for periodic configurations, this comes down to computing:: \rho = \frac{\text{vol}(\mathbf{O})}{\text{vol}(\mathbf{U})}. Here, \text{vol}(\mathbf{U}) represents the volume of the unit cell \mathbf{U}, and \text{vol}(\mathbf{O}) is the volume of the subset of \mathbf{U} occupied by the van der Waals spheres denoted as \mathbf{O}.

Calculating the volume of \mathbf{U} is is straightforward; it’s simply the determinant of the unit cell’s generators: \text{vol}(\mathbf{U})}=\text{det}(\mathbf{U}). However, analytically computing the volume of \mathbf{O} involves solving numerous intricate integrals of intersections of spheres with various radii, making it cumbersome.

Let’s try to do this the Monte-Carlo way, instead. The volume of \mathbf{O} can be expressed as an integral over the unit cell \mathbf{U}, i.e.

    \begin{equation*}\text{vol}(\mathbf{O})=\int_{\mathbf{O}} d\mathbf{V}\end{equation*}

Here, d\mathbf{V} is the natural volume form on \mathbf{R}^3. Since this integral is zero everywhere except in the unit cell, it can be redefined as an integral of the indicator function over the unit cell:

    \begin{equation*}\text{vol}(\mathbf{O})=\int \bm{1}_{\mathbf{O}} d\mathbf{U}\end{equation*}

By changing coordinates to an integral over the unit cube \textbf{C}, we get:

    \begin{equation*}\text{vol}(\mathbf{O})=\int \bm{1}_{\mathbf{U}^{-1}\mathbf{O}} \text{det}(\mathbf{U}) d\mathbf{C}\end{equation*}

Plugging this into the density equation yields:

    \begin{equation*}$\rho=\int \bm{1}_{\mathbf{U}^{-1}\mathbf{O}} d\mathbf{C}\end{equation*}

Considering \mathbf{C} as a uniform random vector on the unit cube, this integral represents the expected value of the indicator function with respect to the 3-variate uniform distribution. This, in turn, equals the probability of X \sim \mathbf{C} lying in the collection of the van der Waals spheres \mathbf{U}^{-1}\mathbf{O}, scaled by the inverse of the unit cell:

    \begin{equation*}$\rho=\textbf{E}\left[ \bm{1}_{\mathbf{U}^{-1}\mathbf{O}} \right]=\text{P}\left( X \in \mathbf{U}^{-1}\mathbf{O} \right)\end{equation*}

This means that the crystal’s density can be estimated by sampling X_1, \ldots X_N \sim \mathbf{C}, counting how many fall inside \mathbf{U}^{-1}\mathbf{O}, and dividing by N:

    \begin{equation*}\hat{\rho} = \frac{\# \left \{ X_i \in \mathbf{U}^{-1}\mathbf{O} \right \}}{N}\end{equation*}

Now, say I want to determine the density of this P21C anthracene structure

Equivalently, I can sample uniformly from the parallelepiped defined by the unit cell, e.g. 100 000 realisations,

and count how many fall inside at least one of the van der Waals spheres

which is 73056. Therefore, the density estimate is \hat{\rho}=0.73056.

But how accurate is this estimate? Or even better is to ask, how large does N need to be so that | \hat{\rho} - \rho | < c, where c is some constant?

To answer this, consider each X_i as a Bernoulli trial. Then \mathbm{X}=\sum_{i=1}^N X_i is binomially distributed random variable with the expected value N\rho and variance N\rho\left(1-\rho \right). According to the central limit theorem, the random variable \mathbm{Z}=\frac{\sqrt{N}\left( \hat{\rho} - \rho \right)}{\sqrt{\rho\left(1-\rho \right)}} converges in distribution to a normally distributed random variable with mean of 0 and a variance of 1. Therefore, I can request \mathbm{Z} being in the interval (-z;z), and I want the size of this interval to be less then some constant with probability p:

    \begin{equation*}P \left( | \hat{\rho} - \rho | < 2 z \sqrt{\frac{\hat{\rho}\left(1-\hat{\rho} \right)}{N}} \right)=p \end{equation*}

where z is the \frac{1+p}{2} quantile of the standard normal distribution.

For the P21C anthracene structure mentioned earlier, to achieve | \hat{\rho} - \rho | < 0.001 with probability p=0.999, I need N=9,000,000. The density estimate in this case is then:

    \begin{equation*}\hat{\rho} \approx 0.731501\end{equation*}

As a point of comparison, CCDC’s Mercury reports the packing coefficient of this structure as 0.733255.