Problem with particle correlation functions

Question

Given a probability density $\rho(\textbf{r})$ defined in a volume $V$ in $2$ or $3$ dimensions one can define the autocorrelation function as: $$ \Phi(r) = \frac{1}{4 \pi}\int_{V}{\rho{(\textbf{r}')}[\int_{S(r)}{\rho{(\textbf{r}' + \textbf{t})}ds}] dV} $$ for a distance $r$. Where $S(r)$ represents a surface of radius $r$. This is basically an autocorrelation for points at a distance $r$ averaged over all directions. Forgive me for my limited mathematical background, this is not exactly my field, but I think my question is valid.

Assuming this represents a distribution of particles whose exact distribution $\rho(\textbf{r})$ is unknown, but $\Phi(r)$ is known and the distribution is homogeneous, this would mean that if I want to estimate the probability $P$ to find a particle around the point $p$ inside $V$, if not given any information it would be a fairly good estimation to say $P = dV/V$. This would be the best effort assuming an uniform distribution.

But if I pick a random particle from the box whose position is $\textbf{r}_{a}$, I would improve the initial estimation with this new information and the use of the isotropic correlation function $\Phi$. The new value would be $P = \Phi(|\textbf{r}_{p} - \textbf{r}_{a}|)$. So far so good.

But what if I pick another particle from the box whose position is $\textbf{r}_{b}$. It is not clear to me what is the most obvious way to use this additional information to improve the density estimation at $\textbf{r}_{p}$. If I ignore the previous particle, it would be $P = \Phi(|\textbf{r}_{p} - \textbf{r}_{b}|)$, does it mean I take the average of both estimations? I am lost here, I have the impression that given the fact that I know the position of 2 particles, I know more about the system and the estimation has to be better than using one particle. I just don't know how to combine both information at the same time the proper way. This would be solved of course knowing the three point correlation function, but supposing we have isotropy, $\Phi$ should be enough. Any ideas?

Andrew · Accepted Answer · 2021-07-06 05:29:40Z

Let's back up for a moment and fill in some of the background in your question (I will get to your question, I promise).

I'm going to call your collection of particles a "gas" for lack of a better word. But we won't consider the velocities, just the positions of the particles.

If our gas has $N$ particles, then the complete distribution of the gas particle positions will give a probability (density) for each state of the gas. Meaning, the state is fully specified if we can assign a probability (density) to each combination $\{\mathbf{r}_1, \mathbf{r}_2, \cdots, \mathbf{r}_N\}$. Let's call this distribution $p(\mathbf{r}_1, \mathbf{r}_2, \cdots, \mathbf{r}_N)$. We will assume that the distribution is unchanged if we permute the particles.

From this distribution, you can define (at least formally) a few different quantities:

To find the probability of finding one particle at a given location $\mathbf{r}$, we average (or marginalize) over the possible positions of the other particles: \begin{equation} \rho(\mathbf{r}) = \int {\rm d} \mathbf{r}'_2 \cdots {\rm d} \mathbf{r}'_N p(\mathbf{r}, \mathbf{r}'_2, \cdots, \mathbf{r}'_N) \end{equation} where, to make things as explicit as possible, I've put primes on the variables that are being integrated over
To find the probability of finding two particles at positions $\mathbf{r}_1$ and $\mathbf{r}_2$, we marginalize over particles $3$ to $N$ \begin{equation} \rho(\mathbf{r}_1, \mathbf{r}_2) = \int {\rm d} \mathbf{r}'_3 \cdots {\rm d} \mathbf{r}'_N p(\mathbf{r}_1, \mathbf{r}_2, \mathbf{r}'_3 \cdots, \mathbf{r}'_N) \end{equation}
To find the probability of finding particle 1 at position $\mathbf{r}_1$, given that we know that particle 2 is at position $\mathbf{r}_2$, we can use conditional probability. Let's build up in steps. First we use the definition of conditional probability \begin{equation} \rho(\mathbf{r}_1|\mathbf{r}_2) = \frac{\rho(\mathbf{r}_1,\mathbf{r}_2)}{\rho(\mathbf{r}_2)} \end{equation} Then, we use the above bullet points to express this as \begin{equation} \rho(\mathbf{r}_1|\mathbf{r}_2) = \frac{\int {\rm d} \mathbf{r}'_3 \cdots {\rm d} \mathbf{r}'_N p(\mathbf{r}_1, \mathbf{r}_2, \mathbf{r}'_3 \cdots, \mathbf{r}'_N)}{\int {\rm d} \mathbf{r}'_1 {\rm d} \mathbf{r}'_3 \cdots {\rm d} \mathbf{r}'_N p(\mathbf{r}'_1, \mathbf{r}_2, \mathbf{r}'_3, \cdots, \mathbf{r}'_N)} \end{equation}
Finally, to get closer to your question, to get the position of particle 1 given knowledge of particles 2 and 3, we again use conditional probability \begin{eqnarray} \rho(\mathbf{r}_1|\mathbf{r}_2, \mathbf{r}_3) &=& \frac{\rho(\mathbf{r}_1,\mathbf{r}_2, \mathbf{r}_3)}{\rho(\mathbf{r}_2, \mathbf{r}_3)} \\ &=& \frac{\int {\rm d} \mathbf{r}'_4 \cdots {\rm d} \mathbf{r}'_N p(\mathbf{r}_1, \mathbf{r}_2, \mathbf{r}_3, \mathbf{r}'_4 \cdots, \mathbf{r}'_N)}{\int {\rm d} \mathbf{r}'_1 {\rm d} \mathbf{r}'_4 \cdots {\rm d} \mathbf{r}'_N p(\mathbf{r}'_1, \mathbf{r}_2, \mathbf{r}_3, \mathbf{r}'_4, \cdots, \mathbf{r}'_N)} \end{eqnarray}

You can compute correlation functions from the general distribution functions by taking appropriate expectation values. For example, the two point correlation function for $\mathbf{r}_1$, given knowledge of $\mathbf{r}_2$ and $\mathbf{r}_3$, is \begin{eqnarray} c(\mathbf{s}|\mathbf{r}_2,\mathbf{r}_3) &=& \langle (\mathbf{r}_1) (\mathbf{r}_1 + \mathbf{s}) \rangle \\ &=& \int {\rm d} \mathbf{r}_1 (\mathbf{r}_1) (\mathbf{r}_1 + \mathbf{s}) \rho(\mathbf{r}_1|\mathbf{r}_2,\mathbf{r}_3)\\ &=& \frac{\int {\rm d} \mathbf{r}_1 \int {\rm d} \mathbf{r}'_4 \cdots {\rm d} \mathbf{r}'_N (\mathbf{r}_1) (\mathbf{r}_1 + \mathbf{s}) p(\mathbf{r}_1, \mathbf{r}_2, \mathbf{r}_3,\mathbf{r}'_4, \cdots, \mathbf{r}'_N)}{\int {\rm d} \mathbf{r}'_1 {\rm d} \mathbf{r}'_4 \cdots {\rm d} \mathbf{r}'_N p(\mathbf{r}'_1, \mathbf{r}_2, \mathbf{r}_3, \mathbf{r}'_4, \cdots, \mathbf{r}'_N)} \end{eqnarray}

That's about all we can say in general.

However, we can make some progress if we are willing to make some simplifying assumptions.

The simplest possible assumption would be that the particles positions are independent, so the probability factorizes $p(\mathbf{r}_1, \mathbf{r}_2, \cdots \mathbf{r}_N) = p(\mathbf{r}_1) p(\mathbf{r}_2) \cdots p(\mathbf{r}_N)$. But this is a little bit too boring to get an interesting answer. In particular it means that knowing the position of particle 2 tells us nothing about the position of particle 1, $\rho(\mathbf{r}_1|\mathbf{r}_2)=\rho(\mathbf{r}_1)$, which you can check explicitly using the formulas above.

The next simplest assumption we can make (which is complicated enough to be somewhat interesting) is that the particles are Gaussian distributed. This means that there is covariance between the particles, so the two point correlation function between different particles is not zero but because of Wick's theorem there is no additional information contained in higher order correlation functions.

Finally, we will assume (as you specify in the question) that the correlations are isotropic, meaning that they only depend on the distance between the particles, and not their orientation in space.

A bit more explicitly, by assuming that the correlations are Gaussian and isotropic, we are assuming we can express the probability distribution as \begin{equation} p(\mathbf{r}_1, \mathbf{r}_2, \cdots \mathbf{r}_N) = \frac{1}{\sqrt{(2\pi)^N\det C}} \exp\left(-\frac{1}{2} \sum_{i=1}^N \sum_{j=1}^N C_{ij}^{-1} |\mathbf{r}_i - \mathbf{r}_j|^2 \right) \end{equation} where $C_{ij}$ is the covariance matrix between particle $i$ and particle $j$.

I am going to assume that the particles are approximately uniformly distributed over some volume. To avoid getting too far into the weeds, a simple way to implement this is to assume the variance of each particle is $\sigma^2$, and that $\sigma^2$ (the diagonal elements of the covariance matrix) is much larger than the covariance between any two particles. Then the covariance matrix has the form $C_{ij} = \sigma^2 \delta_{ij} + \epsilon \Sigma_{ij}$, with $\delta_{ij}$ being the Kronecker delta symbol and the diagonal elements of $\Sigma$ being 0, and $\epsilon \ll 1$. The inverse covariance matrix has the form $C^{-1}_{ij}=\sigma^{-2} \delta_{ij} - \frac{\epsilon}{\sigma^4}\Sigma_{ij} + O(\epsilon^2)$. Then we will later on define quantities like $\sigma_{12}^{-2} = \frac{\epsilon}{\sigma^4}\Sigma_{12}$ describing covariances between particles.

These Gaussian distributions are nice because we can proceed to do the integrals analytically. In particular, after marginalizing over a subset of variables in a multi-variate distribution (assuming zero mean), the result is a multi-variate distribution with the covariance matrix replaced by the covariance matrix for the variables remaining after marginalization. (See for instance wikipedia)

For two particles, we find the result you intuited in your question

\begin{equation} \rho(\mathbf{r}_1|\mathbf{r}_2) = \frac{1}{\sqrt{2\pi \sigma_{12}^2}} \exp\left(-\frac{|\mathbf{r}_1-\mathbf{r}_2|^2}{2\sigma_{12}^2}\right) \end{equation} where $\sigma_{12}^2 \equiv C_{12}$ is the covariance between particles $1$ and $2$.

For the position of particle 1, given knowledge of particles 2 and 3, we find \begin{eqnarray} \rho(\mathbf{r}_1|\mathbf{r}_2,\mathbf{r}_3) &=& \sqrt{\frac{1}{(2\pi)^2 \sigma_{12}^2 \sigma_{13}^2}} \exp\left( -\frac{|\mathbf{r}_1-\mathbf{r}_2|^2}{2\sigma_{12}^2} -\frac{|\mathbf{r}_1-\mathbf{r}_3|^2}{2\sigma_{13}^2} \right) \\ &=&\rho(\mathbf{r}_1|\mathbf{r}_2) \rho(\mathbf{r}_1|\mathbf{r}_3) \end{eqnarray} Note that in dong this, I used the assumed form of the covariance matrix as a large diagonal term and small off-diagonal terms, described above.

So, assuming Gaussianity, the extra information added by additional particles can be incorporated by multiplying the conditional probabilities (this encodes the idea that the Gaussian distribution is fully described by the covariances between pairs of points).

If we break the assumption of Gaussianity, then we won't be able to do the integrals explicitly for arbitrary $N$ and we will need to make some assumption about the form of $\rho(\mathbf{r}_1|\mathbf{r}_2,\mathbf{r}_3)$ to make more progress. This will require some kind of special theoretical or experimental input based on the specific problem. Additionally, the factorization property in the last line above will typically break down, encoding the fact that there can be correlations between triplets of particles which the Gaussian distribution assumes are zero.

Really nice answer, I will go slowly through the steps you mention. In fact, I was thinking about galaxies rather than gas particles, but I thought these ideas would be similar for molecules and pretty much still applies. For galaxies, the 2 point C.F. is rather a power law instead of gaussian, so I wonder if it makes sense to assume there could be a galaxy distribution with a typical power-law C.F. for which the three point C.F. would not give any additional information. — Agustín
– Agustín, Commented Jul 13, 2021 at 22:12
@Agustín Cool! The "particles" can certainly be galaxies :) The 2 point CF is never actually a Gaussian except in textbooks. I suspect (but don't know for sure) that you won't be able to compute $\rho(r_1|r_2,r_3)$ exactly for anything other than a Gaussian distribution, because of the magic property I linked to on wikipedia. More physically, I suspect $\rho(r_1|r_2,r_3)=\rho(r_1|r_2)\rho(r_1|r_3)$ probably breaks down -- (a) if $r_2$ and $r_3$ are in a galaxy cluster, and (b) if $r_1$ is more likely to be attracted gravitationally to them, and therefore closer compared to the Gaussian case. — Andrew
– Andrew, Commented Jul 14, 2021 at 4:33

Stack Exchange Network

Problem with particle correlation functions

1 Answer 1

Your Answer

Hot Network Questions

Problem with particle correlation functions

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Related

Hot Network Questions