# First weeks of coding phase

In this blog post I will share my progress so far in GSoC 2020 with ArviZ. My intention for this blog post is to reflect the work in progress for discussion and planning purposes. Beware that there is little to no finished work displayed.

# Uploading a new `InferenceData`

object

Given that my project involves working with circular variables, the first task I encountered was to find a suitable `InferenceData`

object, to use in examples and tests. As I have worked with molecules φ and ψ torsion angles in the past, I have some models and data that came handy.

The `InferenceData`

object I decided to upload to figshare.com contains the sampled values for two pairs of φ and ψ torsion angles in a glycan molecule. This glycan molecule is a part of a resolved protein structure under PDB (Protein Data Bank) id.: 2LIQ. It is a quite small glycan, only containing three subunits.

After uploading it to figshare.com, I made it available to import form ArviZ with:

```
torsionals = az.load_arviz_data('glycan_torsion_angles')
```

Here’s a code snippet that cointains a brief description of the model used to obtain this `InferenceData`

object.

```
"glycan_torsion_angles": RemoteFileMetadata(
filename="glycan_torsion_angles.nc",
url="http://ndownloader.figshare.com/files/22882652",
checksum="4622621fe7a1d3075c18c4c34af8cc57c59eabbb3501b20c6e2d9c6c4737034c",
description="""
Torsion angles phi and psi are critical for determining the three dimensional
structure of bio-molecules. Combinations of phi and psi torsion angles that
produce clashes between atoms in the bio-molecule result in high energy, unlikely structures.
This model uses a Von Mises distribution to propose torsion angles for the
structure of a glycan molecule (pdb id: 2LIQ), and a Potential to estimate
the proposed structure's energy. Said Potential is bound by Boltzman's law.
""",
),
}
```

# Circular Histogram plot

I managed to obtain a circular histogram modifying the ArviZ `plot_dist`

function. I added a new argument to `plot_dist`

called `is_circular`

, in order to obtain the following plots.

We were having some discussions with ArviZ developers about incorporating the variables domain into the `InferenceData`

object so that ArviZ can automatically detect if the variable is circular and proceed accordingly. This would be very convenient for plotting circular variables.

Here’s an example of a circular histogram:

```
az.plot_dist(torsionals.posterior.tors, is_circular=True, kind='hist')
```

While I was at it, I realized that when the input was in degrees the plot was not correct. This issue was a result of an innappropiate interpretation of the computed bins by the plotting function. I decided to internally check if the input is in degrees and transform it to radians.

```
if values.min() < np.pi and values.max() > np.pi:
values = np.deg2rad(values)
```

This is just a practical rule I came up with. I would like to check if there is a more theoretically supported way to do it.

With this little rule and providing an input in degrees, I got this plot:

# Circular KDE plot

After working a bit on a circular histogram, getting a circular KDE plot was quite straightforward.

```
az.plot_dist(torsionals, is_circular=True)
```

As you can see, there’s an issue with this KDE as the density’s edges don’t meet. This is one of the main points to solve.

# Circular Trace plot

For the circular trace plot I added an argument to the function called `circular_vars`

. This is to identify which variables need a circular KDE.

```
az.plot_trace(torsionals, var_names=['tors', 'E', 'beta', 'alpha'], circular_vars=['tors'])
```

I am not so happy about how it looks, but it is a start…

Besides the problem I already pointed with the KDE plot, I think the circular plots are too small and in consequence the entire plot has to much white space. In general there is a lot to improve about this circular trace plot.

# To-Do:

- Try to improve on ways to check if the input is in degrees or radians.
- Develop appropiate testing for the new arguments.
- Fix
`plot_kde`

for circular variables. This involves analysing ArviZ KDE computation and understanding why its functionning is not ideal in a circular setting. - Write documentation for the new arguments.
- Improve general aspect of trace plot.

You can take a look at the code at my branch. You can always contact me for suggestions and comments. Have a great weekend!