VectorAdam Variant, Optimizing on a Sphere

Sarina Li

Problem

Vector Adam, initially built on top of the ADAM optimization for neural networks has been observed to have issues with vectors. As a result, a variant (VectorAdam) was introduced.

Now imagine, let $a$ and $b$ be some arbitrary points on a sphere $S$. The goal is to use a variant of Vector Adam so that any arbitrary point $n$ on the sphere can be optimized while staying on the sphere.

See the original VectorAdam paper done by my very smart, beautiful and awesome mentor Selena :^)

Progress

May 22nd, 2024

I began this exploratory project on May 22nd 2024. I set up the repository and code environment. I also ran a couple runs of the initial VectorAdam demo code to understand what it was trying to do.

On this day, I also played around with the code and created a simulation using VectorAdam to optimize for the least distance between two points.

Screen Recording 2024-05-26 at 16.00.35.mp4

I noticed that the point would shoot out of the sphere, which is expected since with gradient descent, it will try to find the minimum by “descending” the gradient. However, we don’t want it to shoot out of the sphere.

May 29th, 2024

This week, I worked on a naïve solution for the optimization problem. Basically, I forcefully normalized the point that the VectorAdam would spit out to be on the sphere,

def normalize_tensor(tensor: torch.Tensor, radius: float):
    magnitude = tensor.norm()
    normalization_constant = radius / magnitude
    normalized_tensor = normalization_constant * tensor
    return normalized_tensor

Using this code, I could get the point to stay on the sphere. However, visually I noticed that upon entering a certain point, it would start oscillating between two points on the function. With the regular unnormalized point, the loss function slowly converges. So, now the issue is figuring out the root of the oscillation issue.

Screen Recording 2024-05-30 at 19.51.11.mp4