Sarina Li
Vector Adam, initially built on top of the ADAM optimization for neural networks has been observed to have issues with vectors. As a result, a variant (VectorAdam) was introduced.
Now imagine, let $a$ and $b$ be some arbitrary points on a sphere $S$. The goal is to use a variant of Vector Adam so that any arbitrary point $n$ on the sphere can be optimized while staying on the sphere.
See the original VectorAdam paper done by my very smart, beautiful and awesome mentor Selena :^)
I began this exploratory project on May 22nd 2024. I set up the repository and code environment. I also ran a couple runs of the initial VectorAdam demo code to understand what it was trying to do.
On this day, I also played around with the code and created a simulation using VectorAdam to optimize for the least distance between two points.
Screen Recording 2024-05-26 at 16.00.35.mp4
I noticed that the point would shoot out of the sphere, which is expected since with gradient descent, it will try to find the minimum by “descending” the gradient. However, we don’t want it to shoot out of the sphere.
This week, I worked on a naïve solution for the optimization problem. Basically, I forcefully normalized the point that the VectorAdam would spit out to be on the sphere,
def normalize_tensor(tensor: torch.Tensor, radius: float):
magnitude = tensor.norm()
normalization_constant = radius / magnitude
normalized_tensor = normalization_constant * tensor
return normalized_tensor
Using this code, I could get the point to stay on the sphere. However, visually I noticed that upon entering a certain point, it would start oscillating between two points on the function. With the regular unnormalized point, the loss function slowly converges. So, now the issue is figuring out the root of the oscillation issue.