The Aya Project: My Experience and Learnings

I don’t think anyone is born knowing the reason why they’re here. It’s just something you have to find as you go along. -Tohru Honda, Fruits Basket Well well, isn’t this something we’ve always wondered about? Thinking about the impact that any decision that you are taking in the present will have have in future, is it something that we can even do? Anyways, that went too philosophical but I think that’s how this blog will be too, Why? Aya is going to ACL LFGG 🚀🚀 ...

May 20, 2024 · 12 min · 2531 words · Herumb Shandilya

Writing a Compiler in Rust #1: Lexical Analysis

Compilers are not a game of luck. If you want it to work, code hard. - Sora (No Game No Life) Another blog, another butchered quote but what matters is that they never created a season 2 for this series and I’m max pissed about it!! Anyways, as someone who has a Computer Science Degree, it’s a bit shameful for me to admit that I never formally studied compilers, well in my defense it’s because it was an elective and I chose an ML elective instead of Compiler Design. Although it is something that I’ve always enjoyed hearing a lot about. ...

February 26, 2024 · 21 min · 4378 words · Herumb Shandilya

LIMA: Less is More for Alignment

I can’t even being to explain how I felt reading this paper, the moment I finished it I shared it ASAP with everyone because it deserved it. Essentially, what LIMA wants to address is that big instruction dataset and RLHF aren’t necessary to produce high quality output. As I mentioned above, what LIMA wants to address is that big instruction dataset and RLHF aren’t necessary to produce high quality output. But aside from that it’s an investigation on how much examples are needed to align a model output? I’m gonna explain more soon but that’s that basic idea of the paper. Let’s just dive into it. ...

May 31, 2023 · 4 min · 778 words · Herumb Shandilya

Transformers: Attention is all you need

Attention is Transformer’s breath, Multi-Head is Transformer’s release, RNNs thou wert and art, May thy model reach to greater accuracy. Látom. - Enen No Shouboutai I’m getting tired of butchering anime and video game quotes so I’m thinking I should butcher some meme quotes next time. Anyways, well I have been writing on a lot of topics but something I always wanted to write about is probably explaining a research paper. ...

December 10, 2021 · 26 min · 5520 words · Herumb Shandilya

PyTorch Lightning: DataModules, Callbacks, TPU, and Loggers

When I was a young man, I had liberty but I didn’t see it, I had time but I didn’t know it, And I had PyTorch Lightning but I didn’t use it. - Newbie PyTorch User Another Blog another great video game quote butchered by these hands. Anyways, when I was getting started with PyTorch one of the things that made me jealous was the fact that Tensorflow has so much support for monitoring the model performance. I mean I have to write a training loop with redundant steps while Tensorflow beginners were just passing and chilling. ...

June 8, 2021 · 12 min · 2441 words · Herumb Shandilya

Class Imbalance comes in Like a Lion

In a world without class imbalance we might’ve been heroes. - Neural Networks Keeping aside the fact that I butchered one of the greatest Video Game quotes of all time class imbalance can be a tricky thing to handle especially if you are a beginner. When I first encountered class imbalance I treated it normally, I know right, and not just that I measured the accuracy to judge the performance. Needless to say, that went quite badly, and to avoid this happening to you let me help out in avoiding an embarrassing situation in front of your teacher or whoever you report to. ...

June 5, 2021 · 15 min · 3140 words · Herumb Shandilya

Training SVM over Custom Kernels

One thing that always intrigued me about ML is that the more you learn about it the more you realize how little you know. One such case that happened to me a few months ago when a person asked me if I could help him in SVM and me being me I was like sure not a big deal. It was a big deal. Most of us might be familiar with training models. Few of us might be familiar with when to use which model. But when it comes to the details of these models we might fail to utilize them. In SVM, most of us might use the default RBF, a few of us might play with other kernels to find a better model and chosen ones might understand the working and purpose of these kernels. But can you create a kernel of your own? ...

April 29, 2021 · 7 min · 1392 words · Herumb Shandilya

0 min · 0 words · Herumb Shandilya