A 21-year-old developer from India with a passion for AI/ML, product development, and open source. My expertise spans machine learning, natural language processing, and building AI solutions. I'm constantly exploring new technologies and methodologies in the rapidly evolving field of AI, with a keen interest in developing products that solve real-world problems. Beyond AI, I enjoy reverse engineering, web scraping, scripting, and gaming.
• Content Moderation Leadership: Led the development and deployment of advanced moderation solutions, utilizing vision models to create and maintain a robust Moderation API. This initiative proactively addressed harmful content across platforms, ensuring user safety and responsible AI application.
• Large Language Model (LLM) Optimization & Fine-tuning: Specialized in fine-tuning and optimizing LLMs for diverse applications, including creating engaging role-play scenarios and developing effective question-answering systems. Work focused on significantly improving model performance and accuracy for specific use cases relevant to content generation and user interaction.
• Reinforcement Learning for Model Alignment: Implemented Reinforcement Learning techniques, including Direct Preference Optimization (DPO), to refine LLM outputs and align them with user preferences. This resulted in enhanced output quality, improved user satisfaction, and AI-driven experiences more closely aligned with user expectations.
• Generative AI & Diffusion Models: Pioneered the application of diffusion models, particularly Stable Diffusion, for custom image generation tasks. This involved fine-tuning models, creating specialized image generation solutions, and curating datasets to support diverse image generation needs, expanding the company's generative AI capabilities.
• Automation & Infrastructure Development: Developed and maintained critical automation scripts to streamline AI development workflows. This included automation for LLM hosting, comprehensive benchmarking processes, and synthetic data generation, significantly improving team efficiency and development lifecycle management.
• Text-to-Speech (TTS) Innovation: Designed and implemented a Text-to-Speech (TTS) system for dialog applications, incorporating voice cloning technology and leveraging FastAPI for efficient deployment and scalability.
• Worked with a globally distributed team of 15 developers to build ReVanced, an open-source Android modification framework empowering users to customize apps to suit their needs.
• I helped spearhead overall architecture, design, and roadmapping - devising an adaptable core to enable endless customizations for apps built on Dalvik like YouTube. I also directly contributed extensive code across the codebase.
• I led development of the ReVanced Manager app installed by over 170K users. The app streamlined installations and made core functionality easily accessible.
• Initially adopted by a few hundred users, ReVanced gathered incredible traction on platforms like Reddit, Discord and Telegram from folks who really liked what we built.
• As installs grew exponentially to 170,000+, I took on expanded duties - overseeing roadmap priorities, debugging complex issues, liaising with user communities, and ensuring stability through rigorous testing.
• The project gave me hands-on experience on how coordinated remote teams can build delightful products loved by users globally. Debugging performance problems taught me how to approach issues methodically. Ultimately, the ability to solve real user problems at scale was extremely fulfilling.
Kepler Systems is to contribute to fundamental AI research and development across various domains. Current projects within Kepler Systems include:
Poetry-Llama: State-of-the-Art Urdu Poetry Model: Our most recent project is Poetry-Llama, a cutting-edge LLM specifically fine-tuned for understanding and generating Urdu poetry.This 70-billion parameter model is trained on a diverse corpus of Urdu poetry and is openly available under the Llama license on Hugging Face.
Open Datasets for Urdu Poetry Research: Recognizing the importance of quality data in AI research, Kepler Systems curates and shares valuable datasets:
UrduShers-10k: A meticulously curated collection of 10,000 classical Urdu poetry couplets(shers) from diverse poets and eras, ensuring high quality and cultural representation.This dataset is licensed under CC BY-SA 4.0 for open use in research and creative projects.
UrduGhazals-25k: A comprehensive dataset featuring 25,000 complete Urdu ghazals, encompassing a wide range of poets, eras, and dialects.This resource is also released under the CC BY-SA 4.0 license to promote open access and collaboration.
The ReVanced Manager is an Android application that allows you to modify any Dalvik Android application to add, remove and/or modify existing functionality. It dissassembles the APK locally on your device, makes the required changes using our in-house patcher library and then assembles it back into an APK again. You can find it on GitHub.