PinnedGoogle’s Gemini Ultra: Is it better than ChatGPT4 ?Google has recently released its largest and most advanced LLM (large language model) Gemini Ultra. Also, they are rebranding their LLM…Feb 12, 2024Feb 12, 2024
Designing Data-intensive Applications: Chapter IISummary of Chapter 2: Data Models and Query LanguagesSep 27, 2024Sep 27, 2024
Designing Data-intensive Applications: Chapter ISo far, my experience was centered around computer vision and machine learning fields. I started to make attempts to expand knowledge…Sep 23, 2024Sep 23, 2024
Document Image Understanding with OpenAI’s GPT4-VisionAs I work on several document understanding projects, I wanted to test document reading capabilities of GPT-4-Vision model from OpenAI.Feb 20, 2024Feb 20, 2024
PaLM2-VAdapter from Google ResearchGoogle presents Progressively-aligned-Large-Vision-and-Language-model with higher performance, Stronger scalability and Faster training…Feb 19, 20241Feb 19, 20241
OpenAI strikes again with an Insanely realistic video generation from text model.OpenAI just released a text-to-video model, Sora. It can generate videos upto 1-minute long with impressive realism and relevance to text…Feb 15, 2024Feb 15, 2024
Google announced Gemini 1.5 ProIt comes with long context understanding capabilities, meaning, it can understand really long documents, codes and long videos.Feb 15, 2024Feb 15, 2024
Direct Preference Optimization to align LLMs to your preferencesRLHF (Reinforcement Learning with Human Feedback) and Direct Preference Optimization (DPO) are two popular methods to finetune LLMs (Large…Feb 15, 2024Feb 15, 2024
YOLO-World: A Real-Time Open-Vocabulary Object DetectionA summary of YOLO-World’s research paperFeb 7, 2024Feb 7, 2024