PinnedDr.PixelGoogle’s Gemini Ultra: Is it better than ChatGPT4 ?Google has recently released its largest and most advanced LLM (large language model) Gemini Ultra. Also, they are rebranding their LLM…Feb 12Feb 12
Dr.PixelDesigning Data-intensive Applications: Chapter IISummary of Chapter 2: Data Models and Query LanguagesSep 27Sep 27
Dr.PixelDesigning Data-intensive Applications: Chapter ISo far, my experience was centered around computer vision and machine learning fields. I started to make attempts to expand knowledge…Sep 23Sep 23
Dr.PixelDocument Image Understanding with OpenAI’s GPT4-VisionAs I work on several document understanding projects, I wanted to test document reading capabilities of GPT-4-Vision model from OpenAI.Feb 20Feb 20
Dr.PixelPaLM2-VAdapter from Google ResearchGoogle presents Progressively-aligned-Large-Vision-and-Language-model with higher performance, Stronger scalability and Faster training…Feb 191Feb 191
Dr.PixelOpenAI strikes again with an Insanely realistic video generation from text model.OpenAI just released a text-to-video model, Sora. It can generate videos upto 1-minute long with impressive realism and relevance to text…Feb 15Feb 15
Dr.PixelGoogle announced Gemini 1.5 ProIt comes with long context understanding capabilities, meaning, it can understand really long documents, codes and long videos.Feb 15Feb 15
Dr.PixelDirect Preference Optimization to align LLMs to your preferencesRLHF (Reinforcement Learning with Human Feedback) and Direct Preference Optimization (DPO) are two popular methods to finetune LLMs (Large…Feb 15Feb 15
Dr.PixelYOLO-World: A Real-Time Open-Vocabulary Object DetectionA summary of YOLO-World’s research paperFeb 7Feb 7