ChatGPT-Level Performance By Fine-Tuning LLaMa With Only 1000 Samples, Musk Chasing OpenAI, And An Overview Of Multi-Modal Models |
Machine learning progress stays as rapid as a thieving magpie swooping towards a gleaming silver spoon. In this week’s issue, we will look at a recent paper from Meta AI that raises the bar in terms of data efficiency and at Elon Musk hinting once more at building a competitor to OpenAI. Last but not least, we have a quick review of the state of multi-modal models in store for you. Let’s jump in! DeepTalent Refactors The Job Search For Machine Learning Engineers (Sponsor)Usually, each company has its own, long interview process, and for every company, applicants need to start all over again. This is not only annoying but also wastes a lot of time and money for everyone involved. The DeepTalent platform validates your skills only once. Companies then start reaching out to you and fast-track your application. They can do that because they already know you have great skills as a member of the exclusive network. For companies, the service can significantly shorten the time-to-hire and reduce the costs of technical interviews by 80%. Full disclosure here: I am a co-founder of DeepTalent. So, if you are looking for a new job in machine learning or are hiring engineers in the field, check out: DeepTalent.io! ChatGPT-Level Performance By Fine-Tuning LLaMa With Only 1000 SamplesIn a recent paper, researchers found that fine-tuning a LLaMa model on only 1000 samples was enough to create a state-of-the-art conversational AI. Their instruction-tuned model rivals popular models such as GPT-4, ChatGPT, and Google’s Bard. What is particularly interesting about their finding is that they did not use any reinforcement learning with human feedback (RLHF) as it was used in the creation of ChatGPT. In the paper, they conclude that instruction fine-tuning probably works that well with comparably very few samples because most of the model’s capabilities are learned during pre-training. Why Is This Important? Over the last months, we have seen the cost of creating a state-of-the-art LLM drop considerably. The model used in this study has only 65B parameters. Though this is still a massive model, it has roughly six times fewer parameters than GPT-3 and about 20 times fewer than GPT-4. Reaching state-of-the-art performance by fine-tuning on no more than 1000 samples of dialogue lowers the barrier to entry once more. Curating 1000 examples of specific dialogue is well within the realm of what is possible for e. g. a startup without a boatload of funding. Elon Musk Wants to Challenge Google and Microsoft in AIDuring The Wall Street Journal’s CEO Council Summit in London, Elon Musk expressed his desire to establish an AI business that can compete with industry giants Google and Microsoft. Musk hinted that this endeavor could involve various parts of his corporate empire, including Twitter, which he believes could become cash-flow positive by next month. He suggested the possibility of Twitter and Tesla partnering with an AI company, similar to the Microsoft and OpenAI collaboration. Musk’s existing AI company, X.AI, could play a role in this ambitious plan. Why Is This Important? On the one hand, a company such as X.AI could function as a platform that unifies the AI efforts of Tesla as well as those of Twitter. This would likely help Musk’s endeavors to foster more innovation internally. On the other hand, it will likely create spillover effects on Twitter as well as help to fight off the stiff competition that other car manufacturers pose on Tesla’s self-driving systems. An Overview Of The State Of Multi-Modal ModelsMultimodal models, capable of processing various types of input data, have made significant progress in recent years. Meta AI’s ImageBind is one such model that embeds six modalities into a joint embedding space. It uses a CLIP-like contrastive approach to train encoders for each modality. ImageBind demonstrates strong performance in tasks such as few-shot classification, object detection, and embedding space arithmetic. All of these abilities are obviously observed across multiple modalities. This means, for example, that the model can answer questions about an image in natural language. Why Is This Important? Multimodal models have the potential to revolutionize AI systems, enabling new tasks and impacting our understanding of the world. Development in this area has led to impressive results already. Further, the ability to operate across different modalities has the potential to blow some of the current use cases out of the water. A simple example of this is OCR-free document processing. This would enable us to extract information from scanned documents without the need for additional complex OCR systems. Thank you for reading! As always, I really enjoyed making this for you and sincerely hope you found it useful! See you next week! If you are not subscribed yet: At The Decoding ⭕, I send out a thoughtful 5-minute email every week that keeps you in the loop about machine learning research and the data economy. Click here to subscribe! |
113 Cherry St , #92768, Seattle WA |
Machine learning evolves at a mind-boggling speed. Staying up to date is hard! The Decoding is a weekly 5-minute newsletter keeping you in the loop. Sign up below to get smarter about machine learning and the data economy!
The Decoding June 2nd OpenAI Is Constrained By GPU Availability, A New Open Source LLM With 40B Parameters, And A List Of Notable Companies Raising Huge AI Funds Today, we have a story for you about how the availability of GPUs is blocking progress on LLMs. We also touch on OpenAI’s plans for the next 24 months. Next, we continue with the open-sourcing of HuggingFace’s #1 ranked LLM. Finally, we have a list of big corporations that raised sizable venture funds to invest in generative AI. I am...
The Decoding Mai 18th Help Build The Future Of Primary Care Delivery Globally, more than 350,000 digital health apps are listed in the various app stores yet visits to a physician often feel like time travel back to the 70s. Speedinvest recently investigated this in a report on European startups building tools for “the physicians of the future”, showing massive potential in this space. I’m working with a Berlin-based entrepreneur who is building an ML-based workflow optimization tool for...
The Decoding April 13th The 44 Most Promising AI Startups, OpenAI Big In Japan, And AutoGPT Starting A New Movement The world of machine learning progresses faster than a cheetah running on an airport’s moving walkway. In this week’s issue, we will cover AutoGPT, an open source project which aims at building a fully autonomous agent by giving GPT access to the internet and common business tools. We also got a list of the 44 promising AI companies. It includes descriptions of what each company...