How to get started with LLMs as a product manager
Photo by JJ Ying on Unsplash

How to get started with LLMs as a product manager

LLMs (large language models) like GPT-4 and Databricks Dolly are taking the world by storm. For product managers, this is an exciting time because it feels like an opportunity to define the next generation of product experiences. It has been said many times that this feels like an "iPhone moment". It is no surprise that product managers often ask me: how do I work on LLMs? In this blog post, I will discuss my answer to this question.

Before we talk about what to do, let's talk about a basic fact: LLMs and their surrounding ecosystem are changing incredibly fast. At this point, it's not clear to me who the winners in this race will be, and whether exciting technologies will be around in a few months and in what shape. I suspect even people much smarter than me are quietly watching the ecosystem evolve. Will rapidly-expanding context windows make vector databases less valuable? Perhaps, but it's too early to tell. Prompt engineering seems to work and produce great results -- but will it be just as useful 6 months with newer models? Again, no one really knows.

So, here is the philosophy that guides the rest of this blog post: when technology is changing really fast, product managers should focus on its fundamentals and invariants and learn them really well. By fundamentals, I mean the foundations on which everything is built, and by invariants I mean the aspects that will not change 6 months or even years into the future.

Note:  there's a great argument to be made that you should learn about vector databases, LangChain, etc. Build something, and have fun doing it. Just don’t skip leg day.

Do your coding on Databricks

There are many places to write code for deep learning, with new startups popping up every day. Personally, I strongly recommend using Databricks to simplify your life. You can use notebooks to explore and visualize data, and mix Python and SQL. MLFlow is great for tracking experiments, including for LLMs. I recommend spinning up a single-node cluster to do exploration, and switching to GPUs when training and doing inference with larger models. I also love the fact that I can use Visual Studio Code to write code locally and run it on Databricks-managed compute with a single command.

If you don’t have a Databricks account, you can try it for free.

Learn the math behind deep learning

Deep learning is the technology behind LLMs and a host of other models, including the diffusion models that power Stable Diffusion, Midjourney and DALL·E 2. Here's the interesting thing - the math behind deep learning is surprisingly elementary! You may even have learned it in high school or in a 1st year university course. I am, of course, referring to calculus ("differentiation) and matrix arithmetic. If this sounds intimidating, don't worry - you can do calculus if you can draw a line and you can do matrix math if you can add and multiply numbers.

This basic starter course at Khan Academy contains most of what you need. If you find it too easy, you can take the free and open Linear Algebra course at MIT. Focus on learning these key concepts:

  • Matrix addition, multiplication and inversion
  • Differentiation and the chain rule

Will you be able to read research papers with this knowledge? Not so fast! But you are now armed and dangerous, and can follow along much of the basics of deep learning.

Train your own deep neural network in PyTorch

Deep learning is implemented with neural networks. These are not a new concept as they have been around for a while. What's new is the sheer amount of data and computational power we can throw at them! As a first step, I recommend watching Andrej Karpathy code a neural network. I found it really useful to watch Andrej use simple Python to implement back-propagation, the key "trick" behind deep learning. 

I recommend watching all of Karpathy's videos in this series, including the video where he implements a GPT using the transformer architecture.

Once you understand back-propagation and have implemented a basic neural network, it is time to learn how deep learning is implemented in the real world. The vast majority of research and production deep learning today is done in PyTorch, the world's most popular deep learning library. PyTorch's home page looks  intimidating, but the basic concepts behind it are simple. PyTorch offers a powerful Python API for manipulating "tensors", which are multidimensional arrays of numbers. PyTorch makes it easy to build deep neural networks, perform automatic differentiation on them and accelerate computations with GPUs.

The best way to learn PyTorch is to take the Deep Learning Fundamentals course on lightning.ai by Dr. Sebastian Raschka. Fire up a Databricks notebook as you follow the course, and you will soon learn the PyTorch API and how to build your own deep neural networks for regression, classification and more. I also found the LLM reading list by Dr. Raschka to be really useful, especially the part about transformer architecture.

Fine-tune an open model

Training a foundation model is very expensive, and is likely to remain that way for a while. Chances are your business will directly use an off-the-shelf model or fine-tune with your own data. Product managers should learn both about data generation - how to create task-specific data - as well as how fine-tuning works.

Read this HuggingFace blog post about parameter-efficient fine-tuning. The key takeaway is that you can use lighter compute to only tune additional weights. It is both cheaper and efficient, and the initial LLM weights are not forgotten. This YouTube video is also a great explanation of the process. Of course, fine-tuning is easy to do on Databricks. Once tuned on your data, you want to create a monitoring mechanism for the LLM. LLMs cannot really use normal ML evaluation and monitoring tools, because there is no objective metric to monitor. So another exciting space going forward is how to curate a test dataset that is representative of what your customers use the LLM for. I recommend watching this webinar on LLMOps.

Follow quality newsletters and websites

I am pretty picky about what news I choose to follow about LLMs. After all, there is an amazing amount of hype, snake oil and nonsense about AI these days. I recommend regularly reading the following sources:

  • Ahead of AI by Dr. Sebastian Raschka. I recommend becoming a paid subscriber, as paid content includes Dr. Raschka diving into code.
  • Papers with Code is pretty much what it says. A great way of staying on top of groundbreaking papers with code you can hopefully follow along.
  • Simon Willison's blog. Simon doesn't focus on theory, but he does provide an incredible hands-on resource for working with LLMs.

Databricks blog. Keep an eye on this blog for upcoming announcements on AI.

Help democratize AI at Databricks!

Lastly, I would be remiss if I didn't mention that probably the best way to learn about LLMs is to be a product manager (or engineer!) at Databricks. We are excited about how AI will shape the future, and are looking for product managers who will be part of that journey. You can check out our Careers page, but I am also happy to chat to product managers directly (email me at bilal dot aslam at databricks dot com). We have exciting roles in San Francisco, Seattle and Amsterdam.

PS: I am thinking of writing a follow up blog post series showing how to do deep learning on Databricks (from model training to fine tuning). If you are interested, please leave a comment!

Jair Júnior (JJ)

2x Founder | Low/No-Code | GenAI | MLOps

1y

Great summary to get started! Thanks for sharing

Like
Reply
Gurnaik Singh Lall

Data Scientist - Finance Specialist @ Morrisons | Master of Physics, DE&I Ambassador

1y
Like
Reply
Moshiko Kasirer

Director of Software Engineering, Data and Backend Platforms

1y

" I am thinking of writing a follow up blog post series showing how to do deep learning on Databricks " -please do !

Like
Reply
Like
Reply

Thanks for sharing the great strategy to attack this hype and dig deeper meaningfully. Spot on with the data curation piece!

To view or add a comment, sign in

Insights from the community

Others also viewed

Explore topics