We had a wonderful night at the annual ANDRO Christmas Party! 🎄 Happy Holidays from ANDRO Computational Solutions, LLC!
ANDRO Computational Solutions, LLC’s Post
More Relevant Posts
-
🤗 Launching today! A new short course made in collaboration with Hugging Face: Quantization Fundamentals with Hugging Face. In this course, you will learn how to compress models with the quantization technique to make them more efficient, faster, and accessible, allowing them to run on a wide variety of devices. Join in, and: - Learn to quantize any open source model with linear quantization using the Quanto library. - Get an overview of how linear quantization is implemented to compress any model, including LLMs and vision models. - Apply downcasting, another form of quantization, with the Transformers library, which enables you to load models in about half their normal size in the BFloat16 data type. Learn for free: https://hubs.la/Q02sTcJC0
To view or add a comment, sign in
-
Completed "Quantization Fundamentals" from deeplearning.ai This good course gives intuition on how quantization affects the outcome of the layers of LLMs and how the parameter weights outcome will be affected by pre and post quantization also memory and computing affects by quantization mainly on FP16, FP32, FP64 tensors. This can help in saving computational cost for running open source LLMs.
🤗 Launching today! A new short course made in collaboration with Hugging Face: Quantization Fundamentals with Hugging Face. In this course, you will learn how to compress models with the quantization technique to make them more efficient, faster, and accessible, allowing them to run on a wide variety of devices. Join in, and: - Learn to quantize any open source model with linear quantization using the Quanto library. - Get an overview of how linear quantization is implemented to compress any model, including LLMs and vision models. - Apply downcasting, another form of quantization, with the Transformers library, which enables you to load models in about half their normal size in the BFloat16 data type. Learn for free: https://hubs.la/Q02sTcJC0
To view or add a comment, sign in
-
PennyLane was built by researchers, for research. 📖 We explain (through code!) the latest and most relevant results from the arXiv. With over 160 demos (and counting), let’s revisit some of our most popular research demos, starting off with the Haar measure demo👇 https://lnkd.in/g8yuN6s8
To view or add a comment, sign in
-
Learn to quantize any open source model with linear quantization using the Quanto library. Learn for free: https://hubs.la/Q02sTcJC0
🤗 Launching today! A new short course made in collaboration with Hugging Face: Quantization Fundamentals with Hugging Face. In this course, you will learn how to compress models with the quantization technique to make them more efficient, faster, and accessible, allowing them to run on a wide variety of devices. Join in, and: - Learn to quantize any open source model with linear quantization using the Quanto library. - Get an overview of how linear quantization is implemented to compress any model, including LLMs and vision models. - Apply downcasting, another form of quantization, with the Transformers library, which enables you to load models in about half their normal size in the BFloat16 data type. Learn for free: https://hubs.la/Q02sTcJC0
To view or add a comment, sign in
-
🌟 Excited to share some learnings with you all! 🌟 Learning the basics is beneficial about data types, model downcasting, quantization, and memory uses.📊💻 I learn about the datatype representation for example fp32 - sign: 0 (+ve) or 1 (-ve),1-bit - mantissa(fraction): 23-bit - exponent: 8-bit 🛠We normalize the number by following techniques when representing the number in binary Types of normalization (101.01) - Explicit: 0.10101 e3 (most LHS 1 significant) - Implicit: 1.0101 e2 (most RHS 1 significant) 🚀Moreover, we can downcast the model, for example, we downcast the model from fp32 to fp16 or brain-fp16 (bfp16) now we have the model downcasted to the desired data type but there is a problem in downcasting when we use the model in inference it does not perform well. 🤗Fortunately, there is a solution: quantization. By quantizing the model to fp16 or bfp16, it performs well during inference because quantization utilizes the fp32 model weights for generating responses. 🔗Source links: Representations of Floating Point Numbers: https://lnkd.in/dDKT_uZx More about Datatypes: https://lnkd.in/dbT6X2KP Thanks, DeepLearning.AI for providing short courses to build a better intuition to learn more about AI. Looking forward to hearing your thoughts in the comments! Let's chat! 🗨️💬 #generativeai #ai #quantization #deeplearningai #mlengineer
🤗 Launching today! A new short course made in collaboration with Hugging Face: Quantization Fundamentals with Hugging Face. In this course, you will learn how to compress models with the quantization technique to make them more efficient, faster, and accessible, allowing them to run on a wide variety of devices. Join in, and: - Learn to quantize any open source model with linear quantization using the Quanto library. - Get an overview of how linear quantization is implemented to compress any model, including LLMs and vision models. - Apply downcasting, another form of quantization, with the Transformers library, which enables you to load models in about half their normal size in the BFloat16 data type. Learn for free: https://hubs.la/Q02sTcJC0
To view or add a comment, sign in
-
Lately I've been experimenting with running models locally with LM Studio. It's a great way to try out many models and understand their nuances, limitations, and how they compare with the cloud-based frontier models. Quantization makes all that possible, shrinking model size by 4x or more so you can run them on your laptop. In "Quantization Fundamentals," you'll learn how to quantize nearly any open source model, which is a key technique in making powerful ML models more accessible and practical. Props to Marc Sun, Younes Belkada and Eddy Shyu for putting together such an interesting course on this technical and fascinating topic!
🤗 Launching today! A new short course made in collaboration with Hugging Face: Quantization Fundamentals with Hugging Face. In this course, you will learn how to compress models with the quantization technique to make them more efficient, faster, and accessible, allowing them to run on a wide variety of devices. Join in, and: - Learn to quantize any open source model with linear quantization using the Quanto library. - Get an overview of how linear quantization is implemented to compress any model, including LLMs and vision models. - Apply downcasting, another form of quantization, with the Transformers library, which enables you to load models in about half their normal size in the BFloat16 data type. Learn for free: https://hubs.la/Q02sTcJC0
To view or add a comment, sign in
-
🖥️ 🔬 Hackathon update 1: Machine Learning for Electron and Scanning Probe Microscopy (December 16-17, hybrid) Dear colleagues - we are starting to build the teams for the hackathon on Machine Learning for Electron and Scanning Probe Microscopy, https://lnkd.in/eH4bpDWy Currently we are at 170 registrations, and its high time to start building the teams around the seed problems we have built, and submit your own problems and data sets. To discuss these, join the hackathon Slack channel: https://lnkd.in/eRQmP4Jm As before - please also spread the word!
To view or add a comment, sign in
-
𝐐𝐮𝐚𝐧𝐭𝐢𝐳𝐚𝐭𝐢𝐨𝐧 𝐨𝐟 𝐋𝐋𝐌 𝐦𝐨𝐝𝐞𝐥𝐬 👍🧠 Reduced memory footprint: - 💾 More efficient GPU memory use. - 🏋️♂️ Trains larger models. - 📦 Allows bigger batch sizes. 👍⚡ Increased compute speed: - 🚀 Faster calculations with fp16/bf16. - 🛠️ Works best on specific hardware like Google TPU, NVIDIA A100. 👎🔍 Less precise: 🤏 Uses less memory, affecting precision. https://lnkd.in/dGz3HyKb
🤗 Launching today! A new short course made in collaboration with Hugging Face: Quantization Fundamentals with Hugging Face. In this course, you will learn how to compress models with the quantization technique to make them more efficient, faster, and accessible, allowing them to run on a wide variety of devices. Join in, and: - Learn to quantize any open source model with linear quantization using the Quanto library. - Get an overview of how linear quantization is implemented to compress any model, including LLMs and vision models. - Apply downcasting, another form of quantization, with the Transformers library, which enables you to load models in about half their normal size in the BFloat16 data type. Learn for free: https://hubs.la/Q02sTcJC0
To view or add a comment, sign in
-
Revisiting my previous post on NRK's puzzle game “Former” 🎮 After an insightful chat with Morten Blørstad at this year’s ICT PhD-gathering, I explored a new approach: Monte Carlo Tree Search (MCTS). Inspired by the paper Single-Player Monte-Carlo Tree Search (Schadd et al., 2008), I implemented a variant using their modified Upper Confidence Bounds for Trees (UCT). 🚀 The result? My implementation consistently achieves the best score of the day—in just seconds! Here’s a 13-move solution to today’s puzzle if you want to test it yourself: (2, 5), (0, 1), (0, 6), (3, 8), (3, 5), (4, 3), (0, 5), (6, 4), (5, 3), (5, 3), (5, 6), (6, 4), (0, 8) The top-left corner of the board is (0, 0). 🕹️ Play the game here: 👉 https://lnkd.in/d8f5N94e 💻 Check out the code on GitHub: 👉 https://lnkd.in/dVqqmyaM Feedback is always welcome—let me know your thoughts! 😊 (Don’t worry, this will be my last post about Former—I promise)
To view or add a comment, sign in
2,490 followers