Roofline News

Updates and insights on how Roofline enables easy edge AI deployment
Connect with us
Categories
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
March 25, 2025

llama.cpp is often the go-to solution for running LLMs on edge devices, leveraging handwritten kernels for optimized execution. But while this approach delivers speed, it lacks flexibility when supporting new models and quantization techniques.

For instance, highly relevant models like Apple's OpenELM took months to be integrated into llama.cpp. Similarly, the GitHub issue to fix DeepSeek AI's multimodal janus-pro-1b has been open for nearly two months — highlighting the challenges of adapting to emerging architectures.

That’s why roofline is building an MLIR-based edge AI compiler with a model-agnostic LLM pipeline. Our approach enables day-0 deployment for DeepSeek’s janus-pro-1b and ensures rapid support for the latest quantization methods. Below, we showcase how our solution delivers ~8× performance gains over the native PyTorch compiler (TorchInductor) — and also why orange might just be the perfect company color!

Curious about how generic and scalable our approach is? Let’s chat!And stay tuned — janus-pro-1b also supports image-to-text and text-to-image capabilities.

March 3, 2025

Excited to be attending HiPEAC and the CODAI Workshop 2025 in Barcelona this week!

Today, I will present roofline's AI compiler technology, showcasing how it makes edge AI deployment more flexible and easy. If you want to learn more or get a demo, let me know.

Roofline is also hiring! If you have a background in compilers, computer architecture, and C++ and are looking for opportunities in Germany, let’s connect during the event.

February 27, 2025

The first major proof point is here, and roofline is ready. DeepSeek R1 marks a significant step toward enabling capable LLMs on constraint edge devices.

We tested the DeepSeek R1 Distill Qwen 1.5B through our pipeline, and it worked out of the box — thanks to the flexibility of our MLIR-based compiler. The performance of our CPU compiler for edge is shown below. Today, we give you a sneak peek into our early LLM pipeline with similar performance for tokens/s as llama.cpp. Compared to llama.cpp, our pipeline is model agnostic. Without any model-specific optimization, we achieved 4x memory savings vs TorchInductor. This highlights our flexible and efficient edge AI deployment solution.

February 26, 2025

I am heading to Austin this week to discuss the future of edge AI and present roofline's work in a lightning call on Thursday. My talk will focus on how we enable Day-0 support for novel AI models through our flexible compiler approach.

Feb 27, 2:00 PM | Edge AI Stage

Can't make it or want to see a demo? Feel free to reach out!

January 23, 2025

At the recent oneAPI DevSummit, Intel Liftoff member Roofline AI showcased their innovative approach to AI and HPC performance, addressing the need for adaptable AI compilers that seamlessly integrate across devices. By leveraging the open-source Multi-Level Intermediate Representation (MLIR) and Intel's Level Zero API, they’ve made significant strides in hardware-agnostic AI processing, offering a streamlined solution for top performance across multiple platforms.

Want to learn more about how roofline AI is shaping the future of scalable AI applications? Read the full blog! https://intel.ly/4gpDJ9S

December 8, 2024

Roofline featured in the latest issue of the HiPEAC magazine. Contact us for more insights into our retargetable software development kit for Edge AI hardware accelerators.

November 12, 2024

Our CEO, Jan Moritz Joseph, will be holding a lightning talk today at the oneAPI DevSummit, hosted by UXL Foundation.  He will present how roofline's MLIR toolchain can be used in connection with Level Zero to develop an end-to-end compiler toolchain for edge AI.

Where? Join online or directly reach out to us if you cannot join

When? Today, from 17:00h

Interested?

Sorry, we couldn’t find anything matching that. How about browsing our latest posts?