惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

GbyAI
GbyAI
阮一峰的网络日志
阮一峰的网络日志
C
Check Point Blog
Stack Overflow Blog
Stack Overflow Blog
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
酷 壳 – CoolShell
酷 壳 – CoolShell
M
MIT News - Artificial intelligence
L
LangChain Blog
Microsoft Azure Blog
Microsoft Azure Blog
博客园 - Franky
WordPress大学
WordPress大学
博客园_首页
Y
Y Combinator Blog
Cyber Security Advisories - MS-ISAC
Cyber Security Advisories - MS-ISAC
V
Visual Studio Blog
L
LINUX DO - 最新话题
S
Security @ Cisco Blogs
K
KPMG report finds enterprise disconnect between AI and its ROI | CIO
Help Net Security
Help Net Security
大猫的无限游戏
大猫的无限游戏
Hugging Face - Blog
Hugging Face - Blog
The GitHub Blog
The GitHub Blog
Schneier on Security
Schneier on Security
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
U
Unit 42
Jina AI
Jina AI
雷峰网
雷峰网
罗磊的独立博客
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
博客园 - 【当耐特】
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
人人都是产品经理
人人都是产品经理
Microsoft Security Blog
Microsoft Security Blog
V
V2EX
N
News and Events Feed by Topic
V2EX - 技术
V2EX - 技术
宝玉的分享
宝玉的分享
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
Exploit-DB.com RSS Feed
Exploit-DB.com RSS Feed
Hacker News - Newest:
Hacker News - Newest: "LLM"
P
Proofpoint News Feed
N
Netflix TechBlog - Medium
Martin Fowler
Martin Fowler
O
OpenAI News
P
Proofpoint News Feed
H
Help Net Security
S
Securelist
Vercel News
Vercel News
Hacker News: Ask HN
Hacker News: Ask HN
博客园 - 三生石上(FineUI控件)

Replicate's blog

How to make remarkable videos with Seedance 2.0 – Replicate blog How to prompt Seedream 5.0 – Replicate blog Recraft V4: image generation with design taste – Replicate blog Run Isaac 0.1 on Replicate – Replicate blog Run FLUX.2 on Replicate – Replicate blog How to prompt Nano Banana Pro – Replicate blog Retro Diffusion's pixel art models are now on Replicate – Replicate blog Replicate is joining Cloudflare – Replicate blog Extract text from documents and images with Datalab Marker and OCR – Replicate blog How to prompt Veo 3.1 – Replicate blog IBM's Granite 4.0 is now on Replicate – Replicate blog Which image editing model should I use? – Replicate blog Introducing our new search API – Replicate blog Torch compile caching for inference speed – Replicate blog Announcing Replicate's remote MCP server – Replicate blog How to prompt Veo 3 with images – Replicate blog Open source video is back – Replicate blog Generate consistent characters – Replicate blog Bria is now on Replicate – Replicate blog How we optimized FLUX.1 Kontext [dev] – Replicate blog Compare AI video models – Replicate blog The FLUX.1 Kontext hackathon – Replicate blog How to prompt Veo 3 for the best results – Replicate blog Get the most from Google Veo 3 – Replicate blog FLUX.1 Kontext from the community – Replicate blog Use FLUX.1 Kontext to edit images with words – Replicate blog Generate incredible images with Google's Imagen 4 – Replicate blog Run OpenAI’s latest models on Replicate – Replicate blog NVIDIA H100 GPUs are here – Replicate blog Run 30,000+ LoRAs on Hugging Face with Replicate – Replicate blog Ideogram 3.0 on Replicate – Replicate blog Run MiniMax Speech-02 models with an API – Replicate blog Easel AI is now on Replicate – Replicate blog Stylized video with Wan2.1 – Replicate blog Creative roundup: avatars, lightsabers, and LoRA tricks – Replicate blog Wan2.1: generate videos with an API – Replicate blog Wan2.1 parameter sweep – Replicate blog You can now fine-tune open-source video models – Replicate blog Generate short videos with the Replicate playground – Replicate blog AI video is having its Stable Diffusion moment – Replicate blog FLUX fine-tunes are now fast – Replicate blog FLUX.1 Tools – Control and steerability for FLUX – Replicate blog NVIDIA L40S GPUs are here – Replicate blog Ideogram v2 is an outstanding new inpainting model – Replicate blog Stable Diffusion 3.5 is here – Replicate blog FLUX is fast and it's open source – Replicate blog FLUX1.1 [pro] is here – Replicate blog Using synthetic training data to improve Flux finetunes – Replicate blog Fine-tune FLUX.1 with an API – Replicate blog Fine-tune FLUX.1 to create images of yourself – Replicate blog Replicate Intelligence #12 – Replicate blog Replicate Intelligence #11 – Replicate blog Fine-tune FLUX.1 with your own images – Replicate blog Replicate Intelligence #10 – Replicate blog FLUX.1: First Impressions – Replicate blog Replicate Intelligence #9 – Replicate blog Run FLUX with an API – Replicate blog Replicate Intelligence #8 – Replicate blog Run Meta Llama 3.1 405B with an API – Replicate blog Replicate Intelligence #7 – Replicate blog Replicate Intelligence #6 – Replicate blog Replicate Intelligence #5 – Replicate blog How to get the best results from Stable Diffusion 3 – Replicate blog Run Stable Diffusion 3 on your Apple Silicon Mac – Replicate blog Push a custom version of Stable Diffusion 3 – Replicate blog Replicate Intelligence #4 – Replicate blog Run Stable Diffusion 3 on your own machine with ComfyUI – Replicate blog H100s are coming to Replicate – Replicate blog Run Stable Diffusion 3 with an API – Replicate blog Replicate Intelligence #3 – Replicate blog Replicate Intelligence #2 – Replicate blog Replicate Intelligence #1 – Replicate blog Shared network vulnerability disclosure – Replicate blog Run Snowflake Arctic with an API – Replicate blog Run Meta Llama 3 with an API – Replicate blog Run Code Llama 70B with an API – Replicate blog How to create an AI narrator for your life – Replicate blog Clone your voice using open-source models – Replicate blog Businesses are building on open-source AI – Replicate blog How to run Yi chat models with an API – Replicate blog Scaffold Replicate apps with one command – Replicate blog Using open-source models for faster and cheaper text embeddings – Replicate blog Generate music from chord progressions and text prompts with MusicGen-Chord – Replicate blog Generate images in one second on your Mac using a latent consistency model – Replicate blog How to use retrieval augmented generation with ChromaDB and Mistral – Replicate blog Fine-tune MusicGen to generate music in any style – Replicate blog Jet-setting with Llama 2 + Grammars – Replicate blog How to run Mistral 7B with an API – Replicate blog Make smooth AI generated videos with AnimateDiff and an interpolator – Replicate blog Fine-tuned models now boot in less than one second – Replicate blog Painting with words: a history of text-to-image AI – Replicate blog We're cutting our prices in half – Replicate blog A guide to prompting Llama 2 – Replicate blog Streaming output for language models – Replicate blog Fine-tune SDXL with your own images – Replicate blog Run Llama 2 with an API – Replicate blog Run SDXL with an API – Replicate blog Fine-tune Llama 2 on Replicate – Replicate blog What happened with Llama 2 in the last 24 hours? 🦙 – Replicate blog Make any large language model a better poet – Replicate blog
A comprehensive guide to running Llama 2 locally – Replicate blog
2023-07-22 · via Replicate's blog

We’ve been talking a lot about how to run and fine-tune Llama 2 on Replicate. But you can also run Llama locally on your M1/M2 Mac, on Windows, on Linux, or even your phone. The cool thing about running Llama 2 locally is that you don’t even need an internet connection.

Here’s an example using a locally-running Llama 2 to whip up a website about why llamas are cool:

It’s only been a couple days since Llama 2 was released, but there are already a handful of techniques for running it locally. In this blog post we’ll cover three open-source tools you can use to run Llama 2 on your own devices:

  • Llama.cpp (Mac/Windows/Linux)
  • Ollama (Mac)
  • MLC LLM (iOS/Android)

Llama.cpp (Mac/Windows/Linux)

Llama.cpp is a port of Llama in C/C++, which makes it possible to run Llama 2 locally using 4-bit integer quantization on Macs. However, Llama.cpp also has support for Linux/Windows.

Here’s a one-liner you can use to install it on your M1/M2 Mac:

Here’s what that one-liner does:

Here’s a one-liner for your intel Mac, or Linux machine. It’s the same as above, but we’re not including the LLAMA_METAL=1 flag:

Here’s a one-liner to run on Windows on WSL:

Ollama (Mac)

Ollama is an open-source macOS app (for Apple Silicon) that lets you run, create, and share large language models with a command-line interface. Ollama already has support for Llama 2.

To use the Ollama CLI, download the macOS app at ollama.ai/download. Once you’ve got it installed, you can download Lllama 2 without having to register for an account or join any waiting lists. Run this in your terminal:

Then you can run the model and chat with it:

Note: Ollama recommends that have at least 8 GB of RAM to run the 3B models, 16 GB to run the 7B models, and 32 GB to run the 13B models.

MLC LLM (Llama on your phone)

MLC LLM is an open-source project that makes it possible to run language models locally on a variety of devices and platforms, including iOS and Android.

For iPhone users, there’s an MLC chat app on the App Store. MLC now has support for the 7B, 13B, and 70B versions of Llama 2, but it’s still in beta and not yet on the Apple Store version, so you’ll need to install TestFlight to try it out. Check out out the instructions for installing the beta version here.

Next steps

Happy hacking! 🦙