惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

F
Full Disclosure
WordPress大学
WordPress大学
小众软件
小众软件
Cloudbric
Cloudbric
AWS News Blog
AWS News Blog
腾讯CDC
量子位
人人都是产品经理
人人都是产品经理
大猫的无限游戏
大猫的无限游戏
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
V
Vulnerabilities – Threatpost
Scott Helme
Scott Helme
Hugging Face - Blog
Hugging Face - Blog
博客园_首页
C
CXSECURITY Database RSS Feed - CXSecurity.com
The Hacker News
The Hacker News
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
IT之家
IT之家
Jina AI
Jina AI
Attack and Defense Labs
Attack and Defense Labs
S
SegmentFault 最新的问题
Simon Willison's Weblog
Simon Willison's Weblog
The Cloudflare Blog
阮一峰的网络日志
阮一峰的网络日志
T
Tailwind CSS Blog
Last Week in AI
Last Week in AI
博客园 - 【当耐特】
Google Online Security Blog
Google Online Security Blog
美团技术团队
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
V
Visual Studio Blog
罗磊的独立博客
L
LINUX DO - 最新话题
博客园 - Franky
博客园 - 叶小钗
Apple Machine Learning Research
Apple Machine Learning Research
The Last Watchdog
The Last Watchdog
J
Java Code Geeks
AI
AI
C
Cisco Blogs
酷 壳 – CoolShell
酷 壳 – CoolShell
C
Cyber Attacks, Cyber Crime and Cyber Security
Cisco Talos Blog
Cisco Talos Blog
博客园 - 三生石上(FineUI控件)
雷峰网
雷峰网
Help Net Security
Help Net Security
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
云风的 BLOG
云风的 BLOG
I
Intezer
S
Securelist

Piotr's TechBlog

Deep Dive into Kafka Offset Commit with Spring Boot Claude Code Template for Spring Boot Speed up Java Startup with Spring Boot and Project Leyden AI Models in Containers with RamaLama Claude Code on OpenShift with vLLM and Dev Spaces Create Apps with Claude Code on Ollama Spring AI with External MCP Servers Istio Spring Boot Library Released Startup CPU Boost in Kubernetes with In-Place Pod Resize
Local AI Models with LM Studio and Spring AI
piotr.minkowski · 2026-03-16 · via Piotr's TechBlog

This article explains how to use LM Studio to run AI models locally and use them in a Java application with Spring AI and Spring Boot. LM Studio is one of the most popular alternatives to Ollama, making it easy to run LLMs directly on your laptop. With a clean UI and a built-in OpenAI-compatible API server, you can easily integrate it with your applications. For those of you experimenting with LLMs and Java applications, the Spring AI framework seems like the natural choice. Today, you’ll learn how to use models launched via LM Studio in your application using Spring AI. For Mac users, it’s even more interesting because LM Studio supports MLX models, which can bring significant memory and performance improvements on macOS.

Spring AI is a framework that helps developers easily integrate artificial intelligence features, such as large language models, embeddings, and vector databases, into Spring Boot applications. On my blog, you will find a collection of articles that guide you step by step through working with Spring AI. I cover both simple examples that help you get started quickly and more advanced, complex use cases. I suggest starting with this article.

Source Code

Feel free to use my source code if you’d like to try it out yourself. To do that, you must clone my sample GitHub repository. Then you should only follow my instructions. This repository contains several sample applications. The correct application for this article is in the spring-ai-openai-compatibility directory.

Run AI Models with LM Studio

Using LM Studio, we can run many different models locally in GGUF or MLX formats. In this case, we need a model for a typical chat application interaction to demonstrate how to integrate the Spring AI application with a model on LM Studio via an OpenAI-compatible API server. To select the optimal model for my hardware in this category, I will use the llmfit tool. This is a simple command-line tool that, when run, displays a list of models recommended for a given hardware configuration, organized by category and with an overall score. The model I’ve chosen for today’s experiments is shown in the figure below. This is DeepSeek-V2-Lite-Chat. It may not be the highest-rated model in the Chat category, but it strikes a reasonable balance between token-processing speed and overall rating.

lm-studio-spring-ai-llmfit

Then, let’s look for our model in LM Studio. Here I found the model mlx-community/DeepSeek-V2-Lite-Chat-4bit-mlx. It’s even better because I want to run a model converted to the MLX format, which is optimized for macOS. Let’s download that model.

lm-studio-spring-ai-install-model

Finally, we can run the model locally. In the Local Server section, find the DeepSeek-V2-Lite-Chat-4bit-mlx model after clicking the Load Model button. Then just click the model to run it with LM Studio. The local server must be enabled. By default, it listens on port 1234. You can load and run several AI models in this way. Then the specific model is identified by its name. In our case, the model name is deepseek-v2-lite-chat-mlx. Remember both the local server name and the model identifier to set them in your Spring AI application settings later.

lm-studio-spring-ai-run-model

Integrate Spring AI with the Model on LM Studio

Using OpenAI-Compatible API

As I mentioned before, LM Studio provides an OpenAI-compatible API server. So, we need to include the Spring AI OpenAI starter in our Spring Boot app dependencies.

<dependency>
  <groupId>org.springframework.ai</groupId>
  <artifactId>spring-ai-starter-model-openai</artifactId>
</dependency>

XML

Here’s a list of properties in the Spring Boot application.properties file. The API server is not protected with any API key, but we must set something to avoid an empty value. Then, we must set the API URL and the model name using the values previously read for that model in LM Studio.

spring.ai.openai.api-key = ${OPENAI_API_KEY:lm-studio}
spring.ai.openai.chat.base-url = http://192.168.0.16:1234
spring.ai.openai.chat.options.model = deepseek-v2-lite-chat-mlx

logging.level.org.springframework.ai.chat.client.advisor = DEBUG

Plaintext

Our application exposes one REST endpoint for demo purposes. The GET /simple/{country} allows us to ask for the capital of a given country. The LLM should briefly describe the city’s history.

@RestController
@RequestMapping("/simple")
public class SimpleController {

    private final ChatClient chatClient;

    public SimpleController(ChatClient.Builder chatClientBuilder) {
        this.chatClient = chatClientBuilder
                .defaultAdvisors(SimpleLoggerAdvisor.builder().build())
                .build();
    }

    @GetMapping("/{country}")
    public String ping(@PathVariable String country) {
        PromptTemplate pt = new PromptTemplate("""
                What's the capital of {country} ?
                Describe the history of that city briefly.
        """);

        return chatClient.prompt(pt.create(Map.of("country", country)))
                .call()
                .content();
    }
}

Java

Then, go to the repository root directory and run the following command to start the app:

mvn spring-boot:run

ShellSession

Once the app starts successfully, you can make the following request by entering the name of the country whose capital you want to learn more about.

curl http://localhost:9080/simple/Poland 

ShellSession

Now for something a bit unexpected. Even though the app doesn’t throw any errors, it seems that it simply can’t connect to the model running on LM Studio. The requests were sent successfully from the app perspective, but the connection just hung indefinitely. The root cause is that LM Studio currently doesn’t support HTTP/2, which is used by default by Spring AI RestClient in Spring Boot apps. Therefore, to resolve this issue, we must enforce the use of the HTTP/1.1 protocol. To achieve it, just override the default RestClient builder in the following way:

@SpringBootApplication
public class SpringAIOpenAICompatibility {
    public static void main(String[] args) {
        SpringApplication.run(SpringAIOpenAICompatibility.class, args);
    }

    @Bean
    public RestClient.Builder restClientBuilder() {
        HttpClient httpClient = HttpClient.newBuilder()
                .version(HttpClient.Version.HTTP_1_1) // force HTTP/1.1
                .build();

        return RestClient.builder()
                .requestFactory(new JdkClientHttpRequestFactory(httpClient));
    }
}

Java

Now, you can restart the app and repeat a test call. My model works quite well. It responds fairly quickly and accurately describes the history of the selected capital.

lm-studio-spring-ai-response

Using Anthropic-Compatible API

LM Studio also provides an endpoint compatible with the Anthropic API. In that case, you can include a different starter than before. Here’s the starter that supports Anthropic API.

<dependency>
  <groupId>org.springframework.ai</groupId>
  <artifactId>spring-ai-starter-model-anthropic</artifactId>
</dependency>

XML

Then, you can leave the same values in the configuration properties, but under a different key spring.ai.anthropic.*.

spring.ai.anthropic.api-key = ${OPENAI_API_KEY:lm-studio}
spring.ai.anthropic.chat.base-url = http://192.168.0.16:1234
spring.ai.anthropic.chat.options.model = deepseek-v2-lite-chat-mlx

logging.level.org.springframework.ai.chat.client.advisor = DEBUG

Plaintext

Finally, you can restart the app and call the same app endpoints. This time, the app will communicate with the exact same model, but via the Anthropic API.

You can enable several features specific to the Anthropic API. Of course, the selected feature must be supported by the target AI model. Below is how to create a Spring AI client that enables the “thinking” mechanism using the AnthropicChatOptions object.

@GetMapping("/{country}")
public String ping(@PathVariable String country) {
   PromptTemplate pt = new PromptTemplate("""
          What's the capital of {country} ?
          Describe the history of that city briefly.
   """);

   return chatClient.prompt(pt.create(Map.of("country", country)))
          .options(AnthropicChatOptions.builder()
                  .temperature(1.0)
                  .thinking(AnthropicApi.ThinkingType.ENABLED, 2048)
                  .build())
          .call()
          .content();
}

Java

LM Studio allows you to verify logs on the server-side API. This makes it easy to verify, for example, that a message is being sent in a format compatible with the Anthropic API (the POST /v1/messages endpoint).

Conclusion

LM Studio offers more features for working with local models than Ollama. I find it particularly useful that I can easily run a model in MLX format and view the logs of messages sent to the API server. While experimenting with Spring AI and LM Studio, I ran into an unexpected issue with HTTP/2 support, which was quite confusing. However, I didn’t have problems solving it quickly using standard Spring features.