惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

D
Darknet – Hacking Tools, Hacker News & Cyber Security
V
Vulnerabilities – Threatpost
Cloudbric
Cloudbric
G
GRAHAM CLULEY
S
Securelist
Schneier on Security
Schneier on Security
Help Net Security
Help Net Security
Exploit-DB.com RSS Feed
Exploit-DB.com RSS Feed
Project Zero
Project Zero
Spread Privacy
Spread Privacy
P
Privacy International News Feed
C
Cyber Attacks, Cyber Crime and Cyber Security
Cisco Talos Blog
Cisco Talos Blog
T
Tailwind CSS Blog
博客园_首页
有赞技术团队
有赞技术团队
Simon Willison's Weblog
Simon Willison's Weblog
Stack Overflow Blog
Stack Overflow Blog
K
KPMG report finds enterprise disconnect between AI and its ROI | CIO
Latest news
Latest news
T
Tor Project blog
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
Attack and Defense Labs
Attack and Defense Labs
www.infosecurity-magazine.com
www.infosecurity-magazine.com
O
OpenAI News
J
Java Code Geeks
T
Tenable Blog
K
Kaspersky official blog
AWS News Blog
AWS News Blog
S
Security @ Cisco Blogs
The GitHub Blog
The GitHub Blog
T
Threatpost
月光博客
月光博客
H
Heimdal Security Blog
Security Latest
Security Latest
The Hacker News
The Hacker News
Y
Y Combinator Blog
A
Arctic Wolf
Apple Machine Learning Research
Apple Machine Learning Research
C
Cisco Blogs
美团技术团队
Microsoft Security Blog
Microsoft Security Blog
Hugging Face - Blog
Hugging Face - Blog
T
The Blog of Author Tim Ferriss
C
CERT Recently Published Vulnerability Notes
D
Docker
Google Online Security Blog
Google Online Security Blog
D
DataBreaches.Net
V
Visual Studio Blog
H
Help Net Security

Comments for My Developer Planet

Getting Started with RabbitMQ in Spring Boot Getting Started with Qwen Code for Coding Tasks Open Notebook: A Secure Alternative to Google Notebook LM DevoxxGenie: Your AI Assistant for IDEA What’s New Between Java 17 and Java 21? What’s New Between Java 11 and Java 17?
Implement RAG Using Weaviate, LangChain4j and LocalAI
Published by mydeveloperplanet View all posts by mydeveloperplan · 2024-03-06 · via Comments for My Developer Planet

In this blog, you will learn how to implement Retrieval Augmented Generation (RAG) using Weaviate, LangChain4j and LocalAI. This implementation allows you to ask questions about your documents using natural language. Enjoy!

1. Introduction

In the previous post, Weaviate was used as a vector database in order to perform a semantic search. The source documents used are two wikipedia documents. The discography and list of songs recorded by Bruce Springsteen are the documents used. The interesting part of these documents is that they contain facts and mainly in a table format. Parts of these documents are converted to Markdown in order to have a better representation. The Markdown files are embedded in Collections in Weaviate. The result was amazing: all questions asked, resulted in the correct answer to the question. That is, the correct segment was returned. You still needed to extract the answer yourself, but this was quite easy.

However, can this be solved by providing the Weaviate search results to an LLM (Large Language Model) by creating the right prompt? Will the LLM be able to extract the correct answer to the questions?

The setup is visualised in the graph below:

  1. The documents are embedded and stored in Weaviate;
  2. The question is embedded and a semantic search is performed using Weaviate;
  3. Weaviate returns the semantic search results;
  4. The result is added to a prompt and fed to LocalAI which runs an LLM using LangChain4j;
  5. The LLM returns the answer to the question.

Weaviate also supports RAG, so why bothering using LocalAI and LangChain4j? Unfortunately, Weaviate does not support an integration with LocalAI and only cloud LLMs can be used. If your documents contain sensitive information or information you do not want to send to a cloud-based LLM, you need to run a local LLM and this can be done using LocalAI and LangChain4j.

If you want to run the examples in this blog, you need to read the previous blog.

The sources used in this blog can be found at GitHub.

2. Prerequisites

The prerequisites for this blog are:

  • Basic knowledge of embedding and vector stores;
  • Basic Java knowledge, Java 21 is used;
  • Basic knowledge of Docker;
  • Basic knowledge of LangChain4j;
  • You need Weaviate and the documents need to be embedded, see the previous blog how to do so;
  • You need LocalAI if you want to run the examples, see a previous blog how you can make use of LocalAI. Version 2.2.0 is used for this blog.
  • If you want to learn more about RAG, read this blog.

3. Create the Setup

Before getting started, there is some setup to do.

3.1 Setup LocalAI

LocalAI must be running and configured. How to do so is explained in the blog Running LLM’s Locally: A Step-by-Step Guide.

3.2 Setup Weaviate

Weaviate must be started. Only difference with the Weaviate blog is that you will run it on port 8081 instead of port 8080. This is because LocalAI is already running on port 8080.

Start the compose file from the root of the repository.

$ docker compose -f docker/compose-embed-8081.yaml

Run class EmbedMarkdown in order to embed the documents (change the port to 8081!). Three collections are created:

  1. CompilationAlbum: a list of all compilation albums of Bruce Springsteen;
  2. Song: a list of all songs of Bruce Springsteen;
  3. StudioAlbum: a list of all studio albums of Bruce Springsteen.

4.1 Semantic Search

The first part of the implementation is based on the semantic search implementation of class SearchCollectionNearText. It is assumed here, that you know in which collection (argument className) to search for.

In the previous post, you noticed that strictly spoken, you do not need to know in which collection to search for. However, at this moment, it makes the implementation a bit easier and the result remains identical.

The code will take the question and with the help of NearTextArgument, the question will be embedded. The GraphQL API of Weaviate is used to perform the search.

private static void askQuestion(String className, Field[] fields, String question, String extraInstruction) {
    Config config = new Config("http", "localhost:8081");
    WeaviateClient client = new WeaviateClient(config);

    Field additional = Field.builder()
            .name("_additional")
            .fields(Field.builder().name("certainty").build(), // only supported if distance==cosine
                    Field.builder().name("distance").build()   // always supported
            ).build();
    Field[] allFields = Arrays.copyOf(fields, fields.length + 1);
    allFields[fields.length] = additional;

    // Embed the question
    NearTextArgument nearText = NearTextArgument.builder()
            .concepts(new String[]{question})
            .build();

    Result<GraphQLResponse> result = client.graphQL().get()
            .withClassName(className)
            .withFields(allFields)
            .withNearText(nearText)
            .withLimit(1)
            .run();

    if (result.hasErrors()) {
        System.out.println(result.getError());
        return;
    }
    ...

4.2 Create Prompt

The result of the semantic search needs to be fed to the LLM including the question itself. A prompt is created which will instruct the LLM to answer the question using the result of the semantic search. Also, the option to add extra instructions is implemented. Later on, you will see what to do with that.

private static String createPrompt(String question, String inputData, String extraInstruction) {
    return "Answer the following question: " + question + "\n" +
            extraInstruction + "\n" +
            "Use the following data to answer the question: " + inputData;
}

4.3 Use LLM

Last thing to do, is to feed the prompt to the LLM and to print the question and answer to the console.

private static void askQuestion(String className, Field[] fields, String question, String extraInstruction) {
    ...
    ChatLanguageModel model = LocalAiChatModel.builder()
            .baseUrl("http://localhost:8080")
            .modelName("lunademo")
            .temperature(0.0)
            .build();

    String answer = model.generate(createPrompt(question, result.getResult().getData().toString(), extraInstruction));

    System.out.println(question);
    System.out.println(answer);
}

4.4 Questions

The questions to be asked are the same as in the previous posts. They will invoke the code above.

public static void main(String[] args) {
    askQuestion(Song.NAME, Song.getFields(), "on which album was \"adam raised a cain\" originally released?", "");
    askQuestion(StudioAlbum.NAME, StudioAlbum.getFields(), "what is the highest chart position of \"Greetings from Asbury Park, N.J.\" in the US?", "");
    askQuestion(CompilationAlbum.NAME, CompilationAlbum.getFields(), "what is the highest chart position of the album \"tracks\" in canada?", "");
    askQuestion(Song.NAME, Song.getFields(), "in which year was \"Highway Patrolman\" released?", "");
    askQuestion(Song.NAME, Song.getFields(), "who produced \"all or nothin' at all?\"", "");
}

The complete source code can be viewed here.

5. Results

Run the code and the result is the following:

  1. on which album was “adam raised a cain” originally released?
    The album “Darkness on the Edge of Town” was originally released in 1978, and the song “Adam Raised a Cain” was included on that album.
  2. what is the highest chart position of “Greetings from Asbury Park, N.J.” in the US?
    The highest chart position of “Greetings from Asbury Park, N.J.” in the US is 60.
  3. what is the highest chart position of the album “tracks” in canada?
    Based on the provided data, the highest chart position of the album “Tracks” in Canada is -. This is because the data does not include any Canadian chart positions for this album.
  4. in which year was “Highway Patrolman” released?
    The song “Highway Patrolman” was released in 1982.
  5. who produced “all or nothin’ at all?”
    The song “All or Nothin’ at All” was produced by Bruce Springsteen, Roy Bittan, Jon Landau, and Chuck Plotkin.

All answers to the questions are correct. The most important job has been done in the previous post, where embedding the documents in the correct way, resulted in finding the correct segments. An LLM is able to extract the answer to the question when it is fed with the correct data.

6. Caveats

During the implementation, I ran into some strange behavior which is quite important to know when you are starting to implement your use case.

6.1 Format of Weaviate Results

The Weaviate response contains a GraphQLResponse object, something like the following:

GraphQLResponse(
  data={
    Get={
      Songs=[
        {_additional={certainty=0.7534831166267395, distance=0.49303377}, 
         originalRelease=Darkness on the Edge of Town, 
         producers=Jon Landau Bruce Springsteen Steven Van Zandt (assistant), 
         song="Adam Raised a Cain", writers=Bruce Springsteen, year=1978}
      ]
     }
  }, 
  errors=null)

In the code, the data part is used to add to the prompt.

String answer = model.generate(createPrompt(question, result.getResult().getData().toString(), extraInstruction));

What happens when you add the response as-is to the prompt?

String answer = model.generate(createPrompt(question, result.getResult().toString(), extraInstruction));

Running the code returns the following wrong answer for question 3 and some unnecessary additional information for question 4. The other questions are answered correctly.

  • what is the highest chart position of the album “tracks” in canada?
    Based on the provided data, the highest chart position of the album “Tracks” in Canada is 50.
  • in which year was “Highway Patrolman” released?
    Based on the provided GraphQLResponse, “Highway Patrolman” was released in 1982.
    who produced “all or nothin’ at all?”

6.2 Format of Prompt

The code contains functionality to add extra instructions to the prompt. As you have probably noticed, this functionality is not used. Let’s see what happens when you remove this from the prompt. The createPrompt method becomes the following (I did not remove everything so that only a minor code change is needed).

private static String createPrompt(String question, String inputData, String extraInstruction) {
    return "Answer the following question: " + question + "\n" +
             "Use the following data to answer the question: " + inputData;
}

Running the code adds some extra information to the answer of question 3 which is not entirely correct. It is correct that the album has chart positions for the United States, United Kingdom, Germany and Sweden. It is not correct that the album reached the top 10 in the UK and US charts. All other questions are answered correctly.

  • what is the highest chart position of the album “tracks” in canada?
    Based on the provided data, the highest chart position of the album “Tracks” in Canada is not specified. The data only includes chart positions for other countries such as the United States, United Kingdom, Germany, and Sweden. However, the album did reach the top 10 in the UK and US charts.

It remains a bit brittle when using an LLM. You cannot always trust the answer it is given. Changing the prompt accordingly, it seems to be possible to minimize the hallucinations of an LLM. It is therefore important that you collect feedback from your users in order to identify when an LLM seems to hallucinate. This way, you will be able to improve the responses to the users. An interesting blog is written by Fiddler which addresses these kind of issues.

7. Conclusion

In this blog, you learned how to implement RAG using Weaviate, LangChain4j and LocalAI. The results are quite amazing. Embedding documents the right way, filtering the results and feeding it to an LLM is a very powerful combination which can be used in many use cases.


Discover more from My Developer Planet

Subscribe to get the latest posts sent to your email.