Build a question-answering service with AI and vector databases in JavaScript

Do you need to extract information from a massive number of documents? Would it be cool if AI could answer questions using data from those documents? Do you know JavaScript and don’t want to learn Python to build one service? Good. You can do it all with Langchain.js.

Table of Contents

  1. What are we building, and what do we need?
  2. How to use a vector database for question answering in JavaScript?
    1. Loading documents
    2. Answering questions
  3. Building the REST service
    1. API usage

What are we building, and what do we need?

We will build a REST service capable of receiving a request with the user’s question, finding the answer in documents stored in a vector database, and using AI to answer the question using the data from the document.

We need a collection of documents. For the sake of a tutorial, I load only one document — David Perell’s The Ultimate Guide to Writing Online.

We must split the document into chunks while preserving paragraphs and sentences. We use the RecursiveCharacterTextSplitter. The splitter tries to break the text at the end of a paragraph. If the chunk is still too big, it tries to split the text at the end of a sentence. If preserving sentences isn’t possible, the splitter breaks the text at the end of the word. It won’t cut words in half unless there is no other option.

Naturally, we need a vector database to store the embeddings of the document chunks. HNSWLib is good enough for an in-memory storage. In production, you can use pretty much any database you want (I like Milvus).

Finally, we need access to an AI model (OpenAI API) to synthesize the answer from the retrieved document chunk.

How to use a vector database for question answering in JavaScript?

Let’s start with all of the required imports. I load the document from a file stored on the disk, so I use the fs module.

import { OpenAI } from "langchain/llms/openai";
import { RecursiveCharacterTextSplitter } from "langchain/text_splitter";
import { VectorDBQAChain } from "langchain/chains";
import { HNSWLib } from "langchain/vectorstores/hnswlib";
import { OpenAIEmbeddings } from "langchain/embeddings/openai";
import * as fs from "fs";

We must pass the OpenAI API key to the service to run the code. In the following steps, I assume you have set the OPENAI_API_KEY environment variable.

Loading documents

The following function loads the article, creates an instance of the splitter, splits the documents into chunks, and puts them (and their embeddings) into the vector database.

After populating the vector database with the documents we want to use, we build the question-answering chain. The chain retrieves the relevant data from the database and passes the document to an LLM to generate the answer.

const prepareData = async () => {
    const text = fs.readFileSync("article.txt", "utf8");
    const textSplitter = new RecursiveCharacterTextSplitter({ chunkSize: 1000 });
    const docs = await textSplitter.createDocuments([text]);
    const vectorStore = await HNSWLib.fromDocuments(docs, new OpenAIEmbeddings({openAIApiKey: process.env.OPENAI_API_KEY}));

    const model = new OpenAI({
        openAIApiKey: process.env.OPENAI_API_KEY,
    });
    const chain = VectorDBQAChain.fromLLM(model, vectorStore, {
        returnSourceDocuments: true,
    });

    return chain;
};

Answering questions

To answer the question, we pass the user’s input to the chain as the query parameter and wait for the result. When we get the response, we extract the text property.

const answer = async (chain, query) => {
    const result = await chain.call({query});
    return result.text;
};

I stored those functions in a separate module, so I have to export them:

export default {
    answer,
    prepareData,
};

Building the REST service

In the REST service module, we create a standard express REST API, but we also call the prepareData function to load the documents and build the question-answering chain. Remember to store the chain as a variable so you don’t have to load the documents whenever the user asks a question. The prepareData function is asynchronous, but we have to wait for the data preparation to finish before we can start the server.

In the endpoint implementation, we call the answer function and return the result to the user.

import express from 'express';
import bodyParser from 'body-parser';
import ai from './ai.js';

const app = express();

app.use(bodyParser.json());

app.locals.data = await ai.prepareData();

app.post('/api/answer', async (req, res) => {
  const question = req.body.question;
  const answer = await ai.answer(app.locals.data, question);

  res.json({ answer });
});

app.listen(3000, () => {
  console.log('Server is running on port 3000');
});

API usage

After starting the service, we can send a request using curl: curl -X POST -H "Content-Type: application/json" -d '{"question": "How to write well"}' http://localhost:3000/api/answer.

In my case, I got an answer that sounds very much like David Perell:

{"answer":" To write well, you should follow the three pillars of writing: write from abundance, write from conversation, and write in public. [TRUNCATED TO SHORTEN THE EXAMPLE]"}

Do you need help building AI-powered applications for your business?
You can hire me!

Older post

What's the difference between Langchain Agents and OpenAI Functions?

What should you use when your AI needs access to external systems? Is it better to use Langchain Agents or OpenAI Functions?

Newer post

AI-Powered Topic Modeling: Using Word Embeddings and Clustering for Document Analysis

Explore the seamless integration of artificial intelligence with classical machine learning techniques for effective topic modeling and document clustering. Learn how word embeddings enable higher accuracy, semantic context preservation, and robust results.

Are you looking for an experienced AI consultant? Do you need assistance with your RAG or Agentic Workflow?
Schedule a call, send me a message on LinkedIn. Schedule a call or send me a message on LinkedIn

>