When I first launched version 0.1 of my DIY chatbot last week, my goal was to create a conversational agent that could synthesize information across my blog posts to answer questions. While v0.1 represented a big first step as a coding novice, it had significant limitations. The knowledge base only included summaries of my posts, and the chatbot struggled to retrieve specific details like numbers or dates accurately. My motivation has been to incrementally improve the chatbot’s capabilities through hands-on learning. This post provides an update after one week of optimizations. My focus has been on expanding the knowledge base to include full article content, strengthening security measures, and optimizing the AI model for cost-effectiveness. While the chatbot still has a long way to go, these initial upgrades represent promising progress.
Here is what has been done since the last version:
The chatbot includes all the latest posts up to the end of October 2023
I published six blog posts in October, all of which are now included in the chatbot’s knowledge base. For example, you can ask questions like: “Tell me the key insights about Coursera that Chandler wrote” and the chatbot can provide an answer based on the articles I wrote in Oct 2023.
The knowledge base includes the full blog post vs just the summary
For the first version, because of my lack of knowledge/experience working with the context window length limitation from OpenAI API, I had to generate the summary of each blog post and then embed them using OpenAI’s embedding API endpoint. I didn’t know how to chunk the full article into smaller sections and embed each while keeping the context intact (i.e. metadata, URL, title, etc…).
The first iteration of chunking has been done this week. The chatbot knowledge base now has the full blog post vs just the summary. There is still room for improvement as I continue to refine the ideal chunking limit and approach for long articles. As of now, I split long articles into smaller sections with no more than 800 tokens in length and use “paragraph” as a natural break point.
This way in the next version, it can answer questions about specific numbers written easier.
The blog post’s publish date, title, and URL are included
After the user enters a question, during the retrieval process, the blog post title, publish date, and URL are provided to the chatbot too. So the chatbot can now provide the specific URL or the publish dates back to users if asked.
This will help with validation in case users want to double-check or read the entire blog post.
Super basic security measures are implemented like:
- Validating inputs before they are sent to the API
- Basic rate limiter function i.e. how many questions you can ask per minute to the chatbot
- Queries and chatbot responses are validated with OpenAI moderation API
3. GPT-4 vs GPT-3.5
I will continue to use the GPT-3.5 model for the chatbot because of the cost. GPT-4 answers are much better than GPT-3.5. However, because I haven’t found the best way to work with the context window length while maintaining the conversation history yet, for complex questions that the chatbot needs to look through multiple posts, the cost per answer is high like $0.15 or 0.2 per answer from GPT-4. I can’t afford that yet for this pet project.
P.S.: The new course on Generative AI for Everyone, taught by Andrew Ng, does not disappoint. It is a Free course, that will give you general knowledge about Generative AI, how typical Gen AI software or web applications are being built, etc… It is pretty short too so you can watch all the videos within a weekend.