chilir_

chat-thyme

February 10, 2025

We do these things not because they are easy, but because we thought they were going to be easy. - Unknown

In the early morning hours of February 7th, I publicized chat-thyme, a system for setting up Discord bots interfacing with large language model (LLM) services with OpenAI chat completions API compatibility. A couple of hours later on the same day, I was scheduled for a small surgical procedure - the first surgery I’ve ever had in my life. The fear of having my first surgery was weirdly subdued by the fear of releasing my first project in public. Would people like it? Would people even use it? For better or worse, my post on X didn’t get much traction (in hindsight, I should have waited for my blue checkmark to finish processing before making the post). As I sit here with a small reprieve from work while I recover from surgery, I want to reflect on the development process for chat-thyme and put some of my thoughts to paper.

Background

I started chat-thyme as a small project over Christmas, as a way to fill downtime between visiting friends and family when I flew back home to Calgary. My original goals for the project were a bit broader in scope than what ended up making it into the initial release - I had envisioned that I would be able to describe chat-thyme as an “omni-interface” for LLMs. Interact with models in one interface, pick up and go in another! The original plan was to have three user-facing interfaces: a command-line interface (CLI), a Discord chatbot, and a web user-interface (UI) similar to Open WebUI. After all, how long could it possibly take make an application to send and receive messages and present it through a UI? As it turns out, while that line of thinking isn’t necessarily wrong, there were a couple of external factors I did not initially take into account:

  • Some experimental components don’t play well when you try to glue them together
  • Some experimental components are, well, experimental and may literally not work yet

As my (perceived) progress on the project dragged on over the course of January (ripping my hair out trying to get Gemini to work only to find out that my use-case literally wasn’t supported yet) and work ramped up again in the new year, I decided to cut my losses and aggressively descope the project, limiting it to just a Discord bot for the initial release. With this being my first public open source project, a lot of doubt crept into my mind. Will it gain traction? Will I find enough time outside of work to continue progress, review contributions, etc.? Well, no answers to those questions for now. We’ll just have to see if it even gains any usage outside of my own. Even if it doesn’t, I get enough use out of it for myself and overall the time spent and learning experience was worthwhile.

Bun and TypeScript

I mainly work with Python at work, so when it came to side projects, I wanted to build with something I don’t work with often just to keep things fresh. chat-thyme was a good opportunity to dive a bit deeper with Bun and TypeScript. While I have occasionally used JavaScript/TypeScript in the past whenever I stepped into frontend-land on an as-needed basis, this is the first time I’m building a full application in TypeScript. Overall I had a really good experience! Bun is incredibly fast as a package manager compared and well-stocked in functionality for an all-inclusive experience.

One minor detraction from the experience came from using the Bun test runner, though this may be more of an artifact of me not being familiar with how unit testing works in JavaScript/TypeScript. As it turns out, if a module is mocked, running mock.restore() does not reset the value of modules. So module mocking would interfere with subsequent test suites using the same module. This was counter to my past experience in Python with pytest where monkeypatching is relatively isolated at the unit test level.

Nix

Mitchell Hashimoto’s blog post on using Nix with Dockerfiles was a huge influential factor in the choice to try out Nix for this project. The premise of environments “just working” with a single source of truth was too enticing to ignore, even with the fear of Nix being a bit complex for newcomers. Fortunately, the Determinate Systems Nix installer and their Zero to Nix resource helped out a lot with getting started with Nix and made the whole experience as pain-free as it could be.

That being said, I did have some difficulty setting up the Nix flake for chat-thyme. Through trial and error, I learned enough to get around the file structure (setting up dev shells and setting up default package outputs), but I’m still far from an expert. Another issue I was weirdly encountering was that the Determinate Systems Nix installer GitHub Action just straight up did not work with my CI/CD workflows, but swapping it out another Nix install action was relatively straightforward.

But once everything is set up, it really just works! So magical that it’s legitimately cathartic. Would highly recommend anyone on the fence to try out Nix (keeping in mind the caveat that some initial tinkering may be required initially to get things just right).

OpenAI Compatibility

The original intention was to have chat-thyme be a router of sorts - have support for a limited number of providers to start off and have bespoke processing logic for each provider. This was because I started off only using Ollama as it was easy to test locally on a M1 MacBook Pro (the woes of being GPU poor 😢). I had initially assumed I would need to make HTTP requests to each provider, but this changed when I realized most providers supported using the OpenAI client! When I realized that OpenAI chat completions API compatibility was broadly supported, I pivoted to have chat-thyme act as more of an adapter of sorts, just channeling everything through the OpenAI client. That being said, although chat-thyme is touted to have almost universal compatibility thanks to the broad OpenAI compatibility support, I was unable to validate myself whether or not there would be any issues with model responses served through vLLM and SGLang.

Around the same time DeepSeek R1 was released, there was a lot of new ways for how reasoning content is included in model responses. Unfortunately the OpenRouter response format was slightly different from the DeepSeek response format (reasoning vs reasoning_content). For now, these two formats are the only ones supported. Hopefully there’s some standardization in the near future.

Tool use, especially through a more layered provider like OpenRouter, has also had some compatibility issues. There are a number of times where 400 errors would be received but no further information. It’s unknown if the request formatting error comes from how chat-thyme is formatting the request or if the issue comes from some sort of additional post-processing that OpenRouter applies. Either way, it’s a bit spotty at the moment and I can only really recommend users to enable tool use for locally hosted models (works like a charm through Ollama :chef’s kiss:).

Another thing I wanted to touch on…Google. Experience was…subpar to say the least. The most surprising thing was just how sparse and lacking the documentation was. There is a brief section on tool use/function calling, but there is no indication anywhere that multi-turn tool use is not supported yet. Though to be fair, maybe I as the user should have taken the empty tool_call_id field as part of each tool call response as an indication that this was the case. That being said, I trust @OfficianLoganK and the improvement to Google’s AI products and tooling have been improving at a rapid pace.

Lastly, there doesn’t seem to be a consensus yet on what the format of the tool call output should be when passing it back to the model. I’ve decided to format it like this for now and the results seem decent:

{
	"role": "tool",
	"content": [
		{
			"type": "text",
			"text": {
				"name": <tool/function name>,
				"response": <tool/function output>
			}
		}
	]
}

Exa

While we’re on the topic of tool use, the functionality to interface with Exa was something I really wanted to include as part of chat-thyme. Conceptually, adding semantic search seemed like a no-brainer and easy extension to LLM functionality. Pricing is pretty reasonable too, I never even got close to burning through the initial free credits through all of my testing.

In practice, Exa is great for semantically relevant informational searches, but it can be a bit hit or miss for current events. Since chat-thyme does not force the model to use tools as part of ever response, users would need to prompt the model for exact timeframes went it comes to current events (i.e. “Which teams are playing in the Super Bowl this year?” vs “Which teams are playing in the Super Bowl this year? The current year is 2025”). Even then, the results are still a bit random. For example, I have not had any success when asking for what price a stock ticker closed at for the day.

One thing I definitely want to incorporate in the future is an “arXiv mode” of sorts to have Exa specifically search up relevant literature through arXiv! In my opinion, this is probably one of the biggest use cases for Exa. Stay tuned on this!

Robustness

Small note on robustness, chat-thyme has a relatively simple implementation of an in-memory cache for database connections (with least recently used (LRU) + time-to-live for cache eviction) as a connection pool. There is also a simple in-memory message queue for each chat session to ensure in-order user message processing.

SQLite

This was probably the one choice I regret with this project, not because there’s anything wrong with SQLite, but because NoSQL document databases probably was the more correct choice for storing LLM chat responses.

Starting out, my initial idea was just dumping JSON files to disk - one output for each chat session for each user. The choice to persist chat session data to a database came from a desire for something more robust, and SQLite seemed sufficient enough for the current use case. At first, it seemed like all that needed to be stored for each message was the role and the content. Easy, right?

The problem came when I started implemented tool use in chat-thyme - now I needed to migrate the schema to add a column for tool_call_id. But wait, what about assistant tool calls themselves? Okay, let’s add another column…and so on and so forth. As the structure in responses can be fairly variable, I realized that I should have used a NoSQL document DB instead. It’s no wonder that OpenAI themselves use Cosmos DB.

For now the chat-thyme set up I have with SQLite works, but switching databases would definitely be a top priority for a 2.0 release.

Concluding Thoughts

In the end, I wanted to write this blog post not just to put some of my thoughts on the development process to paper, but also to take the opportunity to plug chat-thyme a bit more 😂. In the future, I still want to explore CLI and web UI interfaces - maybe I can try to integrate Open WebUI somehow. Definitely want to implement an “arXiv mode” for Exa search tool use and definitely want to migrate from SQLite over to a document DB for persistence - though the latter would be a rather large overhaul, might be a 2.0 thing.

Overall, fun project to work on. I definitely need to take care of my health a bit better though. There was probably diminishing gains in staying up late to make sure CI/CD works smoothly, would have been better to just get a full night’s sleep so I can be refreshed and make less typos the next day 😴.

If you’re reading this and haven’t had a chance to take a look at chat-thyme yet, please check it out at https://github.com/chilir/chat-thyme!