Multi-model orchestrator system to solve complex problems (Agora 3.0)

I built a multi-model orchestrator system to solve complex problems, since we know two or more heads think better than one. Plus, seeing them correct each other reveals the weaknesses of each one.

Following my previous article, where I had them debate who would be the next world soccer champion, many ideas came up about how to make them interact, with different dynamics and different objectives. So I decided to stop seeing Python code and CSV outputs, and I converted this system into a friendlier web app. I also increased the number of available models to seven: five of them the most powerful on the market, and two that run locally for now on my personal computer, taking advantage of an NVIDIA RTX 4060 Ti video card I have at home.

The web app is very simple, but I would say it meets the objective and works as follows:

Select the models you want to work with. Each of them has a direct API connection, except ChatGPT, which I connected to an Azure OpenAI environment.
Then you select the mode in which you want to use it. There are basically three modes:

a) Multi-model chat: You can simply have a normal conversation with one or more models at the same time. When you choose more than one, one of them takes the synthesizer role to make your life easier as a user, but when you ask, all the selected models can read your message and the conversation history.

b) Debate mode: One of my favourites and the reason I started this. You define a topic and a framework, and the system assigns roles to the participating models so the debate gets interesting from different angles. Rounds are defined, and at the end of each round a summary is created; when the maximum number of rounds (four) is reached, a global summary of the debate with the conclusions is generated.

c) Cooperative mode: The newest one. Instead of having them debate, I have them work as a team toward the same goal. You define a framework, and I also added an option to inject information through txt, doc, pdf files, etc., so each model has enough context on the topic you want to solve and the research doesn’t come out as a very flat output. When you work with APIs and Python, the internet connection doesn’t come natively like when you use the models on their own platforms, so I had to figure out a way for each of them to search for information every round and give answers with current context or proper backing.

As a concrete example of the cooperative mode, I asked it to investigate when a multi-model architecture really pays off in business processes versus a single well-used model. The result was a decision framework with matrices and clear rules: multi-model pays only when four conditions converge: costly errors, real specialization between models, the need for adversarial review, and operational maturity to measure it. For most workflows, a single model with a good prompt is enough. The fact that the system itself reached this conclusion tells you these are very specific cases, mainly in large organizations where this kind of multi-model development can be useful and highly necessary.

What does this app give me?

Better knowledge of the architecture needed to build this kind of application: Python, Azure, SQL Server, APIs. Plus refreshing some basic knowledge of HTML/CSS and JS.
Understanding that there’s no perfect model, and just like we see them “hallucinate” from time to time, when they interact with each other it becomes even more evident.
Once you’ve had them work in different ways, and on top of that made each of them able to read the others through a SQL Server table, simulating persistent memory, you realize you have a virtual super-team working for you, and the number of uses you can give to this is way too broad.
I now think the next step is to go back to building agent systems, where they interact beyond a conversational sense, with the goal of simplifying a process. I already have several ideas for it.

My personal take on this kind of development in mid-size and large companies is that it’s being limited only to tech teams, but the reality is that every team should be preparing to have someone who can build apps like this for very specific cases in each department. Not long ago, you used to hire analysts to build macros in Excel/Access or to put together dashboards for you. So why not hire people to solve specific problems for each team? I think the answer comes down to the fact that the cost of the human capital to implement this is still very high, not to mention that access to the tools to put together this architecture is expensive, and only mid-size and large companies really have the budget to invest in this. And on the other hand, within those mid-size and large companies, senior executives still don’t have a clear picture of how to adopt all this, and they’re leaving it to the “IT guys”.

View in GitHub

Leave a Reply Cancel reply