LLMs are getting smarter and quantization algorithms are making those LLMs to be trained on smaller and smaller resources. The architecture and pipelines for building new multimodal LLMs and an ensemble of collaborative LLMs are becoming increasingly complex. Tree of Thoughts is an example of such custom architecture, which shows big potential to improve LLM accuracy significantly.
Building Data Infrastructure of such customizable LLM architecture is still an open yet extremely important problem. We will discuss how such data architectures can be built on scale.