RAGs are great for glossary like structured information. Arbitrarily chunking prose text feels like making garbage out of otherwise good quality text. Ideally prose would be used to generate glossary like documents (LLM aided). Performance plays role here (you'd need to process a lot of text up front) so maybe smaller models could be used?
This is something that has always bothered me about RAG. It seems that it’s fine for first order relevance, like a search engine, but for knowledge there needs to be some kind of rumination stage where it revisits the entire corpus to find information that has a second order relevance to round out its ‘understanding’.
You might be able to approximate by chunking and globbing the chunks and searching for those, as well as having the LLM summarize and extract data and search for those items as well.