相关标签
ocrpdf-parserkiedocument-translationragchineseocrai4sciencepp-ocrdocument-parsingpp-structure

Here are 157 public repositories matching this topic...

Convert documents to structured data effortlessly. Unstructured is open-source ETL solution for transforming complex documents into clean, structured formats for language models. Visit our website to learn more about our enterprise grade Platform product for production grade workflows, partitioning, enrichments, chunking and embedding.

  • Updated Jun 4, 2026
  • HTML

Hybrid RAG system combining vector search, knowledge graph (LightRAG), and cross-encoder reranking — with Docling document parsing, visual intelligence (image/table captioning), agentic streaming chat, and inline citations. Powered by Gemini or local Ollama models.

  • Updated Apr 20, 2026
  • Python