Welcome to pykomodo's Documentation! ===================================== Github ------- .. image:: https://img.shields.io/badge/GitHub-View%20on%20GitHub-blue?logo=github :target: https://github.com/duriantaco/pykomodo :alt: View on GitHub Introduction ------------- Welcome to pykomodo -a Python-based parallel file chunking system. Our goal is to convert or chunk massive codebases or mixed file dirs into bite-sized, LLM-ready chunks. You got semantic chunking for Python, PDF text ripping, ignore/unignore patterns, and multi-threaded speed. Whether you’re prepping a dataset for machine learning or just organizing chaos, pykomodo’s got your back. **Key Features:** - Parallel processing - File filtering with custom patterns - Chunking styles: equal splits, size caps, semantic (AST-based), PDF-specific - LLM tweaks like metadata and deduping - Dry-run mode to test your setup Contents --------- .. toctree:: :maxdepth: 2 :caption: Table of Contents quickstart installation usage chunking_guide cli_reference api_reference contribution troubleshooting Indices and tables ------------------- * :ref:`genindex` * :ref:`modindex` * :ref:`search`