Table of Contents

Github
Libraries
- Microsoft's MarkItDown
- QuivrHQ/MegaParse

19 Apr 2025

Branching out my simple converter script to a full-fledged project.

The goal is to create a powerful Markdown parser, testing several approaches along the way, that can handle various formats and convert them into the cleanest Markdown possible.
Possibly leveraging AI for post-processing.

This project is important as it is the foundation to feed clean data into AI workflows.

Github

Project code here:

GitHub - ndeville/markdownee

github.com

https://github.com/ndeville/markdownee

Libraries

Microsoft's MarkItDown

GitHub - microsoft/markitdown: Python tool for converting files and office documents to Markdown.

github.com

https://github.com/microsoft/markitdown

QuivrHQ/MegaParse

GitHub - QuivrHQ/MegaParse: File Parser optimised for LLM Ingestion with no loss 🧠 Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.

github.com

https://github.com/QuivrHQ/MegaParse

Pandoc - CLI document converter

https://pandoc.org/

Pandoc - index

pandoc.org

https://pandoc.org

Pandoc is a universal document converter that can convert files between various markup formats, including Markdown, HTML, LaTeX, and more. It's useful for converting documents from one format to another, and it also includes features like automatic citations and bibliographies, as well as customization options through templates and filters.

ppt2desc: Convert PowerPoint files into semantically rich text using vision language models

GitHub - ALucek/ppt2desc: Convert PowerPoint files into semantically rich text using vision language models

Convert PowerPoint files into semantically rich text using vision language models - ALucek/ppt2desc

https://github.com/ALucek/ppt2desc

The text you've provided appears to be the README file from a GitHub repository for a project named "Convert PowerPoint files into semantically rich text using vision language models." Here's a breakdown of what the project does and how it works based on the information given:

Project Overview

This project aims to convert PowerPoint (.pptx) files into semantically rich text using Vision Language Models (VLLMs). The goal is to provide detailed descriptions of the content in each slide, including text, images, diagrams, and other elements.

How It Works

LibreOffice Conversion: The project utilizes LibreOffice to convert PowerPoint (.pptx) files into a format that can be processed by the vision language models. This step is crucial for handling the complex structures and layouts found in presentation slides.
Vision Language Models (VLLMs): After conversion, the project employs VLLMs to analyze the content of each slide. These models are capable of understanding both textual and visual elements within an image or document, generating detailed descriptions of what they "see."
JSON Output: The output from the VLLM analysis is structured in JSON format. Each JSON file contains information about the presentation deck, including the model used for analysis, and a detailed description of each slide's content.

Github

Libraries

Microsoft's MarkItDown

QuivrHQ/MegaParse

Pandoc - CLI document converter

ppt2desc: Convert PowerPoint files into semantically rich text using vision language models

Project Overview

How It Works

links

social