19 Apr 2025
Branching out my simple converter script to a full-fledged project.
The goal is to create a powerful Markdown parser, testing several approaches along the way, that can handle various formats and convert them into the cleanest Markdown possible.
Possibly leveraging AI for post-processing.
This project is important as it is the foundation to feed clean data into AI workflows.
Github
Project code here:

Libraries
Microsoft's MarkItDown

QuivrHQ/MegaParse

Pandoc - CLI document converter
https://pandoc.org/

Pandoc is a universal document converter that can convert files between various markup formats, including Markdown, HTML, LaTeX, and more. It's useful for converting documents from one format to another, and it also includes features like automatic citations and bibliographies, as well as customization options through templates and filters.
ppt2desc: Convert PowerPoint files into semantically rich text using vision language models

The text you've provided appears to be the README file from a GitHub repository for a project named "Convert PowerPoint files into semantically rich text using vision language models." Here's a breakdown of what the project does and how it works based on the information given:
Project Overview
This project aims to convert PowerPoint (.pptx) files into semantically rich text using Vision Language Models (VLLMs). The goal is to provide detailed descriptions of the content in each slide, including text, images, diagrams, and other elements.
How It Works
-
LibreOffice Conversion: The project utilizes LibreOffice to convert PowerPoint (.pptx) files into a format that can be processed by the vision language models. This step is crucial for handling the complex structures and layouts found in presentation slides.
-
Vision Language Models (VLLMs): After conversion, the project employs VLLMs to analyze the content of each slide. These models are capable of understanding both textual and visual elements within an image or document, generating detailed descriptions of what they "see."
-
JSON Output: The output from the VLLM analysis is structured in JSON format. Each JSON file contains information about the presentation deck, including the model used for analysis, and a detailed description of each slide's content.