
Using Our TYPO3 Extension ai_filemetadata with Mittwald’s AI Hosting
When we first developed our ai_filemetadata extension, we relied on the OpenAI API because it was one of the earliest stable and widely accessible options.
Today, multiple providers offer hosted open-source LLM models. One of them is our partner Mittwald, which recently launched its AI-Hosting. At the time of writing, the service is still in beta.
In this article, we show how to connect and use our extension with Mittwald’s AI Hosting — and how the results compare to OpenAI.

Requirements
To follow this guide, you will need:
- A TYPO3 installation v12+ with the extension ai_filemetadata 1.3.5+
- An mStudio account with an active project at Mittwald
If you don’t have one yet, you can create a project during registration. Once the project is active, AI Hosting can be enabled.
Setting Up AI Hosting
Inside your project dashboard under Components, activate AI Hosting by creating an API key. We recommend enabling the Open WebUI container at the same time for easier testing.

Once logged in, select a model and run a quick test.
The documentation provides guidance on the available models and their strengths.
TYPO3 Extension Configuration
In TYPO3 → Settings → Extension Configuration → ai_filemetadata, enter:
| Extension Parameter | Configuration value | 
|---|---|
| apiBaseUri | Base-URLfrom mittwald Backend | 
| apiKey | API-Keyfrom mittwald Backend | 
| projectId | Key-Namefrom mittwald Backend | 
| model | Mistral-Small-3.2-24B-Instruct | 
During our tests, we occasionally encountered issues with very large images. Therefore, we recommend setting the value for imageResizing. We use 1024 so that TYPO3 scales images down to a maximum of 1024×1024 pixels before they are passed to the LLM.
[Translate to English:] Vergleich der Ergebnisse
[Translate to English:] Für unseren Test haben wir sämtliche Alt-Texte der Bilder unserer Website in Deutsch und Englisch sowohl mit OpenAI (gpt-4o-mini) als auch mit Mittwald (Mistral-Small-3.2-24B-Instruct) erzeugen lassen und anschließend inhaltlich verglichen.
Der Test war zugleich ein kleiner Benchmark für das mittwald KI-Hosting. 649 Bilder mit jeweils zwei Alttexten wurden in etwa zweieinhalb Stunden verarbeitet, das ist die gleiche Größenordnung wie bei openAi.
Wie zu erwarten, unterscheiden sich die Ergebnisse. Schon bei wiederholten Aufrufen mit demselben Modell erhält man leicht unterschiedliche Beschreibungen. Das ist normal.
Fazit
Beide Modelle liefern für unseren Zweck gut nutzbare Ergebnisse: Beschreibende Alt-Texte für Menschen mit Sehbeeinträchtigung. Keine Beschreibung war vollständig falsch oder unbrauchbar. Es gibt daher keinen zwingenden Grund, ein bestimmtes Modell zu bevorzugen.
Beim genaueren Hinsehen fällt auf, dass gpt-4o-mini auf einen größeren Wissensbestand zugreift. Unser neues Bürogebäude wird dort korrekt als „Arkadengebäude“ erkannt, während Mistral schlicht „modernes Bürogebäude“ schreibt.
OpenAI ergänzt gelegentlich kleine Details, etwa den Hinweis, dass es sich um einen Screenshot handelt. Solche Zusatzinformationen sind für die Barrierefreiheit jedoch nicht wesentlich.
Result comparison (examples)
| gpt-4o-mini | Mistral-Small-3.2-24B-Instruct | 
|---|---|
| The image displays a web interface for managing a Solr container. It shows the hostname, short ID, last activity date, and image details, including the specific version of Solr being used. Action buttons for recreating or stopping the container are also present. | A black magnifying glass lies on a light-colored surface next to a partially visible closed laptop. The magnifying glass symbolizes investigation or scrutiny, often associated with research, analysis, or problem-solving. | 
| The photo shows the Arcade Building of the Factory Campus in Düsseldorf with large windows, surrounded by manicured greenery and bushes. The sky is mostly blue with a few clouds. | A modern, multi-story office building with large glass windows and white columns. The building is surrounded by a well-maintained lawn with trees and shrubs. The sky is partly cloudy. The structure has a sleek, contemporary design. | 
| Screenshot showing the extension with an english text | A screenshot of a software interface displaying file metadata. The interface includes fields for editing metadata such as alternative text, categories, and picture credits. The file shown is an image named "screenshot_ai_filemetadata_text_aus_typo3_12_backend.png. | 
| The image contains a form interface for editing image metadata, highlighting fields for image title, creator's name, and relevant links. The primary focus is on the input sections related to an image titled "Skyline von Düsseldorf" and the creator "Michael Mustermann." | Skyline of Düsseldorf showcasing the city's prominent buildings and architecture. The image captures the urban landscape, highlighting the city's distinctive skyline. | 
| A workflow pipeline is displayed, showing various job stages: 'prepare,' 'analysis,' 'build,' and 'deployment.' Each stage lists specific jobs with checkmarks indicating completion, such as 'collect-build-dependencies' and 'frontend-build,' while some jobs are in progress. | The image shows a CI/CD pipeline with four stages: prepare, analysis, build, and deployment. Each stage contains multiple jobs, such as collecting build dependencies, running code analysis tools, building the frontend, and deploying the application. All jobs have been successfully completed, indicated by green checkmarks. | 
| The image displays a grid of line charts illustrating HTTP average latency metrics for different IP addresses over time. Each chart features time on the horizontal axis and latency in milliseconds on the vertical axis, with current latency values highlighted for each IP. | The image displays twelve graphs showing HTTP average latency measurements over time. Each graph represents different websites or endpoints, with latency values ranging from milliseconds to seconds. The graphs illustrate fluctuations in latency, indicating varying response times for each endpoint. | 
| The image depicts miniature crime scene investigators in white suits examining a yellow smiley face with a red mark on it. They are using tools and equipment in a wooden environment, suggesting a humorous take on a serious investigation. | A group of forensic investigators in white suits examine a large yellow emoji with a sad face and a red tongue, which appears to be lying on the ground. The scene is set up like a crime investigation, with equipment and tools around the emoji. | 
Please feel free to share this article.

Comments
No comments yet.