Skip to content
Marketing Factory Digital GmbH
Contact
Logo Marketing Factory Digital GmbH
  • Agency
    • About us
    • History
  • Services
    • Consulting, Analysis and Strategy
    • Programming and Development
      • Interface Development
      • PIM/ERP Links
      • Custom Development
      • Seamless CMS Integration
    • Hosting and Support
      • Cloud Strategies
      • Hosting Partners of Marketing Factory
    • Services with Third Parties
  • Technology
    • TYPO3
      • Current TYPO3 Versions: v12, v13
    • Shopware
    • IT Security
      • DDoS Protection
      • Continuous Upgrading
      • Privacy First
    • Tech Stack
      • Commitment to Open Source
      • Technology Selection
      • PHP Ecosystem
      • Containerisation & Clustering
      • Content Delivery Networks
      • Search Technologies
  • References
    • Projects
    • Clients
      • Client List
    • Screenshot of the homepage of the new Maxion Wheels websiteNEW: Relaunch of the corporate website of Maxion Wheels
  • Community
    • Community Initiatives
  • Blog
  • Contact
  • Deutsch
  • English

You are here:

  1. Blog
  2. Check the container! – Solr is moving
A black magnifying glass is positioned next to a silver laptop on a light-colored surface. The image highlights the magnifying glass as a tool for search and examination, suggesting focus and inquiry.
  • Hosting
  • TYPO3
10.10.2025

Check the container! – Solr is moving


Show larger version for: A screenshot of a code repository showing a directory structure related to Apache Solr configuration files. Key files listed include "solr.xml," "zoo.cfg," and "config" directories, with commit messages noted for updates and tasks related to security and configuration.
Solr configuration set from the TYPO3 extension “solr” by dkd

The Apache Solr search server and its integration into TYPO3 have always been part of the technology stack for the majority of our customer projects. Solr offers a powerful phonetic full-text search that can index and make searchable not only the editorial page content in TYPO3, but also news and glossary entries as well as any other data record types. Technically, the search consists of the actual search server, which is a Java application based on the Lucene library, and a TYPO3 extension based on the PHP library Solarium, which communicates with the search server and updates changes to the data in TYPO3. This ensures that the search index is always up to date. The overall architecture of the search feature is quite challenging for some people because it is often not entirely clear in conversations whether someone means the Apache Solr search server or the TYPO3 extension “solr” when they say “Solr.” However, this is particularly important when it comes to version numbers.

The language in which the content is available is, of course, crucial for the functioning of Apache Solr. A number of parameters that influence the behavior of the subsequent search function depend on this. For example, Solr has an automatic stemming function, which means that it reduces plural words such as “trees” to the singular form “tree.” It also takes phonetic similarity into account—so that a search for “Stephan” also finds entries with “Stefan,” for example. This is achieved by creating at least one so-called core for each language in Solr. This behaves similar to a MySQL database and stores the data in exactly one defined language, each with its own phonetic rules and a stop word list.

We also create separate cores for TYPO3 installations with multiple sites, even if the language is otherwise the same. This has advantages when it comes to non-public areas such as style guides or playgrounds.

Show larger version for: The image displays a web interface for managing a Solr container. It shows the hostname, short ID, last activity date, and image details, including the specific version of Solr being used. Action buttons for recreating or stopping the container are also present.
The running Solr container of our own website as displayed in mStudio

Solr and Mittwald

Many of our projects are operated by our partner Mittwald. From the outset, we have relied on the hosted Solr service, which provided the operation of the Solr search server on Mittwald's infrastructure. Solr runs on Mittwald's older (non-cloud-based) infrastructure. Although existing services remain available, it is gradually being phased out and new plans can no longer be booked there. Therefore, a new solution was needed.

However, our customer projects at Mittwald run on the newer cloud infrastructure, which can be managed via the mStudio management interface. One of the newer features now available there is the option of container hosting. This allows a project to start Docker containers and access them. Since Solr is available as a Docker container in principle, it made sense for us to run Solr as a Docker container alongside the projects in order to continue providing the functionality. This also has the advantage that the search server is operated closer to the application from a technical perspective, which reduces latency during access and promises slightly higher performance overall. In addition, it is also true that booking separate Solr packages has always been associated with additional costs, which are significantly lower in direct comparison with container hosting.

Our general approach

The idea of moving the Solr instances was thus born. The next step was to develop a way to recreate the various Solr plans as containers. Subsequently, each affected TYPO3 installation had to be given the new URL for the Solr server and new credentials, and then completely reindexed once so that the new Solr search server could be filled with content. We took the opportunity to upgrade all Solr instances to the latest software version at the same time.

Operating Solr as a container involves a little extra work. For example, we have to make sure that the configuration required by TYPO3 is loaded into the search server. In addition, the required Solr cores must be created. Based on the official Solr Docker image, several XML files with the field configuration and the basic configuration of Solr itself are stored in its data directory.

We are already successfully using Terraform to automatically manage part of our infrastructure. Fortunately, there is a Terraform provider for mStudio that allows us to programmatically create resources there. This will enable us to later place the Solr administration interface on internally used host names and make it accessible, including authentication. This approach provides us with infrastructure as code and creates a solution that we can continue to use in the long term, even when new site languages – and thus Solr cores – are added.

Mittwald itself has recognized that there is a need to create Solr instances programmatically on the cloud infrastructure. Therefore, there is a corresponding Terraform module that shows a possible way to implement this with Terraform. The module starts a container based on the official Solr Docker image and then supplements the configuration set from the TYPO3 extension solr repository. This way, it can also be updated later. As a result, a typical project configuration now looks like this:

locals {
  mittwald_project_id = "02159b81-c0bb-4f62-ae57-8dd2e6bd5113"

  solr_version = "9.9"
  solr_heap = "1g"
  solr_hostname = "solr.live.mfc.invalid"

  solr_users = {
    cs = "9PxfZq9si08ebsxB1pwOylsB6uw3nCAkKz4dlKPzfws= ZGtkNHhyNWE5czAxN2xhZA=="
    mp = "2/JqksIj3gA3lXgA7E9BuW9D4t8Ki+x/yWLRUGcFFBY= bjljYzdzbnYwNGVxcTB5ag=="
    sk = "wE0pHBx3d0osQUP2i2LSDiGeqfJhSuc1viaWB5+y6Ao= cWIwZTY3cmNkbDh2aXB1Ng=="

    mfd = "Slw0Dts2bGMS7hKEyuXSxYMpahP8eaAF/3wAXyluwiQ= ZHJlcHlhcGwwcHlpZXBvcQ=="
  }
  solr_cores = {
    "www_marketing_factory_de_de_de" = {
      language = "german"
    }
    "www_marketing_factory_com_en_us" = {
      language = "english"
    }
    "sg_marketing_factory_de_de_de" = {
      language = "german"
    }
    "sg_marketing_factory_com_en_us" = {
      language = "english"
    }
  }
}

It defines the Solr version, the Solr cores to be set up, including their language, and the users who should have access. This includes one at the end that TYPO3 itself will use. At the beginning, you specify the Mittwald project in which the whole thing should be created. The project ID is stored for this purpose—a UUID that uniquely describes each project in the cloud infrastructure.

Security considerations

One aspect must not be forgotten: security. The Solr instances previously provided by Mittwald were operated behind a reverse proxy, which also performed authentication via HTTP Basic. The Solr container itself does not currently provide this functionality. TYPO3 later communicates with the container via an internal, private network, so this is not an issue. However, you will usually want to access the Solr admin from the outside as well. Since there is no built-in option in the cloud infrastructure to enable authentication for domains (Mittwald delegates this task to the underlying application), we have enabled authentication in Solr.

This is done using a security.json file, which is stored in the Solr data directory. This would even allow fine-grained role and rights management. However, since we do not need this level of complexity, we have decided not to use it. The necessary user passwords are stored as hashes. There are online tools that can generate the required format, including salt, locally via JavaScript in the browser, which are very useful. Each user can then hash their own password and commit it to the configuration repository, ensuring confidentiality is maintained.

In the example, this file looks as follows (the hashes are not real, of course, and have been newly generated for this example 😉):

{
  "authentication": {
    "blockUnknown": true,
    "class": "solr.BasicAuthPlugin",
    "credentials": {
      "cs": "9PxfZq9si08ebsxB1pwOylsB6uw3nCAkKz4dlKPzfws= ZGtkNHhyNWE5czAxN2xhZA==",
      "mfd": "Slw0Dts2bGMS7hKEyuXSxYMpahP8eaAF/3wAXyluwiQ= ZHJlcHlhcGwwcHlpZXBvcQ==",
      "mp": "2/JqksIj3gA3lXgA7E9BuW9D4t8Ki+x/yWLRUGcFFBY= bjljYzdzbnYwNGVxcTB5ag==",
      "sk": "wE0pHBx3d0osQUP2i2LSDiGeqfJhSuc1viaWB5+y6Ao= cWIwZTY3cmNkbDh2aXB1Ng=="
    },
    "forwardCredentials": false,
    "realm": "Solr administration",
    "scheme": "basic"
  },
  "authorization": {
    "class": "solr.RuleBasedAuthorizationPlugin",
    "permissions": [
      {
        "name": "security-edit",
        "role": "admin"
      }
    ],
    "user-role": {
      "cs": "admin",
      "mfd": "admin",
      "mp": "admin",
      "sk": "admin"
    }
  }
}

Verdict

All previous Solr plans have now been migrated, and the migration went smoothly. The new setup allows us to run the services related to a project together with the project itself on the same infrastructure. The use of Terraform has the additional advantage that we can also use Terraform to perform necessary tasks such as entering IP addresses for access to Solr Admin, provided that there is a Terraform provider for the respective domain registrar. This further reduces manual work and makes the overall process less error prone.

With container hosting, we can now move other things that are currently being operated around the projects, in addition to Solr, as well - provided they are available as Docker images. We already have some ideas in this regard, so stay tuned!

Christian Spoo

"Mr. Fix-It" likes to impose his will on software and hardware. Speaks fluent meme and picdump. Responsible for development and technical design at Marketing Factory.

More posts by this author

Get blog posts as RSS feed

Related blog posts

  • Efficient review environments: Why we replaced Kubernetes with Virtual Machines
  • AWS CloudFront log analysis in Matomo
  • The Great Firewall – The challenges of hosting in China
  • Domain management (2): Domain administration in the company

Please feel free to share this article.


Comments

No comments yet.

Write a comment.

I have been informed that the processing of my data is on a voluntary basis and that I can refuse my consent without detrimental consequences for me or withdraw my consent at any time to Marketing Factory Digital GmbH by mail (Erkrather Straße 401, D-40231 Düsseldorf) or e-mail (info@marketing-factory.de).

I understand that the above data will be stored for as long as I wish to be contacted by Marketing Factory. After my revocation my data will be deleted. Further storage may take place in individual cases if this is required by law.

  • Data privacy policy
  • Legal notice

© Marketing Factory Digital GmbH

Picture Credits
  1. "Die Lupe sitzt neben dem Laptop auf einem Tisch": MJ Duford / License: Unsplash License
  2. Picture: © Christian Spoo / Marketing Factory Digital GmbH
  3. Picture: © Christian Spoo / Marketing Factory Digital GmbH