Skip to main content

RedactManager Documentation

Requirements

The main pre-requisite is a working, well-prepared Kubernetes environment. As not all companies run a Kubernetes requirement yet, we are providing some links to websites that explain in detail what Kubernetes is, and what is required to set up, run and maintain a k8s environment: https://kubernetes.io/docs/setup/production-environment/

Kubernetes can be run on-premises or used as a cloud service, e.g., provided by Google, AWS, or Azure.

There are also tools for managing Kubernetes clusters, like Rancher and RedHat OpenShift. For further information, please see here:

In case of evaluation environments, we recommend choosing a cloud service - we have wide experience with Azure Kubernetes Service (AKS) - or a Rancher orchestrated Kubernetes cluster.

To run a minimum Rancher server, you need at least 1 Linux server with at least 1 CPU and 4 GB RAM. You will also need full (root) permissions on the server to configure it and an internet connection to install software from online repositories.

  • Linux operating system

  • 1 vCPU

  • 4 GB RAM

  • 100 GB free disk space

  • Internet connection

The Rancher server is required for managing k8s clusters. You need additional servers (nodes) to build the k8s cluster. The required resources are described in the respective product sections below.

Please consult the Rancher documentation for further details if you want to run a Rancher server in production.

To deploy RedactManager, the k8s cluster should be able to retrieve the Docker images of our services from our Docker registry: docker.iqser.com (port 5000)

The operator also needs a Linux computer with the following command-line interfaces:

And, of course, that computer needs to be able to access the following destinations:

RedactManager requires resources like processing power (CPU) and available memory on the Kubernetes cluster. The application also requires external resources such as hostnames (FQDNs, fully qualified domain names), networking and routing, and TLS certificates to encrypt the communication.

Please find below the minimum requirements for installing RedactManager:

Table 1. EVALUATION ENVIRONMENT

K8s version

Resources

Storage

External resources

1.19.x or newer

4 cluster nodes

20 vCPU

60 GB RAM

Databases

3 databases in a PostgreSQL DBMS

  • User DB: 1 GB

  • Main DB: 1 GB

  • Data DB: 10 GB

Either provide storage as available space on the k8s cluster for Persistent Volumes (PVs) or provide access to 3 external databases.

If run in a cloud environment, the cloud providers also provide PostgreSQL services, e.g. Azure Database for PostgreSQL, Amazon RDS for PostgreSQL, or Amazon Aurora.

Index

In case of a local deployment, you can use the integrated Elasticsearch cluster that requires 5 PVs:

3x 5 GB available space for PVs

In case of a cloud deployment, you can consider using the managed Elastic Cloud that is available via https://www.elastic.co/ or the cloud provider marketplaces.

Queue

For queuing, we rely on an integrated RabbitMQ backed by a PV. As this is only temporary processing data, there is currently no need to connect to a managed service.

1x 5 GB available space for the PV

1 FQDN/DNS entry

and

1 TLS certificate



Table 2. PRODUCTION ENVIRONMENT

K8s version

Resources

Storage

External resources

1.19.x or newer

6 cluster nodes or more

30 vCPU

180 GB RAM

Databases

  • User DB: 5 GB

  • Main DB: 2 GB

  • Data DB: 50 GB

File Storage

4x 200 GB available space for PVs

or

400 GB cloud storage

Index

5x 20 GB available space for PVs5x 20 GB available space for PVs

or

3 Elastic.co data nodes with 20+ GB3 Elastic.co data nodes with 20+ GB

Queue

1x 20 GB available space for the PV

1 FQDN/DNS entry

and

1 TLS certificate



More resources allow for a higher processing throughput (more parallel processing), more simultaneous users, and the processing of more data. As a rule of thumb, you can calculate the required file storage space by multiplying the uploaded data by four.

A production system requires a minimum of 5 simultaneous users (i.e., users processing files simultaneously, 100+ users in total) and 100 GB of uploaded PDF files. However, monitoring the production environment to determine the actual scaling requirements.