Merge in RR/pyinfra from RED-6366-refactor to master
Squashed commit of the following:
commit 8807cda514b5cc24b1be208173283275d87dcb97
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Fri Mar 10 13:15:15 2023 +0100
enable docker-compose autouse for automatic tests
commit c4579581d3e9a885ef387ee97f3f3a5cf4731193
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Fri Mar 10 12:35:49 2023 +0100
black
commit ac2b754c5624ef37ce310fce7196c9ea11bbca03
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Fri Mar 10 12:30:23 2023 +0100
refactor storage url parsing
- move parsing and validation to config where the connection url is
actually read in
- improve readability of parsing fn
commit 371802cc10b6d946c4939ff6839571002a2cb9f4
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Fri Mar 10 10:48:00 2023 +0100
refactor
commit e8c381c29deebf663e665920752c2965d7abce16
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Fri Mar 10 09:57:34 2023 +0100
rename
commit c8628a509316a651960dfa806d5fe6aacb7a91c1
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Fri Mar 10 09:37:01 2023 +0100
renaming and refactoring
commit 4974d4f56fd73bc55bd76aa7a9bbb16babee19f4
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Fri Mar 10 08:53:09 2023 +0100
refactor payload processor
- limit make_uploader and make_downloader cache
- partially apply them when the class is initialized with storage and
bucket to make the logic and behaviour more comprehensive
- renaming functional pipeline steps to be more expressive
commit f8d51bfcad2b815c8293ab27dd66b256255c5414
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Thu Mar 9 15:30:32 2023 +0100
remove monitor and rename Payload
commit 412ddaa207a08aff1229d7acd5d95402ac8cd578
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Thu Mar 2 10:15:39 2023 +0100
remove azure connection string and disable respective test for now for security reasons
commit 7922a2d9d325f3b9008ad4e3e56b241ba179f52c
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Wed Mar 1 13:30:58 2023 +0100
make payload formatting function names more expressive
commit 7517e544b0f5a434579cc9bada3a37e7ac04059f
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Wed Mar 1 13:24:57 2023 +0100
add some type hints
commit 095410d3009f2dcbd374680dd0f7b55de94c9e76
Author: Matthias Bisping <matthias.bisping@axbit.com>
Date: Wed Mar 1 10:54:58 2023 +0100
Refactoring
- Renaming
- Docstring adjustments
commit e992f0715fc2636eb13eb5ffc4de0bcc5d433fc8
Author: Matthias Bisping <matthias.bisping@axbit.com>
Date: Wed Mar 1 09:43:26 2023 +0100
Re-wording and typo fixes
commit 3c2d698f9bf980bc4b378a44dc20c2badc407b3e
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Tue Feb 28 14:59:59 2023 +0100
enable auto startup for docker compose in tests
commit 55773b4fb0b624ca4745e5b8aeafa6f6a0ae6436
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Tue Feb 28 14:59:37 2023 +0100
Extended tests for queue manager
commit 14f7f943f60b9bfb9fe77fa3cef99a1e7d094333
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Tue Feb 28 13:39:00 2023 +0100
enable auto startup for docker compose in tests
commit 7caf354491c84c6e0b0e09ad4d41cb5dfbfdb225
Merge: 49d47ba d0277b8
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Tue Feb 28 13:32:52 2023 +0100
Merge branch 'RED-6205-prometheus' of ssh://git.iqser.com:2222/rr/pyinfra into RED-6205-prometheus
commit 49d47baba8ccf11dee48a4c1cbddc3bbd12471e5
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Tue Feb 28 13:32:42 2023 +0100
adjust Payload Processor signature
commit d0277b86bc54994b6032774bf0ec2d7b19d7f517
Merge: 5184a18 f6b35d6
Author: Christoph Schabert <christoph.schabert@iqser.com>
Date: Tue Feb 28 11:07:16 2023 +0100
Pull request #61: Change Sec Trigger to PR
Merge in RR/pyinfra from cschabert/PlanSpecjava-1677578703647 to RED-6205-prometheus
* commit 'f6b35d648c88ddbce1856445c3b887bce669265c':
Change Sec Trigger to PR
commit f6b35d648c88ddbce1856445c3b887bce669265c
Author: Christoph Schabert <christoph.schabert@iqser.com>
Date: Tue Feb 28 11:05:13 2023 +0100
Change Sec Trigger to PR
... and 20 more commits
117 lines
5.0 KiB
Markdown
Executable File
117 lines
5.0 KiB
Markdown
Executable File
# PyInfra
|
|
|
|
1. [ About ](#about)
|
|
2. [ Configuration ](#configuration)
|
|
3. [ Response Format ](#response-format)
|
|
4. [ Usage & API ](#usage--api)
|
|
5. [ Scripts ](#scripts)
|
|
6. [ Tests ](#tests)
|
|
|
|
## About
|
|
|
|
Common Module with the infrastructure to deploy Research Projects.
|
|
The Infrastructure expects to be deployed in the same Pod / local environment as the analysis container and handles all outbound communication.
|
|
|
|
## Configuration
|
|
|
|
A configuration is located in `/config.yaml`. All relevant variables can be configured via exporting environment variables.
|
|
|
|
| Environment Variable | Default | Description |
|
|
|-------------------------------|----------------------------------|--------------------------------------------------------------------------|
|
|
| LOGGING_LEVEL_ROOT | "DEBUG" | Logging level for service logger |
|
|
| RABBITMQ_HOST | "localhost" | RabbitMQ host address |
|
|
| RABBITMQ_PORT | "5672" | RabbitMQ host port |
|
|
| RABBITMQ_USERNAME | "user" | RabbitMQ username |
|
|
| RABBITMQ_PASSWORD | "bitnami" | RabbitMQ password |
|
|
| RABBITMQ_HEARTBEAT | 60 | Controls AMQP heartbeat timeout in seconds |
|
|
| RABBITMQ_CONNECTION_SLEEP | 5 | Controls AMQP connection sleep timer in seconds |
|
|
| REQUEST_QUEUE | "request_queue" | Requests to service |
|
|
| RESPONSE_QUEUE | "response_queue" | Responses by service |
|
|
| DEAD_LETTER_QUEUE | "dead_letter_queue" | Messages that failed to process |
|
|
| STORAGE_BACKEND | "s3" | The type of storage to use {s3, azure} |
|
|
| STORAGE_BUCKET | "redaction" | The bucket / container to pull files specified in queue requests from |
|
|
| STORAGE_ENDPOINT | "http://127.0.0.1:9000" | Endpoint for s3 storage |
|
|
| STORAGE_KEY | "root" | User for s3 storage |
|
|
| STORAGE_SECRET | "password" | Password for s3 storage |
|
|
| STORAGE_AZURECONNECTIONSTRING | "DefaultEndpointsProtocol=..." | Connection string for Azure storage |
|
|
| STORAGE_AZURECONTAINERNAME | "redaction" | AKS container |
|
|
| WRITE_CONSUMER_TOKEN | "False" | Value to see if we should write a consumer token to a file |
|
|
|
|
## Response Format
|
|
|
|
### Expected AMQP input message:
|
|
|
|
```json
|
|
{
|
|
"dossierId": "",
|
|
"fileId": "",
|
|
"targetFileExtension": "",
|
|
"responseFileExtension": ""
|
|
}
|
|
```
|
|
|
|
Optionally, the input message can contain a field with the key `"operations"`.
|
|
|
|
### AMQP output message:
|
|
|
|
```json
|
|
{
|
|
"dossierId": "",
|
|
"fileId": ""
|
|
}
|
|
```
|
|
|
|
## Usage & API
|
|
|
|
### Setup
|
|
|
|
Install project dependencies
|
|
|
|
```bash
|
|
make poetry
|
|
```
|
|
|
|
You don't have to install it independently in the project repo, just `import pyinfra` in any `.py`-file
|
|
|
|
or install form another project
|
|
|
|
```bash
|
|
poetry add git+ssh://git@git.iqser.com:2222/rr/pyinfra.git#TAG-NUMBER
|
|
```
|
|
|
|
### API
|
|
|
|
```python
|
|
from pyinfra.config import get_config
|
|
from pyinfra.payload_processing.processor import make_payload_processor
|
|
from pyinfra.queue.queue_manager import QueueManager
|
|
|
|
queue_manager = QueueManager(get_config())
|
|
queue_manager.start_consuming(make_payload_processor(data_processor))
|
|
```
|
|
The data_processor should expect a dict or bytes (pdf) as input and should return a list of results.
|
|
|
|
## Scripts
|
|
|
|
### Run pyinfra locally
|
|
|
|
**Shell 1**: Start minio and rabbitmq containers
|
|
```bash
|
|
$ cd tests && docker-compose up
|
|
```
|
|
|
|
**Shell 2**: Start pyinfra with callback mock
|
|
```bash
|
|
$ python scripts/start_pyinfra.py
|
|
```
|
|
|
|
**Shell 3**: Upload dummy content on storage and publish message
|
|
```bash
|
|
$ python scripts/mock_process_request.py
|
|
```
|
|
|
|
## Tests
|
|
|
|
The tests take a bit longer than you are probably used to, because among other things the required startup times are
|
|
quite high. The test runtime can be accelerated by setting 'autouse' to 'False'. In that case, run 'docker-compose up'
|
|
in the tests dir manually before running the tests. |