Pull request #1: Setup

Merge in RR/fb_detr_prediction_container from setup to master

Squashed commit of the following:

commit 7fae4878d4250676367b7201fa163a4b67f79f84
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Thu Feb 3 11:22:12 2022 +0100

    readded annotation to client

commit ff788030f6b3b342919a7fd31dfa66940033d7e1
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Thu Feb 3 11:15:16 2022 +0100

    applied black

commit 3521444f678950a2772b725c6964751e0e655736
Merge: 4080aff 51d6597
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Thu Feb 3 10:39:11 2022 +0100

    Merge branch 'setup' of ssh://git.iqser.com:2222/rr/fb_detr_prediction_container into setup

commit 4080affd21a02ad32c61fbd2027511f51a202d63
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Thu Feb 3 10:39:02 2022 +0100

    added poppler-utils download to Dockerfile, since pdf2image only is a wrapper for it

commit 51d6597b056ae9ac693280f65a3f37d46b1276cf
Author: Julius Unverfehrt <Julius.Unverfehrt@iqser.com>
Date:   Thu Feb 3 09:43:35 2022 +0100

    Structure change for local backbone lookup (working now)

commit ac314d5148d6e026c67f00df45a8bbc70c15b52d
Author: Julius Unverfehrt <Julius.Unverfehrt@iqser.com>
Date:   Thu Feb 3 09:35:41 2022 +0100

    env bug fixed

commit 1c3221fe4956911b29fd8fede8d07dcdefad06d8
Author: Julius Unverfehrt <Julius.Unverfehrt@iqser.com>
Date:   Thu Feb 3 09:23:55 2022 +0100

    ENV correctly set now

commit 58069440583f1f78cfb2fb796fa4dc4a63e2916a
Author: Julius Unverfehrt <Julius.Unverfehrt@iqser.com>
Date:   Thu Feb 3 08:41:29 2022 +0100

    ENV for local torch model lookup set

commit f0501cf0bf904793e8e04afbd3d80ee84af9d981
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Wed Feb 2 18:28:44 2022 +0100

    changed host and port for flask

commit 986fda22f6656b10930628d0d284995b33ea2df5
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Wed Feb 2 17:33:07 2022 +0100

    added debug webserver method

commit 64b857ce53757ec2b7e7c327962fa65b551603a0
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Wed Feb 2 16:59:11 2022 +0100

    moved utils into module; fixed open-cv (maybe)

commit c62ada183135e12b41a29c6822472e33698f947f
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Wed Feb 2 15:55:10 2022 +0100

    made bash scripts executable

commit 982bdd7503c14fcf1776ae10c38589475199545e
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Wed Feb 2 15:35:16 2022 +0100

    service building logic added (WIP)

commit 46e5e3b8e67e54ecedaeee4765a3437f08fa4b17
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Wed Feb 2 14:37:28 2022 +0100

    applied black

commit ad93130e66d2e87bc86b2bf1de6234f3c037df48
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Wed Feb 2 14:36:09 2022 +0100

    fixed formatting (w, h -> x2, y2); added drawing logic to caller mock

commit df76f033599e66aaa52143f5e2b156530f643df9
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Wed Feb 2 13:54:34 2022 +0100

    page indices in predictions

commit 5e87c57dff752419486d1a44de9a734e3f840816
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Wed Feb 2 13:17:34 2022 +0100

    service main loop WIP (working in basic version)

commit ba5ec3d57621d090201413309126955940602be9
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Wed Feb 2 13:03:52 2022 +0100

    service main loop WIP

commit 77266f6982ec826eadcdd8a18c5ccf0fc380611b
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Wed Feb 2 11:24:27 2022 +0100

    fixed bug for self.classes == None

commit 858ef7589d6914ad503660a3ddc5e75bf72a6bb7
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Wed Feb 2 11:09:11 2022 +0100

    removed 'postprocessors' argument and attribute

... and 32 more commits
This commit is contained in:
Matthias Bisping 2022-02-03 11:44:11 +01:00
parent 42075bf2a0
commit e4dc6631b5
32 changed files with 989 additions and 0 deletions

3
.dvc/.gitignore vendored Normal file
View File

@ -0,0 +1,3 @@
/config.local
/tmp
/cache

6
.dvc/config Normal file
View File

@ -0,0 +1,6 @@
[core]
remote = vector
autostage = true
['remote "vector"']
url = ssh://vector.iqser.com/research/detr_server/
port = 22

107
.dvc/plots/confusion.json Normal file
View File

@ -0,0 +1,107 @@
{
"$schema": "https://vega.github.io/schema/vega-lite/v4.json",
"data": {
"values": "<DVC_METRIC_DATA>"
},
"title": "<DVC_METRIC_TITLE>",
"facet": {
"field": "rev",
"type": "nominal"
},
"spec": {
"transform": [
{
"aggregate": [
{
"op": "count",
"as": "xy_count"
}
],
"groupby": [
"<DVC_METRIC_Y>",
"<DVC_METRIC_X>"
]
},
{
"impute": "xy_count",
"groupby": [
"rev",
"<DVC_METRIC_Y>"
],
"key": "<DVC_METRIC_X>",
"value": 0
},
{
"impute": "xy_count",
"groupby": [
"rev",
"<DVC_METRIC_X>"
],
"key": "<DVC_METRIC_Y>",
"value": 0
},
{
"joinaggregate": [
{
"op": "max",
"field": "xy_count",
"as": "max_count"
}
],
"groupby": []
},
{
"calculate": "datum.xy_count / datum.max_count",
"as": "percent_of_max"
}
],
"encoding": {
"x": {
"field": "<DVC_METRIC_X>",
"type": "nominal",
"sort": "ascending",
"title": "<DVC_METRIC_X_LABEL>"
},
"y": {
"field": "<DVC_METRIC_Y>",
"type": "nominal",
"sort": "ascending",
"title": "<DVC_METRIC_Y_LABEL>"
}
},
"layer": [
{
"mark": "rect",
"width": 300,
"height": 300,
"encoding": {
"color": {
"field": "xy_count",
"type": "quantitative",
"title": "",
"scale": {
"domainMin": 0,
"nice": true
}
}
}
},
{
"mark": "text",
"encoding": {
"text": {
"field": "xy_count",
"type": "quantitative"
},
"color": {
"condition": {
"test": "datum.percent_of_max > 0.5",
"value": "white"
},
"value": "black"
}
}
}
]
}
}

View File

@ -0,0 +1,112 @@
{
"$schema": "https://vega.github.io/schema/vega-lite/v4.json",
"data": {
"values": "<DVC_METRIC_DATA>"
},
"title": "<DVC_METRIC_TITLE>",
"facet": {
"field": "rev",
"type": "nominal"
},
"spec": {
"transform": [
{
"aggregate": [
{
"op": "count",
"as": "xy_count"
}
],
"groupby": [
"<DVC_METRIC_Y>",
"<DVC_METRIC_X>"
]
},
{
"impute": "xy_count",
"groupby": [
"rev",
"<DVC_METRIC_Y>"
],
"key": "<DVC_METRIC_X>",
"value": 0
},
{
"impute": "xy_count",
"groupby": [
"rev",
"<DVC_METRIC_X>"
],
"key": "<DVC_METRIC_Y>",
"value": 0
},
{
"joinaggregate": [
{
"op": "sum",
"field": "xy_count",
"as": "sum_y"
}
],
"groupby": [
"<DVC_METRIC_Y>"
]
},
{
"calculate": "datum.xy_count / datum.sum_y",
"as": "percent_of_y"
}
],
"encoding": {
"x": {
"field": "<DVC_METRIC_X>",
"type": "nominal",
"sort": "ascending",
"title": "<DVC_METRIC_X_LABEL>"
},
"y": {
"field": "<DVC_METRIC_Y>",
"type": "nominal",
"sort": "ascending",
"title": "<DVC_METRIC_Y_LABEL>"
}
},
"layer": [
{
"mark": "rect",
"width": 300,
"height": 300,
"encoding": {
"color": {
"field": "percent_of_y",
"type": "quantitative",
"title": "",
"scale": {
"domain": [
0,
1
]
}
}
}
},
{
"mark": "text",
"encoding": {
"text": {
"field": "percent_of_y",
"type": "quantitative",
"format": ".2f"
},
"color": {
"condition": {
"test": "datum.percent_of_y > 0.5",
"value": "white"
},
"value": "black"
}
}
}
]
}
}

116
.dvc/plots/linear.json Normal file
View File

@ -0,0 +1,116 @@
{
"$schema": "https://vega.github.io/schema/vega-lite/v4.json",
"data": {
"values": "<DVC_METRIC_DATA>"
},
"title": "<DVC_METRIC_TITLE>",
"width": 300,
"height": 300,
"layer": [
{
"encoding": {
"x": {
"field": "<DVC_METRIC_X>",
"type": "quantitative",
"title": "<DVC_METRIC_X_LABEL>"
},
"y": {
"field": "<DVC_METRIC_Y>",
"type": "quantitative",
"title": "<DVC_METRIC_Y_LABEL>",
"scale": {
"zero": false
}
},
"color": {
"field": "rev",
"type": "nominal"
}
},
"layer": [
{
"mark": "line"
},
{
"selection": {
"label": {
"type": "single",
"nearest": true,
"on": "mouseover",
"encodings": [
"x"
],
"empty": "none",
"clear": "mouseout"
}
},
"mark": "point",
"encoding": {
"opacity": {
"condition": {
"selection": "label",
"value": 1
},
"value": 0
}
}
}
]
},
{
"transform": [
{
"filter": {
"selection": "label"
}
}
],
"layer": [
{
"mark": {
"type": "rule",
"color": "gray"
},
"encoding": {
"x": {
"field": "<DVC_METRIC_X>",
"type": "quantitative"
}
}
},
{
"encoding": {
"text": {
"type": "quantitative",
"field": "<DVC_METRIC_Y>"
},
"x": {
"field": "<DVC_METRIC_X>",
"type": "quantitative"
},
"y": {
"field": "<DVC_METRIC_Y>",
"type": "quantitative"
}
},
"layer": [
{
"mark": {
"type": "text",
"align": "left",
"dx": 5,
"dy": -5
},
"encoding": {
"color": {
"type": "nominal",
"field": "rev"
}
}
}
]
}
]
}
]
}

104
.dvc/plots/scatter.json Normal file
View File

@ -0,0 +1,104 @@
{
"$schema": "https://vega.github.io/schema/vega-lite/v4.json",
"data": {
"values": "<DVC_METRIC_DATA>"
},
"title": "<DVC_METRIC_TITLE>",
"width": 300,
"height": 300,
"layer": [
{
"encoding": {
"x": {
"field": "<DVC_METRIC_X>",
"type": "quantitative",
"title": "<DVC_METRIC_X_LABEL>"
},
"y": {
"field": "<DVC_METRIC_Y>",
"type": "quantitative",
"title": "<DVC_METRIC_Y_LABEL>",
"scale": {
"zero": false
}
},
"color": {
"field": "rev",
"type": "nominal"
}
},
"layer": [
{
"mark": "point"
},
{
"selection": {
"label": {
"type": "single",
"nearest": true,
"on": "mouseover",
"encodings": [
"x"
],
"empty": "none",
"clear": "mouseout"
}
},
"mark": "point",
"encoding": {
"opacity": {
"condition": {
"selection": "label",
"value": 1
},
"value": 0
}
}
}
]
},
{
"transform": [
{
"filter": {
"selection": "label"
}
}
],
"layer": [
{
"encoding": {
"text": {
"type": "quantitative",
"field": "<DVC_METRIC_Y>"
},
"x": {
"field": "<DVC_METRIC_X>",
"type": "quantitative"
},
"y": {
"field": "<DVC_METRIC_Y>",
"type": "quantitative"
}
},
"layer": [
{
"mark": {
"type": "text",
"align": "left",
"dx": 5,
"dy": -5
},
"encoding": {
"color": {
"type": "nominal",
"field": "rev"
}
}
}
]
}
]
}
]
}

31
.dvc/plots/simple.json Normal file
View File

@ -0,0 +1,31 @@
{
"$schema": "https://vega.github.io/schema/vega-lite/v4.json",
"data": {
"values": "<DVC_METRIC_DATA>"
},
"title": "<DVC_METRIC_TITLE>",
"width": 300,
"height": 300,
"mark": {
"type": "line"
},
"encoding": {
"x": {
"field": "<DVC_METRIC_X>",
"type": "quantitative",
"title": "<DVC_METRIC_X_LABEL>"
},
"y": {
"field": "<DVC_METRIC_Y>",
"type": "quantitative",
"title": "<DVC_METRIC_Y_LABEL>",
"scale": {
"zero": false
}
},
"color": {
"field": "rev",
"type": "nominal"
}
}
}

39
.dvc/plots/smooth.json Normal file
View File

@ -0,0 +1,39 @@
{
"$schema": "https://vega.github.io/schema/vega-lite/v4.json",
"data": {
"values": "<DVC_METRIC_DATA>"
},
"title": "<DVC_METRIC_TITLE>",
"mark": {
"type": "line"
},
"encoding": {
"x": {
"field": "<DVC_METRIC_X>",
"type": "quantitative",
"title": "<DVC_METRIC_X_LABEL>"
},
"y": {
"field": "<DVC_METRIC_Y>",
"type": "quantitative",
"title": "<DVC_METRIC_Y_LABEL>",
"scale": {
"zero": false
}
},
"color": {
"field": "rev",
"type": "nominal"
}
},
"transform": [
{
"loess": "<DVC_METRIC_Y>",
"on": "<DVC_METRIC_X>",
"groupby": [
"rev"
],
"bandwidth": 0.3
}
]
}

3
.dvcignore Normal file
View File

@ -0,0 +1,3 @@
# Add patterns of files dvc should ignore, which could improve
# the performance. Learn more at
# https://dvc.org/doc/user-guide/dvcignore

3
.gitmodules vendored Normal file
View File

@ -0,0 +1,3 @@
[submodule "incl/detr"]
path = incl/detr
url = ssh://git@git.iqser.com:2222/rr/detr.git

35
Dockerfile Normal file
View File

@ -0,0 +1,35 @@
FROM python:3.8 as builder1
# Use a virtual environment.
RUN python -m venv /app/venv
ENV PATH="/app/venv/bin:$PATH"
# Upgrade pip.
RUN python -m pip install --upgrade pip
# Make a directory for the service files and copy the service repo into the container.
WORKDIR /app/service
COPY . ./
# Set up service as a module and install all its dependencies.
RUN bash setup/docker_local.sh
# Make a new container and copy all relevant files over to filter out temporary files
# produced during setup to reduce the final container's size.
FROM python:3.8
WORKDIR /app/
COPY --from=builder1 /app .
ENV PATH="/app/venv/bin:$PATH"
WORKDIR /app/service
RUN apt update --yes
RUN apt install vim --yes
RUN apt install poppler-utils --yes
EXPOSE 5000
EXPOSE 8080
# Run the service loop.
CMD ["python3", "src/run_service.py"]

0
__init__.py Normal file
View File

7
config.yaml Normal file
View File

@ -0,0 +1,7 @@
device: cpu
threshold: .5
classes: ["logo", "other", "formula", "signature", "handwriting_other"]
rejection_class: "other"
checkpoint: checkpoint.pth

1
data/.gitignore vendored Normal file
View File

@ -0,0 +1 @@
/checkpoint.pth

4
data/checkpoint.pth.dvc Normal file
View File

@ -0,0 +1,4 @@
outs:
- md5: 9face65530febd41a0722e0513da2264
size: 496696129
path: checkpoint.pth

1
data/hub/checkpoints/.gitignore vendored Normal file
View File

@ -0,0 +1 @@
/resnet50-0676ba61.pth

View File

@ -0,0 +1,4 @@
outs:
- md5: b94941323912291bb67db6fdb1d80c11
size: 102530333
path: resnet50-0676ba61.pth

0
fb_detr/__init__.py Normal file
View File

7
fb_detr/locations.py Normal file
View File

@ -0,0 +1,7 @@
from pathlib import Path
MODULE_ROOT = Path(__file__).resolve().parents[1]
CONFIG_FILE = MODULE_ROOT / "config.yaml"
DATA_DIR = MODULE_ROOT / "data"
TORCH_HOME = DATA_DIR

121
fb_detr/predictor.py Normal file
View File

@ -0,0 +1,121 @@
import argparse
from itertools import compress, starmap
from operator import itemgetter
from pathlib import Path
from typing import Iterable
import torch
from detr.models import build_model
from detr.test import get_args_parser, infer
from iteration_utilities import starfilter
from fb_detr.utils.config import read_config
def load_model(checkpoint_path):
parser = argparse.ArgumentParser(parents=[get_args_parser()])
args = parser.parse_args()
if args.output_dir:
Path(args.output_dir).mkdir(parents=True, exist_ok=True)
device = torch.device(read_config("device"))
model, _, _ = build_model(args)
checkpoint = torch.load(checkpoint_path, map_location="cpu")
model.load_state_dict(checkpoint["model"])
model.to(device)
return model
class Predictor:
def __init__(self, checkpoint_path, classes=None, rejection_class=None):
self.model = load_model(checkpoint_path)
self.classes = classes
self.rejection_class = rejection_class
@staticmethod
def __format_boxes(boxes):
keys = "x1", "y1", "x2", "y2"
x1s = boxes[:, 0].tolist()
y1s = boxes[:, 1].tolist()
x2s = boxes[:, 2].tolist()
y2s = boxes[:, 3].tolist()
boxes = [dict(zip(keys, vs)) for vs in zip(x1s, y1s, x2s, y2s)]
return boxes
@staticmethod
def __normalize_to_list(maybe_multiple):
return maybe_multiple if isinstance(maybe_multiple, tuple) else tuple([maybe_multiple])
def __format_classes(self, classes):
if self.classes:
return self.__normalize_to_list(itemgetter(*classes.tolist())(self.classes))
else:
return classes.tolist()
def __format_prediction(self, output: dict):
boxes, classes = itemgetter("bboxes", "classes")(output)
if len(boxes):
boxes = self.__format_boxes(boxes)
classes = self.__format_classes(classes)
else:
boxes, classes = [], []
output["bboxes"] = boxes
output["classes"] = classes
return output
def __filter_predictions_for_image(self, predictions):
boxes, classes = itemgetter("bboxes", "classes")(predictions)
if boxes:
keep = map(lambda c: c != self.rejection_class, classes)
compressed = list(compress(zip(boxes, classes), keep))
boxes, classes = map(list, zip(*compressed)) if compressed else ([], [])
predictions["bboxes"] = boxes
predictions["classes"] = classes
return predictions
def filter_predictions(self, predictions):
def detections_present(_, prediction):
return bool(prediction["classes"])
def build_return_dict(page_idx, predictions):
return {"page_idx": page_idx, **predictions}
filtered_rejections = map(self.__filter_predictions_for_image, predictions)
filtered_no_detections = starfilter(detections_present, enumerate(filtered_rejections))
filtered_no_detections = starmap(build_return_dict, filtered_no_detections)
return filtered_no_detections
def format_predictions(self, outputs: Iterable):
return map(self.__format_prediction, outputs)
def predict(self, images, threshold=None, format_output=False):
if not threshold:
threshold = read_config("threshold")
predictions = infer(images, self.model, read_config("device"), threshold)
if format_output:
predictions = self.format_predictions(predictions)
if self.rejection_class:
predictions = self.filter_predictions(predictions)
return predictions

View File

18
fb_detr/utils/config.py Normal file
View File

@ -0,0 +1,18 @@
import yaml
from fb_detr.locations import CONFIG_FILE
def read_config(key, config_path: str = CONFIG_FILE):
"""Reads the values associated with a key from a config.
Args:
key: Key to look up the value to.
config_path: Path to config.
Returns:
The value associated with `key`.
"""
with open(config_path) as f:
config = yaml.load(f, Loader=yaml.FullLoader)
return config[key]

0
incl/__init__.py Normal file
View File

1
incl/detr Submodule

@ -0,0 +1 @@
Subproject commit 7e3258ccc1fa2be7a9d8ab333873b79de7005809

14
requirements.txt Normal file
View File

@ -0,0 +1,14 @@
torch==1.10.2
numpy==1.22.1
#opencv-python==4.5.5.62
opencv-python-headless==4.5.5.62
torchvision==0.11.3
pycocotools==2.0.4
scipy==1.7.3
pdf2image==1.16.0
PyYAML==6.0
Flask==2.0.2
requests==2.27.1
iteration-utilities==0.11.0
dvc==2.9.3
dvc[ssh]

58
scripts/client_mock.py Normal file
View File

@ -0,0 +1,58 @@
import argparse
import json
from operator import itemgetter
import pdf2image
import requests
from PIL import ImageDraw
def draw_coco_box(draw: ImageDraw.Draw, bbox, klass):
x1, y1, x2, y2 = itemgetter("x1", "y1", "x2", "y2")(bbox)
draw.rectangle(((x1, y1), (x2, y2)), outline="red")
draw.text((x1, y1), text=klass, fill=(0, 0, 0, 100))
def draw_coco_boxes(image, bboxes, classes):
draw = ImageDraw.Draw(image)
for bbox, klass in zip(bboxes, classes):
draw_coco_box(draw, bbox, klass)
return image
def annotate(pdf_path, predictions):
pages = pdf2image.convert_from_path(pdf_path)
for prd in predictions:
page_idx, boxes, classes = itemgetter("page_idx", "bboxes", "classes")(prd)
page = pages[page_idx]
image = draw_coco_boxes(page, boxes, classes)
image.save(f"/tmp/serv_out/{page_idx}.png")
def parse_args():
parser = argparse.ArgumentParser()
parser.add_argument("--pdf_path", required=True)
args = parser.parse_args()
return args
def main(args):
response = requests.post("http://0.0.0.0:8080", data=open(args.pdf_path, "rb"))
response.raise_for_status()
predictions = response.json()
print(json.dumps(predictions, indent=2))
annotate(args.pdf_path, predictions)
if __name__ == "__main__":
args = parse_args()
main(args)

35
scripts/flask_test.py Normal file
View File

@ -0,0 +1,35 @@
import argparse
from PIL import Image
from flask import Flask, request, jsonify
from pathlib import Path
app = Flask(__name__)
@app.before_first_request
def init():
from fb_detr.predictor import Predictor
global PRED
PRED = Predictor(args.resume)
@app.route("/", methods=["GET", "POST"])
def predict_request():
if request.method == "POST":
image_folder_path = request.form.get("image_folder_path")
images = list(map(Image.open, Path(image_folder_path).glob("*.png")))
results = PRED.predict(images, format_output=True)
for result in results:
return jsonify(result)
if request.method == "GET":
return "Not implemented"
parser = argparse.ArgumentParser()
parser.add_argument("--resume", required=True)
args = parser.parse_args()
app.run()

58
scripts/predict.py Normal file
View File

@ -0,0 +1,58 @@
import argparse
import json
from pathlib import Path
from detr.test import draw_boxes
from pdf2image import pdf2image
from fb_detr.predictor import Predictor
def parse_args():
parser = argparse.ArgumentParser()
parser.add_argument("--resume", required=True)
parser.add_argument("--output_dir", required=True)
parser.add_argument("--pdf_path")
parser.add_argument("--draw_boxes", default=False, action="store_true")
args = parser.parse_args()
return args
def build_image_paths(image_root_dir):
return [*map(str, Path(image_root_dir).glob("*.png"))]
def pdf_to_pages(pdf_path):
pages = pdf2image.convert_from_path(pdf_path)
return pages
def main():
# TDOO: de-hardcode
classes = {1: "logo", 2: "other", 3: "formula", 4: "signature", 5: "handwriting_other"}
args = parse_args()
predictor = Predictor(args.resume, classes=classes, rejection_class="other")
images = pdf_to_pages(args.pdf_path)
outputs = predictor.predict(images, 0.5)
if args.draw_boxes:
for im, o in zip(images, outputs):
if len(o["bboxes"]):
draw_boxes(image=im, **o, output_path=args.output_dir)
else:
outputs = predictor.format_predictions(outputs)
outputs = predictor.filter_predictions(outputs)
for o in outputs:
print(json.dumps(o, indent=2))
if __name__ == "__main__":
main()

13
setup.py Normal file
View File

@ -0,0 +1,13 @@
#!/usr/bin/env python
from distutils.core import setup
setup(
name="fb_detr",
version="0.1.0",
description="",
author="",
author_email="",
url="",
packages=["fb_detr"],
)

14
setup/docker.sh Executable file
View File

@ -0,0 +1,14 @@
#!/bin/bash
set -e
python3 -m venv build_venv
source build_venv/bin/activate
python3 -m pip install --upgrade pip
pip install dvc
pip install 'dvc[ssh]'
dvc pull
git submodule update --init --recursive
docker build -t detr-server .

8
setup/docker_local.sh Executable file
View File

@ -0,0 +1,8 @@
#!/bin/bash
set -e
pip install -e .
pip install -r requirements.txt
cd incl/detr
pip install -e .

66
src/run_service.py Normal file
View File

@ -0,0 +1,66 @@
import argparse
import os
from fb_detr.locations import DATA_DIR
from fb_detr.locations import TORCH_HOME
from fb_detr.predictor import Predictor
from flask import Flask, request, jsonify
from pdf2image import pdf2image
from fb_detr.utils.config import read_config
def parse_args():
parser = argparse.ArgumentParser()
parser.add_argument("--resume")
args = parser.parse_args()
return args
def load_classes():
classes = read_config("classes")
id2class = dict(zip(range(1, len(classes) + 1), classes))
return id2class
def get_checkpoint():
return DATA_DIR / read_config("checkpoint")
def set_torch_env():
os.environ["TORCH_HOME"] = str(TORCH_HOME)
def main(args):
set_torch_env()
def initialize_predictor():
checkpoint = get_checkpoint() if not args.resume else args.resume
predictor = Predictor(checkpoint, classes=load_classes(), rejection_class=read_config("rejection_class"))
return predictor
app = Flask(__name__)
@app.route("/", methods=["POST"])
def predict_request():
pdf = request.data
pages = pdf2image.convert_from_bytes(pdf)
predictions = predictor.predict(pages, format_output=True)
return jsonify(list(predictions))
@app.route("/status", methods=["GET"])
def status():
response = "OK"
return jsonify(response)
predictor = initialize_predictor()
app.run(host="0.0.0.0", port=8080)
if __name__ == "__main__":
args = parse_args()
main(args)