Matthias Bisping 8432cfe514 Pull request #8: figure detection
Merge in RR/vidocp from text_removal to master

Squashed commit of the following:

commit b65374c512ce9ba07fa522d591c83db3de5d7d55
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Sun Feb 6 01:03:12 2022 +0100

    readme updated

commit 1c1f7a395a00fa505cf19e1ad87d8c34faa6ef5b
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Sun Feb 6 01:00:46 2022 +0100

    figure detection version 1 completed

commit f257660823ef8682e9fedda9921ad946ef2ade76
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Sun Feb 6 00:37:03 2022 +0100

    wip

commit 2e89b28f4a69da80570597c823b3b7a591788d0a
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Sun Feb 6 00:23:56 2022 +0100

    wip
2022-02-06 01:04:15 +01:00

30 lines
996 B
Python

import argparse
from vidocp.table_parsing import annotate_tables_in_pdf
from vidocp.redaction_detection import annotate_boxes_in_pdf
from vidocp.layout_parsing import annotate_layout_in_pdf
from vidocp.figure_detection import remove_text_in_pdf
def parse_args():
parser = argparse.ArgumentParser()
parser.add_argument("pdf_path")
parser.add_argument("page_index", type=int)
parser.add_argument("--type", choices=["table", "redaction", "layout", "figure"])
args = parser.parse_args()
return args
if __name__ == "__main__":
args = parse_args()
if args.type == "table":
annotate_tables_in_pdf(args.pdf_path, page_index=args.page_index)
elif args.type == "redaction":
annotate_boxes_in_pdf(args.pdf_path, page_index=args.page_index)
elif args.type == "layout":
annotate_layout_in_pdf(args.pdf_path, page_index=args.page_index)
elif args.type == "figure":
remove_text_in_pdf(args.pdf_path, page_index=args.page_index)