Authenticate

dai_auth()

Get service account token

dai_token()

Produce token

dai_has_token()

Check that token is available

dai_user()

Get user info associated with service account key

get_project_id()

Get project id from service account file

dai_deauth()

Delete token from environment

Send for processing

dai_sync()

OCR page synchronously

dai_async()

OCR documents asynchronously

dai_status()

Check the status of an asynchronous processing job

dai_sync_tab()

OCR page synchronously and extract table data

dai_async_tab()

OCR documents asynchronously and extract table data

Inspect output

text_from_dai_response()

Get document text from a Document AI response object

text_from_dai_file()

Get document text from a Document AI json file

draw_blocks()

Inspect block bounding boxes in Google Document AI json output

draw_paragraphs()

Inspect paragraph boxes in Google Document AI json output

draw_lines()

Inspect line bounding boxes in Google Document AI json output

draw_tokens()

Inspect token bounding boxes in Google Document AI json output

Process output

build_token_df()

Build dataframe with token location data

build_block_df()

Build dataframe with block location data

split_block()

Split a block bounding box

reassign_tokens()

Assign tokens to new blocks

reassign_tokens2()

Assign tokens to a single new block

from_labelme()

Extract block coordinates from labelme files

Utilities

image_to_pdf()

Convert images to PDF

create_folder()

Create folder in Google Storage

is_pdf()

Check that a file is pdf

is_json()

Check that a file is json

pdf_to_binbase()

Convert pdf to base64-encoded binary tiff

img_to_binbase()

Convert image file to base64-encoded binary tiff