This documentation covers the usage of two API endpoints of Hangul provided by https://d4gumsi.pythonanywhere.com for handling document processing and summarization. Below is an outline of the endpoints, payload structure, and the sequence of API calls.
To learn more about Hangul here.
URL: https://d4gumsi.pythonanywhere.com/api/v2/products/hangul
Method: POST
Description: This endpoint processes a given PDF document based on the specified payload parameters.
File: A PDF file to be sent along with the payload.
kw_num (str): Number of keywords to extract.
Return_ALL (bool): A flag to indicate if all available data should be returned. If this one True, it overrides all the rest.
Document-related flags (bool): Flags to specify if certain document attributes should be extracted:
pdf Metadata information: This information is extracted from the metadata of the pdf file. It is available in the metadata dictionary of the response (All are bool type).
# Define API URL
api_url_1 = 'https://d4gumsi.pythonanywhere.com/api/v2/products/hangul'
# Define Payload
payload = {
'kw_num': "5",
'Return_ALL': False,
'document_language': False,
'document_title': True,
'document_summary': False,
'content': False,
'report_type': True,
'locations': True,
'full_content': True,
'markdown_text': True,
'document_theme': False,
'new_detected_disasters': True,
"Author": False,
"doc_created_date": False,
"doc_modified_date": False,
"num_of_pages": False,
"charsPerPage": False,
}
# Define File to be uploaded
files = {
"file": ("filename.pdf", open("filepath/filename.pdf", "rb")),
}
# Make the Request
response = requests.post(api_url_1, files=files, data=payload)
# Print the response
print(response.json())
The response will be a dictionary which contains various keys, one of which is document_summary_parameters that will be used for the next API call.
The response keys are the following: