Spaces:
Running
Running
saifisvibinn
commited on
Commit
·
27a4694
1
Parent(s):
07b3913
Add /api/predict REST endpoint with CORS support for Postman testing
Browse files- POSTMAN_EXAMPLE.md +131 -0
- app.py +4 -0
- requirements.txt +1 -0
POSTMAN_EXAMPLE.md
ADDED
|
@@ -0,0 +1,131 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Postman API Testing Guide
|
| 2 |
+
|
| 3 |
+
## Endpoint: `/api/predict`
|
| 4 |
+
|
| 5 |
+
### Request Configuration
|
| 6 |
+
|
| 7 |
+
**Method:** `POST`
|
| 8 |
+
**URL:** `https://saifisvibin-volaris-pdf-tool.hf.space/api/predict`
|
| 9 |
+
**Headers:** None required (multipart/form-data is set automatically)
|
| 10 |
+
|
| 11 |
+
### Body Configuration
|
| 12 |
+
|
| 13 |
+
1. Select **Body** tab
|
| 14 |
+
2. Select **form-data** (not raw)
|
| 15 |
+
3. Add a key named `file`
|
| 16 |
+
4. Change the type from "Text" to **"File"** (dropdown on the right)
|
| 17 |
+
5. Click **"Select Files"** and choose a PDF file
|
| 18 |
+
|
| 19 |
+
### Example Request in Postman
|
| 20 |
+
|
| 21 |
+
```
|
| 22 |
+
POST https://saifisvibin-volaris-pdf-tool.hf.space/api/predict
|
| 23 |
+
Content-Type: multipart/form-data
|
| 24 |
+
|
| 25 |
+
Body (form-data):
|
| 26 |
+
file: [Select PDF file]
|
| 27 |
+
```
|
| 28 |
+
|
| 29 |
+
### Expected Response
|
| 30 |
+
|
| 31 |
+
```json
|
| 32 |
+
{
|
| 33 |
+
"status": "success",
|
| 34 |
+
"filename": "document.pdf",
|
| 35 |
+
"text": "# Document Title\n\nThis is the extracted markdown text...",
|
| 36 |
+
"tables": [
|
| 37 |
+
{
|
| 38 |
+
"page": 1,
|
| 39 |
+
"bbox": [100, 200, 500, 400],
|
| 40 |
+
"confidence": 0.95,
|
| 41 |
+
"width": 400,
|
| 42 |
+
"height": 200,
|
| 43 |
+
"image_base64": "data:image/png;base64,iVBORw0KGgoAAAANS...",
|
| 44 |
+
"image_path": "tables/page_1_tab_0.png"
|
| 45 |
+
}
|
| 46 |
+
],
|
| 47 |
+
"figures": [
|
| 48 |
+
{
|
| 49 |
+
"page": 2,
|
| 50 |
+
"bbox": [150, 250, 550, 650],
|
| 51 |
+
"confidence": 0.92,
|
| 52 |
+
"width": 400,
|
| 53 |
+
"height": 400,
|
| 54 |
+
"image_base64": "data:image/png;base64,iVBORw0KGgoAAAANS...",
|
| 55 |
+
"image_path": "figures/page_2_fig_0.png"
|
| 56 |
+
}
|
| 57 |
+
],
|
| 58 |
+
"summary": {
|
| 59 |
+
"total_pages": 42,
|
| 60 |
+
"figures_count": 5,
|
| 61 |
+
"tables_count": 3,
|
| 62 |
+
"elements_count": 8
|
| 63 |
+
}
|
| 64 |
+
}
|
| 65 |
+
```
|
| 66 |
+
|
| 67 |
+
### Error Response
|
| 68 |
+
|
| 69 |
+
```json
|
| 70 |
+
{
|
| 71 |
+
"status": "error",
|
| 72 |
+
"error": "Error message here"
|
| 73 |
+
}
|
| 74 |
+
```
|
| 75 |
+
|
| 76 |
+
### cURL Example
|
| 77 |
+
|
| 78 |
+
```bash
|
| 79 |
+
curl -X POST https://saifisvibin-volaris-pdf-tool.hf.space/api/predict \
|
| 80 |
+
-F "file=@/path/to/your/document.pdf"
|
| 81 |
+
```
|
| 82 |
+
|
| 83 |
+
### Python Example
|
| 84 |
+
|
| 85 |
+
```python
|
| 86 |
+
import requests
|
| 87 |
+
|
| 88 |
+
url = "https://saifisvibin-volaris-pdf-tool.hf.space/api/predict"
|
| 89 |
+
|
| 90 |
+
with open("document.pdf", "rb") as f:
|
| 91 |
+
files = {"file": f}
|
| 92 |
+
response = requests.post(url, files=files)
|
| 93 |
+
|
| 94 |
+
result = response.json()
|
| 95 |
+
print(f"Status: {result['status']}")
|
| 96 |
+
print(f"Figures: {result['summary']['figures_count']}")
|
| 97 |
+
print(f"Tables: {result['summary']['tables_count']}")
|
| 98 |
+
print(f"Text length: {len(result['text'])} characters")
|
| 99 |
+
```
|
| 100 |
+
|
| 101 |
+
### JavaScript/Fetch Example
|
| 102 |
+
|
| 103 |
+
```javascript
|
| 104 |
+
const formData = new FormData();
|
| 105 |
+
formData.append('file', fileInput.files[0]);
|
| 106 |
+
|
| 107 |
+
fetch('https://saifisvibin-volaris-pdf-tool.hf.space/api/predict', {
|
| 108 |
+
method: 'POST',
|
| 109 |
+
body: formData
|
| 110 |
+
})
|
| 111 |
+
.then(response => response.json())
|
| 112 |
+
.then(data => {
|
| 113 |
+
console.log('Status:', data.status);
|
| 114 |
+
console.log('Figures:', data.summary.figures_count);
|
| 115 |
+
console.log('Tables:', data.summary.tables_count);
|
| 116 |
+
console.log('Text:', data.text);
|
| 117 |
+
})
|
| 118 |
+
.catch(error => console.error('Error:', error));
|
| 119 |
+
```
|
| 120 |
+
|
| 121 |
+
### Testing Checklist
|
| 122 |
+
|
| 123 |
+
- [ ] Upload a PDF file
|
| 124 |
+
- [ ] Verify response status is "success"
|
| 125 |
+
- [ ] Check that "text" field contains extracted markdown
|
| 126 |
+
- [ ] Verify "tables" array contains table data with base64 images
|
| 127 |
+
- [ ] Verify "figures" array contains figure data with base64 images
|
| 128 |
+
- [ ] Check "summary" contains correct counts
|
| 129 |
+
- [ ] Test with invalid file type (should return error)
|
| 130 |
+
- [ ] Test without file (should return error)
|
| 131 |
+
|
app.py
CHANGED
|
@@ -6,13 +6,17 @@ import uuid
|
|
| 6 |
from pathlib import Path
|
| 7 |
from typing import Dict, List, Optional
|
| 8 |
from flask import Flask, render_template, request, jsonify, send_file, send_from_directory
|
|
|
|
| 9 |
from werkzeug.utils import secure_filename
|
| 10 |
import torch
|
|
|
|
| 11 |
|
| 12 |
import main as extractor
|
| 13 |
from loguru import logger
|
| 14 |
|
| 15 |
app = Flask(__name__)
|
|
|
|
|
|
|
| 16 |
app.config['MAX_CONTENT_LENGTH'] = 500 * 1024 * 1024 # 500MB max file size
|
| 17 |
app.config['UPLOAD_FOLDER'] = './uploads'
|
| 18 |
app.config['OUTPUT_FOLDER'] = './output'
|
|
|
|
| 6 |
from pathlib import Path
|
| 7 |
from typing import Dict, List, Optional
|
| 8 |
from flask import Flask, render_template, request, jsonify, send_file, send_from_directory
|
| 9 |
+
from flask_cors import CORS
|
| 10 |
from werkzeug.utils import secure_filename
|
| 11 |
import torch
|
| 12 |
+
import base64
|
| 13 |
|
| 14 |
import main as extractor
|
| 15 |
from loguru import logger
|
| 16 |
|
| 17 |
app = Flask(__name__)
|
| 18 |
+
# Enable CORS for all routes
|
| 19 |
+
CORS(app, resources={r"/api/*": {"origins": "*"}})
|
| 20 |
app.config['MAX_CONTENT_LENGTH'] = 500 * 1024 * 1024 # 500MB max file size
|
| 21 |
app.config['UPLOAD_FOLDER'] = './uploads'
|
| 22 |
app.config['OUTPUT_FOLDER'] = './output'
|
requirements.txt
CHANGED
|
@@ -10,3 +10,4 @@ loguru==0.7.3
|
|
| 10 |
numpy==1.26.4
|
| 11 |
flask==3.0.0
|
| 12 |
werkzeug==3.0.1
|
|
|
|
|
|
| 10 |
numpy==1.26.4
|
| 11 |
flask==3.0.0
|
| 12 |
werkzeug==3.0.1
|
| 13 |
+
flask-cors==4.0.0
|