env file
This commit is contained in:
parent
b84bdf17f0
commit
cb2a56f70d
178
cf-plan.md
Normal file
178
cf-plan.md
Normal file
@ -0,0 +1,178 @@
|
||||
# Cross-Reference Tab Implementation Plan
|
||||
|
||||
## Overview
|
||||
|
||||
Create a standalone tab that allows users to upload a document, process it, and find related documents in the existing database.
|
||||
|
||||
## Current System Analysis
|
||||
|
||||
### Backend (FastAPI)
|
||||
|
||||
- ✅ `/search` endpoint exists - can find related documents
|
||||
- ✅ `/documents` endpoint exists - can retrieve documents
|
||||
- ❌ No document upload endpoint
|
||||
- ❌ No document processing for uploaded files
|
||||
|
||||
### Frontend
|
||||
|
||||
- ✅ Tab system exists
|
||||
- ✅ Basic cross-reference function exists (hardcoded)
|
||||
- ❌ No file upload functionality
|
||||
- ❌ No dedicated cross-reference tab
|
||||
|
||||
## Implementation Plan
|
||||
|
||||
### Phase 1: Backend API Extensions
|
||||
|
||||
#### New Endpoints Needed
|
||||
|
||||
1. **`POST /upload-document`**
|
||||
|
||||
- Accept file upload (PDF, DOC, TXT)
|
||||
- Extract text content from uploaded file
|
||||
- Return processed text and document metadata
|
||||
- **No database storage** - temporary processing only
|
||||
|
||||
2. **`POST /find-cross-references`**
|
||||
- Accept processed document text
|
||||
- Use existing search functionality internally
|
||||
- Return related documents with similarity scores
|
||||
- Include cross-reference analysis
|
||||
|
||||
#### Leverage Existing APIs
|
||||
|
||||
- Use existing `/search` endpoint logic for finding related documents
|
||||
- Use existing `/documents` endpoint to fetch full related documents
|
||||
- Use existing database connection and document retrieval functions
|
||||
|
||||
### Phase 2: Frontend Implementation
|
||||
|
||||
#### New Tab Structure
|
||||
|
||||
1. **Upload Section**
|
||||
|
||||
- File drop zone
|
||||
- File type validation (PDF, DOC, DOCX, TXT)
|
||||
- Upload progress indicator
|
||||
- File preview/summary
|
||||
|
||||
2. **Processing Section**
|
||||
|
||||
- Processing status indicator
|
||||
- Document analysis summary
|
||||
- Key terms extraction display
|
||||
|
||||
3. **Results Section**
|
||||
- Related documents list
|
||||
- Similarity scores
|
||||
- Cross-reference details
|
||||
- Document preview capability
|
||||
|
||||
#### UI Components Needed
|
||||
|
||||
- File upload widget
|
||||
- Progress bars
|
||||
- Results grid/list
|
||||
- Document preview modal
|
||||
- Cross-reference visualization
|
||||
|
||||
### Phase 3: Processing Logic
|
||||
|
||||
#### Document Processing Pipeline
|
||||
|
||||
1. **File Upload & Validation**
|
||||
|
||||
- Validate file type and size
|
||||
- Extract text content using appropriate libraries
|
||||
- Clean and normalize text
|
||||
|
||||
2. **Content Analysis**
|
||||
|
||||
- Extract key terms and phrases
|
||||
- Identify legal concepts
|
||||
- Generate search queries from content
|
||||
|
||||
3. **Cross-Reference Matching**
|
||||
|
||||
- Use existing search service (enhanced_rag_service or simple_search_service)
|
||||
- Multiple search strategies:
|
||||
- Full text similarity
|
||||
- Key terms matching
|
||||
- Legal concept matching
|
||||
- Rank results by relevance
|
||||
|
||||
4. **Results Processing**
|
||||
- Format cross-reference results
|
||||
- Include similarity metrics
|
||||
- Group by document type or relevance
|
||||
|
||||
## Technical Approach
|
||||
|
||||
### Backend Dependencies
|
||||
|
||||
```python
|
||||
# New libraries needed
|
||||
- python-multipart # For file uploads
|
||||
- PyPDF2 or pdfplumber # PDF text extraction
|
||||
- python-docx # Word document processing
|
||||
```
|
||||
|
||||
### API Strategy
|
||||
|
||||
**Recommendation: Create new endpoints** because:
|
||||
|
||||
- Current `/search` expects a text query, not document content
|
||||
- Need specialized document processing logic
|
||||
- Need different response format for cross-references
|
||||
- Upload functionality is entirely new
|
||||
|
||||
### Frontend Strategy
|
||||
|
||||
- Add new tab to existing tab system
|
||||
- Use existing styling and components where possible
|
||||
- Implement file upload using HTML5 File API
|
||||
- Use existing API calling patterns
|
||||
|
||||
## File Structure
|
||||
|
||||
### New Backend Files
|
||||
|
||||
```
|
||||
embedding/
|
||||
├── document_processor.py # Handle file uploads and text extraction
|
||||
├── cross_reference_service.py # Cross-reference logic
|
||||
```
|
||||
|
||||
### New Frontend Components
|
||||
|
||||
```
|
||||
frontend/
|
||||
├── js/
|
||||
│ ├── cross-reference.js # Cross-reference tab logic
|
||||
│ └── file-upload.js # File upload utilities
|
||||
├── css/
|
||||
│ └── cross-reference.css # Specific styling
|
||||
```
|
||||
|
||||
### API Endpoints Summary
|
||||
|
||||
1. **`POST /upload-document`** - New endpoint needed
|
||||
2. **`POST /find-cross-references`** - New endpoint needed
|
||||
3. **`GET /documents`** - Use existing
|
||||
4. **`GET /documents/{id}`** - Use existing
|
||||
|
||||
## Development Priority
|
||||
|
||||
1. Backend document upload and processing
|
||||
2. Cross-reference matching logic
|
||||
3. Frontend tab and upload interface
|
||||
4. Results display and formatting
|
||||
5. Error handling and validation
|
||||
|
||||
## Benefits of This Approach
|
||||
|
||||
- Leverages existing search infrastructure
|
||||
- Maintains separation of concerns
|
||||
- Scalable and maintainable
|
||||
- Consistent with current API patterns
|
||||
- No database changes needed
|
20
demoEnv.txt
Normal file
20
demoEnv.txt
Normal file
@ -0,0 +1,20 @@
|
||||
# MySQL Database Configuration
|
||||
MYSQL_HOST=47.130.80.140
|
||||
MYSQL_PORT=3333
|
||||
MYSQL_USER=root
|
||||
MYSQL_PASSWORD=1ibL5A5cGevvM7Ax0ZDqyKXQTHMlEW5D5hwG6OcR7KPF77kMkEfxFEbLDtwzr6Ci
|
||||
MYSQL_DATABASE=agc
|
||||
|
||||
|
||||
# App Configuration
|
||||
DEBUG=True
|
||||
STREAMLIT_SERVER_PORT=8501
|
||||
|
||||
|
||||
# OpenAI API Configuration
|
||||
OPENAI_API_KEY=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
|
||||
OPENAI_CHAT_MODEL=gpt-4o
|
||||
|
||||
# Application Settings
|
||||
MAX_SEARCH_RESULTS=5
|
||||
SIMILARITY_THRESHOLD=0.7
|
@ -2267,7 +2267,7 @@
|
||||
|
||||
<div class="footer-bottom">
|
||||
<p>
|
||||
© 2023 Attorney General's Chambers of Malaysia. All Rights
|
||||
© 2025 AGC Malaysia. All Rights
|
||||
Reserved.
|
||||
</p>
|
||||
</div>
|
||||
|
Loading…
x
Reference in New Issue
Block a user