ai-benchmark/CHANGES_SUMMARY.md
second_constantine 8ef3a16e3a feat: add MongoDB test generation and update dependencies
- Added pymongo==3.13.0 to requirements.txt for MongoDB connectivity
- Implemented generate_summarization_from_mongo.py script to generate summarization tests from MongoDB
- Updated run.sh to support 'gen-mongo' command for MongoDB test generation
- Enhanced scripts/README.md with documentation for new MongoDB functionality
- Improved help text in run.sh to clarify available commands and usage examples
```

This commit adds MongoDB integration for test generation and updates the documentation and scripts accordingly.
2026-01-22 20:11:52 +03:00

98 lines
2.8 KiB
Markdown

# Summary of Changes: JSON to TXT Test Format Conversion
## Overview
All test files have been converted from JSON format to TXT format with a clear separator `==============` for better readability and maintainability.
## Changes Made
### 1. Updated Benchmark Modules (src/benchmarks/*.py)
**Files modified:**
- `src/benchmarks/translation.py`
- `src/benchmarks/summarization.py`
- `src/benchmarks/codegen.py`
**Changes:**
- Modified `load_test_cases()` method to read TXT files instead of JSON
- TXT files are parsed by splitting on the separator `==============`
- Prompt is in the first part, expected result is in the second part
- Maintains backward compatibility with existing test logic
### 2. Updated Test Generator (scripts/generate_tests.py)
**Changes:**
- Modified `generate_tests()` to create TXT files instead of JSON
- TXT files use format: `prompt\n==============\nexpected`
- Updated validation logic to work with TXT files
- Made `--model` and `--ollama-url` optional when using `--validate` flag
- Added proper error handling for validation mode
### 3. Created Conversion Script (scripts/convert_json_to_txt.py)
**New file:** `scripts/convert_json_to_txt.py`
**Features:**
- Converts existing JSON test files to TXT format
- Preserves all test data
- Uses the same separator format
- Can be run on any test directory
**Usage:**
```bash
python scripts/convert_json_to_txt.py tests/translation
python scripts/convert_json_to_txt.py tests/summarization
python scripts/convert_json_to_txt.py tests/codegen
```
### 4. Test Data Conversion
**Converted directories:**
- `tests/translation/` - 5 test files
- `tests/summarization/` - 4 test files
- `tests/codegen/` - 3 test files
**Format:**
```
Prompt text here
==============
Expected result here
```
### 5. Validation
**Validation script:**
```bash
python scripts/generate_tests.py --validate tests/translation
```
**Results:**
- All 12 test files successfully converted
- All tests pass validation
- Benchmark script works correctly with new format
- Report generation works as expected
## Benefits
1. **Better readability** - Human-readable format without JSON syntax
2. **Simpler editing** - No need to deal with JSON structure
3. **Clear separation** - Explicit separator makes it obvious what's prompt vs expected
4. **Backward compatible** - All existing functionality preserved
5. **Easy migration** - Conversion script handles existing tests
## Testing
All changes have been tested:
- ✅ Validation script works correctly
- ✅ Benchmark script runs successfully
- ✅ Report generation works
- ✅ All test files converted successfully
- ✅ New TXT format is properly read by all benchmark modules
## Migration Complete
The system now:
- ✅ Generates TXT files instead of JSON
- ✅ Reads TXT files instead of JSON
- ✅ Validates TXT files with proper format
- ✅ Maintains all existing functionality