- Added pymongo==3.13.0 to requirements.txt for MongoDB connectivity - Implemented generate_summarization_from_mongo.py script to generate summarization tests from MongoDB - Updated run.sh to support 'gen-mongo' command for MongoDB test generation - Enhanced scripts/README.md with documentation for new MongoDB functionality - Improved help text in run.sh to clarify available commands and usage examples ``` This commit adds MongoDB integration for test generation and updates the documentation and scripts accordingly.
2.8 KiB
2.8 KiB
Summary of Changes: JSON to TXT Test Format Conversion
Overview
All test files have been converted from JSON format to TXT format with a clear separator ============== for better readability and maintainability.
Changes Made
1. Updated Benchmark Modules (src/benchmarks/*.py)
Files modified:
src/benchmarks/translation.pysrc/benchmarks/summarization.pysrc/benchmarks/codegen.py
Changes:
- Modified
load_test_cases()method to read TXT files instead of JSON - TXT files are parsed by splitting on the separator
============== - Prompt is in the first part, expected result is in the second part
- Maintains backward compatibility with existing test logic
2. Updated Test Generator (scripts/generate_tests.py)
Changes:
- Modified
generate_tests()to create TXT files instead of JSON - TXT files use format:
prompt\n==============\nexpected - Updated validation logic to work with TXT files
- Made
--modeland--ollama-urloptional when using--validateflag - Added proper error handling for validation mode
3. Created Conversion Script (scripts/convert_json_to_txt.py)
New file: scripts/convert_json_to_txt.py
Features:
- Converts existing JSON test files to TXT format
- Preserves all test data
- Uses the same separator format
- Can be run on any test directory
Usage:
python scripts/convert_json_to_txt.py tests/translation
python scripts/convert_json_to_txt.py tests/summarization
python scripts/convert_json_to_txt.py tests/codegen
4. Test Data Conversion
Converted directories:
tests/translation/- 5 test filestests/summarization/- 4 test filestests/codegen/- 3 test files
Format:
Prompt text here
==============
Expected result here
5. Validation
Validation script:
python scripts/generate_tests.py --validate tests/translation
Results:
- All 12 test files successfully converted
- All tests pass validation
- Benchmark script works correctly with new format
- Report generation works as expected
Benefits
- Better readability - Human-readable format without JSON syntax
- Simpler editing - No need to deal with JSON structure
- Clear separation - Explicit separator makes it obvious what's prompt vs expected
- Backward compatible - All existing functionality preserved
- Easy migration - Conversion script handles existing tests
Testing
All changes have been tested:
- ✅ Validation script works correctly
- ✅ Benchmark script runs successfully
- ✅ Report generation works
- ✅ All test files converted successfully
- ✅ New TXT format is properly read by all benchmark modules
Migration Complete
The system now:
- ✅ Generates TXT files instead of JSON
- ✅ Reads TXT files instead of JSON
- ✅ Validates TXT files with proper format
- ✅ Maintains all existing functionality