Added .DS_Store to the ignore list to prevent macOS temporary files from being tracked in the repository. This follows the existing pattern of ignoring Python cache files and results directories.
Moved TEST_SEPARATOR import from benchmarks.base to constants module in codegen, summarization, and translation benchmarks for better modularity and maintainability. This change improves code organization by centralizing constants in a dedicated module.
- Removed unused imports and constants
- Simplified run method signature by swapping num_ctx and context_size parameters
- Added test case name logging for better traceability
- Updated main function to pass context_size to benchmark run method
- Improved code clarity and maintainability
Remove the "Лог файл" (Log file) column from the report generation as it's no longer needed. This simplifies the report structure and removes unused functionality.
The commit corrects the argument name used for logging the context size from `num_ctx` to `context_size` to match the actual parameter name, ensuring accurate logging output. This change improves code consistency and makes the log messages more readable.
This commit adds support for specifying context size when running benchmarks, which is passed to the Ollama client as the `num_ctx` option. The changes include:
- Updated the `run` method in the base benchmark class to accept an optional `context_size` parameter
- Modified the Ollama client call to include context size in the options when provided
- Updated the `run_benchmarks` function to accept and pass through the context size
- Added example usage to the help output showing how to use the new context size parameter
- Fixed prompt formatting in the summarization benchmark to use `text` instead of `task`
The changes enable running benchmarks with custom context sizes, which is useful for testing models with different context window limitations.
- Updated summarization prompt to require Russian output and exclude non-textual elements
- Upgraded ollama dependency to v0.6.1
- Enhanced run.sh script to support both single record and file-based ID input for MongoDB test generation
- Updated documentation in scripts/README.md to reflect new functionality
- Added verbose flag to generate_summarization_from_mongo.py for better debugging
```
This commit message follows the conventional commit format with a short title (50-72 characters) and provides a clear description of the changes made and their purpose.
- Added scripts directory with generate_tests.py for automated test generation
- Added prompts directory with category-specific prompts for test generation
- Updated README with documentation for test generation workflow
- Modified test data format to TXT with '=== разделитель ===' separator
- Enhanced documentation with sections on test generation, validation, and reporting
- Added detailed instructions for using the new test generation capabilities
Update documentation to reflect new TXT format with separator for summarization tests instead of JSON format. Clarify that expected field may be empty if summary generation fails.
feat: change test generation to TXT format with separator
Change test generation from JSON to TXT format with TEST_SEPARATOR. Add filename sanitization function to handle MongoDB record IDs. Update output path and file naming logic. Add attempt to generate expected summary through LLM with fallback to empty string.
Previously, report filenames included a timestamp (e.g., `benchmark_20231015_143022.md`), which caused issues when regenerating reports as it would create duplicate files. The timestamp is no longer included in the filenames to ensure consistent naming and avoid overwriting conflicts. This change affects both benchmark and summary report generation in `src/utils/report.py`.
- Added pymongo==3.13.0 to requirements.txt for MongoDB connectivity
- Implemented generate_summarization_from_mongo.py script to generate summarization tests from MongoDB
- Updated run.sh to support 'gen-mongo' command for MongoDB test generation
- Enhanced scripts/README.md with documentation for new MongoDB functionality
- Improved help text in run.sh to clarify available commands and usage examples
```
This commit adds MongoDB integration for test generation and updates the documentation and scripts accordingly.
Added documentation for test generation through Ollama, including new command-line arguments for `generate_tests.py` and updated `run.sh` script. Also added a new `gen` command to `run.sh` for generating tests via Ollama. This improves usability by providing clear instructions and automation for test generation.
Added **/__pycache__/ to .gitignore to prevent Python cache directories from being tracked across all directories. This improves repository cleanliness and reduces unnecessary files in version control.
- Added run.sh script with init, upd, run, and clean commands
- Updated README.md to document run.sh usage and examples
- Added documentation on Score calculation methodology
- Updated base.py to include score calculation logic
```
This commit message follows the conventional commit format with a short title and a detailed description of the changes made. It explains what was changed and why, making it clear and informative.