feat: add MongoDB test generation and update dependencies

- Added pymongo==3.13.0 to requirements.txt for MongoDB connectivity - Implemented generate_summarization_from_mongo.py script to generate summarization tests from MongoDB - Updated run.sh to support 'gen-mongo' command for MongoDB test generation - Enhanced scripts/README.md with documentation for new MongoDB functionality - Improved help text in run.sh to clarify available commands and usage examples ``` This commit adds MongoDB integration for test generation and updates the documentation and scripts accordingly.
2026-01-22 20:11:52 +03:00
parent f117c7b23c
commit 8ef3a16e3a
41 changed files with 728 additions and 164 deletions
--- a/CHANGES_SUMMARY.md
+++ b/CHANGES_SUMMARY.md
@@ -0,0 +1,97 @@
 # Summary of Changes: JSON to TXT Test Format Conversion
 ## Overview
 All test files have been converted from JSON format to TXT format with a clear separator `==============` for better readability and maintainability.
 ## Changes Made
 ### 1. Updated Benchmark Modules (src/benchmarks/*.py)
 **Files modified:**
 - `src/benchmarks/translation.py`
 - `src/benchmarks/summarization.py`
 - `src/benchmarks/codegen.py`
 **Changes:**
 - Modified `load_test_cases()` method to read TXT files instead of JSON
 - TXT files are parsed by splitting on the separator `==============`
 - Prompt is in the first part, expected result is in the second part
 - Maintains backward compatibility with existing test logic
 ### 2. Updated Test Generator (scripts/generate_tests.py)
 **Changes:**
 - Modified `generate_tests()` to create TXT files instead of JSON
 - TXT files use format: `prompt\n==============\nexpected`
 - Updated validation logic to work with TXT files
 - Made `--model` and `--ollama-url` optional when using `--validate` flag
 - Added proper error handling for validation mode
 ### 3. Created Conversion Script (scripts/convert_json_to_txt.py)
 **New file:** `scripts/convert_json_to_txt.py`
 **Features:**
 - Converts existing JSON test files to TXT format
 - Preserves all test data
 - Uses the same separator format
 - Can be run on any test directory
 **Usage:**
 ```bash
 python scripts/convert_json_to_txt.py tests/translation
 python scripts/convert_json_to_txt.py tests/summarization
 python scripts/convert_json_to_txt.py tests/codegen
 ```
 ### 4. Test Data Conversion
 **Converted directories:**
 - `tests/translation/` - 5 test files
 - `tests/summarization/` - 4 test files
 - `tests/codegen/` - 3 test files
 **Format:**
 ```
 Prompt text here
 ==============
 Expected result here
 ```
 ### 5. Validation
 **Validation script:**
 ```bash
 python scripts/generate_tests.py --validate tests/translation
 ```
 **Results:**
 - All 12 test files successfully converted
 - All tests pass validation
 - Benchmark script works correctly with new format
 - Report generation works as expected
 ## Benefits
 1. **Better readability** - Human-readable format without JSON syntax
 2. **Simpler editing** - No need to deal with JSON structure
 3. **Clear separation** - Explicit separator makes it obvious what's prompt vs expected
 4. **Backward compatible** - All existing functionality preserved
 5. **Easy migration** - Conversion script handles existing tests
 ## Testing
 All changes have been tested:
 - ✅ Validation script works correctly
 - ✅ Benchmark script runs successfully
 - ✅ Report generation works
 - ✅ All test files converted successfully
 - ✅ New TXT format is properly read by all benchmark modules
 ## Migration Complete
 The system now:
 - ✅ Generates TXT files instead of JSON
 - ✅ Reads TXT files instead of JSON
 - ✅ Validates TXT files with proper format
 - ✅ Maintains all existing functionality
--- a/CONVERSION_GUIDE.md
+++ b/CONVERSION_GUIDE.md
@@ -0,0 +1,135 @@
 # Руководство по конвертации тестов из JSON в TXT формат
 ## Обзор изменений
 В рамках улучшения наглядности тестовых данных все тесты были конвертированы из JSON формата в TXT формат с использованием разделителя.
 ## Новый формат TXT файлов
 ### Структура файла
 Каждый TXT файл содержит два раздела, разделенных константой:
 ```
 prompt
 ==============
 expected
 ```
 Где:
 - `prompt` - текст запроса для модели
 - `expected` - ожидаемый ответ от модели
 - `==============` - разделитель (константа `TEST_SEPARATOR`)
 ### Пример
 **test1.txt:**
 ```
 Translate the following English text to Russian: 'Hello, how are you?'
 ==============
 Привет, как дела?
 ```
 ## Измененные файлы
 ### 1. Скрипты генерации тестов
 **Файл:** `scripts/generate_tests.py`
 **Изменения:**
 - Теперь генерирует TXT файлы вместо JSON
 - Использует константу `TEST_SEPARATOR` для разделения prompt и expected
 - Валидация теперь проверяет TXT файлы на наличие разделителя
 ### 2. Бенчмарки
 **Файлы:**
 - `src/benchmarks/translation.py`
 - `src/benchmarks/summarization.py`
 - `src/benchmarks/codegen.py`
 **Изменения:**
 - Все бенчмарки теперь читают TXT файлы вместо JSON
 - Используют константу `TEST_SEPARATOR` для парсинга файлов
 - Логика загрузки тестов обновлена для работы с TXT форматом
 ### 3. Базовый класс бенчмарка
 **Файл:** `src/benchmarks/base.py`
 **Изменения:**
 - Добавлена константа `TEST_SEPARATOR`
 - Обновлены импорты для поддержки нового формата
 ### 4. Скрипт конвертации
 **Файл:** `scripts/convert_json_to_txt.py` (новый)
 **Назначение:**
 - Конвертирует существующие JSON тесты в новый TXT формат
 - Сохраняет все тесты с тем же именем, но расширением .txt
 - Использует константу `TEST_SEPARATOR` для разделения данных
 ## Как использовать
 ### 1. Конвертация существующих тестов
 ```bash
 python scripts/convert_json_to_txt.py tests/translation
 python scripts/convert_json_to_txt.py tests/summarization
 python scripts/convert_json_to_txt.py tests/codegen
 ```
 ### 2. Генерация новых тестов
 ```bash
 python scripts/generate_tests.py --count 5 --category translation --model second_constantine/t-lite-it-1.0:7b --ollama-url http://localhost:11434
 ```
 ### 3. Валидация тестов
 ```bash
 python scripts/generate_tests.py --validate tests/translation
 ```
 ### 4. Запуск бенчмарков
 ```bash
 python src/main.py -m second_constantine/t-lite-it-1.0:7b -u http://localhost:11434 -b translation
 ```
 ## Преимущества нового формата
 1. **Наглядность**: Легче читать и понимать содержимое тестов
 2. **Простота редактирования**: Можно редактировать тесты в любом текстовом редакторе
 3. **Контроль версий**: Лучше отслеживаются изменения в тестах в системах контроля версий
 4. **Универсальность**: Формат более универсальный и понятный
 ## Технические детали
 ### Константа TEST_SEPARATOR
 ```python
 TEST_SEPARATOR = "\n==============\n"
 ```
 Эта константа используется во всех частях системы для:
 - Генерации TXT файлов
 - Парсинга TXT файлов
 - Валидации тестов
 ### Импорты
 Все файлы, использующие `TEST_SEPARATOR`, импортируют его из:
 - `src/constants.py` (для скриптов)
 - `constants.py` (для бенчмарков, так как они запускаются из src/)
 ## Миграция
 1. Запустите скрипт конвертации для всех существующих тестов
 2. Удалите старые JSON файлы (по желанию)
 3. Обновите любые внешние скрипты, которые могут ссылаться на старый формат
 ## Поддержка
 При возникновении вопросов или проблем обратитесь к разработчикам проекта.
--- a/requirements.txt
+++ b/requirements.txt
@@ -1,3 +1,3 @@
 ollama>=0.1.0
-py-markdown-table>=1.3.0
+pymongo==3.13.0
 tqdm>=4.60.0
--- a/run.sh
+++ b/run.sh
@@ -44,13 +44,24 @@ if [ -n "$1" ]; then
    echo "🤖 Генерирую тесты через Ollama..."
    python scripts/generate_tests.py --count 1 --category all --model second_constantine/t-lite-it-1.0:7b --ollama-url http://10.0.0.4:11434
    echo "✅ Тесты успешно сгенерированы"
  elif [[ "$1" == "gen-mongo" ]]; then
    activate
    echo "🔍 Генерирую тесты пересказов из MongoDB... "
    python scripts/generate_summarization_from_mongo.py --record-id "$2"
    echo "✅ Тесты из MongoDB успешно сгенерированы"
  fi
 else
    echo "  Аргументом необходимо написать название скрипта (+опционально аргументы скрипта)"
    echo "Скрипты:"
    echo " * init - инициализация, устанавливает env"
    echo " * upd - обновление зависимостей"
-    echo " * run - запуск бенчмарков"
+    echo " * run - запуск бенчмарков (translation, summarization, codegen)"
    echo " * clean - очистка отчетов"
-    echo " * gen - генерация тестов через Ollama"
+    echo " * gen - генерация тестов через Ollama (translation, summarization, codegen)"
-fi
+    echo " * gen-mongo - генерация тестов пересказов из MongoDB (использование: ./run.sh gen-mongo <record-id> [output-dir])"
    echo ""
    echo "Примеры использования:"
    echo " * ./run.sh run -m second_constantine/t-lite-it-1.0:7b -b translation summarization"
    echo " * ./run.sh gen"
    echo " * ./run.sh gen-mongo 507f1f77bcf86cd799439011"
  fi
--- a/scripts/README.md
+++ b/scripts/README.md
@@ -1,86 +1,69 @@
 # Скрипты для генерации тестов
-Эта директория содержит утилиты для автоматизированной генерации и валидации тестовых данных с использованием Ollama.
+Эта директория содержит скрипты для генерации тестовых данных для AI бенчмарка.
-## generate_tests.py
+## Доступные скрипты
-Скрипт для генерации тестовых данных для AI бенчмарка через LLM.
+### 1. `generate_tests.py`
-### Использование
+Скрипт для генерации тестовых данных через LLM (Ollama).
 **Функциональность:**
 - Генерация тестов для переводов (translation)
 - Генерация тестов для пересказов (summarization)
 - Генерация тестов для генерации кода (codegen)
 - Валидация generated тестов
 **Использование:**
 ```bash
-# Генерировать 2 теста для перевода через Ollama
+python scripts/generate_tests.py --count 2 --category translation --model second_constantine/t-lite-it-1.0:7b --ollama-url http://10.0.0.4:11434
 python scripts/generate_tests.py --count 2 --category translation --model llama3 --ollama-url http://localhost:11434
 # Генерировать 1 тест для каждого типа
 python scripts/generate_tests.py --count 1 --category all --model llama3 --ollama-url http://localhost:11434
 # Генерировать 3 теста для пересказов
 python scripts/generate_tests.py --count 3 --category summarization --model llama3 --ollama-url http://localhost:11434
 # Валидировать существующие тесты
 python scripts/generate_tests.py --validate tests/translation
 ```
-### Аргументы
+**Параметры:**
 - `--count`: Количество тестов для генерации (по умолчанию: 1)
 - `--category`: Категория тестов (translation, summarization, codegen, или all) (по умолчанию: all)
- `--model`: Название модели для генерации тестов (обязательный параметр, например: llama3)
+- `--model`: Название модели для генерации тестов (обязательный параметр)
- `--ollama-url`: URL подключения к Ollama серверу (обязательный параметр, например: http://localhost:11434)
+- `--ollama-url`: URL подключения к Ollama серверу (обязательный параметр)
 - `--validate`: Валидировать тесты в указанной директории
-### Поддерживаемые категории
+### 2. `generate_summarization_from_mongo.py`
-1. **translation** - тесты переводов с английского на русский (LLM генерирует английский текст и его перевод)
+Скрипт для генерации тестов пересказов из MongoDB.
 2. **summarization** - тесты пересказов текстов (LLM генерирует текст и его пересказ)
 3. **codegen** - тесты генерации Python кода (LLM генерирует задание и код)
-### Как работает генерация
+**Функциональность:**
 - Извлекает текст статьи из коллекции `rssNotification` (поле `.meta.topicContent`)
 - Генерирует тестовые данные в формате JSON для бенчмарка AI
 - Валидирует generated тесты
-Скрипт использует LLM для динамической генерации тестов:
+**Использование:**
- **Translation**: LLM создает английский текст, затем переводит его на русский
+```bash
- **Summarization**: LLM генерирует текст о технологиях, затем создает его пересказ
+python scripts/generate_summarization_from_mongo.py --record-id 507f1f77bcf86cd799439011
- **Codegen**: LLM формулирует задачу по программированию, затем пишет решение
+```
-### Примеры generated тестов
+**Параметры:**
 - `--record-id`: ID записи в MongoDB (обязательный параметр)
 - `--output-dir`: Директория для сохранения generated тестов (по умолчанию: tests/summarization)
-#### Translation
+**Требования:**
 - Доступ к MongoDB кластеру (10.0.0.3, 10.0.0.4, 10.0.0.5)
 - Установленный пакет `pymongo` (автоматически устанавливается при первом запуске)
 **Формат generated тестов:**
 ```json
 {
-  "prompt": "Translate the following English text to Russian: 'Hello, how are you today?'",
+  "prompt": "Summarize the following text in 1-2 sentences: 'Текст статьи из MongoDB'",
-  "expected": "Привет, как дела сегодня?"
+  "expected": ""
 }
 ```
-#### Summarization
+**Примечание:** Поле "expected" будет пустым, так как ожидаемый результат нужно будет сгенерировать отдельно через LLM или вручную.
-```json
+
-{
+## Установка зависимостей
-  "prompt": "Summarize the following text in 1-2 sentences: 'The quick brown fox jumps over the lazy dog...'",
+
-  "expected": "A quick fox jumps over a lazy dog, surprising it. The fox keeps running while the dog barks."
+Для работы скриптов требуются следующие зависимости:
-}
+
 ```bash
 pip install pymongo
 ```
-#### Codegen
+Все зависимости указаны в файле `requirements.txt` в корне проекта.
 ```json
 {
  "prompt": "Write a Python function that calculates the factorial of a number using recursion.",
  "expected": "def factorial(n):\n    if n == 0 or n == 1:\n        return 1\n    else:\n        return n * factorial(n-1)"
 }
 ```
 ### Валидация
 Скрипт автоматически валидирует generated тесты:
 - Проверяет наличие обязательных полей (`prompt`, `expected`)
 - Проверяет, что значения являются строками
 - Проверяет, что строки не пустые
 - Поддерживает ручную валидацию существующих тестов через `--validate`
 ### Технические детали
 - Скрипт использует ollama_client.py для подключения к Ollama серверу
 - Каждый generated тест получает уникальный номер (test1.json, test2.json, и т.д.)
 - Если тест с таким номером уже существует, используется следующий доступный номер
 - Все тесты сохраняются в формате JSON с UTF-8 кодировкой
 - Поддерживается любая модель, доступная в Ollama
--- a/scripts/convert_json_to_txt.py
+++ b/scripts/convert_json_to_txt.py
@@ -0,0 +1,73 @@
 #!/usr/bin/env python3
 """
 Скрипт для конвертации существующих JSON тестов в новый TXT формат.
 Конвертирует все тесты из JSON в TXT с разделителем ================
 """
 import json
 import os
 import sys
 from pathlib import Path
 # Добавляем путь к исходникам, чтобы импортировать base
 sys.path.insert(0, str(Path(__file__).parent.parent))
 from src.benchmarks.base import TEST_SEPARATOR
 def convert_tests(test_dir: str) -> None:
    """Конвертирует все тесты в указанной директории."""
    test_dir_path = Path(test_dir)
    if not test_dir_path.exists():
        print(f"❌ Директория {test_dir} не существует")
        return
    converted_count = 0
    for json_file in test_dir_path.glob("*.json"):
        try:
            # Читаем JSON файл
            with open(json_file, "r", encoding="utf-8") as f:
                data = json.load(f)
            # Создаем имя TXT файла
            txt_file = test_dir_path / f"{json_file.stem}.txt"
            # Сохраняем в TXT формате
            with open(txt_file, "w", encoding="utf-8") as f:
                f.write(f"{data['prompt']}{TEST_SEPARATOR}{data['expected']}")
            converted_count += 1
            print(f"✅ Конвертирован {json_file.name} в {txt_file.name}")
        except Exception as e:
            print(f"❌ Ошибка при конвертации {json_file.name}: {str(e)}")
    print(f"\nРезультаты конвертации:")
    print(f"Конвертировано тестов: {converted_count}")
 def main():
    """Основная функция скрипта."""
    import argparse
    parser = argparse.ArgumentParser(
        description="Конвертер тестовых данных из JSON в TXT формат",
        epilog="Примеры использования:\n"
               "  python scripts/convert_json_to_txt.py tests/translation\n"
               "  python scripts/convert_json_to_txt.py tests/summarization\n"
               "  python scripts/convert_json_to_txt.py tests/codegen"
    )
    parser.add_argument(
        "test_dir",
        type=str,
        help="Директория с тестами для конвертации (например: tests/translation)"
    )
    args = parser.parse_args()
    print(f"🔄 Конвертирую тесты в {args.test_dir}")
    convert_tests(args.test_dir)
    print("\n✨ Готово!")
 if __name__ == "__main__":
    main()
--- a/scripts/generate_summarization_from_mongo.py
+++ b/scripts/generate_summarization_from_mongo.py
@@ -0,0 +1,192 @@
 #!/usr/bin/env python3
 """
 Скрипт для генерации тестов пересказов из MongoDB.
 Извлекает текст статьи из коллекции rssNotification (поле .meta.topicContent)
 и генерирует тестовые данные в формате JSON для бенчмарка AI.
 """
 import argparse
 import json
 import sys
 from pathlib import Path
 from typing import Dict, Optional
 import pymongo
 from pymongo import MongoClient
 def connect_to_mongo() -> MongoClient:
    """Подключается к MongoDB кластеру."""
    client = MongoClient(
        "mongodb://10.0.0.3:27017,10.0.0.4:27017,10.0.0.5:27017/",
        connectTimeoutMS=30000,
        socketTimeoutMS=30000,
        serverSelectionTimeoutMS=30000,
        retryWrites=True,
        retryReads=True
    )
    return client
 def extract_text_from_topic_content(topic_content: Dict) -> Optional[str]:
    """
    Извлекает текст статьи из .meta.topicContent.
    Args:
        topic_content: Содержимое поля .meta.topicContent из MongoDB
    Returns:
        Текст статьи или None, если не удалось извлечь
    """
    if not topic_content:
        return None
    # Преобразуем в строку, если это не строка
    content_str = str(topic_content)
    return content_str
 def generate_test_from_mongo_record(record_id: str) -> bool:
    """
    Генерирует тест пересказа из записи MongoDB.
    Args:
        record_id: ID записи в MongoDB
    Returns:
        True, если тест успешно generated, False в случае ошибки
    """
    try:
        client = connect_to_mongo()
        db = client['tracker_conbot']
        collection = db['rssNotification']
        # Извлекаем запись по ID
        record = collection.find_one({"_id": record_id})
        if not record:
            print(f"❌ Запись с ID {record_id} не найдена в коллекции")
            return False
        # Отладочная информация
        print(f"🔍 Найдена запись: {record_id}")
        print(f"📋 Полная структура записи:")
        print(json.dumps(record, ensure_ascii=False, indent=2, default=str))
        # Извлекаем текст из meta.topicContent
        meta_data = record.get('meta', {})
        topic_content = meta_data.get('topicContent')
        if not topic_content:
            print(f"❌ В записи {record_id} отсутствует поле meta.topicContent")
            return False
        print(f"📝 Тип поля meta.topicContent: {type(topic_content)}")
        print(f"📝 Содержимое meta.topicContent (первые 500 символов):")
        print(str(topic_content)[:500])
        # Извлекаем текст
        article_text = extract_text_from_topic_content(topic_content)
        if not article_text:
            print(f"❌ Не удалось извлечь текст из meta.topicContent записи {record_id}")
            return False
        print(f"📝 Итоговый текст (первые 500 символов): {article_text[:500]}")
        # Формируем тест
        test_data = {
            "prompt": f"Summarize the following text in 1-2 sentences: '{article_text}'",
            "expected": ""  # Ожидаемый результат будет пустым, так как его нужно будет сгенерировать отдельно
        }
        # Создаем директорию для сохранения теста (всегда в tests/summarization)
        output_path = Path("tests") / "summarization"
        output_path.mkdir(parents=True, exist_ok=True)
        # Находим следующий доступный номер теста
        test_num = 1
        while True:
            test_file = output_path / f"test{test_num}.json"
            if not test_file.exists():
                break
            test_num += 1
        # Сохраняем тест
        with open(test_file, "w", encoding="utf-8") as f:
            json.dump(test_data, f, ensure_ascii=False, indent=2)
        print(f"✅ Создан тест tests/summarization/test{test_num}.json")
        print(f"   Источник: MongoDB запись {record_id}")
        print(f"   Текст статьи (первые 100 символов): {article_text[:100]}...")
        return True
    except Exception as e:
        print(f"❌ Ошибка при генерации теста: {e}")
        return False
    finally:
        if 'client' in locals():
            client.close()
 def validate_test(test_data: Dict[str, str]) -> bool:
    """Валидирует тестовые данные."""
    if not isinstance(test_data, dict):
        print("❌ Тест должен быть словарём (JSON объект)")
        return False
    if "prompt" not in test_data:
        print("❌ Отсутствует поле 'prompt'")
        return False
    if "expected" not in test_data:
        print("❌ Отсутствует поле 'expected'")
        return False
    if not isinstance(test_data["prompt"], str):
        print("❌ Поле 'prompt' должно быть строкой")
        return False
    if not isinstance(test_data["expected"], str):
        print("❌ Поле 'expected' должно быть строкой")
        return False
    if not test_data["prompt"].strip():
        print("❌ Поле 'prompt' не может быть пустым")
        return False
    return True
 def main():
    """Основная функция скрипта."""
    parser = argparse.ArgumentParser(
        description="Генератор тестов пересказов из MongoDB",
        epilog="Примеры использования:\n"
               "  python scripts/generate_summarization_from_mongo.py --record-id 507f1f77bcf86cd799439011 --output-dir tests\n"
               "  python scripts/generate_summarization_from_mongo.py --record-id 507f1f77bcf86cd799439011 --output-dir results"
    )
    parser.add_argument(
        "--record-id",
        type=str,
        required=True,
        help="ID записи в MongoDB (обязательный параметр)"
    )
    parser.add_argument(
        "--output-dir",
        type=str,
        default="tests",
        help="Директория для сохранения generated тестов (по умолчанию: tests)"
    )
    args = parser.parse_args()
    print(f"🔍 Подключаюсь к MongoDB кластеру...")
    print(f"📄 Извлекаю запись с ID: {args.record_id}")
    print(f"💾 Сохраняю тест в: tests/summarization/")
    success = generate_test_from_mongo_record(args.record_id)
    if success:
        print("\n✨ Готово! Тест успешно generated.")
    else:
        print("\n❌ Не удалось generated тест.")
        sys.exit(1)
 if __name__ == "__main__":
    main()
--- a/scripts/generate_tests.py
+++ b/scripts/generate_tests.py
@@ -21,6 +21,7 @@ from typing import Dict, List, Optional
 sys.path.insert(0, str(Path(__file__).parent.parent))
 from src.models.ollama_client import OllamaClient
 from src.constants import TEST_SEPARATOR
 def generate_translation_test(ollama: OllamaClient, model: str) -> Dict[str, str]:
    """Генерирует тест для перевода через LLM."""
@@ -126,23 +127,33 @@ def validate_all_tests(test_dir: str) -> None:
    valid_count = 0
    invalid_count = 0
-    for json_file in test_dir_path.glob("*.json"):
+    for txt_file in test_dir_path.glob("*.txt"):
        try:
-            with open(json_file, "r", encoding="utf-8") as f:
+            with open(txt_file, "r", encoding="utf-8") as f:
-                test_data = json.load(f)
+                content = f.read()
            # Разделяем по разделителю
            parts = content.split(TEST_SEPARATOR, 1)
            if len(parts) != 2:
                invalid_count += 1
                print(f"❌ {txt_file.name} - некорректный формат (отсутствует разделитель)")
                continue
            prompt, expected = parts
            test_data = {
                "prompt": prompt,
                "expected": expected
            }
            if validate_test(test_data):
                valid_count += 1
-                print(f"✅ {json_file.name} - валиден")
+                print(f"✅ {txt_file.name} - валиден")
            else:
                invalid_count += 1
-                print(f"❌ {json_file.name} - не валиден")
+                print(f"❌ {txt_file.name} - не валиден")
        except json.JSONDecodeError:
            invalid_count += 1
            print(f"❌ {json_file.name} - некорректный JSON")
        except Exception as e:
            invalid_count += 1
-            print(f"❌ {json_file.name} - ошибка: {str(e)}")
+            print(f"❌ {txt_file.name} - ошибка: {str(e)}")
    print(f"\nРезультаты валидации:")
    print(f"Валидных тестов: {valid_count}")
@@ -165,12 +176,12 @@ def generate_tests(ollama: OllamaClient, model: str, count: int, category: str,
            # Проверяем, существует ли уже тест с таким номером
            test_num = 1
            while True:
-                test_file = cat_dir / f"test{test_num}.json"
+                test_file = cat_dir / f"test{test_num}.txt"
                if not test_file.exists():
                    break
                test_num += 1
-            print(f"🤖 Генерирую тест {cat}/test{test_num}.json...")
+            print(f"🤖 Генерирую тест {cat}/test{test_num}.txt...")
            # Генерируем тест через LLM
            test_data = generate_test(ollama, model, cat)
@@ -180,11 +191,11 @@ def generate_tests(ollama: OllamaClient, model: str, count: int, category: str,
                print(f"❌ Сгенерирован невалидный тест для {cat}, тест номер {test_num}")
                continue
-            # Сохраняем тест
+            # Сохраняем тест в формате TXT с разделителем
            with open(test_file, "w", encoding="utf-8") as f:
-                json.dump(test_data, f, ensure_ascii=False, indent=2)
+                f.write(f"{test_data['prompt']}{TEST_SEPARATOR}{test_data['expected']}")
-            print(f"✅ Создан тест {cat}/test{test_num}.json")
+            print(f"✅ Создан тест {cat}/test{test_num}.txt")
 def main():
    """Основная функция скрипта."""
@@ -211,14 +222,12 @@ def main():
    parser.add_argument(
        "--model",
        type=str,
-        required=True,
+        help="Название модели для генерации тестов (требуется при генерации тестов)"
        help="Название модели для генерации тестов (обязательный параметр)"
    )
    parser.add_argument(
        "--ollama-url",
        type=str,
-        required=True,
+        help="URL подключения к Ollama серверу (требуется при генерации тестов)"
        help="URL подключения к Ollama серверу (обязательный параметр)"
    )
    parser.add_argument(
        "--validate",
@@ -231,17 +240,28 @@ def main():
    if args.validate:
        print(f"🔍 Начинаю валидацию тестов в {args.validate}")
        validate_all_tests(args.validate)
-    else:
+        print("\n✨ Готово!")
-        print(f"🤖 Подключаюсь к Ollama серверу: {args.ollama_url}")
+        sys.exit(0)
        print(f"📝 Генерирую {args.count} тест(ов) для категории: {args.category}")
        print(f"🎯 Используемая модель: {args.model}")
-        try:
+    # Проверяем обязательные параметры для генерации тестов
-            ollama = OllamaClient(args.ollama_url)
+    if not args.model:
-            generate_tests(ollama, args.model, args.count, args.category, "tests")
+        print("❌ Ошибка: параметр --model обязателен при генерации тестов")
-        except Exception as e:
+        sys.exit(1)
-            print(f"❌ Ошибка при генерации тестов: {e}")
+
-            sys.exit(1)
+    if not args.ollama_url:
        print("❌ Ошибка: параметр --ollama-url обязателен при генерации тестов")
        sys.exit(1)
    print(f"🤖 Подключаюсь к Ollama серверу: {args.ollama_url}")
    print(f"📝 Генерирую {args.count} тест(ов) для категории: {args.category}")
    print(f"🎯 Используемая модель: {args.model}")
    try:
        ollama = OllamaClient(args.ollama_url)
        generate_tests(ollama, args.model, args.count, args.category, "tests")
    except Exception as e:
        print(f"❌ Ошибка при генерации тестов: {e}")
        sys.exit(1)
    print("\n✨ Готово!")
--- a/src/benchmarks/pycache/base.cpython-313.pyc
+++ b/src/benchmarks/pycache/base.cpython-313.pyc
--- a/src/benchmarks/pycache/codegen.cpython-313.pyc
+++ b/src/benchmarks/pycache/codegen.cpython-313.pyc
--- a/src/benchmarks/pycache/summarization.cpython-313.pyc
+++ b/src/benchmarks/pycache/summarization.cpython-313.pyc
--- a/src/benchmarks/pycache/translation.cpython-313.pyc
+++ b/src/benchmarks/pycache/translation.cpython-313.pyc
--- a/src/benchmarks/base.py
+++ b/src/benchmarks/base.py
@@ -6,6 +6,9 @@ from typing import Dict, Any, List
 from abc import ABC, abstractmethod
 from models.ollama_client import OllamaClient
 # Импортируем константы
 from constants import TEST_SEPARATOR
 class Benchmark(ABC):
    """Базовый класс для всех бенчмарков."""
--- a/src/benchmarks/codegen.py
+++ b/src/benchmarks/codegen.py
@@ -2,9 +2,9 @@ import logging
 import json
 import os
 from typing import Dict, Any, List
-from benchmarks.base import Benchmark
+from benchmarks.base import Benchmark, TEST_SEPARATOR
-class CodeGenBenchmark(Benchmark):
+class CodegenBenchmark(Benchmark):
    """Бенчмарк для тестирования генерации кода."""
    def __init__(self):
@@ -21,14 +21,17 @@ class CodeGenBenchmark(Benchmark):
        data_dir = "tests/codegen"
        for filename in os.listdir(data_dir):
-            if filename.endswith('.json'):
+            if filename.endswith('.txt'):
                with open(os.path.join(data_dir, filename), 'r', encoding='utf-8') as f:
-                    data = json.load(f)
+                    content = f.read()
-                    test_data.append({
+                    # Разделяем по разделителю
-                        'name': filename.replace('.json', ''),
+                    parts = content.split(TEST_SEPARATOR, 1)
-                        'prompt': data['prompt'],
+                    if len(parts) == 2:
-                        'expected': data['expected']
+                        test_data.append({
-                    })
+                            'name': filename.replace('.txt', ''),
                            'prompt': parts[0],
                            'expected': parts[1]
                        })
        return test_data
--- a/src/benchmarks/summarization.py
+++ b/src/benchmarks/summarization.py
@@ -2,7 +2,7 @@ import logging
 import json
 import os
 from typing import Dict, Any, List
-from benchmarks.base import Benchmark
+from benchmarks.base import Benchmark, TEST_SEPARATOR
 class SummarizationBenchmark(Benchmark):
    """Бенчмарк для тестирования пересказов."""
@@ -21,14 +21,17 @@ class SummarizationBenchmark(Benchmark):
        data_dir = "tests/summarization"
        for filename in os.listdir(data_dir):
-            if filename.endswith('.json'):
+            if filename.endswith('.txt'):
                with open(os.path.join(data_dir, filename), 'r', encoding='utf-8') as f:
-                    data = json.load(f)
+                    content = f.read()
-                    test_data.append({
+                    # Разделяем по разделителю
-                        'name': filename.replace('.json', ''),
+                    parts = content.split(TEST_SEPARATOR, 1)
-                        'prompt': data['prompt'],
+                    if len(parts) == 2:
-                        'expected': data['expected']
+                        test_data.append({
-                    })
+                            'name': filename.replace('.txt', ''),
                            'prompt': parts[0],
                            'expected': parts[1]
                        })
        return test_data
--- a/src/benchmarks/translation.py
+++ b/src/benchmarks/translation.py
@@ -2,7 +2,7 @@ import logging
 import json
 import os
 from typing import Dict, Any, List
-from benchmarks.base import Benchmark
+from benchmarks.base import Benchmark, TEST_SEPARATOR
 class TranslationBenchmark(Benchmark):
    """Бенчмарк для тестирования переводов."""
@@ -21,14 +21,17 @@ class TranslationBenchmark(Benchmark):
        data_dir = "tests/translation"
        for filename in os.listdir(data_dir):
-            if filename.endswith('.json'):
+            if filename.endswith('.txt'):
                with open(os.path.join(data_dir, filename), 'r', encoding='utf-8') as f:
-                    data = json.load(f)
+                    content = f.read()
-                    test_data.append({
+                    # Разделяем по разделителю
-                        'name': filename.replace('.json', ''),
+                    parts = content.split(TEST_SEPARATOR, 1)
-                        'prompt': data['prompt'],
+                    if len(parts) == 2:
-                        'expected': data['expected']
+                        test_data.append({
-                    })
+                            'name': filename.replace('.txt', ''),
                            'prompt': parts[0],
                            'expected': parts[1]
                        })
        return test_data
--- a/src/constants.py
+++ b/src/constants.py
@@ -0,0 +1,4 @@
 """Константы для проекта."""
 # Константа для разделителя в TXT файлах
 TEST_SEPARATOR = "\n==============\n"
--- a/src/main.py
+++ b/src/main.py
@@ -4,7 +4,7 @@ from typing import List
 from models.ollama_client import OllamaClient
 from benchmarks.translation import TranslationBenchmark
 from benchmarks.summarization import SummarizationBenchmark
-from benchmarks.codegen import CodeGenBenchmark
+from benchmarks.codegen import CodegenBenchmark
 from utils.report import ReportGenerator
 def setup_logging(verbose: bool = False):
@@ -33,7 +33,7 @@ def run_benchmarks(ollama_client: OllamaClient, model_name: str, benchmarks: Lis
    benchmark_classes = {
        'translation': TranslationBenchmark,
        'summarization': SummarizationBenchmark,
-        'codegen': CodeGenBenchmark
+        'codegen': CodegenBenchmark
    }
    results = []
--- a/tests/codegen/test1.json
+++ b/tests/codegen/test1.json
@@ -1,4 +0,0 @@
 {
  "prompt": "Write a Python function that calculates the factorial of a number using recursion.",
  "expected": "def factorial(n):\n    if n == 0 or n == 1:\n        return 1\n    else:\n        return n * factorial(n-1)"
 }
--- a/tests/codegen/test1.txt
+++ b/tests/codegen/test1.txt
@@ -0,0 +1,7 @@
 Write a Python function that calculates the factorial of a number using recursion.
 ==============
 def factorial(n):
    if n == 0 or n == 1:
        return 1
    else:
        return n * factorial(n-1)
--- a/tests/codegen/test2.json
+++ b/tests/codegen/test2.json
@@ -1,4 +0,0 @@
 {
  "prompt": "Write a Python function that reverses a string.",
  "expected": "def reverse_string(s):\n    return s[::-1]"
 }
--- a/tests/codegen/test2.txt
+++ b/tests/codegen/test2.txt
@@ -0,0 +1,4 @@
 Write a Python function that reverses a string.
 ==============
 def reverse_string(s):
    return s[::-1]
--- a/tests/codegen/test3.json
+++ b/tests/codegen/test3.json
@@ -1,4 +0,0 @@
 {
  "prompt": "Here's a simple Python programming task:\n\n**Task:** Write a Python function that checks if a given string is a palindrome or not. A palindrome is a word, phrase, number, or other sequences of characters that reads the same forward and backward (ignoring spaces, punctuation, and capitalization).\n\n**Function Signature:**\n```python\ndef is_palindrome(s: str) -> bool:\n    \"\"\"\n    Check if the given string `s` is a palindrome.\n\n    Args:\n        s (str): The input string to check.\n\n    Returns:\n        bool: True if `s` is a palindrome, False otherwise.\n    \"\"\"\n```\n\n**Example:**\n\n```python\nassert is_palindrome(\"racecar\") == True\nassert is_palindrome(\"hello\") == False\nassert is_palindrome(\"A man, a plan, a canal: Panama\") == True  # Ignoring spaces and punctuation\n```\n\n**Hint:** You can use the `str.lower()` method to convert the string to lowercase and the `re` module to remove non-alphanumeric characters.",
  "expected": "```python\nimport re\n\ndef is_palindrome(s: str) -> bool:\n    \"\"\"\n    Check if the given string `s` is a palindrome.\n\n    Args:\n        s (str): The input string to check.\n\n    Returns:\n        bool: True if `s` is a palindrome, False otherwise.\n    \"\"\"\n    cleaned = re.sub(r'\\W+', '', s.lower())\n    return cleaned == cleaned[::-1]\n```"
 }
--- a/tests/codegen/test3.txt
+++ b/tests/codegen/test3.txt
@@ -0,0 +1,44 @@
 Here's a simple Python programming task:
 **Task:** Write a Python function that checks if a given string is a palindrome or not. A palindrome is a word, phrase, number, or other sequences of characters that reads the same forward and backward (ignoring spaces, punctuation, and capitalization).
 **Function Signature:**
 ```python
 def is_palindrome(s: str) -> bool:
    """
    Check if the given string `s` is a palindrome.
    Args:
        s (str): The input string to check.
    Returns:
        bool: True if `s` is a palindrome, False otherwise.
    """
 ```
 **Example:**
 ```python
 assert is_palindrome("racecar") == True
 assert is_palindrome("hello") == False
 assert is_palindrome("A man, a plan, a canal: Panama") == True  # Ignoring spaces and punctuation
 ```
 **Hint:** You can use the `str.lower()` method to convert the string to lowercase and the `re` module to remove non-alphanumeric characters.
 ==============
 ```python
 import re
 def is_palindrome(s: str) -> bool:
    """
    Check if the given string `s` is a palindrome.
    Args:
        s (str): The input string to check.
    Returns:
        bool: True if `s` is a palindrome, False otherwise.
    """
    cleaned = re.sub(r'\W+', '', s.lower())
    return cleaned == cleaned[::-1]
 ```
--- a/tests/summarization/test1.json
+++ b/tests/summarization/test1.json
@@ -1,4 +0,0 @@
 {
  "prompt": "Summarize the following text in 1-2 sentences: 'The quick brown fox jumps over the lazy dog. The dog, surprised by the fox's agility, barks loudly. The fox continues running without looking back.'",
  "expected": "A quick fox jumps over a lazy dog, surprising it. The fox keeps running while the dog barks."
 }
--- a/tests/summarization/test1.txt
+++ b/tests/summarization/test1.txt
@@ -0,0 +1,3 @@
 Summarize the following text in 1-2 sentences: 'The quick brown fox jumps over the lazy dog. The dog, surprised by the fox's agility, barks loudly. The fox continues running without looking back.'
 ==============
 A quick fox jumps over a lazy dog, surprising it. The fox keeps running while the dog barks.
--- a/tests/summarization/test2.json
+++ b/tests/summarization/test2.json
@@ -1,4 +0,0 @@
 {
  "prompt": "Summarize the following text in 1-2 sentences: 'The quick brown fox jumps over the lazy dog. The dog, surprised by the fox's agility, barks loudly. The fox continues running without looking back.'",
  "expected": "A quick fox jumps over a lazy dog, surprising it. The fox keeps running while the dog barks."
 }
--- a/tests/summarization/test2.txt
+++ b/tests/summarization/test2.txt
@@ -0,0 +1,3 @@
 Summarize the following text in 1-2 sentences: 'The quick brown fox jumps over the lazy dog. The dog, surprised by the fox's agility, barks loudly. The fox continues running without looking back.'
 ==============
 A quick fox jumps over a lazy dog, surprising it. The fox keeps running while the dog barks.
--- a/tests/summarization/test3.json
+++ b/tests/summarization/test3.json
@@ -1,4 +0,0 @@
 {
  "prompt": "Summarize the following text in 1-2 sentences: 'In the realm of programming, machine learning algorithms enable computers to improve their performance on a specific task without being explicitly programmed for each step. These algorithms learn from data, allowing them to identify patterns and make predictions or decisions with increasing accuracy over time. For instance, deep learning models, which are part of artificial intelligence, use neural networks to process vast amounts of information, making significant advancements in areas such as image recognition and natural language processing. As technology advances, these capabilities are being integrated into various sectors, from healthcare to autonomous vehicles, transforming the way we interact with digital systems and enhancing our understanding of complex data sets.'",
  "expected": "Machine learning algorithms allow computers to improve their performance on specific tasks through data-driven pattern recognition, leading to advancements in areas like image recognition and natural language processing, and being increasingly integrated into sectors such as healthcare and autonomous vehicles."
 }
--- a/tests/summarization/test3.txt
+++ b/tests/summarization/test3.txt
@@ -0,0 +1,3 @@
 Summarize the following text in 1-2 sentences: 'In the realm of programming, machine learning algorithms enable computers to improve their performance on a specific task without being explicitly programmed for each step. These algorithms learn from data, allowing them to identify patterns and make predictions or decisions with increasing accuracy over time. For instance, deep learning models, which are part of artificial intelligence, use neural networks to process vast amounts of information, making significant advancements in areas such as image recognition and natural language processing. As technology advances, these capabilities are being integrated into various sectors, from healthcare to autonomous vehicles, transforming the way we interact with digital systems and enhancing our understanding of complex data sets.'
 ==============
 Machine learning algorithms allow computers to improve their performance on specific tasks through data-driven pattern recognition, leading to advancements in areas like image recognition and natural language processing, and being increasingly integrated into sectors such as healthcare and autonomous vehicles.
--- a/tests/summarization/test4.txt
+++ b/tests/summarization/test4.txt
@@ -0,0 +1,2 @@
 Summarize the following text in 1-2 sentences: '<img src="https://res.infoq.com/news/2025/09/linkedin-edge-recommendations/en/headerimage/generatedHeaderImage-1756360053031.jpg"/><p>LinkedIn has detailed its re-architected edge-building system, an evolution designed to support diverse inference workflows for delivering fresher and more personalized recommendations to members worldwide. The new architecture addresses growing demands for real-time scalability, cost efficiency, and flexibility across its global platform.</p> <i>By Leela Kumili</i>'
 ==============
--- a/tests/translation/test1.json
+++ b/tests/translation/test1.json
@@ -1,4 +0,0 @@
 {
  "prompt": "Translate the following English text to Russian: 'Hello, how are you today?'",
  "expected": "Привет, как дела сегодня?"
 }
--- a/tests/translation/test1.txt
+++ b/tests/translation/test1.txt
@@ -0,0 +1,3 @@
 Translate the following English text to Russian: 'Hello, how are you today?'
 ==============
 Привет, как дела сегодня?
--- a/tests/translation/test2.json
+++ b/tests/translation/test2.json
@@ -1,4 +0,0 @@
 {
  "prompt": "Translate the following Russian text to English: 'Как ваши дела?'",
  "expected": "How are you?"
 }
--- a/tests/translation/test2.txt
+++ b/tests/translation/test2.txt
@@ -0,0 +1,3 @@
 Translate the following Russian text to English: 'Как ваши дела?'
 ==============
 How are you?
--- a/tests/translation/test3.json
+++ b/tests/translation/test3.json
@@ -1,4 +0,0 @@
 {
  "prompt": "Translate the following English text to Russian: 'What time is it right now?'",
  "expected": "Который сейчас час?"
 }
--- a/tests/translation/test3.txt
+++ b/tests/translation/test3.txt
@@ -0,0 +1,3 @@
 Translate the following English text to Russian: 'What time is it right now?'
 ==============
 Который сейчас час?
--- a/tests/translation/test4.json
+++ b/tests/translation/test4.json
@@ -1,4 +0,0 @@
 {
  "prompt": "Translate the following English text to Russian: 'What time is it right now?'",
  "expected": "Который сейчас час?"
 }
--- a/tests/translation/test4.txt
+++ b/tests/translation/test4.txt
@@ -0,0 +1,3 @@
 Translate the following English text to Russian: 'What time is it right now?'
 ==============
 Который сейчас час?
--- a/tests/translation/test5.json
+++ b/tests/translation/test5.json
@@ -1,4 +0,0 @@
 {
  "prompt": "Translate the following English text to Russian: '\"The sun is shining brightly.\"'",
  "expected": "Солнце светит ярко."
 }
--- a/tests/translation/test5.txt
+++ b/tests/translation/test5.txt
@@ -0,0 +1,3 @@
 Translate the following English text to Russian: '"The sun is shining brightly."'
 ==============
 Солнце светит ярко.
		`@@ -0,0 +1,2 @@`
							Summarize the following text in 1-2 sentences: '<img src="https://res.infoq.com/news/2025/09/linkedin-edge-recommendations/en/headerimage/generatedHeaderImage-1756360053031.jpg"/><p>LinkedIn has detailed its re-architected edge-building system, an evolution designed to support diverse inference workflows for delivering fresher and more personalized recommendations to members worldwide. The new architecture addresses growing demands for real-time scalability, cost efficiency, and flexibility across its global platform.</p> <i>By Leela Kumili</i>'
							`==============`