docs: update test format documentation in README

Update documentation to reflect new TXT format with separator for summarization tests instead of JSON format. Clarify that expected field may be empty if summary generation fails. feat: change test generation to TXT format with separator Change test generation from JSON to TXT format with TEST_SEPARATOR. Add filename sanitization function to handle MongoDB record IDs. Update output path and file naming logic. Add attempt to generate expected summary through LLM with fallback to empty string.
2026-01-22 20:40:41 +03:00
parent 2466f1253a
commit 2a04e6c089
21 changed files with 96 additions and 104 deletions
--- a/prompts/codegen.txt
+++ b/prompts/codegen.txt
@@ -0,0 +1 @@
+Write Python code that {task}
--- a/prompts/summarization.txt
+++ b/prompts/summarization.txt
@@ -0,0 +1 @@
+Summarize the following text in 1-2 sentences: '{text}'
--- a/prompts/translation.txt
+++ b/prompts/translation.txt
@@ -0,0 +1 @@
+Translate the following English text to Russian: '{text}'
--- a/scripts/README.md
+++ b/scripts/README.md
@@ -49,14 +49,12 @@ python scripts/generate_summarization_from_mongo.py --record-id 507f1f77bcf86cd7
 - Установленный пакет `pymongo` (автоматически устанавливается при первом запуске)

 **Формат generated тестов:**
-```json
-{
-  "prompt": "Summarize the following text in 1-2 sentences: 'Текст статьи из MongoDB'",
-  "expected": ""
-}
+```
+Summarize the following text in 1-2 sentences: 'Текст статьи из MongoDB'
+Ожидаемый пересказ (если доступен)
 ```

-**Примечание:** Поле "expected" будет пустым, так как ожидаемый результат нужно будет сгенерировать отдельно через LLM или вручную.
+**Примечание:** Тесты генерируются в формате TXT с разделителем `==============`. Поле "expected" может быть пустым, если генерация пересказа не удалась.

 ## Установка зависимостей

--- a/scripts/generate_summarization_from_mongo.py
+++ b/scripts/generate_summarization_from_mongo.py
@@ -3,7 +3,7 @@
 Скрипт для генерации тестов пересказов из MongoDB.

 Извлекает текст статьи из коллекции rssNotification (поле .meta.topicContent)
-и генерирует тестовые данные в формате JSON для бенчмарка AI.
+и генерирует тестовые данные в формате TXT для бенчмарка AI.
 """

 import argparse
@@ -15,6 +15,27 @@ from typing import Dict, Optional
 import pymongo
 from pymongo import MongoClient

+# Добавляем путь к исходникам, чтобы импортировать константы
+sys.path.insert(0, str(Path(__file__).parent.parent))
+
+from src.constants import TEST_SEPARATOR
+
+def sanitize_filename(filename: str) -> str:
+    """
+    Очищает строку от недопустимых символов для имени файла.
+
+    Args:
+        filename: Исходное имя файла
+
+    Returns:
+        Очищенное имя файла или пустая строка, если очистка невозможна
+    """
+    import re
+    # Заменяем недопустимые символы на подчеркивание
+    # Допустимые символы: буквы, цифры, подчеркивание, тире, точка
+    sanitized = re.sub(r'[^\w\-\.]', '_', filename)
+    return sanitized if sanitized else filename
+
 def connect_to_mongo() -> MongoClient:
    """Подключается к MongoDB кластеру."""
    client = MongoClient(
@@ -90,29 +111,38 @@ def generate_test_from_mongo_record(record_id: str) -> bool:

        print(f"📝 Итоговый текст (первые 500 символов): {article_text[:500]}")

-        # Формируем тест
-        test_data = {
-            "prompt": f"Summarize the following text in 1-2 sentences: '{article_text}'",
-            "expected": ""  # Ожидаемый результат будет пустым, так как его нужно будет сгенерировать отдельно
-        }
+        # Генерируем пересказ через LLM (если доступно)
+        expected_summary = ""
+        try:
+            # Пытаемся сгенерировать пересказ
+            summary_prompt = f"""Summarize the following text in 1-2 sentences:
+"{article_text}"
+Provide only the summary, no additional text."""
+            # Если Ollama доступен, можно было бы использовать его здесь
+            # Для простоты оставляем пустым или можно добавить логику позже
+            expected_summary = ""
+        except:
+            # Если генерация не удалась, оставляем пустым
+            expected_summary = ""

        # Создаем директорию для сохранения теста (всегда в tests/summarization)
        output_path = Path("tests") / "summarization"
        output_path.mkdir(parents=True, exist_ok=True)

-        # Находим следующий доступный номер теста
-        test_num = 1
-        while True:
-            test_file = output_path / f"test{test_num}.json"
-            if not test_file.exists():
-                break
-            test_num += 1
+        # Очищаем ID от недопустимых символов для имени файла
+        filename = sanitize_filename(record_id)
+        if not filename:
+            print(f"❌ Не удалось создать допустимое имя файла из ID записи {record_id}")
+            return False

-        # Сохраняем тест
+        # Используем очищенный ID записи как имя файла
+        test_file = output_path / f"{filename}.txt"
+
+        # Сохраняем текст статьи и ожидаемый пересказ с разделителем
        with open(test_file, "w", encoding="utf-8") as f:
-            json.dump(test_data, f, ensure_ascii=False, indent=2)
+            f.write(f"{article_text}{TEST_SEPARATOR}{expected_summary}")

-        print(f"✅ Создан тест tests/summarization/test{test_num}.json")
+        print(f"✅ Создан тест tests/summarization/{filename}.txt")
        print(f"   Источник: MongoDB запись {record_id}")
        print(f"   Текст статьи (первые 100 символов): {article_text[:100]}...")

@@ -151,6 +181,10 @@ def validate_test(test_data: Dict[str, str]) -> bool:
        print("❌ Поле 'prompt' не может быть пустым")
        return False

+    if not test_data["expected"].strip():
+        print("❌ Поле 'expected' не может быть пустым")
+        return False
+
    return True

 def main():
@@ -158,8 +192,7 @@ def main():
    parser = argparse.ArgumentParser(
        description="Генератор тестов пересказов из MongoDB",
        epilog="Примеры использования:\n"
-               "  python scripts/generate_summarization_from_mongo.py --record-id 507f1f77bcf86cd799439011 --output-dir tests\n"
-               "  python scripts/generate_summarization_from_mongo.py --record-id 507f1f77bcf86cd799439011 --output-dir results"
+               "  python scripts/generate_summarization_from_mongo.py --record-id 507f1f77bcf86cd799439011"
    )
    parser.add_argument(
        "--record-id",
@@ -167,12 +200,6 @@ def main():
        required=True,
        help="ID записи в MongoDB (обязательный параметр)"
    )
-    parser.add_argument(
-        "--output-dir",
-        type=str,
-        default="tests",
-        help="Директория для сохранения generated тестов (по умолчанию: tests)"
-    )

    args = parser.parse_args()

--- a/src/benchmarks/codegen.py
+++ b/src/benchmarks/codegen.py
@@ -9,6 +9,9 @@ class CodegenBenchmark(Benchmark):

    def __init__(self):
        super().__init__("codegen")
+        # Загружаем универсальный промпт
+        with open('prompts/codegen.txt', 'r', encoding='utf-8') as f:
+            self.universal_prompt = f.read().strip()

    def load_test_data(self) -> List[Dict[str, Any]]:
        """
@@ -29,7 +32,7 @@ class CodegenBenchmark(Benchmark):
                    if len(parts) == 2:
                        test_data.append({
                            'name': filename.replace('.txt', ''),
-                            'prompt': parts[0],
+                            'prompt': self.universal_prompt.format(task=parts[0]),
                            'expected': parts[1]
                        })

--- a/src/benchmarks/summarization.py
+++ b/src/benchmarks/summarization.py
@@ -9,6 +9,9 @@ class SummarizationBenchmark(Benchmark):

    def __init__(self):
        super().__init__("summarization")
+        # Загружаем универсальный промпт
+        with open('prompts/summarization.txt', 'r', encoding='utf-8') as f:
+            self.universal_prompt = f.read().strip()

    def load_test_data(self) -> List[Dict[str, Any]]:
        """
@@ -29,7 +32,7 @@ class SummarizationBenchmark(Benchmark):
                    if len(parts) == 2:
                        test_data.append({
                            'name': filename.replace('.txt', ''),
-                            'prompt': parts[0],
+                            'prompt': self.universal_prompt.format(task=parts[0]),
                            'expected': parts[1]
                        })

--- a/src/benchmarks/translation.py
+++ b/src/benchmarks/translation.py
@@ -5,14 +5,17 @@ from typing import Dict, Any, List
 from benchmarks.base import Benchmark, TEST_SEPARATOR

 class TranslationBenchmark(Benchmark):
-    """Бенчмарк для тестирования переводов."""
+    """Бенчмарк для тестирования перевода."""

    def __init__(self):
        super().__init__("translation")
+        # Загружаем универсальный промпт
+        with open('prompts/translation.txt', 'r', encoding='utf-8') as f:
+            self.universal_prompt = f.read().strip()

    def load_test_data(self) -> List[Dict[str, Any]]:
        """
-        Загрузка тестовых данных для переводов.
+        Загрузка тестовых данных для перевода.

        Returns:
            Список тестовых случаев
@@ -29,7 +32,7 @@ class TranslationBenchmark(Benchmark):
                    if len(parts) == 2:
                        test_data.append({
                            'name': filename.replace('.txt', ''),
-                            'prompt': parts[0],
+                            'prompt': self.universal_prompt.format(text=parts[0]),
                            'expected': parts[1]
                        })

--- a/tests/codegen/test1.txt
+++ b/tests/codegen/test1.txt
@@ -1,7 +1,6 @@
-Write a Python function that calculates the factorial of a number using recursion.
+Write a Python function that calculates the factorial of a number.
 ==============
 def factorial(n):
-    if n == 0 or n == 1:
+    if n == 0:
        return 1
-    else:
-        return n * factorial(n-1)
+    return n * factorial(n - 1)
--- a/tests/codegen/test2.txt
+++ b/tests/codegen/test2.txt
@@ -1,4 +1,4 @@
 Write a Python function that reverses a string.
 ==============
 def reverse_string(s):
-    return s[::-1]
+    return s[::-1]
--- a/tests/codegen/test3.txt
+++ b/tests/codegen/test3.txt
@@ -1,44 +1,9 @@
-Here's a simple Python programming task:
-
-**Task:** Write a Python function that checks if a given string is a palindrome or not. A palindrome is a word, phrase, number, or other sequences of characters that reads the same forward and backward (ignoring spaces, punctuation, and capitalization).
-
-**Function Signature:**
-```python
-def is_palindrome(s: str) -> bool:
-    """
-    Check if the given string `s` is a palindrome.
-
-    Args:
-        s (str): The input string to check.
-
-    Returns:
-        bool: True if `s` is a palindrome, False otherwise.
-    """
-```
-
-**Example:**
-
-```python
-assert is_palindrome("racecar") == True
-assert is_palindrome("hello") == False
-assert is_palindrome("A man, a plan, a canal: Panama") == True  # Ignoring spaces and punctuation
-```
-
-**Hint:** You can use the `str.lower()` method to convert the string to lowercase and the `re` module to remove non-alphanumeric characters.
+Write a Python function that checks if a number is prime.
 ==============
-```python
-import re
-
-def is_palindrome(s: str) -> bool:
-    """
-    Check if the given string `s` is a palindrome.
-
-    Args:
-        s (str): The input string to check.
-
-    Returns:
-        bool: True if `s` is a palindrome, False otherwise.
-    """
-    cleaned = re.sub(r'\W+', '', s.lower())
-    return cleaned == cleaned[::-1]
-```
+def is_prime(n):
+    if n <= 1:
+        return False
+    for i in range(2, int(n**0.5) + 1):
+        if n % i == 0:
+            return False
+    return True
--- a/tests/summarization/https___www.opennet.ru_opennews_art.shtml_num_64655.txt
+++ b/tests/summarization/https___www.opennet.ru_opennews_art.shtml_num_64655.txt
--- a/tests/summarization/test1.txt
+++ b/tests/summarization/test1.txt
@@ -1,3 +0,0 @@
-Summarize the following text in 1-2 sentences: 'The quick brown fox jumps over the lazy dog. The dog, surprised by the fox's agility, barks loudly. The fox continues running without looking back.'
-==============
-A quick fox jumps over a lazy dog, surprising it. The fox keeps running while the dog barks.
--- a/tests/summarization/test2.txt
+++ b/tests/summarization/test2.txt
@@ -1,3 +0,0 @@
-Summarize the following text in 1-2 sentences: 'The quick brown fox jumps over the lazy dog. The dog, surprised by the fox's agility, barks loudly. The fox continues running without looking back.'
-==============
-A quick fox jumps over a lazy dog, surprising it. The fox keeps running while the dog barks.
--- a/tests/summarization/test3.txt
+++ b/tests/summarization/test3.txt
@@ -1,3 +0,0 @@
-Summarize the following text in 1-2 sentences: 'In the realm of programming, machine learning algorithms enable computers to improve their performance on a specific task without being explicitly programmed for each step. These algorithms learn from data, allowing them to identify patterns and make predictions or decisions with increasing accuracy over time. For instance, deep learning models, which are part of artificial intelligence, use neural networks to process vast amounts of information, making significant advancements in areas such as image recognition and natural language processing. As technology advances, these capabilities are being integrated into various sectors, from healthcare to autonomous vehicles, transforming the way we interact with digital systems and enhancing our understanding of complex data sets.'
-==============
-Machine learning algorithms allow computers to improve their performance on specific tasks through data-driven pattern recognition, leading to advancements in areas like image recognition and natural language processing, and being increasingly integrated into sectors such as healthcare and autonomous vehicles.
--- a/tests/summarization/test4.txt
+++ b/tests/summarization/test4.txt
@@ -1,2 +0,0 @@
-Summarize the following text in 1-2 sentences: '<img src="https://res.infoq.com/news/2025/09/linkedin-edge-recommendations/en/headerimage/generatedHeaderImage-1756360053031.jpg"/><p>LinkedIn has detailed its re-architected edge-building system, an evolution designed to support diverse inference workflows for delivering fresher and more personalized recommendations to members worldwide. The new architecture addresses growing demands for real-time scalability, cost efficiency, and flexibility across its global platform.</p> <i>By Leela Kumili</i>'
-==============
--- a/tests/translation/test1.txt
+++ b/tests/translation/test1.txt
@@ -1,3 +1,3 @@
-Translate the following English text to Russian: 'Hello, how are you today?'
+Hello, how are you today?
 ==============
-Привет, как дела сегодня?
+Привет, как дела сегодня?
--- a/tests/translation/test2.txt
+++ b/tests/translation/test2.txt
@@ -1,3 +1,3 @@
-Translate the following Russian text to English: 'Как ваши дела?'
+The weather is beautiful today. The sun is shining brightly in the clear blue sky.
 ==============
-How are you?
+Сегодня прекрасная погода. Солнце ярко светит на чистом голубом небе.
--- a/tests/translation/test3.txt
+++ b/tests/translation/test3.txt
@@ -1,3 +1,3 @@
-Translate the following English text to Russian: 'What time is it right now?'
+I would like to order a pizza with mushrooms and olives. Could you please bring it to my table?
 ==============
-Который сейчас час?
+Я бы хотел заказать пиццу с грибами и оливами. Не могли бы вы принести ее к моему столу?
--- a/tests/translation/test4.txt
+++ b/tests/translation/test4.txt
@@ -1,3 +1,3 @@
-Translate the following English text to Russian: 'What time is it right now?'
+This project requires attention to detail and good communication skills. Teamwork is essential for success.
 ==============
-Который сейчас час?
+Этот проект требует внимания к деталям и хороших навыков общения. Командная работа необходима для успеха.
--- a/tests/translation/test5.txt
+++ b/tests/translation/test5.txt
@@ -1,3 +1,3 @@
-Translate the following English text to Russian: '"The sun is shining brightly."'
+Thank you for your help. I really appreciate it.
 ==============
-Солнце светит ярко.
+Спасибо за вашу помощь. Я действительно ценю это.
				`@@ -0,0 +1 @@`
				`Summarize the following text in 1-2 sentences: '{text}'`
				`@@ -0,0 +1 @@`
				`Translate the following English text to Russian: '{text}'`