feat: add MongoDB test generation and update dependencies

- Added pymongo==3.13.0 to requirements.txt for MongoDB connectivity
- Implemented generate_summarization_from_mongo.py script to generate summarization tests from MongoDB
- Updated run.sh to support 'gen-mongo' command for MongoDB test generation
- Enhanced scripts/README.md with documentation for new MongoDB functionality
- Improved help text in run.sh to clarify available commands and usage examples
```

This commit adds MongoDB integration for test generation and updates the documentation and scripts accordingly.
This commit is contained in:
2026-01-22 20:11:52 +03:00
parent f117c7b23c
commit 8ef3a16e3a
41 changed files with 728 additions and 164 deletions

View File

@@ -1,4 +0,0 @@
{
"prompt": "Write a Python function that calculates the factorial of a number using recursion.",
"expected": "def factorial(n):\n if n == 0 or n == 1:\n return 1\n else:\n return n * factorial(n-1)"
}

7
tests/codegen/test1.txt Normal file
View File

@@ -0,0 +1,7 @@
Write a Python function that calculates the factorial of a number using recursion.
==============
def factorial(n):
if n == 0 or n == 1:
return 1
else:
return n * factorial(n-1)

View File

@@ -1,4 +0,0 @@
{
"prompt": "Write a Python function that reverses a string.",
"expected": "def reverse_string(s):\n return s[::-1]"
}

4
tests/codegen/test2.txt Normal file
View File

@@ -0,0 +1,4 @@
Write a Python function that reverses a string.
==============
def reverse_string(s):
return s[::-1]

View File

@@ -1,4 +0,0 @@
{
"prompt": "Here's a simple Python programming task:\n\n**Task:** Write a Python function that checks if a given string is a palindrome or not. A palindrome is a word, phrase, number, or other sequences of characters that reads the same forward and backward (ignoring spaces, punctuation, and capitalization).\n\n**Function Signature:**\n```python\ndef is_palindrome(s: str) -> bool:\n \"\"\"\n Check if the given string `s` is a palindrome.\n\n Args:\n s (str): The input string to check.\n\n Returns:\n bool: True if `s` is a palindrome, False otherwise.\n \"\"\"\n```\n\n**Example:**\n\n```python\nassert is_palindrome(\"racecar\") == True\nassert is_palindrome(\"hello\") == False\nassert is_palindrome(\"A man, a plan, a canal: Panama\") == True # Ignoring spaces and punctuation\n```\n\n**Hint:** You can use the `str.lower()` method to convert the string to lowercase and the `re` module to remove non-alphanumeric characters.",
"expected": "```python\nimport re\n\ndef is_palindrome(s: str) -> bool:\n \"\"\"\n Check if the given string `s` is a palindrome.\n\n Args:\n s (str): The input string to check.\n\n Returns:\n bool: True if `s` is a palindrome, False otherwise.\n \"\"\"\n cleaned = re.sub(r'\\W+', '', s.lower())\n return cleaned == cleaned[::-1]\n```"
}

44
tests/codegen/test3.txt Normal file
View File

@@ -0,0 +1,44 @@
Here's a simple Python programming task:
**Task:** Write a Python function that checks if a given string is a palindrome or not. A palindrome is a word, phrase, number, or other sequences of characters that reads the same forward and backward (ignoring spaces, punctuation, and capitalization).
**Function Signature:**
```python
def is_palindrome(s: str) -> bool:
"""
Check if the given string `s` is a palindrome.
Args:
s (str): The input string to check.
Returns:
bool: True if `s` is a palindrome, False otherwise.
"""
```
**Example:**
```python
assert is_palindrome("racecar") == True
assert is_palindrome("hello") == False
assert is_palindrome("A man, a plan, a canal: Panama") == True # Ignoring spaces and punctuation
```
**Hint:** You can use the `str.lower()` method to convert the string to lowercase and the `re` module to remove non-alphanumeric characters.
==============
```python
import re
def is_palindrome(s: str) -> bool:
"""
Check if the given string `s` is a palindrome.
Args:
s (str): The input string to check.
Returns:
bool: True if `s` is a palindrome, False otherwise.
"""
cleaned = re.sub(r'\W+', '', s.lower())
return cleaned == cleaned[::-1]
```

View File

@@ -1,4 +0,0 @@
{
"prompt": "Summarize the following text in 1-2 sentences: 'The quick brown fox jumps over the lazy dog. The dog, surprised by the fox's agility, barks loudly. The fox continues running without looking back.'",
"expected": "A quick fox jumps over a lazy dog, surprising it. The fox keeps running while the dog barks."
}

View File

@@ -0,0 +1,3 @@
Summarize the following text in 1-2 sentences: 'The quick brown fox jumps over the lazy dog. The dog, surprised by the fox's agility, barks loudly. The fox continues running without looking back.'
==============
A quick fox jumps over a lazy dog, surprising it. The fox keeps running while the dog barks.

View File

@@ -1,4 +0,0 @@
{
"prompt": "Summarize the following text in 1-2 sentences: 'The quick brown fox jumps over the lazy dog. The dog, surprised by the fox's agility, barks loudly. The fox continues running without looking back.'",
"expected": "A quick fox jumps over a lazy dog, surprising it. The fox keeps running while the dog barks."
}

View File

@@ -0,0 +1,3 @@
Summarize the following text in 1-2 sentences: 'The quick brown fox jumps over the lazy dog. The dog, surprised by the fox's agility, barks loudly. The fox continues running without looking back.'
==============
A quick fox jumps over a lazy dog, surprising it. The fox keeps running while the dog barks.

View File

@@ -1,4 +0,0 @@
{
"prompt": "Summarize the following text in 1-2 sentences: 'In the realm of programming, machine learning algorithms enable computers to improve their performance on a specific task without being explicitly programmed for each step. These algorithms learn from data, allowing them to identify patterns and make predictions or decisions with increasing accuracy over time. For instance, deep learning models, which are part of artificial intelligence, use neural networks to process vast amounts of information, making significant advancements in areas such as image recognition and natural language processing. As technology advances, these capabilities are being integrated into various sectors, from healthcare to autonomous vehicles, transforming the way we interact with digital systems and enhancing our understanding of complex data sets.'",
"expected": "Machine learning algorithms allow computers to improve their performance on specific tasks through data-driven pattern recognition, leading to advancements in areas like image recognition and natural language processing, and being increasingly integrated into sectors such as healthcare and autonomous vehicles."
}

View File

@@ -0,0 +1,3 @@
Summarize the following text in 1-2 sentences: 'In the realm of programming, machine learning algorithms enable computers to improve their performance on a specific task without being explicitly programmed for each step. These algorithms learn from data, allowing them to identify patterns and make predictions or decisions with increasing accuracy over time. For instance, deep learning models, which are part of artificial intelligence, use neural networks to process vast amounts of information, making significant advancements in areas such as image recognition and natural language processing. As technology advances, these capabilities are being integrated into various sectors, from healthcare to autonomous vehicles, transforming the way we interact with digital systems and enhancing our understanding of complex data sets.'
==============
Machine learning algorithms allow computers to improve their performance on specific tasks through data-driven pattern recognition, leading to advancements in areas like image recognition and natural language processing, and being increasingly integrated into sectors such as healthcare and autonomous vehicles.

View File

@@ -0,0 +1,2 @@
Summarize the following text in 1-2 sentences: '<img src="https://res.infoq.com/news/2025/09/linkedin-edge-recommendations/en/headerimage/generatedHeaderImage-1756360053031.jpg"/><p>LinkedIn has detailed its re-architected edge-building system, an evolution designed to support diverse inference workflows for delivering fresher and more personalized recommendations to members worldwide. The new architecture addresses growing demands for real-time scalability, cost efficiency, and flexibility across its global platform.</p> <i>By Leela Kumili</i>'
==============

View File

@@ -1,4 +0,0 @@
{
"prompt": "Translate the following English text to Russian: 'Hello, how are you today?'",
"expected": "Привет, как дела сегодня?"
}

View File

@@ -0,0 +1,3 @@
Translate the following English text to Russian: 'Hello, how are you today?'
==============
Привет, как дела сегодня?

View File

@@ -1,4 +0,0 @@
{
"prompt": "Translate the following Russian text to English: 'Как ваши дела?'",
"expected": "How are you?"
}

View File

@@ -0,0 +1,3 @@
Translate the following Russian text to English: 'Как ваши дела?'
==============
How are you?

View File

@@ -1,4 +0,0 @@
{
"prompt": "Translate the following English text to Russian: 'What time is it right now?'",
"expected": "Который сейчас час?"
}

View File

@@ -0,0 +1,3 @@
Translate the following English text to Russian: 'What time is it right now?'
==============
Который сейчас час?

View File

@@ -1,4 +0,0 @@
{
"prompt": "Translate the following English text to Russian: 'What time is it right now?'",
"expected": "Который сейчас час?"
}

View File

@@ -0,0 +1,3 @@
Translate the following English text to Russian: 'What time is it right now?'
==============
Который сейчас час?

View File

@@ -1,4 +0,0 @@
{
"prompt": "Translate the following English text to Russian: '\"The sun is shining brightly.\"'",
"expected": "Солнце светит ярко."
}

View File

@@ -0,0 +1,3 @@
Translate the following English text to Russian: '"The sun is shining brightly."'
==============
Солнце светит ярко.