feat: enhance summarization prompt and improve MongoDB test generation

- Updated summarization prompt to require Russian output and exclude non-textual elements - Upgraded ollama dependency to v0.6.1 - Enhanced run.sh script to support both single record and file-based ID input for MongoDB test generation - Updated documentation in scripts/README.md to reflect new functionality - Added verbose flag to generate_summarization_from_mongo.py for better debugging ``` This commit message follows the conventional commit format with a short title (50-72 characters) and provides a clear description of the changes made and their purpose.
2026-01-23 03:49:22 +03:00
parent d8785ada8a
commit 2048e4e40d
234 changed files with 3268 additions and 72 deletions
--- a/ids.txt
+++ b/ids.txt
@@ -0,0 +1,220 @@
+https://habr.com/ru/articles/987826/
+https://habr.com/ru/articles/985300/
+https://habr.com/ru/articles/987848/
+https://habr.com/ru/articles/987854/
+https://habr.com/ru/companies/croc/articles/987856/
+https://habr.com/ru/articles/987862/
+https://habr.com/ru/companies/perco/articles/987866/
+https://habr.com/ru/companies/habr_career/articles/987870/
+https://habr.com/ru/articles/987876/
+https://habr.com/ru/articles/987882/
+https://t-cadet.github.io/programming-wisdom/#2026-01-17-gathering-linux-syscall-numbers
+https://www.valueaddedresource.net/ebay-bans-ai-agents-updates-arbitration-user-agreement-feb-2026/
+https://www.jamf.com/blog/threat-actors-expand-abuse-of-visual-studio-code/
+https://www.projectlumen.app/
+https://www.ycombinator.com/companies/flowtel/jobs/LaddaEz-founding-engineer-staff-senior
+https://www.media.mit.edu/publications/your-brain-on-chatgpt/
+https://lambdaland.org/posts/2026-01-21_tree-sitter_vs_lsp/
+https://reactos.org/blogs/30yrs-of-ros/
+https://www.jsoftware.com/papers/perlis77.htm
+https://www.pbs.org/newshour/health/brazilian-city-uses-tilapia-fish-skin-treat-burn-victims
+https://venturebeat.com/security/salesforce-research-across-the-c-suite-trust-is-the-key-to-scaling-agentic
+https://venturebeat.com/orchestration/what-servicenow-and-openai-signal-for-enterprises-as-ai-moves-from-advice-to
+https://venturebeat.com/data/cfos-are-now-getting-their-own-vibe-coding-moment-thanks-to-datarails
+https://venturebeat.com/infrastructure/why-linkedin-says-prompting-was-a-non-starter-and-small-models-was-the
+https://venturebeat.com/infrastructure/claude-code-costs-up-to-usd200-a-month-goose-does-the-same-thing-for-free
+https://venturebeat.com/data/x-open-sources-its-algorithm-5-ways-businesses-can-benefit
+https://venturebeat.com/orchestration/mits-new-recursive-framework-lets-llms-process-10-million-tokens-without
+https://venturebeat.com/infrastructure/truefoundry-launches-truefailover-to-automatically-reroute-enterprise-ai
+https://venturebeat.com/infrastructure/stop-calling-it-the-ai-bubble-its-actually-multiple-bubbles-each-with-a
+https://venturebeat.com/orchestration/why-reinforcement-learning-plateaus-without-representation-depth-and-other
+https://nuancesprog.ru/p/31409/
+https://nuancesprog.ru/p/30941/
+https://nuancesprog.ru/p/30511/
+https://nuancesprog.ru/p/30448/
+https://nuancesprog.ru/p/30828/
+https://nuancesprog.ru/p/31317/
+https://nuancesprog.ru/p/30453/
+https://nuancesprog.ru/p/31340/
+https://nuancesprog.ru/p/30546/
+https://nuancesprog.ru/p/30526/
+https://vc.ru/services/2699471-biklain-youtube-bez-vpn
+https://vc.ru/marketplace/2700808-wildberries-rasshiril-testirovanie-neyropereskazov
+https://vc.ru/money/2700754-prodazhi-noutbukov-v-rossii-sokratilis-na-15-30-protsentov
+https://vc.ru/money/2700907-sem-altman-openai-investitsii-50-mlrd-abudabi
+https://vc.ru/hr/2701502-profsoyuz-hyundai-yuzhnaya-koreya-protiv-robotov
+https://vc.ru/tech/2701445-sony-novye-naushniki-linkbuds-clip
+https://vc.ru/tech/2701146-sozdanie-obedinyonnoy-mikroelektronnoy-kompanii
+https://vc.ru/services/2701052-prilozhenie-nonusa-boykot-amerikanskih-tovarov
+https://vc.ru/tech/2699508-tesla-vozobnovlyaet-proekt-dojo3
+https://vc.ru/offline/2699877-obvineniya-muzhchine-v-ssha-za-moshennichestvo-s-poddelnym-udostovereniem-pilota
+https://www.bitdegree.org/crypto/news/what-ethereum-does-that-bitcoin-cant
+https://www.bitdegree.org/crypto/news/bitdegree-rolls-out-latest-mission-exploring-global-money-transfers-in-brazil
+https://www.bitdegree.org/crypto/news/bhutan-to-launch-sei-network-validator-in-q1-blockchain-push
+https://www.bitdegree.org/crypto/news/solana-mobile-rewards-seeker-owners-with-26-million-skr-airdrop
+https://www.bitdegree.org/crypto/news/aave-hands-lens-control-to-mask-network-refocuses-on-defi
+https://www.bitdegree.org/crypto/news/sec-reviews-fresh-calls-for-defi-and-self-custody-clarity
+https://www.bitdegree.org/crypto/news/slowmist-uncovers-snap-store-exploit-targeting-crypto-users
+https://www.bitdegree.org/crypto/news/iran-buys-507-million-in-tether-to-defend-rial-says-elliptic
+https://www.bitdegree.org/crypto/news/ethereums-vitalik-buterin-shifts-entire-online-activity-to-firefly-in-2026
+https://www.bitdegree.org/crypto/news/circle-teams-up-with-un-to-modernize-38-billion-in-global-aid-transfers
+https://ip-calculator.ru/blog/ask/zachem-programmistu-razbiratsya-v-zheleze/
+https://ip-calculator.ru/blog/ask/kak-ajtishniku-vybrat-nejroset/
+https://ip-calculator.ru/blog/ask/kak-it-biznesu-zashhitit-svoyu-informatsionnuyu-infrastrukturu/
+https://ip-calculator.ru/blog/ask/kak-rossijskim-ajti-predprinimatelyam-zapuskat-biznes-v-azii-bez-problem-s-zaderzhkoj-nadyozhnostyu-i-bezopasnostyu/
+https://ip-calculator.ru/blog/ask/kak-it-kompanii-zashhitit-svoi-dannye/
+https://ip-calculator.ru/blog/ask/kak-sozdat-internet-magazin-ponyatnoe-rukovodstvo-dlya-predprinimatelej/
+https://ip-calculator.ru/blog/ask/integratsiya-1s-s-it-sistemami-prakticheskoe-rukovodstvo/
+https://ip-calculator.ru/blog/ask/nejroseti-v-seo-i-marketinge-telegram-kak-novyj-poiskovik/
+https://ip-calculator.ru/blog/ask/vizualnyj-dizajn-i-sotsialnaya-aktivnost-kak-pikseli-i-oformlenie-vliyayut-na-lajki-i-podpischikov/
+https://ip-calculator.ru/blog/ask/how-to-enable-use-or-remove-the-developer-tab-in-microsoft-excel/
+https://www.infoq.com/presentations/autonomous-driving-data/
+https://www.infoq.com/news/2026/01/aws-european-sovereign-cloud/
+https://www.infoq.com/news/2026/01/data-sovereignty-trust-framework/
+https://www.infoq.com/news/2026/01/meta-pai-genai-data-flows/
+https://www.infoq.com/news/2026/01/claude-cowork/
+https://www.infoq.com/news/2026/01/cyberark-agents-defenses/
+https://www.infoq.com/articles/ai-assisted-development-series/
+https://www.infoq.com/news/2026/01/google-agentic-commerce-ucp/
+https://www.infoq.com/news/2026/01/prisma-7-performance/
+https://www.infoq.com/news/2026/01/salesforce-eks-karpenter/
+https://kod.ru/reestr-kuriery
+https://kod.ru/open-letter-to-pavel-durov
+https://kod.ru/mincifri-analog-call-of-duty
+https://kod.ru/blue-origin-sputnikovaya-svayz
+https://kod.ru/huawei-matepad-11-5-s-2026-predzakaz
+https://kod.ru/obzor-huawei-matepad-11-5-s-2026
+https://kod.ru/gosduma-shtrafy-vpn
+https://kod.ru/caviar-aladdin
+https://kod.ru/yandex-b2btech-postgresql
+https://kod.ru/telegram-op-ogranicheniya
+https://www.404media.co/ham-radio-operators-in-belarus-arrested-face-the-death-penalty/
+https://www.404media.co/podcast-heres-what-palantir-is-really-building/
+https://www.404media.co/feds-create-drone-no-fly-zone-that-would-stop-people-filming-ice/
+https://www.404media.co/ohio-mail-theft-postal-worker-robbery/
+https://www.404media.co/how-wikipedia-will-survive-in-the-age-of-ai-with-wikipedias-cto-selena-deckelmann/
+https://www.404media.co/comic-con-bans-ai-art-after-artist-pushback/
+https://www.404media.co/ices-facial-recognition-app-misidentified-a-woman-twice/
+https://www.404media.co/behind-the-blog-putting-the-puzzle-together/
+https://www.404media.co/scientists-make-stunning-find-inside-prehistoric-wolfs-stomach/
+https://www.404media.co/theres-a-lootbox-with-rare-pokemon-cards-sitting-in-the-pentagon-food-court/
+https://dzone.com/articles/java-high-availability-failures
+https://dzone.com/articles/merge-liquid-clustering-common-issues
+https://dzone.com/articles/no-buffering-strategy-streaming-search-results
+https://dzone.com/articles/rag-ai-for-ai-builders
+https://dzone.com/articles/where-ai-fits-and-fails-in-workday-integrations
+https://dzone.com/articles/automated-inventory-pattern-for-managing-aws-ec2
+https://dzone.com/articles/future-of-data-streaming-apache-flink-for-agentic-ai
+https://dzone.com/articles/refactoring-react-monolith-with-autonomous-agents
+https://dzone.com/articles/mcp-security-governance-opportunity
+https://dzone.com/articles/build-ai-tools-go-mcp-sdk-databases
+https://www.reddit.com/r/MachineLearning/comments/1qj3t98/d_do_you_feel_like_companies_are_scooping_abusing/
+https://www.reddit.com/r/ollama/comments/1qjtnqm/how_to_implement_a_rag_retrieval_augmented/
+https://www.reddit.com/r/LLM/comments/1qjmi9f/using_ai_for_product_mockups/
+https://www.reddit.com/r/ollama/comments/1qjtqyr/what_do_you_guys_test_llms_in_cicd/
+https://www.reddit.com/r/datasets/comments/1qjiok5/i_finetuned_llama_32_1b_brazilian_address_parser/
+https://www.reddit.com/r/LLM/comments/1qjd8b0/i_liked_this_paper_251004226_epistemic_diversity/
+https://www.reddit.com/r/MachineLearning/comments/1qjmqy8/d_which_data_design_patterns_have_held_up_for_you/
+https://www.reddit.com/r/datasets/comments/1qjmvso/how_to_get_dfdc_dataset_access_is_the_website/
+https://www.reddit.com/r/technology/comments/1qji6cy/job_applicants_sue_to_open_black_box_of_ai_hiring/
+https://www.reddit.com/r/technology/comments/1qjh5o1/president_fcc_threatens_to_enforce_equaltime_rule/
+https://www.opennet.ru/opennews/art.shtml?num=64655
+https://www.opennet.ru/opennews/art.shtml?num=64657
+https://www.opennet.ru/opennews/art.shtml?num=64658
+https://www.opennet.ru/opennews/art.shtml?num=64649
+https://www.opennet.ru/opennews/art.shtml?num=64650
+https://www.opennet.ru/opennews/art.shtml?num=64651
+https://www.opennet.ru/opennews/art.shtml?num=64652
+https://www.opennet.ru/opennews/art.shtml?num=64653
+https://www.opennet.ru/opennews/art.shtml?num=64642
+https://www.opennet.ru/opennews/art.shtml?num=64644
+https://4pda.to/2026/01/22/452001/qwerty_smartfon_unihertz_titan_2_elite_pokazali_vzhivuyu_video/
+https://4pda.to/2026/01/22/452021/smena_komandy_prodyuser_world_of_warcraft_teper_rabotaet_nad_mmo_po_league_of_legends/
+https://4pda.to/2026/01/22/452008/kuler_igrovogo_smartfona_iqoo_15_ultra_pokazali_na_rendere/
+https://4pda.to/2026/01/22/452006/smi_era_nedorogikh_ssd_zakonchilas_i_dalshe_budet_eschyo_khuzhe/
+https://4pda.to/2026/01/22/452044/portativka_gpd_win_5_poluchit_ofitsialnuyu_podderzhku_os_bazzite/
+https://4pda.to/2026/01/22/452026/pokhozhe_marathon_ne_svetit_sudba_concord_gejmery_skupayut_predzakazy/
+https://4pda.to/2026/01/22/452045/dizajn_modulnoj_kamery_insta360_pocket_raskryt_do_anonsa_foto/
+https://4pda.to/2026/01/22/452051/asus_predstavila_ochen_dorogoj_8k_monitor_proart_pa32kcx_dlya_professionalov/
+https://4pda.to/2026/01/22/452052/smi_igrovaya_proizvoditelnost_linux_i_windows_pochti_ravna_no_ne_dlya_vsekh/
+https://4pda.to/2026/01/22/452025/eto_uspekh_vsego_za_nedelyu_gejmery_zagruzili_svyshe_5_mln_modov_dlya_hytale/
+https://techcrunch.com/2026/01/20/x-open-sources-its-algorithm-while-facing-a-transparency-fine-and-grok-controversies/
+https://techcrunch.com/2026/01/20/elon-musk-says-teslas-restarted-dojo3-will-be-for-space-based-ai-compute/
+https://techcrunch.com/2026/01/20/in-an-effort-to-protect-young-users-chatgpt-will-now-predict-how-old-you-are/
+https://techcrunch.com/2026/01/20/one-time-hot-insurance-tech-ethos-poised-to-be-first-tech-ipo-of-the-year/
+https://techcrunch.com/2026/01/20/netflix-to-redesign-its-app-as-it-competes-with-social-platforms-for-daily-engagement/
+https://techcrunch.com/2026/01/20/anthropics-ceo-stuns-davos-with-nvidia-criticism/
+https://techcrunch.com/2026/01/20/bolna-nabs-6-3-million-from-general-catalyst-for-its-india-focused-voice-orchestration-platform/
+https://techcrunch.com/2026/01/20/amagi-slides-in-india-debut-as-cloud-tv-software-firm-tests-investor-appetite/
+https://techcrunch.com/2026/01/20/snap-reaches-settlement-in-social-media-addiction-lawsuit/
+https://techcrunch.com/2026/01/21/consumers-spent-more-on-mobile-apps-than-games-in-2025-driven-by-ai-app-adoption/
+https://xakep.ru/2026/01/20/stackwarp/
+https://xakep.ru/2026/01/20/modular-ds/
+https://xakep.ru/2026/01/21/shadowrelay/
+https://xakep.ru/2026/01/21/almaty-meetup-jan/
+https://xakep.ru/2026/01/21/sideloading-restrictions/
+https://xakep.ru/2026/01/21/voidlink-ai/
+https://xakep.ru/2026/01/21/aisuru-kimwolf/
+https://xakep.ru/2026/01/21/vigi-patch/
+https://xakep.ru/2026/01/19/0days-win/
+https://xakep.ru/2026/01/19/gootloader-zip/
+https://uproger.com/podrobnyj-vvodnyj-kurs-po-parsingu-na-python-2026-godav-etom-besplatnom-kurse-v/
+https://uproger.com/anthropicai-vypustili-claude-cowork-po-suti-eto-claude-code-no-dlya-netehnare/
+https://uproger.com/ceo-cursor-zayavil-chto-oni-skoordinirovali-sotni-gpt-5-2-agentov-chtoby-avtonom/
+https://uproger.com/nvidia-kvzap-zhmem-kv-kesh-v-4-raza-vse-lyubyat-dlinnyj-kontekst-no-dlya-gpu-eto-b/
+https://uproger.com/google-problema-data-czentrov-uzhe-ne-v-kupit-elektrichestvo-problema-podkl/
+https://uproger.com/%f0%9f%a4%96-luchshie-github-repozitorii-chtoby-vyuchit-ai-s-nulya-v-2026/
+https://uproger.com/google-pokazala-interesnyj-primer-togo-kak-multimodeli-uzhe-pomogayut-v-gumanit/
+https://uproger.com/glavnye-novosti-ii-i-ml-openai-zapustila-chatgpt-health-chatgpt-health-otdel/
+https://uproger.com/u-deepseek-mozhet-byt-odin-iz-samyh-silnyh-skrytyh-istochnikov-finansirovaniya/
+https://uproger.com/grok-4-20-ii-nashyol-novuyu-bellman-funkcziyu-i-prodvinul-slozhnuyu-zadachu-v-analizep/
+https://gopractice.ru/product/kano-model/
+https://gopractice.ru/product/segmentation-method/
+https://gopractice.ru/product/the-north-star-metric-guide/
+https://gopractice.ru/product/focus-on-the-job-not-the-customer/
+https://gopractice.ru/product/finding-potential-ai-applications/
+https://gopractice.ru/product/large-language-models/
+https://gopractice.ru/product/ai-products/
+https://gopractice.ru/product/jtbd-the-theory-and-the-frameworks/
+https://gopractice.ru/product/metrics/
+https://gopractice.ru/product/ai-products-mazes/
+https://vas3k.club/post/29747/
+https://vas3k.blog/notes/invest/
+https://vas3k.blog/world/normandy/
+https://vas3k.blog/blog/unstoppable_web/
+https://vas3k.blog/world/south_africa/
+https://vas3k.blog/notes/flipperzero/
+https://vas3k.blog/notes/bow/
+https://vas3k.blog/blog/bus_2022/
+https://vas3k.blog/notes/homelab_2022/
+https://vas3k.blog/blog/2022/
+https://vas3k.blog/world/japan/
+https://www.theinformation.com/briefings/china-vows-retaliate-trump-imposes-new-100-tariff
+https://www.theinformation.com/articles/ai-ad-tech-land-grab-pits-salesforce-google-microsoft-amazon
+https://www.theinformation.com/briefings/nvidia-announces-chip-design-deal-broadcom
+https://www.theinformation.com/articles/robots-beat-optimus-space
+https://www.theinformation.com/briefings/data-startups-fivetran-dbt-labs-announced-merger
+https://www.theinformation.com/briefings/jpmorgan-commits-10-billion-strategic-tech-investments
+https://www.theinformation.com/briefings/apple-rebrands-streaming-service-apple-tv
+https://www.theinformation.com/articles/borrowed-money-fueled-cryptos-700-billion-sell
+https://www.theinformation.com/articles/openai-working-softbanks-arm-broadcom-ai-chip-effort
+https://techrocks.ru/2025/02/20/20-useful-typescript-tricks/
+https://techrocks.ru/2025/07/16/queues-in-typescript/
+https://techrocks.ru/2024/12/27/black-box-testing/
+https://techrocks.ru/2025/01/10/how-to-add-watermarks-to-images/
+https://techrocks.ru/2025/01/19/how-to-build-accessible-modals/
+https://techrocks.ru/2025/01/23/how-to-merge-word-documents-in-python/
+https://techrocks.ru/2025/01/29/10-react-one-liners/
+https://techrocks.ru/2025/02/07/how-to-merge-word-documents-using-nodejs/
+https://techrocks.ru/2025/02/20/20-useful-typescript-tricks/
+https://techrocks.ru/2025/03/14/20-useful-js-tricks/
+https://polarsparc.com/2025/05/31/open-webui/
+https://polarsparc.com/2025/06/15/langchain-recipes/
+https://polarsparc.com/2025/06/15/langchain/
+https://polarsparc.com/2025/06/21/pragmatic-spring-ai/
+https://polarsparc.com/2025/06/29/llama-cpp/
+https://polarsparc.com/2025/07/04/hyperledger-besu-docker/
+https://polarsparc.com/2025/07/05/docker-model-runner/
+https://polarsparc.com/2025/07/12/cross-entropy/
+https://polarsparc.com/2025/07/19/anvil-solidity-python/
+https://polarsparc.com/2025/08/01/polarsparc-retire/
--- a/prompts/summarization.txt
+++ b/prompts/summarization.txt
@@ -1 +1,18 @@
-Summarize the following text in 1-2 sentences: '{text}'
+Task: Produce a concise summary in Russian.   
+
+Input: A block of text that may be written in Russian or any other language. The text can contain code snippets, tables, and other non‑textual elements.   
+
+Requirements:   
+
+    Summarize the content of the input text.  
+    The output must be only the summary, written entirely in Russian.  
+    Exclude all code blocks, tables, images, links, and any formatting (no indentations, no bullet points, no headings).  
+    The summary should be a single paragraph, as short as possible while still covering the main ideas.  
+    Do not add any commentary, explanations, or additional text before or after the summary.
+         
+
+Example:
+[User supplies text]
+[Model outputs only the summary in Russian, no code or tables] 
+
+'{text}'
--- a/requirements.txt
+++ b/requirements.txt
@@ -1,3 +1,3 @@
-ollama>=0.1.0
+ollama>=0.6.1
 pymongo==3.13.0
 tqdm>=4.60.0
--- a/run.sh
+++ b/run.sh
@@ -47,7 +47,18 @@ if [ -n "$1" ]; then
  elif [[ "$1" == "gen-mongo" ]]; then
    activate
    echo "🔍 Генерирую тесты пересказов из MongoDB... "
-    python scripts/generate_summarization_from_mongo.py --record-id "$2"
+    if [[ -n "$2" ]] && [[ "$2" != "--id-file" ]]; then
+      # Старый формат: ./run.sh gen-mongo <record-id>
+      python scripts/generate_summarization_from_mongo.py --record-id "$2"
+    elif [[ -n "$2" ]] && [[ "$2" == "--id-file" ]]; then
+      # Новй формат: ./run.sh gen-mongo --id-file <file-path>
+      shift 2
+      python scripts/generate_summarization_from_mongo.py --id-file "$1"
+    else
+      echo "❌ Ошибка: Укажите либо --record-id, либо --id-file"
+      echo "Использование: ./run.sh gen-mongo <record-id> или ./run.sh gen-mongo --id-file <file-path>"
+      exit 1
+    fi
    echo "✅ Тесты из MongoDB успешно сгенерированы"
  fi
 else
@@ -58,10 +69,12 @@ else
    echo " * run - запуск бенчмарков (translation, summarization, codegen)"
    echo " * clean - очистка отчетов"
    echo " * gen - генерация тестов через Ollama (translation, summarization, codegen)"
-    echo " * gen-mongo - генерация тестов пересказов из MongoDB (использование: ./run.sh gen-mongo <record-id> [output-dir])"
+    echo " * gen-mongo - генерация тестов пересказов из MongoDB (использование: ./run.sh gen-mongo <record-id> или ./run.sh gen-mongo --id-file <file-path>)"
    echo ""
    echo "Примеры использования:"
    echo " * ./run.sh run -m second_constantine/t-lite-it-1.0:7b -b translation summarization"
+    echo " * ./run.sh run -m second_constantine/t-lite-it-1.0:7b --num-ctx 16000"
    echo " * ./run.sh gen"
    echo " * ./run.sh gen-mongo 507f1f77bcf86cd799439011"
+    echo " * ./run.sh gen-mongo --id-file ids.txt"
  fi
--- a/scripts/README.md
+++ b/scripts/README.md
@@ -32,17 +32,23 @@ python scripts/generate_tests.py --count 2 --category translation --model second

 **Функциональность:**
 - Извлекает текст статьи из коллекции `rssNotification` (поле `.meta.topicContent`)
- Генерирует тестовые данные в формате JSON для бенчмарка AI
+- Генерирует тестовые данные в формате TXT для бенчмарка AI
 - Валидирует generated тесты
+- Поддерживает обработку как одной записи, так и нескольких записей из файла

 **Использование:**
 ```bash
+# Для обработки одной записи
 python scripts/generate_summarization_from_mongo.py --record-id 507f1f77bcf86cd799439011
+
+# Для обработки нескольких записей из файла
+python scripts/generate_summarization_from_mongo.py --id-file ids.txt
 ```

 **Параметры:**
- `--record-id`: ID записи в MongoDB (обязательный параметр)
- `--output-dir`: Директория для сохранения generated тестов (по умолчанию: tests/summarization)
+- `--record-id`: ID записи в MongoDB (для обработки одной записи)
+- `--id-file`: Файл с ID записей (по одной на строку, для обработки нескольких записей)
+  * Примечание: Укажите либо `--record-id`, либо `--id-file`, но не оба одновременно

 **Требования:**
 - Доступ к MongoDB кластеру (10.0.0.3, 10.0.0.4, 10.0.0.5)
@@ -56,6 +62,13 @@ Summarize the following text in 1-2 sentences: 'Текст статьи из Mon

 **Примечание:** Тесты генерируются в формате TXT с разделителем `==============`. Поле "expected" может быть пустым, если генерация пересказа не удалась.

+**Обработка файла с ID:**
+- Скрипт читает ID из файла построчно
+- Обрабатывает каждую запись по очереди
+- Выводит прогресс и статистику по обработке
+- Продолжает обработку остальных записей даже при ошибках отдельных записей
+- Выводит подробные логи об ошибках для каждой неудачной записи
+
 ## Установка зависимостей

 Для работы скриптов требуются следующие зависимости:
--- a/scripts/generate_summarization_from_mongo.py
+++ b/scripts/generate_summarization_from_mongo.py
@@ -66,12 +66,13 @@ def extract_text_from_topic_content(topic_content: Dict) -> Optional[str]:

    return content_str

-def generate_test_from_mongo_record(record_id: str) -> bool:
+def generate_test_from_mongo_record(record_id: str, verbose: bool = True) -> bool:
    """
    Генерирует тест пересказа из записи MongoDB.

    Args:
        record_id: ID записи в MongoDB
+        verbose: Выводить подробную отладочную информацию (по умолчанию: True)

    Returns:
        True, если тест успешно generated, False в случае ошибки
@@ -84,32 +85,42 @@ def generate_test_from_mongo_record(record_id: str) -> bool:
        # Извлекаем запись по ID
        record = collection.find_one({"_id": record_id})
        if not record:
-            print(f"❌ Запись с ID {record_id} не найдена в коллекции")
+            if verbose:
+                print(f"❌ Запись с ID {record_id} не найдена в коллекции")
            return False

-        # Отладочная информация
-        print(f"🔍 Найдена запись: {record_id}")
-        print(f"📋 Полная структура записи:")
-        print(json.dumps(record, ensure_ascii=False, indent=2, default=str))
+        # Отладочная информация (только при verbose=True)
+        if verbose:
+            print(f"🔍 Найдена запись: {record_id}")
+            print(f"📋 Полная структура записи:")
+            print(json.dumps(record, ensure_ascii=False, indent=2, default=str))

        # Извлекаем текст из meta.topicContent
        meta_data = record.get('meta', {})
        topic_content = meta_data.get('topicContent')
        if not topic_content:
-            print(f"❌ В записи {record_id} отсутствует поле meta.topicContent")
+            if verbose:
+                print(f"❌ В записи {record_id} отсутствует поле meta.topicContent")
+                print(f"📋 Полная структура записи:")
+                print(json.dumps(record, ensure_ascii=False, indent=2, default=str))
            return False

-        print(f"📝 Тип поля meta.topicContent: {type(topic_content)}")
-        print(f"📝 Содержимое meta.topicContent (первые 500 символов):")
-        print(str(topic_content)[:500])
+        if verbose:
+            print(f"📝 Тип поля meta.topicContent: {type(topic_content)}")
+            print(f"📝 Содержимое meta.topicContent (первые 500 символов):")
+            print(str(topic_content)[:500])

        # Извлекаем текст
        article_text = extract_text_from_topic_content(topic_content)
        if not article_text:
-            print(f"❌ Не удалось извлечь текст из meta.topicContent записи {record_id}")
+            if verbose:
+                print(f"❌ Не удалось извлечь текст из meta.topicContent записи {record_id}")
+                print(f"📋 Полная структура записи:")
+                print(json.dumps(record, ensure_ascii=False, indent=2, default=str))
            return False

-        print(f"📝 Итоговый текст (первые 500 символов): {article_text[:500]}")
+        if verbose:
+            print(f"📝 Итоговый текст (первые 500 символов): {article_text[:500]}")

        # Генерируем пересказ через LLM (если доступно)
        expected_summary = ""
@@ -132,7 +143,8 @@ Provide only the summary, no additional text."""
        # Очищаем ID от недопустимых символов для имени файла
        filename = sanitize_filename(record_id)
        if not filename:
-            print(f"❌ Не удалось создать допустимое имя файла из ID записи {record_id}")
+            if verbose:
+                print(f"❌ Не удалось создать допустимое имя файла из ID записи {record_id}")
            return False

        # Используем очищенный ID записи как имя файла
@@ -142,14 +154,23 @@ Provide only the summary, no additional text."""
        with open(test_file, "w", encoding="utf-8") as f:
            f.write(f"{article_text}{TEST_SEPARATOR}{expected_summary}")

-        print(f"✅ Создан тест tests/summarization/{filename}.txt")
-        print(f"   Источник: MongoDB запись {record_id}")
-        print(f"   Текст статьи (первые 100 символов): {article_text[:100]}...")
+        if verbose:
+            print(f"✅ Создан тест tests/summarization/{filename}.txt")
+            print(f"   Источник: MongoDB запись {record_id}")
+            print(f"   Текст статьи (первые 100 символов): {article_text[:100]}...")

        return True

    except Exception as e:
-        print(f"❌ Ошибка при генерации теста: {e}")
+        if verbose:
+            print(f"❌ Ошибка при генерации теста: {e}")
+            try:
+                record = collection.find_one({"_id": record_id})
+                if record:
+                    print(f"📋 Полная структура записи:")
+                    print(json.dumps(record, ensure_ascii=False, indent=2, default=str))
+            except:
+                pass
        return False
    finally:
        if 'client' in locals():
@@ -187,33 +208,93 @@ def validate_test(test_data: Dict[str, str]) -> bool:

    return True

+def read_ids_from_file(file_path: str) -> list:
+    """
+    Читает ID записей из файла.
+
+    Args:
+        file_path: Путь к файлу с ID записей (по одной на строку)
+
+    Returns:
+        Список ID записей
+    """
+    try:
+        with open(file_path, 'r', encoding='utf-8') as f:
+            ids = [line.strip() for line in f if line.strip()]
+        return ids
+    except Exception as e:
+        print(f"❌ Ошибка при чтении файла {file_path}: {e}")
+        return []
+
 def main():
    """Основная функция скрипта."""
    parser = argparse.ArgumentParser(
        description="Генератор тестов пересказов из MongoDB",
        epilog="Примеры использования:\n"
-               "  python scripts/generate_summarization_from_mongo.py --record-id 507f1f77bcf86cd799439011"
+               "  python scripts/generate_summarization_from_mongo.py --record-id 507f1f77bcf86cd799439011\n"
+               "  python scripts/generate_summarization_from_mongo.py --id-file ids.txt"
    )
-    parser.add_argument(
+    group = parser.add_mutually_exclusive_group(required=True)
+    group.add_argument(
        "--record-id",
        type=str,
-        required=True,
-        help="ID записи в MongoDB (обязательный параметр)"
+        help="ID записи в MongoDB (для обработки одной записи)"
+    )
+    group.add_argument(
+        "--id-file",
+        type=str,
+        help="Файл с ID записей (по одной на строку, для обработки нескольких записей)"
    )

    args = parser.parse_args()

-    print(f"🔍 Подключаюсь к MongoDB кластеру...")
-    print(f"📄 Извлекаю запись с ID: {args.record_id}")
-    print(f"💾 Сохраняю тест в: tests/summarization/")
+    if args.record_id:
+        print(f"🔍 Подключаюсь к MongoDB кластеру...")
+        print(f"📄 Извлекаю запись с ID: {args.record_id}")
+        print(f"💾 Сохраняю тест в: tests/summarization/")

-    success = generate_test_from_mongo_record(args.record_id)
+        success = generate_test_from_mongo_record(args.record_id)

-    if success:
-        print("\n✨ Готово! Тест успешно generated.")
-    else:
-        print("\n❌ Не удалось generated тест.")
-        sys.exit(1)
+        if success:
+            print("\n✨ Готово! Тест успешно generated.")
+        else:
+            print("\n❌ Не удалось generated тест.")
+            sys.exit(1)
+    elif args.id_file:
+        print(f"🔍 Подключаюсь к MongoDB кластеру...")
+        print(f"📄 Извлекаю ID записи из файла: {args.id_file}")
+        print(f"💾 Сохраняю тесты в: tests/summarization/")
+
+        # Читаем ID из файла
+        record_ids = read_ids_from_file(args.id_file)
+        if not record_ids:
+            print("❌ Файл с ID записей пуст или недействителен.")
+            sys.exit(1)
+
+        print(f"📊 Найдено {len(record_ids)} ID записей в файле")
+        print("🔄 Начинаю обработку записей...\n")
+
+        success_count = 0
+        error_count = 0
+
+        for i, record_id in enumerate(record_ids, 1):
+            print(f"[{i}/{len(record_ids)}] Обрабатываю запись: {record_id}")
+            success = generate_test_from_mongo_record(record_id, verbose=True)
+            if success:
+                success_count += 1
+            else:
+                error_count += 1
+            print()  # Пустая строка для разделения логов
+
+        print(f"\n📊 Итог:")
+        print(f"   ✅ Успешно generated: {success_count}")
+        print(f"   ❌ Ошибки: {error_count}")
+
+        if error_count > 0:
+            print(f"\n⚠️  Некоторые записи были обработаны с ошибками. Проверьте логи выше.")
+            sys.exit(1)
+        else:
+            print("\n✨ Готово! Все тесты успешно generated.")

 if __name__ == "__main__":
    main()
--- a/src/benchmarks/base.py
+++ b/src/benchmarks/base.py
@@ -46,13 +46,14 @@ class Benchmark(ABC):
        """
        pass

-    def run(self, ollama_client: OllamaClient, model_name: str) -> Dict[str, Any]:
+    def run(self, ollama_client: OllamaClient, model_name: str, num_ctx: int = 32000) -> Dict[str, Any]:
        """
        Запуск бенчмарка.

        Args:
            ollama_client: Клиент для работы с Ollama
            model_name: Название модели
+            num_ctx: Размер контекста

        Returns:
            Результаты бенчмарка
@@ -71,9 +72,11 @@ class Benchmark(ABC):

                # Получение ответа от модели
                prompt = test_case['prompt']
+                self.logger.debug(f"Prompt: {prompt[:200]}...")  # Логируем начало промпта
                model_response = ollama_client.generate(
                    model=model_name,
                    prompt=prompt,
+                    num_ctx=num_ctx,
                    options={'temperature': 0.7}
                )

@@ -101,7 +104,10 @@ class Benchmark(ABC):
                })

            except Exception as e:
-                self.logger.error(f"Error in test case {i}: {e}")
+                error_msg = f"Error in test case {i} (name: {test_case['name']}): {e}"
+                self.logger.error(error_msg)
+                if 'prompt' in locals():
+                    self.logger.debug(f"Failed prompt: {prompt[:500]}")
                results.append({
                    'test_case': test_case['name'],
                    'error': str(e)
--- a/src/benchmarks/summarization.py
+++ b/src/benchmarks/summarization.py
@@ -32,7 +32,7 @@ class SummarizationBenchmark(Benchmark):
                    if len(parts) == 2:
                        test_data.append({
                            'name': filename.replace('.txt', ''),
-                            'prompt': self.universal_prompt.format(task=parts[0]),
+                            'prompt': self.universal_prompt.format(text=parts[0]),
                            'expected': parts[1]
                        })

--- a/src/main.py
+++ b/src/main.py
@@ -18,7 +18,7 @@ def setup_logging(verbose: bool = False):
        ]
    )

-def run_benchmarks(ollama_client: OllamaClient, model_name: str, benchmarks: List[str]) -> List[dict]:
+def run_benchmarks(ollama_client: OllamaClient, model_name: str, benchmarks: List[str], num_ctx: int) -> List[dict]:
    """
    Запуск выбранных бенчмарков.

@@ -26,6 +26,7 @@ def run_benchmarks(ollama_client: OllamaClient, model_name: str, benchmarks: Lis
        ollama_client: Клиент для работы с Ollama
        model_name: Название модели
        benchmarks: Список имен бенчмарков для запуска
+        num_ctx: Размер контекста

    Returns:
        Список результатов бенчмарков
@@ -45,7 +46,7 @@ def run_benchmarks(ollama_client: OllamaClient, model_name: str, benchmarks: Lis

        logging.info(f"Running {benchmark_name} benchmark...")
        benchmark = benchmark_classes[benchmark_name]()
-        result = benchmark.run(ollama_client, model_name)
+        result = benchmark.run(ollama_client, model_name, num_ctx)
        results.append(result)

    return results
@@ -59,6 +60,7 @@ def main():
                       help='Список бенчмарков для выполнения (translation, summarization, codegen)')
    parser.add_argument('-o', '--output', default='results', help='Директория для сохранения результатов')
    parser.add_argument('-v', '--verbose', action='store_true', help='Подробный режим вывода')
+    parser.add_argument('--num-ctx', type=int, default=32000, help='Размер контекста для модели (по умолчанию 32000)')

    args = parser.parse_args()

@@ -68,13 +70,14 @@ def main():
    logging.info(f"Starting benchmarking for model: {args.model}")
    logging.info(f"Ollama URL: {args.ollama_url}")
    logging.info(f"Benchmarks to run: {', '.join(args.benchmarks)}")
+    logging.info(f"Context size: {args.num_ctx}")

    try:
        # Инициализация клиента
        ollama_client = OllamaClient(args.ollama_url)

        # Запуск бенчмарков
-        results = run_benchmarks(ollama_client, args.model, args.benchmarks)
+        results = run_benchmarks(ollama_client, args.model, args.benchmarks, args.num_ctx)

        # Генерация отчетов
        report_generator = ReportGenerator()
@@ -88,7 +91,7 @@ def main():
        logging.info("Benchmarking completed successfully!")

    except Exception as e:
-        logging.error(f"Error during benchmarking: {e}")
+        logging.error(f"Error during benchmarking: {e}", exc_info=True)
        return 1

    return 0
--- a/src/models/ollama_client.py
+++ b/src/models/ollama_client.py
@@ -16,13 +16,14 @@ class OllamaClient:
        self.client = Client(host=base_url)
        self.logger = logging.getLogger(__name__)

-    def generate(self, model: str, prompt: str, **kwargs) -> str:
+    def generate(self, model: str, prompt: str, num_ctx: int = 32000, **kwargs) -> str:
        """
        Генерация ответа от модели.

        Args:
            model: Название модели
            prompt: Входной промпт
+            num_ctx: Размер контекста (по умолчанию 32000)
            **kwargs: Дополнительные параметры для запроса

        Returns:
@@ -33,23 +34,32 @@ class OllamaClient:
        """
        try:
            self.logger.info(f"Generating response for model {model}")
+            self.logger.debug(f"Prompt: {prompt[:200]}...")  # Логируем начало промпта
+            # Объединяем options из kwargs с num_ctx
+            options = {'num_ctx': num_ctx}
+            if 'options' in kwargs:
+                options.update(kwargs.pop('options'))
            response = self.client.generate(
                model=model,
                prompt=prompt,
+                options=options,
                **kwargs
            )
            return response['response']
        except Exception as e:
-            self.logger.error(f"Error generating response: {e}")
+            error_msg = f"Error generating response for model {model}: {e}"
+            self.logger.error(error_msg)
+            self.logger.debug(f"Failed prompt: {prompt[:500]}")
            raise

-    def chat(self, model: str, messages: list, **kwargs) -> str:
+    def chat(self, model: str, messages: list, num_ctx: int = 32000, **kwargs) -> str:
        """
        Диалог с моделью.

        Args:
            model: Название модели
            messages: Список сообщений в формате [{'role': 'user', 'content': '...'}, ...]
+            num_ctx: Размер контекста (по умолчанию 32000)
            **kwargs: Дополнительные параметры для запроса

        Returns:
@@ -60,9 +70,14 @@ class OllamaClient:
        """
        try:
            self.logger.info(f"Chatting with model {model}")
+            # Объединяем options из kwargs с num_ctx
+            options = {'num_ctx': num_ctx}
+            if 'options' in kwargs:
+                options.update(kwargs.pop('options'))
            response = self.client.chat(
                model=model,
                messages=messages,
+                options=options,
                **kwargs
            )
            return response['message']['content']
--- a/tests/codegen/test2.txt
+++ b/tests/codegen/test2.txt
@@ -1,4 +0,0 @@
-Write a Python function that reverses a string.
-==============
-def reverse_string(s):
-    return s[::-1]
--- a/tests/codegen/test3.txt
+++ b/tests/codegen/test3.txt
@@ -1,9 +0,0 @@
-Write a Python function that checks if a number is prime.
-==============
-def is_prime(n):
-    if n <= 1:
-        return False
-    for i in range(2, int(n**0.5) + 1):
-        if n % i == 0:
-            return False
-    return True
--- a/tests/summarization/https___4pda.to_2026_01_22_452001_qwerty_smartfon_unihertz_titan_2_elite_pokazali_vzhivuyu_video_.txt
+++ b/tests/summarization/https___4pda.to_2026_01_22_452001_qwerty_smartfon_unihertz_titan_2_elite_pokazali_vzhivuyu_video_.txt
@@ -0,0 +1,2 @@
+QWERTY-смартфон Unihertz Titan 2 Elite показали «вживую» [ВИДЕО] 22.01.26 15 На YouTube-канале компании Unihertz появился ролик с демонстрацией недавно анонсированного QWERTY-смартфона Titan 2 Elite. Видеоряд даёт более наглядное представление о размерах устройства и его функциональных возможностях. В частности, производитель показал пример использования физической клавиатуры — запуск приложения Google Keep через удержание клавиши K и выход из него по кнопке Home, расположенной рядом с пробелом. Клавиша fn, в свою очередь, намекает на множество вариантов комбинаций для запуска тех или иных действий. 1 / 3 Список характеристик Unihertz Titan 2 Elite по-прежнему не опубликован, поскольку гаджет всё ещё находится на стадии тестирования. Подробности о новинке станут известны на выставке MWC 2026, которая пройдёт в Барселоне со 2 по 5 марта 2026 года. Источник: youtube.com Автор: Шамиль Алиуллов # Unihertz Unihertz Titan 2 Elite QWERTY-смартфон Unihertz Titan 2 Elite показали «вживую» [ВИДЕО] 22.01.26 15 На YouTube-канале компании Unihertz появился ролик с демонстрацией недавно анонсированного QWERTY-смартфона Titan 2 Elite. Видеоряд даёт более наглядное представление о размерах устройства и его функциональных возможностях. 22.01.26 15 На YouTube-канале компании Unihertz появился ролик с демонстрацией недавно анонсированного QWERTY-смартфона Titan 2 Elite. Видеоряд даёт более наглядное представление о размерах устройства и его функциональных возможностях. 22.01.26 15 15 1 / 3 1 / 3 1 / 3           Источник: youtube.com Автор: Шамиль Алиуллов # Unihertz Unihertz Titan 2 Elite Источник: youtube.com Автор: Шамиль Алиуллов Источник: youtube.com Автор: Шамиль Алиуллов # Unihertz Unihertz Titan 2 Elite
+==============
--- a/tests/summarization/https___4pda.to_2026_01_22_452006_smi_era_nedorogikh_ssd_zakonchilas_i_dalshe_budet_eschyo_khuzhe_.txt
+++ b/tests/summarization/https___4pda.to_2026_01_22_452006_smi_era_nedorogikh_ssd_zakonchilas_i_dalshe_budet_eschyo_khuzhe_.txt
@@ -0,0 +1,2 @@
+СМИ: эра недорогих SSD закончилась — и дальше будет ещё хуже 22.01.26 67 Вслед за оперативной памятью по всему миру подорожали и накопители. И это, похоже, надолго: представитель одного из крупнейших поставщиков NAND уже заявил о завершении эпохи дешёвых SSD, а два других бренда сокращают производство флеш-памяти в пользу более выгодной DRAM. По заявлению топ-менеджера Kioxia Шунсуке Накато, компания уже распродала все чипы, которые сойдут с её конвейеров в текущем году. Он утверждает, что это приведёт к дефициту и дальнейшему росту цен, а дни недорогих SSD по $45 за терабайт, по его мнению, уже закончились. Кроме того, по информации корейского издания Chosun Biz, производство NAND-памяти в этом году заметно сокращают Samsung и SK Hynix, переключившись на нужды NVIDIA в сфере ИИ-ускорителей. Это уже в ближайшее время скажется на потребительском рынке. По мнению профильных СМИ, на рынке твердотельных накопителей реализуется тот же сценарий, что и с оперативной памятью. По прогнозам журналистов, SSD продолжат скачкообразно дорожать минимум до 2027 года. Источник: wccftech.com Автор: Виктория Анисимова СМИ: эра недорогих SSD закончилась — и дальше будет ещё хуже 22.01.26 67 Вслед за оперативной памятью по всему миру подорожали и накопители. И это, похоже, надолго: представитель одного из крупнейших поставщиков NAND уже заявил о завершении эпохи дешёвых SSD, а два других бренда сокращают производство флеш-памяти в пользу более выгодной DRAM. 22.01.26 67 Вслед за оперативной памятью по всему миру подорожали и накопители. И это, похоже, надолго: представитель одного из крупнейших поставщиков NAND уже заявил о завершении эпохи дешёвых SSD, а два других бренда сокращают производство флеш-памяти в пользу более выгодной DRAM. 22.01.26 67 67 Источник: wccftech.com Автор: Виктория Анисимова Источник: wccftech.com Автор: Виктория Анисимова Источник: wccftech.com Автор: Виктория Анисимова
+==============
--- a/tests/summarization/https___4pda.to_2026_01_22_452008_kuler_igrovogo_smartfona_iqoo_15_ultra_pokazali_na_rendere_.txt
+++ b/tests/summarization/https___4pda.to_2026_01_22_452008_kuler_igrovogo_smartfona_iqoo_15_ultra_pokazali_na_rendere_.txt
@@ -0,0 +1,2 @@
+Кулер игрового смартфона iQOO 15 Ultra показали на рендере 22.01.26 13 Инсайдер Digital Chat Station опубликовал схематическое изображение ещё не представленного смартфона iQOO 15 Ultra. Рендер демонстрирует одну из ключевых особенностей модели, отчасти благодаря которой она установила рекорд производительности в AnTuTu. Судя по эскизу, конструкция системы активного охлаждения смартфона включает воздуховод с отверстиями в блоке камер, через которые в корпус попадает холодный воздух. Горячий, в свою очередь, выбрасывается через решётку на боковой грани гаджета. По слухам, размеры вентилятора составят 17 x 17 x 4 мм. Эффективная площадь испарительной камеры, согласно источнику, составит 8000 мм². Заодно информатор сообщил, что смартфон будет представлен в конфигурациях памяти 16 + 512 ГБ и 24 ГБ + 1 ТБ. Информацией об устройстве ранее поделился и директор по продуктам iQOO Галант Ви. По его словам, новинка ориентирована не только на пиковую производительность (60 fps на максимальных настройках графики «в одной из самых требовательных игр»), но и на стабильность фреймрейта. Премьера iQOO 15 Ultra ожидается в начале февраля 2026 года, цена смартфона по-прежнему неизвестна. Источник: gizmochina.com Автор: Шамиль Алиуллов # iQOO iQOO 15 Ultra Кулер игрового смартфона iQOO 15 Ultra показали на рендере 22.01.26 13 Инсайдер Digital Chat Station опубликовал схематическое изображение ещё не представленного смартфона iQOO 15 Ultra. Рендер демонстрирует одну из ключевых особенностей модели, отчасти благодаря которой она установила рекорд производительности в AnTuTu. 22.01.26 13 Инсайдер Digital Chat Station опубликовал схематическое изображение ещё не представленного смартфона iQOO 15 Ultra. Рендер демонстрирует одну из ключевых особенностей модели, отчасти благодаря которой она установила рекорд производительности в AnTuTu. 22.01.26 13 13 Источник: gizmochina.com Автор: Шамиль Алиуллов # iQOO iQOO 15 Ultra Источник: gizmochina.com Автор: Шамиль Алиуллов Источник: gizmochina.com Автор: Шамиль Алиуллов # iQOO iQOO 15 Ultra
+==============
--- a/tests/summarization/https___4pda.to_2026_01_22_452021_smena_komandy_prodyuser_world_of_warcraft_teper_rabotaet_nad_mmo_po_league_of_legends_.txt
+++ b/tests/summarization/https___4pda.to_2026_01_22_452021_smena_komandy_prodyuser_world_of_warcraft_teper_rabotaet_nad_mmo_po_league_of_legends_.txt
@@ -0,0 +1,2 @@
+Смена команды. Продюсер World of Warcraft теперь работает над MMO по League of Legends 22.01.26 6 В игровой индустрии (западной её части во всяком случае) мало кто засиживается в одной компании надолго. Из-за этого порой возникают любопытные ситуации, когда сотрудников одной студии в какой-то момент нанимают её конкуренты. Рэймонд Бартос Например, ведущий продюсер World of Warcraft Рэймонд Бартос променял офис Blizzard Entertainment на уютную должность в Riot Games. Там он будет трудиться над MMO по мотивам League of Legends вместе с Орландо Сальваторе, другим ветераном Blizzard. Riot, отметим, уже очень давно вынашивает свою MMORPG. Разработка проекта держится в строжайшем секрете — известно лишь, что в 2024-м её перезапустили с нуля. Ранее Марк Меррилл, со-основатель компании, объяснил тотальную смену концепции очень просто: ему и его коллегам не нужен стандартный представитель жанра, они жаждут реализовать весь потенциал мира Рунтерра. Увы, это означает, что у амбициозного тайтла нет даже приблизительной даты релиза: Меррилл надеется, что им удастся выпустить MMO до колонизации Марса. Источник: massivelyop.com Автор: Валентин Карузов Смена команды. Продюсер World of Warcraft теперь работает над MMO по League of Legends 22.01.26 6 В игровой индустрии (западной её части во всяком случае) мало кто засиживается в одной компании надолго. Из-за этого порой возникают любопытные ситуации, когда сотрудников одной студии в какой-то момент нанимают её конкуренты. 22.01.26 6 В игровой индустрии (западной её части во всяком случае) мало кто засиживается в одной компании надолго. Из-за этого порой возникают любопытные ситуации, когда сотрудников одной студии в какой-то момент нанимают её конкуренты. 22.01.26 6 6 Источник: massivelyop.com Автор: Валентин Карузов Источник: massivelyop.com Автор: Валентин Карузов Источник: massivelyop.com Автор: Валентин Карузов
+==============
--- a/tests/summarization/https___4pda.to_2026_01_22_452025_eto_uspekh_vsego_za_nedelyu_gejmery_zagruzili_svyshe_5_mln_modov_dlya_hytale_.txt
+++ b/tests/summarization/https___4pda.to_2026_01_22_452025_eto_uspekh_vsego_za_nedelyu_gejmery_zagruzili_svyshe_5_mln_modov_dlya_hytale_.txt
@@ -0,0 +1,2 @@
+Это успех. Всего за неделю геймеры загрузили свыше 5 млн модов для Hytale 22.01.26 12 В прошлом году Riot Games отменила Hytale, которую затем выкупил её оригинальный создатель — и выпустил на рынок в имеющемся виде. Игру ждал бешеный успех. Поскольку Hytale не представлена в Steam, измерить её показатели онлайна невозможно — однако о высоком интересе публики говорят различные косвенные свидетельства. К примеру, всего за неделю с момента выпуска игры в раннем доступе пользователи создали для неё свыше двух тысяч модификаций и суммарно скачали их свыше пяти миллионов раз. Публика конкурента Minecraft действительно измеряется миллионами — вот настолько популярным оказался проект, пускай это пока не сопоставимо с аудиторией самой популярной блочной песочницы. Среди всего обилия модификаций сильнее прочих оказались востребованы изменения в интерфейсе, различные ассистенты, новые предметы оружия и мебели, расширители максимального стека предметов, увеличители добычи руды и улучшатели карты. Изучить список модификаций можно по этой ссылке. Источник: curseforge.com Автор: Александр Козьяков Это успех. Всего за неделю геймеры загрузили свыше 5 млн модов для Hytale 22.01.26 12 В прошлом году Riot Games отменила Hytale, которую затем выкупил её оригинальный создатель — и выпустил на рынок в имеющемся виде. Игру ждал бешеный успех. 22.01.26 12 В прошлом году Riot Games отменила Hytale, которую затем выкупил её оригинальный создатель — и выпустил на рынок в имеющемся виде. Игру ждал бешеный успех. 22.01.26 12 12 Источник: curseforge.com Автор: Александр Козьяков Источник: curseforge.com Автор: Александр Козьяков Источник: curseforge.com Автор: Александр Козьяков
+==============
--- a/tests/summarization/https___4pda.to_2026_01_22_452026_pokhozhe_marathon_ne_svetit_sudba_concord_gejmery_skupayut_predzakazy_.txt
+++ b/tests/summarization/https___4pda.to_2026_01_22_452026_pokhozhe_marathon_ne_svetit_sudba_concord_gejmery_skupayut_predzakazy_.txt
@@ -0,0 +1,2 @@
+Похоже, Marathon не светит судьба Concord. Геймеры скупают предзаказы 22.01.26 13 При анонсе Marathon вместо праздника студия Bungie получила много головной боли: геймеры раскритиковали проект и вынудили разработчиков отложить его релиз. Кажется, дополнительное время они провели с пользой. Недавно Bungie открыла платные предзаказы Marathon — и уже на этом основании можно утверждать, что игре не светит судьба провальной Concord. В мировом чарте лидеров продаж Steam проект дебютировал в топ-10 (правда, парой дней спустя оказался на 11 позиции среди платных игр). Волна ненависти к проекту тем временем сходит на нет: трейлер предзаказа собрал 13 тысяч лайков и 2,7 тысячи дизлайков. Для сравнения, в ролике с первой демонстрацией геймплея реакции разделились ровно пополам, а секция комментариев была заполнена негативными замечаниями. Теперь же игроки в основном выражают желание поскорее поиграть в Marathon. Кажется, создателям видеоигр стоит почаще прислушиваться к мнению сообщества — такое отношение окупает себя с лихвой. Источник: youtube.com Автор: Александр Козьяков Похоже, Marathon не светит судьба Concord. Геймеры скупают предзаказы 22.01.26 13 При анонсе Marathon вместо праздника студия Bungie получила много головной боли: геймеры раскритиковали проект и вынудили разработчиков отложить его релиз. Кажется, дополнительное время они провели с пользой. 22.01.26 13 При анонсе Marathon вместо праздника студия Bungie получила много головной боли: геймеры раскритиковали проект и вынудили разработчиков отложить его релиз. Кажется, дополнительное время они провели с пользой. 22.01.26 13 13 Источник: youtube.com Автор: Александр Козьяков Источник: youtube.com Автор: Александр Козьяков Источник: youtube.com Автор: Александр Козьяков
+==============
--- a/tests/summarization/https___4pda.to_2026_01_22_452044_portativka_gpd_win_5_poluchit_ofitsialnuyu_podderzhku_os_bazzite_.txt
+++ b/tests/summarization/https___4pda.to_2026_01_22_452044_portativka_gpd_win_5_poluchit_ofitsialnuyu_podderzhku_os_bazzite_.txt
@@ -0,0 +1,2 @@
+«Портативка» GPD Win 5 получит официальную поддержку ОС Bazzite 22.01.26 21 Представитель бренда игровых консолей GPD заявил о начале работ по адаптации ОС Bazzite под фирменную «портативку» Win 5. Смена операционной системы, как ожидается, повысит игровую производительность устройства по сравнению с версией на Windows. Bazzite — это дистрибутив на ядре Linux, оптимизированный для запуска игр. Он включает в себя клиенты Steam и других геймерских платформ, поддерживает HDR и VRR, а также содержит ряд инструментов и системных настроек, призванных снизить фоновую нагрузку на железо. О готовящемся нововведении компания сообщила на Reddit, предложив всем желающим принять участие в тестировании. Судя по первым отзывам, у текущей модификации ОС есть проблемы с качеством звука, работой некоторых кнопок и выходом из спящего режима. Разработчики обещают оперативно «чинить» дистрибутив, выпуская соответствующие патчи. Сроки релиза стабильной версии операционной системы, полностью «заточенной» под консоль, пока не объявлены. Напомним, что GPD Win 5 изначально работает под управлением Windows 11. Консоль оснащена 7-дюймовым IPS-дисплеем с разрешением 1080p и частотой обновления 120 Гц. За производительность отвечает процессор AMD Ryzen Al Max 385 или Max+ 395 со «встройкой» AMD Radeon 8050S или 8060S соответственно. Источник: videocardz.com Автор: Шамиль Алиуллов # GPD GPD Win 5 «Портативка» GPD Win 5 получит официальную поддержку ОС Bazzite 22.01.26 21 Представитель бренда игровых консолей GPD заявил о начале работ по адаптации ОС Bazzite под фирменную «портативку» Win 5. Смена операционной системы, как ожидается, повысит игровую производительность устройства по сравнению с версией на Windows. 22.01.26 21 Представитель бренда игровых консолей GPD заявил о начале работ по адаптации ОС Bazzite под фирменную «портативку» Win 5. Смена операционной системы, как ожидается, повысит игровую производительность устройства по сравнению с версией на Windows. 22.01.26 21 21 Источник: videocardz.com Автор: Шамиль Алиуллов # GPD GPD Win 5 Источник: videocardz.com Автор: Шамиль Алиуллов Источник: videocardz.com Автор: Шамиль Алиуллов # GPD GPD Win 5
+==============
--- a/tests/summarization/https___4pda.to_2026_01_22_452045_dizajn_modulnoj_kamery_insta360_pocket_raskryt_do_anonsa_foto_.txt
+++ b/tests/summarization/https___4pda.to_2026_01_22_452045_dizajn_modulnoj_kamery_insta360_pocket_raskryt_do_anonsa_foto_.txt
@@ -0,0 +1,2 @@
+Дизайн модульной камеры Insta360 Pocket раскрыт до анонса [ФОТО] 22.01.26 5 Insta360 и DJI соперничают на рынке экшн-камер, и, судя по свежему «сливу», первая хочет перенести конкуренцию и в сферу камер для блогинга, представив альтернативу модели DJI OSMO Pocket 3 и ещё не выпущенной Pocket 4. Одной из особенностей будущей новинки должна стать модульная конструкция. 1 / 2 По данным профильных СМИ, гаджет под названием Insta360 Pocket будет состоять из нескольких раздельных блоков: Модуль камеры со сменными объективами Блок управления с экраном и набором кнопок Модуль питания (предположительно со сменной АКБ) Соединительный модуль Судя по опубликованным снимкам и рендерам, все блоки будут быстросъёмными. На инсайдерских фотографиях устройство также показано в частично разобранном виде. Подробные характеристики, дата анонса и возможная цена Insta360 Pocket пока неизвестны. Источник: notebookcheck.net Автор: Шамиль Алиуллов # Insta360 Insta360 Pocket Дизайн модульной камеры Insta360 Pocket раскрыт до анонса [ФОТО] 22.01.26 5 Insta360 и DJI соперничают на рынке экшн-камер, и, судя по свежему «сливу», первая хочет перенести конкуренцию и в сферу камер для блогинга, представив альтернативу модели DJI OSMO Pocket 3 и ещё не выпущенной Pocket 4. Одной из особенностей будущей новинки должна стать модульная конструкция. 22.01.26 5 Insta360 и DJI соперничают на рынке экшн-камер, и, судя по свежему «сливу», первая хочет перенести конкуренцию и в сферу камер для блогинга, представив альтернативу модели DJI OSMO Pocket 3 и ещё не выпущенной Pocket 4. Одной из особенностей будущей новинки должна стать модульная конструкция. 22.01.26 5 5 1 / 2 1 / 2 1 / 2         Источник: notebookcheck.net Автор: Шамиль Алиуллов # Insta360 Insta360 Pocket Источник: notebookcheck.net Автор: Шамиль Алиуллов Источник: notebookcheck.net Автор: Шамиль Алиуллов # Insta360 Insta360 Pocket
+==============
--- a/tests/summarization/https___4pda.to_2026_01_22_452051_asus_predstavila_ochen_dorogoj_8k_monitor_proart_pa32kcx_dlya_professionalov_.txt
+++ b/tests/summarization/https___4pda.to_2026_01_22_452051_asus_predstavila_ochen_dorogoj_8k_monitor_proart_pa32kcx_dlya_professionalov_.txt
@@ -0,0 +1,2 @@
+ASUS представила очень дорогой 8K-монитор ProArt PA32KCX для профессионалов 22.01.26 21 Компания ASUS привезла на европейский рынок профессиональный монитор ProArt PA32KCX. Новинка, ориентированная на специалистов, работающих с графикой и видео, сочетает высокое разрешение, точную цветопередачу и хорошую яркость подсветки, а также оснащается встроенным колориметром для точной калибровки. 1 / 3 32-дюймовый дисплей монитора с разрешением 7680x4320 пикселей (275 ppi) имеет 4032 зоны локального затемнения. Эта особенность, по заверению бренда значительно уменьшает визуальные артефакты и эффект ореола вокруг мелких элементов на экране. Время отклика панели составляет 5 мс (GtG). Монитор поставляется с заводской калибровкой (Delta E < 1), а встроенный колориметр позволяет настроить цветопередачу без долгих танцев с бубном и стороннего ПО. Из коробки заявлен охват 100% цветовой гаммы sRGB, 97% P3 и 95% Adobe RGB. 1 / 4 Новинка комплектуется блендой, снижающей количество бликов на экране. Встроенные датчики также мониторят окружающее освещение и присутствие пользователя, автоматически регулируя яркость подсветки (до 1200 кд/м² в режиме HDR). Набор интерфейсов модули включает разъём DisplayPort 2.1, два HDMI 2.1, Thunderbolt 4, пару USB Type-C и три USB Type-A. Эргономичная подставка позволяет менять угол наклона, вращать экрана и регулировать высоту. ASUS ProArt PA32KCX поступил в продажу на рынке Европы по цене €9199. Источник: computerbase.de Автор: Виктория Анисимова # ASUS ASUS представила очень дорогой 8K-монитор ProArt PA32KCX для профессионалов 22.01.26 21 Компания ASUS привезла на европейский рынок профессиональный монитор ProArt PA32KCX. Новинка, ориентированная на специалистов, работающих с графикой и видео, сочетает высокое разрешение, точную цветопередачу и хорошую яркость подсветки, а также оснащается встроенным колориметром для точной калибровки. 22.01.26 21 Компания ASUS привезла на европейский рынок профессиональный монитор ProArt PA32KCX. Новинка, ориентированная на специалистов, работающих с графикой и видео, сочетает высокое разрешение, точную цветопередачу и хорошую яркость подсветки, а также оснащается встроенным колориметром для точной калибровки. 22.01.26 21 21 1 / 3 1 / 3 1 / 3           1 / 4 1 / 4 1 / 4             Источник: computerbase.de Автор: Виктория Анисимова # ASUS Источник: computerbase.de Автор: Виктория Анисимова Источник: computerbase.de Автор: Виктория Анисимова # ASUS
+==============
--- a/tests/summarization/https___4pda.to_2026_01_22_452052_smi_igrovaya_proizvoditelnost_linux_i_windows_pochti_ravna_no_ne_dlya_vsekh_.txt
+++ b/tests/summarization/https___4pda.to_2026_01_22_452052_smi_igrovaya_proizvoditelnost_linux_i_windows_pochti_ravna_no_ne_dlya_vsekh_.txt
@@ -0,0 +1,2 @@
+СМИ: игровая производительность Linux и Windows почти равна. Но не для всех 22.01.26 87 Портал PC Games Hardware сравнил быстродействие игр в Windows 11 и Linux (через Proton) в конфигурациях с 10 различными видеокартами AMD и NVIDIA. Результаты оказались любопытными: хотя в среднем операционка Microsoft была быстрее, на определённых конфигурациях она уступала альтернативной ОС. Журналисты протестировали 10 игр на 10 видеокартах, поочерёдно устанавливаемых в системный блок на базе Ryzen 79800X3D с 48 ГБ оперативной памяти DDR5-6000. В качестве альтернативы «Окнам» была выбрана CachyOS на основе Arch Linux. Она, как выяснилось, практически не отставала от соперницы на конфигурациях с видеокартами AMD, а вот «зелёные» модели были заметно быстрее именно при работе с Windows. К примеру, игра Anno 117: Pax Romana оказалась на 1–5% быстрее на Linux с видеокартами RX 7800 XT и серией RX 9000, тогда как на железе NVIDIA в этом сценарии наблюдалась просадка быстродействия до 15%. Примерно ту же картину продемонстрировали A Plague Tale: Requiem и The Outer Worlds 2. Kingdom Come: Deliverance II сложилась любопытная ситуация. На платформе Linux с видеокартами NVIDIA RTX 5080, 5070 Ti и 5070 она ускорилась на 1–2% по сравнению с Windows. И наоборот, модели серии Radeon RX 9000 отстали примерно на 7% при запуске под Linux. В большинстве оставшихся игр, включая Baldur's Gate 3 и Clair Obscur: Expedition 33, «Радеоны» теряли около 5% игровой производительности при переходе на Linux, а видеокарты NVIDIA — до 20% в зависимости от модели. По заключению издания, ситуация может измениться, когда Valve выпустит Steam Machine и начнёт оптимизировать собственный дистрибутив для ПК. Сейчас Proton может запускать около 90% Windows-игр на Linux. Основными проблемами по-прежнему остаются Secure Boot и античит-системы на уровне ядра. Источник: techspot.com Автор: Шамиль Алиуллов # Linux Microsoft NVIDIA AMD Microsoft Windows 11 СМИ: игровая производительность Linux и Windows почти равна. Но не для всех 22.01.26 87 Портал PC Games Hardware сравнил быстродействие игр в Windows 11 и Linux (через Proton) в конфигурациях с 10 различными видеокартами AMD и NVIDIA. Результаты оказались любопытными: хотя в среднем операционка Microsoft была быстрее, на определённых конфигурациях она уступала альтернативной ОС. 22.01.26 87 Портал PC Games Hardware сравнил быстродействие игр в Windows 11 и Linux (через Proton) в конфигурациях с 10 различными видеокартами AMD и NVIDIA. Результаты оказались любопытными: хотя в среднем операционка Microsoft была быстрее, на определённых конфигурациях она уступала альтернативной ОС. 22.01.26 87 87 Источник: techspot.com Автор: Шамиль Алиуллов # Linux Microsoft NVIDIA AMD Microsoft Windows 11 Источник: techspot.com Автор: Шамиль Алиуллов Источник: techspot.com Автор: Шамиль Алиуллов # Linux Microsoft NVIDIA AMD Microsoft Windows 11
+==============
--- a/tests/summarization/https___dzone.com_articles_automated-inventory-pattern-for-managing-aws-ec2.txt
+++ b/tests/summarization/https___dzone.com_articles_automated-inventory-pattern-for-managing-aws-ec2.txt
@@ -0,0 +1,94 @@
+In the hybrid cloud era, managing infrastructure visibility is a constant battle. We spin up EC2 instances for testing, leave them running, and forget about them. Security groups become bloated, and cost management turns into a guessing game. While high-end tools like Datadog or CloudHealth offer solutions, they often come with significant licensing costs and integration overhead. Sometimes, you just need a lightweight, customizable way to see exactly what is running in your environment. Based on a case study involving hybrid infrastructure management, this article outlines a low-cost automation architecture to retrieve, visualize, and analyze EC2 parameters. While the original implementation relied on legacy Excel VBA, we have modernized the stack to use Python. By combining Boto3 (the AWS SDK) and Pandas, you can build a self-updating inventory system that reduces audit time by 98%. The Problem: The Cloud “Black Box” When you manage hundreds of instances across multiple regions, three critical issues arise: Over-Provisioning: Resources are sized for peak load but run idle 90% of the time. Zombie Resources: Development environments are abandoned but left running. Security Drift: Who opened port 22 on the database server? When was the last OS patch applied? Manual audits are impossible at scale. You need an automated snapshot of your infrastructure’s health. The Architecture: A Python Automation Pipeline We replace the fragile CSV-to-VBA workflow with a robust Python script. This enables better error handling, type safety, and easier scheduling via Cron or Jenkins. The Workflow: Data Extraction: Python (boto3) queries the AWS API across all target regions. Data Processing: Python (pandas) flattens the JSON response into a structured DataFrame and filters for anomalies. Visualization: Python (openpyxl / xlsxwriter) exports a formatted Excel dashboard for management reporting. Step 1: The “VBA Killer” Python Script In legacy workflows, engineers often used VBA to parse CSVs line by line to avoid Excel crashing on large datasets. Python’s Pandas library handles this natively using vectorized operations, processing hundreds of thousands of rows in milliseconds. Below is the complete script to fetch EC2 data and generate a formatted report. Python import boto3
+import pandas as pd
+from datetime import datetime
+
+def get_ec2_inventory(regions):
+    inventory_list = []
+    
+    for region in regions:
+        print(f"Scanning region: {region}...")
+        ec2 = boto3.client('ec2', region_name=region)
+        
+        # Paginator handles API limits automatically
+        paginator = ec2.get_paginator('describe_instances')
+        
+        for page in paginator.paginate():
+            for reservation in page['Reservations']:
+                for instance in reservation['Instances']:
+                    # Extract Tags safely
+                    tags = {t['Key']: t['Value'] for t in instance.get('Tags', [])}
+                    
+                    # Build the record
+                    record = {
+                        'Region': region,
+                        'InstanceId': instance['InstanceId'],
+                        'Name': tags.get('Name', 'N/A'),
+                        'Type': instance['InstanceType'],
+                        'State': instance['State']['Name'],
+                        'PublicIP': instance.get('PublicIpAddress', 'N/A'),
+                        'PrivateIP': instance.get('PrivateIpAddress', 'N/A'),
+                        'LaunchTime': instance['LaunchTime'].replace(tzinfo=None), # Fix TZ for Excel
+                        'CostCenter': tags.get('CostCenter', 'Unknown')
+                    }
+                    inventory_list.append(record)
+                    
+    return pd.DataFrame(inventory_list)
+
+def generate_excel_report(df, filename):
+    """
+    Replaces VBA formatting logic. 
+    Writes data to Excel and adds Conditional Formatting.
+    """
+    with pd.ExcelWriter(filename, engine='xlsxwriter') as writer:
+        # Write raw data
+        df.to_excel(writer, sheet_name='EC2_Inventory', index=False)
+        
+        workbook = writer.book
+        worksheet = writer.sheets['EC2_Inventory']
+        
+        # Format 1: Header styling
+        header_fmt = workbook.add_format({'bold': True, 'bg_color': '#4F81BD', 'font_color': 'white'})
+        for col_num, value in enumerate(df.columns.values):
+            worksheet.write(0, col_num, value, header_fmt)
+            
+        # Format 2: Highlight "Stopped" instances in Red
+        red_fmt = workbook.add_format({'bg_color': '#FFC7CE', 'font_color': '#9C0006'})
+        
+        # Apply conditional formatting to the 'State' column (Column E)
+        row_count = len(df) + 1
+        worksheet.conditional_format(f'E2:E{row_count}', {
+            'type': 'text',
+            'criteria': 'containing',
+            'value': 'stopped',
+            'format': red_fmt
+        })
+        
+        # Auto-adjust column widths
+        worksheet.set_column(0, 8, 20) 
+
+    print(f" Report generated: {filename}")
+
+if __name__ == "__main__":
+    # Define scope
+    target_regions = ['us-east-1', 'us-west-2']
+    
+    # 1. Fetch
+    df_instances = get_ec2_inventory(target_regions)
+    
+    # 2. Analyze (Simple Pandas Logic)
+    print(f"Total Instances Found: {len(df_instances)}")
+    print(df_instances['State'].value_counts())
+    
+    # 3. Report
+    timestamp = datetime.now().strftime("%Y%m%d")
+    generate_excel_report(df_instances, f"aws_inventory_{timestamp}.xlsx") Step 2: Why Python Beats VBA for Ops The shift from Excel VBA to Python provides three architectural advantages: Maintainability: VBA is locked inside a .xlsm binary file. Python scripts are plain text, version-controlled in Git, and easily peer-reviewed. API integration: VBA requires complex HTTP requests or external shell calls to interact with AWS. Python uses boto3, a native and well-maintained SDK. Speed: The VBA approach in the original study relied on memory arrays to speed up cell writing. Pandas abstracts this entirely, writing binary Excel files directly from memory without the overhead of the Excel GUI. Step 3: Automated Analysis Once the data is in a DataFrame, you can run logic checks before a human ever sees the report. Example: Detecting Zombie Instances Python # Identify instances running for > 30 days in 'Dev' environment
+zombies = df[
+    (df['State'] == 'running') & 
+    (df['CostCenter'] == 'Dev') & 
+    (df['LaunchTime'] < pd.Timestamp.now() - pd.Timedelta(days=30))
+]
+
+if not zombies.empty:
+    print(f"WARNING: {len(zombies)} potential zombie instances detected.")
+    # Optional: Send Slack alert Results: The Impact of Automation Implementing this automated inventory pattern yielded significant operational improvements: Cost reduction: Identified and removed unused storage volumes and zombie instances, saving thousands in monthly spend. Time savings: Reduced the monthly inventory audit from 288 hours (manual) to zero hours (fully automated). Data freshness: Moved from a monthly manual snapshot to a daily automated feed, allowing operations teams to react to security risks in near real time. Conclusion You don’t always need a SaaS subscription to solve cloud management problems. By chaining together standard administrative tools — Boto3, Pandas, and Excel — you can build a robust, no-cost inventory system that fits your exact needs. Next Steps: Clone the script above. Schedule it to run every Monday morning via GitHub Actions or Jenkins. Email the report automatically to your FinOps team. The best observability tool is the one you actually look at.
+==============
--- a/tests/summarization/https___dzone.com_articles_build-ai-tools-go-mcp-sdk-databases.txt
+++ b/tests/summarization/https___dzone.com_articles_build-ai-tools-go-mcp-sdk-databases.txt
@@ -0,0 +1,257 @@
+The Model Context Protocol (MCP) has established itself as the ubiquitous standard for connecting AI applications to external systems. Since its release, there have been implementations across various programming languages and frameworks, enabling developers to build solutions that expose data sources, tools, and workflows to AI applications. For Go developers, however, the journey to an official MCP SDK took longer (compared to other SDKs like Python and TypeScript). Discussions and design/implementation work on the official Go implementation began during early to mid 2025. At the time of writing (January 2026), it stands at version 1.2.0. As a Gopher, I'm excited (and relieved!) to finally have a stable, official MCP Go SDK that the Go community can rely on. To explore its capabilities, I built an MCP server for Azure Cosmos DB. This blog post will dive into the MCP Go SDK fundamentals by walking through its specifics and exploring concepts like tools and servers. By the end, you'll understand how to use the MCP Go SDK to build your own MCP servers, with Azure Cosmos DB serving as a practical example. Note: This project is not intended to replace the Azure MCP Server or Azure Cosmos DB MCP Toolkit. Rather, it serves as an experimental learning tool that demonstrates how to combine the Azure and MCP Go SDKs to build AI tooling for Azure Cosmos DB. MCP Basics Let's briefly cover what MCP is and how the MCP Go SDK works. What Is the Model Context Protocol? The Model Context Protocol (MCP) is an open-source standard for connecting AI applications to external systems. It's often referred to as a USB-C port for AI applications — just as USB-C provides a standardized way to connect devices, MCP provides a standardized way to connect AI applications to data sources, tools, and workflows. With MCP, AI applications (ranging from IDEs like VS Code, CLI coding tools like GitHub Copilot or apps like Claude web/desktop) can: Access data sources (local files, databases, APIs) Use tools (search engines, calculators, external services) Execute workflows (specialized prompts, multi-step operations) This standardization means developers can build MCP servers once and have them work with any MCP-compatible AI application, rather than creating custom integrations for each platform. MCP Go SDK The official Go MCP SDK provides the building blocks to create MCP servers and clients in Go. Here's a minimal example of an MCP server with a simple tool: Go package main
+
+import (
+    "context"
+    "log"
+    "strings"
+
+    "github.com/modelcontextprotocol/go-sdk/mcp"
+)
+
+type ReverseInput struct {
+    Text string `json:"text" jsonschema:"the text to reverse"`
+}
+
+type ReverseOutput struct {
+    Reversed string `json:"reversed" jsonschema:"the reversed text"`
+}
+
+func ReverseText(ctx context.Context, req *mcp.CallToolRequest, input ReverseInput) (
+    *mcp.CallToolResult,
+    ReverseOutput,
+    error,
+) {
+    runes := []rune(input.Text)
+    for i, j := 0, len(runes)-1; i < j; i, j = i+1, j-1 {
+        runes[i], runes[j] = runes[j], runes[i]
+    }
+    return nil, ReverseOutput{Reversed: string(runes)}, nil
+}
+
+func main() {
+    // Create server
+    server := mcp.NewServer(&mcp.Implementation{
+        Name:    "text-tools",
+        Version: "v1.0.0",
+    }, nil)
+
+    // Add a tool
+    mcp.AddTool(server, &mcp.Tool{
+        Name:        "reverse",
+        Description: "reverses the input text",
+    }, ReverseText)
+
+    // Run over stdio
+    if err := server.Run(context.Background(), &mcp.StdioTransport{}); err != nil {
+        log.Fatal(err)
+    } This example demonstrates the key concepts: Tool definition: A mcp.Tool with a name and description Input/output types: Structs with JSON schema tags that define the tool's interface Handler function: The actual logic that executes when the tool is called Server: Created with mcp.NewServer() and configured with tools Transport: How the server communicates (here using stdio) These concepts will be covered later on the blog. MCP Server in Action ▶️ To get a sense of what the server can do, take a look at this short demo of using the MCP server with Agent Mode in Visual Studio Code: This server exposes several tools that enable AI applications to interact with Azure Cosmos DB: list_databases – List all databases in a Cosmos DB account list_containers – List all containers in a specific database read_item – Read a specific item using its ID and partition key execute_query – Execute SQL queries against containers create_container – Create new containers with partition keys add_item_to_container – Add items to containers read_container_metadata – Retrieve container configuration details If you want to setup and configure the server, check out the GitHub repository. Alright, let's dive into how it's built. Understanding the Implementation Tools are the building blocks of an MCP server. Each tool represents a specific operation that the server can perform. Let's use the read_item tool as an example to understand the fundamental concepts of the MCP Go SDK and how it integrates with the Azure Cosmos DB Go SDK. MCP Tools: Definition, Handler, and Execution Flow An MCP tool consists of these components: Tool Definition The tool definition describes the tool to the AI application. Here's how we define the read_item tool: Go func ReadItem() *mcp.Tool {
+    return &mcp.Tool{
+        Name:        "read_item",
+        Description: "Read a specific item from a container in an Azure Cosmos DB database using the item ID and partition key",
+    } The Tool struct contains: Name: A unique identifier for the tool Description: Helps the AI understand when to use this tool The SDK can automatically infer input and output schemas from your handler function's types, which we'll see next. Input and Output Types Type-safe input and output structures define the tool's interface: Go type ReadItemToolInput struct {
+    Account      string `json:"account" jsonschema:"Azure Cosmos DB account name"`
+    Database     string `json:"database" jsonschema:"Name of the database"`
+    Container    string `json:"container" jsonschema:"Name of the container to read data from"`
+    ItemID       string `json:"itemID" jsonschema:"ID of the item to read"`
+    PartitionKey string `json:"partitionKey" jsonschema:"Partition key value of the item"`
+}
+
+type ReadItemToolResult struct {
+    Item string `json:"item" jsonschema:"The item data as JSON string"`
+} The SDK uses these types to automatically generate JSON schemas and handle validation. JSON tags define how fields are serialized, and jsonschema tags provide descriptions that help AI applications understand what each field represents Tool Handler The handler is where the actual work happens. The MCP Go SDK provides a generic AddTool function that can bind tools to functions with this signature: Go func(ctx context.Context, request *CallToolRequest, input InputType) (result *CallToolResult, output OutputType, error) Here's the read_item handler: Go func ReadItemToolHandler(ctx context.Context, _ *mcp.CallToolRequest, input ReadItemToolInput) (*mcp.CallToolResult, ReadItemToolResult, error) {
+    // 1. Validate inputs
+    if input.Account == "" {
+        return nil, ReadItemToolResult{}, errors.New("cosmos db account name missing")
+    }
+    if input.Database == "" {
+        return nil, ReadItemToolResult{}, errors.New("database name missing")
+    }
+    // ... more validation
+
+    // 2. Get Cosmos DB client
+    client, err := GetCosmosClientFunc(input.Account)
+    if err != nil {
+        return nil, ReadItemToolResult{}, err
+    }
+
+    // 3. Navigate to the container
+    databaseClient, err := client.NewDatabase(input.Database)
+    if err != nil {
+        return nil, ReadItemToolResult{}, fmt.Errorf("error creating database client: %v", err)
+    }
+
+    containerClient, err := databaseClient.NewContainer(input.Container)
+    if err != nil {
+        return nil, ReadItemToolResult{}, fmt.Errorf("error creating container client: %v", err)
+    }
+
+    // 4. Read the item using Cosmos DB SDK
+    partitionKey := azcosmos.NewPartitionKeyString(input.PartitionKey)
+    itemResponse, err := containerClient.ReadItem(ctx, partitionKey, input.ItemID, nil)
+    if err != nil {
+        return nil, ReadItemToolResult{}, fmt.Errorf("error reading item: %v", err)
+    }
+
+    // 5. Return the result
+    return nil, ReadItemToolResult{Item: string(itemResponse.Value)}, nil
+} The handler handles (pun intended!) several things: Validates input parameters Interacts with Azure Cosmos DB Returns structured output Notice we return nil for the *mcp.CallToolResult. The SDK automatically handles the response marshaling for us. If we return an error, the SDK sets IsError: true in the result automatically. Authenticating With Azure Cosmos DB The MCP server uses NewDefaultAzureCredential from the Azure Identity SDK, which automatically handles multiple authentication methods, such as Azure CLI credentials (for local development), Managed Identity (for production), environment variables, and more: Go func GetCosmosDBClient(accountName string) (*azcosmos.Client, error) {
+    endpoint := fmt.Sprintf("https://%s.documents.azure.com:443/", accountName)
+
+    cred, err := azidentity.NewDefaultAzureCredential(nil)
+    if err != nil {
+        return nil, fmt.Errorf("error creating credential: %v", err)
+    }
+
+    client, err := azcosmos.NewClient(endpoint, cred, nil)
+    if err != nil {
+        return nil, fmt.Errorf("error creating Cosmos client: %v", err)
+    }
+
+    return client, nil
+} Once we have the client, we use the standard Azure Cosmos DB SDK patterns: client.NewDatabase() to get a database client databaseClient.NewContainer() to get a container client containerClient.ReadItem() to perform the actual read operation MCP Server: Bringing Tools Together The beauty here is that MCP provides the standardized interface for AI interactions, while the Azure Cosmos DB SDK handles all the database operations — the handler acts as the bridge between these two worlds. Now that we understand individual tools, let's see how they're organized within an MCP server. An MCP server exposes specific capabilities (tools, resources, prompts) to AI applications through the standardized MCP protocol. Creating the Server Here's how we create and configure the MCP server in main.go: Go func main() {
+    // Create the server with metadata
+    server := mcp.NewServer(&mcp.Implementation{
+        Name:       "mcp_azure_cosmosdb_go",
+        Title:      "Go based MCP server for Azure Cosmos DB using the Azure SDK for Go and the MCP Go SDK",
+        Version:    "0.0.1",
+        WebsiteURL: "https://github.com/abhirockzz/mcp_cosmosdb_go",
+    }, nil)
+
+    // Register all tools with their handlers
+    mcp.AddTool(server, tools.ListDatabases(), tools.ListDatabasesToolHandler)
+    mcp.AddTool(server, tools.ListContainers(), tools.ListContainersToolHandler)
+    mcp.AddTool(server, tools.ReadContainerMetadata(), tools.ReadContainerMetadataToolHandler)
+    mcp.AddTool(server, tools.CreateContainer(), tools.CreateContainerToolHandler)
+    mcp.AddTool(server, tools.AddItemToContainer(), tools.AddItemToContainerToolHandler)
+    mcp.AddTool(server, tools.ReadItem(), tools.ReadItemToolHandler)
+    mcp.AddTool(server, tools.ExecuteQuery(), tools.ExecuteQueryToolHandler)
+
+    // ... transport setup (covered next)
+} Breaking this down: mcp.NewServer() creates a new server instance with: Implementation metadata: Name, title, and version that identify the server ServerOptions: Additional configuration (we use nil for defaults) mcp.AddTool() registers each tool with the server: Takes the server instance The tool definition (from functions like tools.ReadItem()) The handler function (like tools.ReadItemToolHandler) When the server connects to a client, it automatically advertises the tools capability, making all registered tools discoverable by the AI application. Transport: Connecting Server to Client A transport defines how the server communicates with clients. It's the communication channel that carries JSON-RPC messages between the server and client. The MCP Go SDK supports multiple transport types. HTTP Streamable Transport The server also supports http transport, which is ideal for web-based AI applications. Here's how we set it up: Go // Create the streamable HTTP handler
+handler := mcp.NewStreamableHTTPHandler(func(req *http.Request) *mcp.Server {
+    return server
+}, nil)
+
+// Start the HTTP server
+if err := http.ListenAndServe(":9090", handler); err != nil {
+    log.Fatalf("Server failed: %v", err) The NewStreamableHTTPHandler creates an HTTP handler that accepts incoming HTTP requests from MCP clients, and returns the appropriate server instance for each request. It handles the streamable transport protocol automatically and supports multiple concurrent client sessions This transport is ideal when you want to support web-based AI applications, and the server needs to be accessible over HTTP/HTTPS. This allows multiple clients to connect simultaneously. Stdio Transport Another common MCP transport is stdio, used when the server runs as a subprocess: Go err := server.Run(context.Background(), &mcp.StdioTransport{})
+if err != nil {
+    log.Fatal(err)
+} The stdio transport runs as a subprocess started by the client and communicates via standard input/output streams. It's perfect for local MCP clients like GitHub Copilot CLI, Claude Code (or Desktop), etc. Both transports implement the same MCP protocol, so the server's tools work identically regardless of which transport you choose. The difference is purely in how the server connects to and communicates with clients. With the server created, tools registered, and transport configured, the MCP server is ready to accept connections from AI applications and execute operations against Azure Cosmos DB. Testing the MCP Server This involves verifying functionality at different layers of the stack. This server uses integration tests at two levels: tests that verify the MCP protocol aspects, and tests that focus on handler logic with database interactions. Let's explore both approaches. Before diving into testing, let's briefly understand what an MCP client is. Understanding MCP Clients An MCP client is the component that connects to an MCP server to consume its capabilities. In the context of the MCP server: In production: The client is typically an AI application (like Claude Desktop or VS Code) that discovers and calls our tools In testing: We create programmatic clients to verify our server works correctly The MCP Go SDK provides a Client type that we can use to connect to our server and call its tools, simulating how a real AI application would interact with it. Handler-Level Integration Testing With Azure Cosmos DB vNext Emulator Let's start by looking at tests that focus on handler logic and database interactions. It uses the Azure Cosmos DB vNext Emulator with testcontainers-go. From tools_test.go: Go func TestListDatabases(t *testing.T) {
+    tests := []struct {
+        name           string
+        input          ListDatabasesToolInput
+        expectError    bool
+        expectedResult string
+        expectedErrMsg string
+    }{
+        {
+            name: "valid account name",
+            input: ListDatabasesToolInput{
+                Account: "dummy_account_does_not_matter",
+            },
+            expectError:    false,
+            expectedResult: testOperationDBName,
+        },
+        {
+            name: "empty account name",
+            input: ListDatabasesToolInput{
+                Account: "",
+            },
+            expectError:    true,
+            expectedErrMsg: "cosmos db account name missing",
+        },
+    }
+
+    for _, test := range tests {
+        t.Run(test.name, func(t *testing.T) {
+            _, response, err := ListDatabasesToolHandler(
+                context.Background(), 
+                nil, 
+                test.input,
+            )
+
+            if test.expectError {
+                require.Error(t, err)
+                assert.Contains(t, err.Error(), test.expectedErrMsg)
+                return
+            }
+
+            require.NoError(t, err)
+            assert.Contains(t, response.Databases, test.expectedResult)
+        })
+    }
+} These tests call handlers directly (bypassing the MCP protocol layer) and use table-driven tests for input validation and error handling, business logic correctness, database operations, and edge cases. Go func setupCosmosEmulator(ctx context.Context) (testcontainers.Container, error) {
+    req := testcontainers.ContainerRequest{
+        Image:        "mcr.microsoft.com/cosmosdb/linux/azure-cosmos-emulator:vnext-preview",
+        ExposedPorts: []string{"8081:8081", "8080:8080"},
+        WaitingFor:   wait.ForListeningPort(nat.Port("8080")),
+        Env: map[string]string{
+            "PROTOCOL": "http",
+        },
+    }
+
+    container, err := testcontainers.GenericContainer(ctx, testcontainers.GenericContainerRequest{
+        ContainerRequest: req,
+        Started:          true,
+    })
+    // ... error handling
+
+    return container, nil
+} The testcontainers-go library automatically pulls the emulator image, starts the container, and cleans up after tests complete. This is set up once in the TestMain function and shared across all tests. MCP Protocol Integration Testing Beyond handler testing, we also verify the complete MCP protocol stack, from client request through server processing to response. Here's an example from mcp_integration_test.go: Go func TestMCPIntegration_ReadItem(t *testing.T) {
+    ctx := context.Background()
+
+    // 1. Create MCP server and register the read_item tool
+    server := mcp.NewServer(&mcp.Implementation{
+        Name:    "test-cosmosdb-server",
+        Version: "0.0.1",
+    }, nil)
+
+    mcp.AddTool(server, ReadItem(), ReadItemToolHandler)
+
+    // 2. Create in-memory transports for testing
+    serverTransport, clientTransport := mcp.NewInMemoryTransports()
+
+    // 3. Connect server
+    serverSession, err := server.Connect(ctx, serverTransport, nil)
+    require.NoError(t, err)
+    defer serverSession.Close()
+
+    // 4. Create and connect client
+    client := mcp.NewClient(&mcp.Implementation{
+        Name:    "test-client",
+        Version: "0.0.1",
+    }, nil)
+
+    clientSession, err := client.Connect(ctx, clientTransport, nil)
+    require.NoError(t, err)
+    defer clientSession.Close()
+
+    // 5. Call the tool via MCP protocol
+    result, err := clientSession.CallTool(ctx, &mcp.CallToolParams{
+        Name: "read_item",
+        Arguments: map[string]any{
+            "account":      "dummy_account_does_not_matter",
+            "database":     testOperationDBName,
+            "container":    testOperationContainerName,
+            "itemID":       id,
+            "partitionKey": partitionKeyValue,
+        },
+    })
+
+    // 6. Verify the response
+    require.NoError(t, err)
+    require.False(t, result.IsError)
+    require.NotEmpty(t, result.Content)
+
+    // 7. Parse and validate the JSON response
+    textContent, ok := result.Content[0].(*mcp.TextContent)
+    require.True(t, ok)
+
+    var response ReadItemToolResult
+    err = json.Unmarshal([]byte(textContent.Text), &response)
+    require.NoError(t, err)
+
+    assert.NotEmpty(t, response.Item)
+} This test demonstrates several key concepts: In-memory transports: mcp.NewInMemoryTransports() creates a pair of connected transports without requiring actual network communication — perfect for testing Client-server connection: Both server and client connect to their respective transports, establishing a session Tool invocation: clientSession.CallTool() sends a properly formatted MCP request Response handling: The result is parsed from the MCP protocol format back to our domain types Full protocol verification: This tests the complete round trip: request serialization → tool execution → response serialization → client parsing Both handler-level and protocol-level tests use the Azure Cosmos DB vNext emulator, not mocks. Handler-level tests provide feedback on business logic, while protocol-level tests ensure MCP compliance and end-to-end functionality. Wrap Up With the MCP Go SDK, building MCP servers has become significantly more accessible for Go developers. You don't have to go for Python anymore (sorry, Pythonistas, pun intended!). This MCP server demonstrates how to combine the MCP Go SDK with domain-specific tools — in this case, the Azure Cosmos DB Go SDK. While this server provides useful functionality for interacting with Cosmos DB from AI applications, its primary purpose is educational. As mentioned before, this is a learning tool that shows how to integrate MCP with real-world services, not a replacement for solutions like the Azure MCP Server or the Azure Cosmos DB MCP Toolkit. The specific patterns we covered (defining tools, implementing handlers, managing authentication, choosing transports, and writing integration tests) apply to any MCP server you might build. The same concepts apply, whether you're exposing APIs, databases, file systems, or custom business logic. Next Steps Ready to build your own MCP server? Here are some resources to get you started: MCP Go SDK resources: Documentation, design, and examples. MCP Specification: https://modelcontextprotocol.io/ Azure Cosmos DB Go SDK: https://pkg.go.dev/github.com/Azure/azure-sdk-for-go/sdk/data/azcosmos Azure Cosmos DB vNext Emulator and Testcontainers Guide: Integration Testing for Go Applications The MCP ecosystem is growing rapidly, and I am excited for Go developers who now have first-class support for participating in this evolution!
+==============
--- a/tests/summarization/https___dzone.com_articles_future-of-data-streaming-apache-flink-for-agentic-ai.txt
+++ b/tests/summarization/https___dzone.com_articles_future-of-data-streaming-apache-flink-for-agentic-ai.txt
--- a/tests/summarization/https___dzone.com_articles_java-high-availability-failures.txt
+++ b/tests/summarization/https___dzone.com_articles_java-high-availability-failures.txt
--- a/tests/summarization/https___dzone.com_articles_mcp-security-governance-opportunity.txt
+++ b/tests/summarization/https___dzone.com_articles_mcp-security-governance-opportunity.txt
--- a/tests/summarization/https___dzone.com_articles_merge-liquid-clustering-common-issues.txt
+++ b/tests/summarization/https___dzone.com_articles_merge-liquid-clustering-common-issues.txt
--- a/tests/summarization/https___dzone.com_articles_no-buffering-strategy-streaming-search-results.txt
+++ b/tests/summarization/https___dzone.com_articles_no-buffering-strategy-streaming-search-results.txt
--- a/tests/summarization/https___dzone.com_articles_rag-ai-for-ai-builders.txt
+++ b/tests/summarization/https___dzone.com_articles_rag-ai-for-ai-builders.txt
--- a/tests/summarization/https___dzone.com_articles_refactoring-react-monolith-with-autonomous-agents.txt
+++ b/tests/summarization/https___dzone.com_articles_refactoring-react-monolith-with-autonomous-agents.txt
@@ -0,0 +1,121 @@
+I've been wrangling React codebases professionally for well over ten years now, and honestly, the story is always the same in 2026: teams inherit these massive, everything-in-one-place apps built back when Create React App felt like the future. All the logic — auth, shopping cart, product lists, user profiles — lives in a handful of giant files. Props get drilled six levels deep, the state is scattered, and nobody wants to touch it because one wrong move brings the whole thing down. Last year, I led a refactor on a five-year-old dashboard exactly like that. We managed to break it into proper feature slices and even laid the groundwork for microfrontends. The thing that made the biggest difference? A multi-agent AI setup that did a lot of the heavy lifting for us. It wasn't magic — it still needed human eyes — but it turned a three-month nightmare into something we wrapped in five weeks. In this piece, I'll walk you through how I built that system. We'll take a messy little React monolith (the kind you see everywhere) and let a team of AI agents analyze it, plan the refactor, write the new modular code, add tests, and review everything. We'll use LangGraph to orchestrate the agents and Claude 3.5 Sonnet as the LLM (though GPT-4o works fine too). What You'll Need Nothing exotic: Node 20+ and your package manager of choice. Python for the agent orchestration (LangChain/LangGraph live there — it's still the most reliable option). An Anthropic API key (or OpenAI). Just export it as ANTHROPIC_API_KEY. Git and VS Code. I lean heavily on the Cursor extension these days for quick diff reviews. Grab the sample app we'll be working with — a tiny e-commerce dashboard where login, product list, and cart are all crammed into src/App.js. It's deliberately ugly, but painfully realistic. Here's the heart of the mess: JavaScript import React, { useState } from 'react';
+import './App.css';
+
+function App() {
+  const [user, setUser] = useState(null);
+  const [cart, setCart] = useState([]);
+  const [products] = useState([{ id: 1, name: 'Widget', price: 10 }]);
+
+  const login = (username, password) => {
+    if (username === 'admin') setUser({ username });
+  };
+
+  const addToCart = (product) => {
+    setCart([...cart, product]);
+  };
+
+  return (
+    <div className="App">
+      {!user ? (
+        <form onSubmit={(e) => { e.preventDefault(); login(e.target.username.value, e.target.password.value); }}>
+          <input name="username" placeholder="Username" />
+          <input name="password" type="password" />
+          <button>Login</button>
+        </form>
+      ) : (
+        <>
+          <h1>Welcome, {user.username}</h1>
+          <div>
+            <h2>Products</h2>
+            {products.map(p => (
+              <div key={p.id}>
+                {p.name} - ${p.price}
+                <button onClick={() => addToCart(p)}>Add to Cart</button>
+              </div>
+            ))}
+          </div>
+          <div>
+            <h2>Cart ({cart.length})</h2>
+            {/* cart items would go here */}
+          </div>
+        </>
+      )}
+    </div>
+  );
+}
+
+export default App; You get the idea: everything lives in one component, auth is fake and insecure, no routing, no code splitting. Why Legacy React Apps Are Such a Pain Most big companies are still running apps that started life pre-React 18. Giant components, prop drilling everywhere, bundle sizes that make mobile users cry. Adding a new feature means touching half the codebase and praying the tests (if they exist) still pass. Agentic workflows help because they can read the whole thing at once, spot patterns we miss when we're deep in the weeds, and churn out consistent modular code faster than any human could. The Agent Team I run five specialized agents that hand work off to each other: Analyzer – reads the code and produces a structured report. Planner – turns that report into concrete steps. Coder – writes the actual refactored files. Tester – generates meaningful tests. Reviewer – catches anything that slipped through. The Analyzer we already made pretty thorough in the last version. Let's spend more time on the two that do the real work: Coder and Tester. Coder Agent This is the one that actually moves code around. I've learned the hard way that vague prompts lead to broken imports and forgotten lazy loading, so I lock it down pretty tight. Here's the system prompt I use: Python coder_prompt = ChatPromptTemplate.from_messages([
+    ("system", """You're a senior React engineer whose specialty is cleaning up old monoliths.
+
+Implement the refactor plan exactly—no creative detours. Rules I always follow:
+- Functional components and hooks only.
+- Feature-sliced layout: src/features/auth/, src/features/products/, src/features/cart/
+- React Router v6+ with proper <Routes> and <Route>
+- Every route component wrapped in React.lazy() + Suspense for code splitting
+- Shared state lives in dedicated contexts under src/context/
+- Forms are fully controlled (no e.target.username nonsense)
+- Components stay small and focused
+- Relative imports must be correct in the new structure
+- Don't add new dependencies unless the plan explicitly says so
+
+Output must be a JSON object: keys are full file paths, values are complete file contents. Include every new or changed file. Nothing else."""),
+    ("user", """Analysis JSON: {analysis_json}
+Original files: {original_files}
+Plan: {plan}""")
+]) Tester Agent Good tests are what keep me from losing sleep after a refactor. The tester prompt forces realistic RTL/Jest tests: Python tester_prompt = ChatPromptTemplate.from_messages([
+    ("system", """You're a frontend testing specialist. Write clean, useful tests with React Testing Library and Jest.
+
+For every important new or changed component:
+- Test rendering and key interactions
+- Use proper roles and accessible queries
+- Mock contexts when needed
+- Include at least one error/empty state test where it makes sense
+- Keep tests focused—aim for meaningful coverage, not 100% theater
+
+Output JSON: keys are test file paths (e.g. src/features/auth/LoginForm.test.jsx), values are full test files."""),
+    ("user", "Refactored files: {refactored_files}")
+]) What Happens When We Run It Feed the original App.js into the workflow. The Analyzer spots the usual suspects — high-severity coupling, oversized component, no code splitting, insecure auth — and gives us a nice JSON plan. Coder takes that plan and produces things like: A proper LoginForm.jsx with controlled inputs Separate ProductsList.jsx and Cart.jsx Context providers for auth and cart An AppRoutes.jsx that looks roughly like this: JavaScript import React, { Suspense } from 'react';
+import { BrowserRouter, Routes, Route, Navigate } from 'react-router-dom';
+
+const LoginForm = React.lazy(() => import('./features/auth/LoginForm'));
+const ProductsList = React.lazy(() => import('./features/products/ProductsList'));
+const Cart = React.lazy(() => import('./features/cart/Cart'));
+
+function AppRoutes() {
+  return (
+    <BrowserRouter>
+      <Suspense fallback={<div>Loading...</div>}>
+        <Routes>
+          <Route path="/login" element={<LoginForm />} />
+          <Route path="/products" element={<ProductsList />} />
+          <Route path="/cart" element={<Cart />} />
+          <Route path="*" element={<Navigate to="/login" />} />
+        </Routes>
+      </Suspense>
+    </BrowserRouter>
+  );
+}
+
+export default AppRoutes; Tester then writes solid tests — one of my favorites from a real run: JavaScript import { render, screen, fireEvent } from '@testing-library/react';
+import LoginForm from './LoginForm';
+import { AuthContext } from '../../context/AuthContext';
+
+const renderWithContext = (ui, { user = null, login = jest.fn() } = {}) => {
+  return render(
+    <AuthContext.Provider value={{ user, login }}>
+      {ui}
+    </AuthContext.Provider>
+  );
+};
+
+test('submits credentials correctly', () => {
+  const mockLogin = jest.fn();
+  renderWithContext(<LoginForm />, { login: mockLogin });
+
+  fireEvent.change(screen.getByPlaceholderText('Username'), { target: { value: 'admin' } });
+  fireEvent.change(screen.getByLabelText(/password/i), { target: { value: 'secret' } });
+  fireEvent.click(screen.getByRole('button', { name: /login/i }));
+
+  expect(mockLogin).toHaveBeenCalledWith('admin', 'secret');
+}); The Reviewer usually asks for one or two small tweaks (like adding a redirect after login), we loop back to Coder, and we're done. Running the Tests and Shipping npm test on the generated suite usually passes after the first or second iteration. Bundle size drops noticeably once the lazy loading is in place. I still review every diff in Cursor — AI doesn't get a free pass — but the volume of clean, consistent code it produces is night-and-day compared to doing it all manually. Lessons From the Trenches The detailed, structured prompts are what make this actually usable in real projects. Loose instructions = chaos. JSON output with file paths = easy automation. We've used this pattern on much larger apps (10–15k lines) and consistently needed only minor manual fixes afterward. Important Caveats If You're Thinking of Running This on Your Own Monolith Look, this setup works great on small-to-medium apps (a few hundred to a couple thousand lines), and it's a fantastic way to prototype a refactor or clean up a prototype. But before you point it at your company's million-line dashboard, here are the realities I've run into: Token limits are real. Even Claude 3.5's 200k context window fills up fast on anything bigger than a modest app. You'll need to chunk the codebase — feed in one feature or directory at a time — or build smarter retrieval tools (like vector search over your repo). Full-app refactors in one shot just aren't feasible yet. Hallucinations and subtle bugs happen. The agents are good, but they can invent imports that don't exist, miss edge cases in business logic, or subtly change behavior. Never merge without a thorough human diff review. In our bigger projects, we treat the AI output as a very smart PR draft, not final code. Costs add up. Running multiple agents with long contexts on a large codebase can burn through hundreds of dollars in API credits quickly. Start small and monitor usage. Non-code concerns get ignored. Package.json changes, build config, environment variables, and custom webpack setups — these agents won't touch them unless you explicitly add tools for it. It's best for mechanical refactors. Extracting components, adding routing, introducing contexts, code splitting — these are where it shines. Complex domain logic migrations or performance optimizations still need heavy human involvement. Top-tier companies are experimenting, not relying. Places like Meta, Google, and Amazon are piloting agentic workflows internally, but they're wrapping them in heavy guardrails, custom retrieval systems, and mandatory review gates. Full autonomy on critical monoliths isn't happening yet — think 30–50% productivity boost on targeted tasks, not full replacement. Use this as an accelerator, not a silver bullet. Start with one bounded feature, let the agents propose the changes, review and tweak, then expand. That's how we've gotten real wins without disasters. Wrapping Up If you're staring at a legacy 0 right now, give this approach a shot. It's not about replacing engineers — it's about letting us focus on the hard problems instead of endless boilerplate and busywork. I'd love to hear what your biggest React refactor headache is at the moment. Drop it in the comments — maybe we can figure out how to tackle it next. Happy (and much less painful) refactoring!
+==============
--- a/tests/summarization/https___dzone.com_articles_where-ai-fits-and-fails-in-workday-integrations.txt
+++ b/tests/summarization/https___dzone.com_articles_where-ai-fits-and-fails-in-workday-integrations.txt
--- a/tests/summarization/https___gopractice.ru_product_ai-products-mazes_.txt
+++ b/tests/summarization/https___gopractice.ru_product_ai-products-mazes_.txt
--- a/tests/summarization/https___gopractice.ru_product_ai-products_.txt
+++ b/tests/summarization/https___gopractice.ru_product_ai-products_.txt
--- a/tests/summarization/https___gopractice.ru_product_finding-potential-ai-applications_.txt
+++ b/tests/summarization/https___gopractice.ru_product_finding-potential-ai-applications_.txt
--- a/tests/summarization/https___gopractice.ru_product_focus-on-the-job-not-the-customer_.txt
+++ b/tests/summarization/https___gopractice.ru_product_focus-on-the-job-not-the-customer_.txt
--- a/tests/summarization/https___gopractice.ru_product_jtbd-the-theory-and-the-frameworks_.txt
+++ b/tests/summarization/https___gopractice.ru_product_jtbd-the-theory-and-the-frameworks_.txt
--- a/tests/summarization/https___gopractice.ru_product_kano-model_.txt
+++ b/tests/summarization/https___gopractice.ru_product_kano-model_.txt
--- a/tests/summarization/https___gopractice.ru_product_large-language-models_.txt
+++ b/tests/summarization/https___gopractice.ru_product_large-language-models_.txt
--- a/tests/summarization/https___gopractice.ru_product_metrics_.txt
+++ b/tests/summarization/https___gopractice.ru_product_metrics_.txt
--- a/tests/summarization/https___gopractice.ru_product_segmentation-method_.txt
+++ b/tests/summarization/https___gopractice.ru_product_segmentation-method_.txt
--- a/tests/summarization/https___gopractice.ru_product_the-north-star-metric-guide_.txt
+++ b/tests/summarization/https___gopractice.ru_product_the-north-star-metric-guide_.txt
--- a/tests/summarization/https___habr.com_ru_articles_985300_.txt
+++ b/tests/summarization/https___habr.com_ru_articles_985300_.txt
--- a/tests/summarization/https___habr.com_ru_articles_987826_.txt
+++ b/tests/summarization/https___habr.com_ru_articles_987826_.txt
--- a/tests/summarization/https___habr.com_ru_articles_987848_.txt
+++ b/tests/summarization/https___habr.com_ru_articles_987848_.txt
--- a/tests/summarization/https___habr.com_ru_articles_987854_.txt
+++ b/tests/summarization/https___habr.com_ru_articles_987854_.txt
@@ -0,0 +1,2 @@
+Полупрозрачный iPhone Air Из коробки Air выглядит несколько странно. Самый жуткий страх перфекциониста - несовпадение радиусов закругления грани смартфона и плато камер. Смартфон хоть и не страдает топовыми характеристиками, но хотя бы в руке ощущается по-новому. А чтобы сделать его ещё более эксклюзивным, надо придумать дизайн. Полностью прозрачные крышки уже делали. Но у Air есть одна особенность. Плато камер из прозрачного стекла, в то время, как остальная задняя крышка из матового. Останется только располовинить смартфон и заглянуть внутрь. Разбор iPhone Air С первого взгляда всё просто - берёшь крышку, снимаешь верхнюю металлическую пластину, на которую цепляется линза камеры и соскабливаешь пленку. Металлическая пластина линзы камеры iPhone Air На деле под металлической пластиной есть и пластиковая накладка. Она нужна, чтобы для крепления верхних скоб, которые и защелкиваются в корпус. Тут есть 2 варианта: - Полностью снять пластиковую накладку, тем самым легко удалим пленку, но тогда мы лишимся верхнего крепления; - Аккуратно вырезать пластик, чтобы снять пленку по контуру плато и сохранить крепёж; Пластиковая обрамление с крепежом Второй способ более элегантный, но трудоёмкий. Ушло около 4 резаков. Тупятся они быстро, а срез должен быть аккуратным. Ибо через прозрачное стекло так или иначе будут видны. Вырезание плато камеры iPhone Air Останутся только нюансы: вклеить линзу камеры, вспышку и сетку микрофона. По сравнению с заводским способом, когда линза камеры приварена к металлической пластине - выглядит надежнее. Но за несколько месяцев использования с полиуретановым клеем, который использовался для линзы и вспышки, ничего не случилось, ничего не отвалилось. Пыль внутрь не попала. Кажется, что делается всё быстро. На деле аккуратно вырезать эту пластиковую подложку стоило несколько часов. И это без учета демонтажа металлической подложки и сборки устройства. Почему крышка стала черной тоже есть ответ. Черный дисплей на светлых айфонах сильно выбивается из общего вида. А так как у нас видны металлические внутренности, то они отлично сочетаются со светлой металлической рамкой. Кастомный iPhone Air Практично ли это? Навряд ли. Делают ли урезаный Air лучше? Нет. Зато он стал легче и с необычным внешним видом. Видео с полным процессом кастомизации - https://www.youtube.com/watch?v=I-cz7i4ErMc Да! Когда коту делать нечего, он ...
+==============
--- a/tests/summarization/https___habr.com_ru_articles_987862_.txt
+++ b/tests/summarization/https___habr.com_ru_articles_987862_.txt
--- a/tests/summarization/https___habr.com_ru_articles_987876_.txt
+++ b/tests/summarization/https___habr.com_ru_articles_987876_.txt
@@ -0,0 +1,2 @@
+The Information на основе внутренних документов OpenAI, компания ожидает убыток $14 млрд в 2026 году, что втрое больше, чем в 2025. Кумулятивные потери за 2023–2028 составят $44 млрд, после чего в 2029 планируется выход на прибыль $14 млрд при выручке $100 млрд. Deutsche Bank посчитал жёстче: отрицательный свободный денежный поток $143 млрд между 2024 и 2029 годами. Аналитики пишут: «Ни один стартап в истории не работал с убытками в таких масштабах. Мы находимся на абсолютно неизведанной территории». Что там с Sora. По оценкам Forbes, генерация одного 10-секундного видео обходится в $1.30. При текущих объёмах это $15 млн в день, или $5.4 млрд в год. Глава Sora Билл Пиблз публично признал, что «экономика сейчас абсолютно неустойчива». Добрый вечер Доля ChatGPT упала с 87% до 68% за год. Google Gemini вырос с 5.4% до 18.2%. В enterprise ещё хуже: OpenAI потерял половину рынка (с 50% в 2023 до 27% сейчас), а Claude от Anthropic теперь лидер с 32%. Как сказал бывший управляющий Fidelity Джордж Нобл на прошлой неделе: «OpenAI разваливается на глазах в реальном времени. Я наблюдал крах компаний десятилетиями. Здесь все тревожные признаки».
+==============
--- a/tests/summarization/https___habr.com_ru_articles_987882_.txt
+++ b/tests/summarization/https___habr.com_ru_articles_987882_.txt
--- a/tests/summarization/https___habr.com_ru_companies_croc_articles_987856_.txt
+++ b/tests/summarization/https___habr.com_ru_companies_croc_articles_987856_.txt
--- a/tests/summarization/https___habr.com_ru_companies_habr_career_articles_987870_.txt
+++ b/tests/summarization/https___habr.com_ru_companies_habr_career_articles_987870_.txt
--- a/tests/summarization/https___habr.com_ru_companies_perco_articles_987866_.txt
+++ b/tests/summarization/https___habr.com_ru_companies_perco_articles_987866_.txt
--- a/tests/summarization/https___ip-calculator.ru_blog_ask_how-to-enable-use-or-remove-the-developer-tab-in-microsoft-excel_.txt
+++ b/tests/summarization/https___ip-calculator.ru_blog_ask_how-to-enable-use-or-remove-the-developer-tab-in-microsoft-excel_.txt
--- a/tests/summarization/https___ip-calculator.ru_blog_ask_integratsiya-1s-s-it-sistemami-prakticheskoe-rukovodstvo_.txt
+++ b/tests/summarization/https___ip-calculator.ru_blog_ask_integratsiya-1s-s-it-sistemami-prakticheskoe-rukovodstvo_.txt
--- a/tests/summarization/https___ip-calculator.ru_blog_ask_kak-ajtishniku-vybrat-nejroset_.txt
+++ b/tests/summarization/https___ip-calculator.ru_blog_ask_kak-ajtishniku-vybrat-nejroset_.txt
--- a/tests/summarization/https___ip-calculator.ru_blog_ask_kak-it-biznesu-zashhitit-svoyu-informatsionnuyu-infrastrukturu_.txt
+++ b/tests/summarization/https___ip-calculator.ru_blog_ask_kak-it-biznesu-zashhitit-svoyu-informatsionnuyu-infrastrukturu_.txt
--- a/tests/summarization/https___ip-calculator.ru_blog_ask_kak-it-kompanii-zashhitit-svoi-dannye_.txt
+++ b/tests/summarization/https___ip-calculator.ru_blog_ask_kak-it-kompanii-zashhitit-svoi-dannye_.txt
--- a/tests/summarization/https___ip-calculator.ru_blog_ask_kak-rossijskim-ajti-predprinimatelyam-zapuskat-biznes-v-azii-bez-problem-s-zaderzhkoj-nadyozhnostyu-i-bezopasnostyu_.txt
+++ b/tests/summarization/https___ip-calculator.ru_blog_ask_kak-rossijskim-ajti-predprinimatelyam-zapuskat-biznes-v-azii-bez-problem-s-zaderzhkoj-nadyozhnostyu-i-bezopasnostyu_.txt
--- a/tests/summarization/https___ip-calculator.ru_blog_ask_kak-sozdat-internet-magazin-ponyatnoe-rukovodstvo-dlya-predprinimatelej_.txt
+++ b/tests/summarization/https___ip-calculator.ru_blog_ask_kak-sozdat-internet-magazin-ponyatnoe-rukovodstvo-dlya-predprinimatelej_.txt
--- a/tests/summarization/https___ip-calculator.ru_blog_ask_nejroseti-v-seo-i-marketinge-telegram-kak-novyj-poiskovik_.txt
+++ b/tests/summarization/https___ip-calculator.ru_blog_ask_nejroseti-v-seo-i-marketinge-telegram-kak-novyj-poiskovik_.txt
--- a/tests/summarization/https___ip-calculator.ru_blog_ask_vizualnyj-dizajn-i-sotsialnaya-aktivnost-kak-pikseli-i-oformlenie-vliyayut-na-lajki-i-podpischikov_.txt
+++ b/tests/summarization/https___ip-calculator.ru_blog_ask_vizualnyj-dizajn-i-sotsialnaya-aktivnost-kak-pikseli-i-oformlenie-vliyayut-na-lajki-i-podpischikov_.txt
--- a/tests/summarization/https___ip-calculator.ru_blog_ask_zachem-programmistu-razbiratsya-v-zheleze_.txt
+++ b/tests/summarization/https___ip-calculator.ru_blog_ask_zachem-programmistu-razbiratsya-v-zheleze_.txt
--- a/tests/summarization/https___kod.ru_blue-origin-sputnikovaya-svayz.txt
+++ b/tests/summarization/https___kod.ru_blue-origin-sputnikovaya-svayz.txt
@@ -0,0 +1,2 @@
+Компания Blue Origin Джеффа Безоса объявила о запуске проекта спутниковой связи TeraWave. Сеть ориентирована не на массового пользователя, а на корпоративных клиентов — дата-центры, госучреждения и крупный бизнес. В группировку войдут 5280 спутников на низкой околоземной орбите со скоростью передачи данных до 144 Гбит/с и ещё 128 спутников на средней орбите — до 6 Тбит/с. Развёртывание начнётся в четвёртом квартале 2027 года, сроки полного завершения проекта не раскрываются. Анонс TeraWave прозвучал спустя несколько месяцев после ребрендинга другого спутникового проекта Безоса Amazon Leo (бывший Project Kuiper), который рассчитан на массовый рынок и более 3000 спутников на НОО. В Blue Origin подчёркивают, что проекты не конкурируют напрямую: TeraWave нацелен на клиентов, которым нужны сверхвысокие скорости и быстрая масштабируемость. Blue Origin также продолжает развивать пусковые услуги. В ноябре 2025 года компания успешно запустила тяжёлую ракету New Glenn и впервые посадила её многоразовую первую ступень.
+==============
--- a/tests/summarization/https___kod.ru_caviar-aladdin.txt
+++ b/tests/summarization/https___kod.ru_caviar-aladdin.txt
@@ -0,0 +1,2 @@
+Российский бренд Caviar анонсировал разработку кастомного робота Aladdin, выполненного в восточной эстетике. Об этом «Коду Дурова» рассказали в пресс-службе компании. За основу взята модель G1 китайской компании Unitree Robotics. Этот гуманоидный робот предназначен для исследований и демонстраций, оснащён системой стабилизации и способен выполнять сложные движения. Caviar полностью переработал внешний облик базовой модели. Корпус планируется оформить в стиле восточных аристократических одеяний с золотыми орнаментами и вставками из драгоценных камней. Дизайн вдохновлён арабскими мотивами и отсылает к персонажу из сборника «Тысяча и одна ночь». Сказки этого собрания формировались с VIII по XIV век и объединяют фольклор Ближнего Востока, Персии и Индии. В трактовке Caviar образ Аладдина лишён сказочности — магия заменена технологиями. Робот представлен как символ новой аристократии, основанной на контроле над высокими технологиями. Проект позиционируется как концептуальный и создаётся исключительно под индивидуальный заказ. Массовое производство не предусмотрено — это скорее дизайнерский и культурный объект. Стоимость Robot Aladdin составляет $100 тысяч. Сроки изготовления компания сообщает по запросу клиентов.
+==============
--- a/tests/summarization/https___kod.ru_gosduma-shtrafy-vpn.txt
+++ b/tests/summarization/https___kod.ru_gosduma-shtrafy-vpn.txt
@@ -0,0 +1,2 @@
+В Государственной думе заверили, что штрафы за использование VPN-сервисов обычными пользователями не планируются. Об этом сообщает Интерфакс. Первый заместитель председателя комитета Госдумы по информационной политике Антон Горелкин подчеркнул, что такие меры даже не рассматриваются. По его словам, никаких дополнительных штрафов для граждан вводить не собираются. Депутат напомнил, что в России уголовная ответственность за VPN наступает только при использовании технологии для совершения преступлений. «В России ответственность за VPN грозит лишь тем, кто использует его для совершения преступлений: это считается отягчающим обстоятельством. Также запрещено рекламировать способы обхода блокировок и призывать к их использованию. Никакие другие штрафы не планируются и не обсуждаются» Ранее, летом 2025 года, был принят закон об административной ответственности владельцев VPN-сервисов. Документ также предусматривает штрафы за целенаправленный поиск запрещённых в России материалов, но не затрагивает обычных пользователей.
+==============
--- a/tests/summarization/https___kod.ru_huawei-matepad-11-5-s-2026-predzakaz.txt
+++ b/tests/summarization/https___kod.ru_huawei-matepad-11-5-s-2026-predzakaz.txt
@@ -0,0 +1,2 @@
+Открыт предзаказ на обновлённый планшет Huawei MatePad 11,5 S c экраном PaperMatte для защиты зрения, сообщила «Коду Дурова» пресс-служба МТС. Новинка вместе с экраном PaperMatte получила умную магнитную клавиатуру, а также ИИ-фишки в приложении «Блокнот». В Россию планшет приезжает в зелёном и космическом сером цветах. Что интересного предлагает HUAWEI MatePad 11,5 S? Планшет 2026 года в цельнометаллическом корпусе и с экраном 11,5" позиционируется как устройство для учёбы, быстрого выполнения рабочих задач и творчества. Он имеет толщину 6,1 мм и весит 515 г. Изображение: Huawei // Изображение: Huawei Дисплей получил разрешение 2,8К, частоту обновления 144 Гц и пиковую яркость до 500 нит. Версия с экраном PaperMatte выполнена с технологией нанотравления — она снижает блики и создаёт эффект письма на бумаге. Также поддерживаются стилус Huawei M-Pencil Pro с технологией NearLink и умная магнитная клавиатура. Последняя оснащена клавишами с ходом 1,5 мм и размером 15 мм. Аккумулятор планшета имеет ёмкость 8800 мАч, а также быструю зарядку Huawei SuperCharge мощностью до 40 Вт. Изображение: Huawei // Изображение: Huawei Среди ИИ-фишек в «Блокноте» заявлены распознавание и решение уравнений для обучения, синхронизация текста и аудио, общий буфер обмена между устройствами и работа в режиме разделённого экрана. Также доступен центр ресурсов с шаблонами бумаги, стикерами и обложками. Планшет оснащён компьютерной версией приложения WPS Office, что позволяет использовать его как мобильный офис для работы с документами и презентациями. Сколько стоит Huawei MatePad 11,5 S в России? Изображение: Huawei // Изображение: Huawei MatePad 11,5 S 12+256 ГБ PaperMatte с клавиатурой — 45 990 рублей; MatePad 11,5 S 12+256 ГБ (космический серый) с клавиатурой — 40 990 рублей. В период предзаказа в интернет-магазине c 22 по 29 января МТС предлагает в подарок стилус Huawei M-Pencil 3, мышь и 1 год дополнительной гарантии. Изображение: Huawei // Изображение: Huawei Коммерческий директор розничной сети МТС Алексей Помозов рассказал, что по итогам 2025 года планшеты Huawei занимают 41% продаж в штуках в розничной сети МТС. По его словам, это подтверждает высокий спрос: «Мы уверены, что новый Huawei MatePad 11,5 S укрепит эту позицию и станет одним из самых востребованных планшетов в своем сегменте». Напомним, в отдельном обзоре «Код Дурова» выделил пять фишек Huawei MatePad 11.5"S PaperMatte в версии 2024 года: Huawei MatePad 11.5″ PaperMatte: секрет «бумажного» экрана и скрытые фишки Разбираемся в особенностях Huawei MatePad 11.5″ PaperMatte: уникальный «бумажный» экран, производительная начинка и стиль. Узнайте, как планшет поможет в работе и развлечениях. Код ДуроваВлад Войтенко [Читать далее] Заглавное изображение: Huawei
+==============
--- a/tests/summarization/https___kod.ru_mincifri-analog-call-of-duty.txt
+++ b/tests/summarization/https___kod.ru_mincifri-analog-call-of-duty.txt
@@ -0,0 +1,2 @@
+Разработчики отечественного аналога Call of Duty смогут претендовать на налоговые льготы и финансирование через «Институт развития интернета» (ИРИ). Об этом Минцифры сообщило в ответе депутату Госдумы Михаилу Делягину, пишет «Газета.Ru». По оценке самого депутата, разработка такого проекта может обойтись до 10 млрд рублей. В министерстве уточнили, что при подаче заявки в ИРИ на финансирование игры соответствующей тематики она будет рассмотрена в рамках действующих конкурсных процедур. Также в ведомстве напомнили, что в России уже действуют меры господдержки для IT-компаний, включая разработчиков игр. Среди них — пониженная ставка налога на прибыль в размере 5%, сниженные тарифы страховых взносов и частичное освобождение от НДС в ряде случаев. Ранее Делягин предложил создать отечественный шутер AAA-класса — аналог серии Call of Duty, где противниками выступали бы американцы, украинцы и британцы, а не россияне, как, по его словам, в зарубежных играх. Он отмечал, что проект будет крайне дорогим и без существенных налоговых и кредитных льгот его реализация маловероятна. Депутат считает, что видеоигры используются недружественными странами для распространения «русофобской пропаганды», поэтому разработка российского аналога, по его мнению, имеет стратегическое значение.
+==============
--- a/tests/summarization/https___kod.ru_obzor-huawei-matepad-11-5-s-2026.txt
+++ b/tests/summarization/https___kod.ru_obzor-huawei-matepad-11-5-s-2026.txt
--- a/tests/summarization/https___kod.ru_open-letter-to-pavel-durov.txt
+++ b/tests/summarization/https___kod.ru_open-letter-to-pavel-durov.txt
@@ -0,0 +1,2 @@
+Вице-спикер Госдумы Владислав Даванков призвал граждан подписать открытое письмо к Павлу Дурову с призывом открыть представительство Telegram в России. По уверениям Даванкова, открытие представительства является формальным, но ключевым требованием российского законодательства. «Чаще всего чиновники объясняют блокировки тем, что у соцсети нет представительства в России — и значит, не с кем выстраивать диалог. Если Telegram выполнит это требование, то лишит Роскомнадзор оснований для блокировки», — объясняет депутат. Подписи к открытому письму к главе Telegram вице-спикер Госдумы собирает через Telegram-бота @newpeople_vote_bot с названием «#ТелеграмЖиви». Всего за полчаса после начала сбора подписей в боте свою подпись оставили несколько тысяч человек. В этом же боте подпись против блокировки Telegram за это время уже успели поставить почти 40 000 человек. Господин Даванков подчеркнул, что относится к Павлу Дурову и к тому, что он сделал для развития интернета во всем мире «с большим уважением»: «Думаю, что наши ценности в отношении свободы слова и цифровых прав сходятся. И если офис Telegram в России защитит эти ценности, это будет правильный шаг». Вечером 21 января заместитель председателя Совета по развитию цифровой экономики при Совете Федерации Артём Шейкин заявил, что в России последовательно вводят меры против Telegram: «Telegram не выполняет требования, направленные на предупреждение и пресечение совершения преступлений на территории РФ», — рассказал сенатор. Заявление о последовательном введении мер в отношении мессенджера в комментарии «Осторожно Media» Роскомнадзор назвал «исчерпывающим». Код Дурова ⚡️ Скорость Telegram снизилась из-за блокировки звонков? У части российских пользователей стало больше проблем с выгрузкой и загрузкой медиафайлов, выяснил «Код Дурова». 💘 Тесты показывают, что выгрузка и отправка большого файла порой при переходе на иностранный IP-адрес существенно быстрее, чем… Telegram [Читать далее] В пятницу «Код Дурова» продемонстрировал, что выгрузка и отправка большого файла при переходе на иностранный IP-адрес существенно быстрее, чем на российском. К тому дню жалоб на скорость загрузки и выгрузки файлов со стороны россиян стало заметно больше.
+==============
--- a/tests/summarization/https___kod.ru_reestr-kuriery.txt
+++ b/tests/summarization/https___kod.ru_reestr-kuriery.txt
@@ -0,0 +1,2 @@
+Департамент транспорта Москвы разрабатывает законопроект о регулировании курьерской деятельности, сообщает РБК. Документ предусматривает создание федеральных и региональных реестров участников рынка. В рамках новой системы будут созданы два типа реестров. Федеральный реестр охватит курьерские организации, а региональные — самих курьеров с их паспортными данными и информацией о транспорте. Каждый курьер получит персональный идентификатор и рейтинг. Включение в реестр останется бесплатным, но потребует соблюдения ряда условий. В частности, курьеров обяжут соблюдать правила ПДД, проходить медосмотры и фотоконтроль перед сменами. Нарушителей правил дорожного движения будут блокировать. Курьерам, не желающим регистрироваться в реестре, будет запрещено доставлять заказы. Курьерские компании обяжут допускать к работе только зарегистрированных сотрудников. Они должны будут вести журналы заказов и отслеживать местоположение курьеров в режиме реального времени. Платформы для заказов получат дополнительные обязанности по проверке регистрации курьеров и организации документооборота. Их также обяжут страховать ответственность и блокировать нарушителей. За несоблюдение новых требований планируется ввести штрафные санкции. При работе с незарегистрированными курьерами агрегаторы будут нести солидарную ответственность за возмещение ущерба. В материале отмечается, что регулирование вступит в силу с 1 сентября 2027 года, однако не уточняется размер штрафов.
+==============
--- a/tests/summarization/https___kod.ru_telegram-op-ogranicheniya.txt
+++ b/tests/summarization/https___kod.ru_telegram-op-ogranicheniya.txt
@@ -0,0 +1,2 @@
+Член комиссии Общественной палаты РФ Евгений Машаров призвал продолжить ограничительные меры против мессенджера Telegram в случае игнорирования российского законодательства. Об этом сообщает ТАСС. Евгений Машаров подчеркнул, что администрация мессенджера должна соблюдать требования отечественного законодательства. По его словам, проблема не ограничивается обычными сообщениями и звонками. В приложении активно функционируют группы с противоправным содержанием, включая онлайн-казино и приём спортивных ставок. Представитель Общественной палаты рекомендовал максимально широко использовать российский мессенджер MAX. Машаров отметил главные преимущества отечественной платформы — безопасность и высокое качество связи. Ранее Роскомнадзор уже принял меры против Telegram, ограничив функцию голосовых вызовов в августе прошлого года. Ведомство объяснило решение использованием мессенджера для мошенничества и киберпреступлений. Россияне подписывают открытое письмо к Павлу Дурову с призывом открыть представительство Telegram в России Вице-спикер Госдумы Владислав Даванков призвал граждан подписать открытое письмо к Павлу Дурову с призывом открыть представительство Telegram в России. Код ДуроваВлад Войтенко [Читать далее]
+==============
--- a/tests/summarization/https___kod.ru_yandex-b2btech-postgresql.txt
+++ b/tests/summarization/https___kod.ru_yandex-b2btech-postgresql.txt
@@ -0,0 +1,2 @@
+Yandex B2B Tech представила управляемый сервис Sharded PostgreSQL, который решает проблему горизонтального масштабирования популярной базы данных. Об этом «Коду Дурова» сообщили в пресс-службе компании. PostgreSQL по умолчанию не поддерживает горизонтальное масштабирование — добавление новых серверов для распределения нагрузки. Это создаёт ограничения для компаний, которым нужно обрабатывать большие объёмы данных. Новый сервис позволяет банкам и ecommerce-компаниям масштабировать системы по мере роста бизнеса. Технология уже используется в продуктах самого Яндекса — Яндекс ID, Яндекс Пэй и сервисе «Едадил». Согласно исследованию Stack Overflow 2025, PostgreSQL остаётся самой популярной системой управления базами данных среди 55,6% профессиональных разработчиков. Управляемый сервис PostgreSQL от Yandex Cloud используют свыше 5 тысяч компаний. Среди клиентов — девелопер «Самолёт», маркетплейс ЦИАН, автосервис Pango Cars и металлургическая компания «Русполимет». В 2025 году количество пользователей сервиса выросло на 15%. Новое решение позволяет сократить время вывода продуктов на рынок в 3-4 раза. Компании могут сэкономить до 15 миллионов рублей на этапе запуска высоконагруженных продуктов. Сервис доступен в облачной платформе Yandex Cloud по запросу.
+==============
--- a/tests/summarization/https___lambdaland.org_posts_2026-01-21_tree-sitter_vs_lsp_.txt
+++ b/tests/summarization/https___lambdaland.org_posts_2026-01-21_tree-sitter_vs_lsp_.txt
--- a/tests/summarization/https___nuancesprog.ru_p_30448_.txt
+++ b/tests/summarization/https___nuancesprog.ru_p_30448_.txt
--- a/tests/summarization/https___nuancesprog.ru_p_30453_.txt
+++ b/tests/summarization/https___nuancesprog.ru_p_30453_.txt
--- a/tests/summarization/https___nuancesprog.ru_p_30511_.txt
+++ b/tests/summarization/https___nuancesprog.ru_p_30511_.txt
--- a/tests/summarization/https___nuancesprog.ru_p_30526_.txt
+++ b/tests/summarization/https___nuancesprog.ru_p_30526_.txt
--- a/tests/summarization/https___nuancesprog.ru_p_30546_.txt
+++ b/tests/summarization/https___nuancesprog.ru_p_30546_.txt
--- a/tests/summarization/https___nuancesprog.ru_p_30828_.txt
+++ b/tests/summarization/https___nuancesprog.ru_p_30828_.txt
@@ -0,0 +1,65 @@
+При создании представлений SwiftUI, которые должны адаптироваться к размерам контейнера, многие разработчики обращаются к GeometryReader. Это стандартное решение, но оно имеет свои издержки: влияя на иерархию представлений, может усложнить логику макета. В этой статье я покажу, как создать правильный модификатор с помощью onGeometryChange и contentMargins — двух мощных API iOS 17+, которые позволяют отслеживать изменения геометрии без обертывания представлений в GeometryReader. Примечание: onGeometryChange вышел с iOS 17 SDK, но был портирован на iOS 16. Проблема: скрытые затраты GeometryReader Допустим, вам надо центрировать контент по горизонтали с ограничением максимальной ширины. Это распространенная задача для макетов iPad или дизайнов для широких экранов. Традиционный подход выглядит примерно так: var body: some View {
+  GeometryReader { geometry in
+    ScrollView {
+      LazyVStack {
+        // Контент
+      }
+      .frame(maxWidth: 768)
+    }
+    .contentMargins(
+      .horizontal,
+      max(0, (geometry.size.width - 768) / 2),
+      for: .scrollContent
+    )
+  }
+} Это работает, но GeometryReader имеет несколько недостатков: влияет на компоновку: занимает все доступное пространство, что может нарушить иерархию представлений; приводит к многословности: требуется оборачивать всю структуру представлений; не подлежит переиспользованию: этот паттерн приходится повторять во множестве представлений. Решение: правильный ViewModifier Вместо оборачивания представлений в GeometryReader, можно создать переиспользуемый модификатор, который применяет onGeometryChange для отслеживания изменений размера без влияния на компоновку. Вот как это сделать: import SwiftUI
+
+struct MaxWidthContentMargins: ViewModifier {
+  @State private var containerWidth: CGFloat = 0
+
+  func body(content: Content) -> some View {
+    content
+      .onGeometryChange(
+        for: CGFloat.self,
+        of: { geometry in
+          geometry.size.width
+        }
+      ) { width in
+        containerWidth = width
+      }
+      .contentMargins(
+        .horizontal,
+        max(0, (containerWidth - .editorMaxColumnWidth) / 2),
+        for: .scrollContent
+      )
+  }
+}
+
+extension View {
+  func maxWidthContentMargins() -> some View {
+    modifier(MaxWidthContentMargins())
+  }
+}
+ Как это работает 1. onGeometryChange: отслеживание без влияния Модификатор onGeometryChange позволяет отслеживать изменения геометрии без обертывания представления в GeometryReader. Он имеет три параметра: for: тип отслеживаемого значения (в данном случае CGFloat для ширины); of: замыкание, которое извлекает значение из GeometryProxy; action: замыкание, которое выполняется при изменении геометрии. .onGeometryChange(
+  for: CGFloat.self,
+  of: { geometry in
+    geometry.size.width
+  }
+) { width in
+  containerWidth = width
+} Вот ключевое отличие: onGeometryChange отслеживает геометрию, не влияя на иерархию представлений. Ваши представления остаются чистыми и неизмененными. 2. contentMargins: точное позиционирование контента Модификатор contentMargins позволяет регулировать отступы вокруг прокручиваемого контента, не затрагивая индикаторы прокрутки. Это идеально подходит для центрирования контента при сохранении индикаторов прокрутки у краев экрана. При этом чистое пространство, которое стало отступами, остается прокручиваемым и интерактивным. .contentMargins(
+  .horizontal,
+  max(0, (containerWidth - .editorMaxColumnWidth) / 2),
+  for: .scrollContent
+) Данное вычисление max(0, (containerWidth — .editorMaxColumnWidth) / 2) обеспечивает: на широких экранах: центрирование Content с равными отступами с обеих сторон; на узких экранах: отсутствие дополнительных отступов (функция max(0, …) предотвращает появление отрицательных значений) Применение модификатора Теперь применение модификатора становится предельно простым: struct LibraryView: View {
+  var body: some View {
+    ScrollView {
+      LazyVStack {
+        // Контент
+      }
+      .frame(maxWidth: .editorMaxColumnWidth)
+    }
+    .maxWidthContentMargins()
+  }
+} Никаких оберток GeometryReader. Никаких повторяющихся геометрических вычислений. Всего лишь чистый, декларативный модификатор. Читайте также: Как вернуть контроль над состоянием данных с RemoteResult Apple убивает Swift 7 лучших ресурсов для iOS-разработчиков в 2025 году Читайте нас в Telegram, VK и Дзен Перевод статьи Thomas Ricouard: Beyond GeometryReader: Building Better SwiftUI Modifiers with onGeometryChange
+==============
--- a/tests/summarization/https___nuancesprog.ru_p_30941_.txt
+++ b/tests/summarization/https___nuancesprog.ru_p_30941_.txt
--- a/tests/summarization/https___nuancesprog.ru_p_31317_.txt
+++ b/tests/summarization/https___nuancesprog.ru_p_31317_.txt
--- a/tests/summarization/https___nuancesprog.ru_p_31340_.txt
+++ b/tests/summarization/https___nuancesprog.ru_p_31340_.txt
--- a/tests/summarization/https___nuancesprog.ru_p_31409_.txt
+++ b/tests/summarization/https___nuancesprog.ru_p_31409_.txt
--- a/tests/summarization/https___polarsparc.com_2025_05_31_open-webui_.txt
+++ b/tests/summarization/https___polarsparc.com_2025_05_31_open-webui_.txt
@@ -0,0 +1,2 @@
+Updated: 05/31/2025 In this primer, we will provide an overview of the Open WebUI platform as well as get our hands dirty interacting with it, in combination with the Ollama platform. Here is the link to the article on Open WebUI: Quick Primer on Open WebUI with Ollama Enjoy 🙂 !!!
+==============
--- a/tests/summarization/https___polarsparc.com_2025_06_15_langchain-recipes_.txt
+++ b/tests/summarization/https___polarsparc.com_2025_06_15_langchain-recipes_.txt
@@ -0,0 +1,2 @@
+Updated: 06/15/2025 In this article, we will provide code snippets for commonly performed tasks using the popular LLM framework LangChain. Here is the link to the article on common LangChain recipes: Common LangChain Recipes Enjoy 🙂 !!!
+==============
--- a/tests/summarization/https___polarsparc.com_2025_06_15_langchain_.txt
+++ b/tests/summarization/https___polarsparc.com_2025_06_15_langchain_.txt
@@ -0,0 +1,2 @@
+Updated: 06/15/2025 In this primer, we will provide an overview of the popular LLM framework LangChain AND get our hands dirty with some code samples on the various core components. Here is the link to the article on LangChain: Quick Primer on LangChain Enjoy 🙂 !!!
+==============
--- a/tests/summarization/https___polarsparc.com_2025_06_21_pragmatic-spring-ai_.txt
+++ b/tests/summarization/https___polarsparc.com_2025_06_21_pragmatic-spring-ai_.txt
@@ -0,0 +1,2 @@
+In this article, we will provide an overview of the Spring AI framework that went GA in May 2025 and get our hands dirty with some code samples on the various core capabilities of the framework in combination with Spring Boot. Here is the link to the article on Spring AI: Pragmatic Bytes on Spring AI Enjoy 🙂 !!!
+==============
--- a/tests/summarization/https___polarsparc.com_2025_06_29_llama-cpp_.txt
+++ b/tests/summarization/https___polarsparc.com_2025_06_29_llama-cpp_.txt
@@ -0,0 +1,2 @@
+In this primer, we will provide an overview of the llama.cpp inference platform and get our hands dirty in using it. We will use the Langchain OpenAI module in Python to test the platform. Here is the link to the article on llama.cpp: Quick Primer on llama.cpp Enjoy 🙂 !!!
+==============
--- a/tests/summarization/https___polarsparc.com_2025_07_04_hyperledger-besu-docker_.txt
+++ b/tests/summarization/https___polarsparc.com_2025_07_04_hyperledger-besu-docker_.txt
@@ -0,0 +1,2 @@
+Updated: 07/04/2025 In this article, we will explore the use of Hyperledger Besu (an open source Ethereum client built for the Enterprises). We will demonstrate how to setup and test a multi-node private Blockchain network using Hyperledger Besu in Docker containers. Here is the link to the article: Hyperledger Besu Private Network using Docker Enjoy 🙂 !!!
+==============
--- a/tests/summarization/https___polarsparc.com_2025_07_05_docker-model-runner_.txt
+++ b/tests/summarization/https___polarsparc.com_2025_07_05_docker-model-runner_.txt
@@ -0,0 +1,2 @@
+In this article, we will introduce how on can get their hands dirty in using the Docker Model Runner. Here is the link to the article: Quick Bytes on Docker Model Runner Enjoy 🙂 !!!
+==============
--- a/tests/summarization/https___polarsparc.com_2025_07_12_cross-entropy_.txt
+++ b/tests/summarization/https___polarsparc.com_2025_07_12_cross-entropy_.txt
@@ -0,0 +1,2 @@
+In this article, we will develop an intuition for the Cross Entropy loss function that is often used for AI/ML Classification tasks during model training. Here is the link to the article: Understanding Cross Entropy Enjoy 🙂 !!!
+==============
--- a/tests/summarization/https___polarsparc.com_2025_07_19_anvil-solidity-python_.txt
+++ b/tests/summarization/https___polarsparc.com_2025_07_19_anvil-solidity-python_.txt
@@ -0,0 +1,2 @@
+Updated: 07/19/2025 Note that as of 2023, the local Ethereum development environment Ganache (part of TruffleSuite) has been sunset. In this updated article, we will show how to setup a local Ethereum development environment using Anvil (part of the Foundry toolchain) and Solidity for building, deploying, and testing Decentralized Application (dApps) and follow through with a simple demonstration using Python. Here is the link to the article: Using Anvil and Solidity with Python Enjoy 🙂 !!!
+==============
--- a/tests/summarization/https___polarsparc.com_2025_08_01_polarsparc-retire_.txt
+++ b/tests/summarization/https___polarsparc.com_2025_08_01_polarsparc-retire_.txt
@@ -0,0 +1,2 @@
+!!! ATTENTION !!! The domain polarsparc.com will be RETIRED and going OFFLINE in early Oct 2025. The most of the relevant content of about 300 articles have been migrated over to Github Pages at polarsparc.github.io !!! Going forward, please feel free to visit the above mentioned Github Pages site for newer content !!!
+==============
--- a/tests/summarization/https___reactos.org_blogs_30yrs-of-ros_.txt
+++ b/tests/summarization/https___reactos.org_blogs_30yrs-of-ros_.txt
--- a/tests/summarization/https___t-cadet.github.io_programming-wisdom__2026-01-17-gathering-linux-syscall-numbers.txt
+++ b/tests/summarization/https___t-cadet.github.io_programming-wisdom__2026-01-17-gathering-linux-syscall-numbers.txt
@@ -0,0 +1,80 @@
+Programming Wisdom ⚙️ 3 of 3 shown Date Update People Title Topics 2026-01-17 2026-01-17 @op Gathering Linux Syscall Numbers in a C Table #linux #syscalls #c 2025-09-18 2025-09-18 @cmuratori Wise Commenting #wisdom #comments #handmade hero 2025-09-14 2025-09-14 @op Hello, World! #lore 2026-01-17 Gathering Linux Syscall Numbers in a C Table 2025-09-18 Wise Commenting 2025-09-14 Hello, World! Gathering Linux Syscall Numbers in a C Table 2026-01-17 @op #linux #syscalls #c I've been trying to program without libc, and on Linux that means calling syscalls directly. Syscalls are the lowest userland layer; they are basically the ground of the Linux userland. In an ideal world, there would be a header-only C library provided by the Linux kernel; we would include that file and be done with it. As it turns out, there is no such file, and interfacing with syscalls is complicated. Syscalls are special; to syscall, one has to put the syscall number in a register, the arguments in other registers, and issue an assembly instruction. Okay, that said, how hard can it be to create my own header-only syscall library? First things first: I need to get all the syscall numbers. My Linux syscall table. Organized thematically for browsing. Valid C code, cross-architecture. /*╔════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╗
+/*║                                                  LINUX SYSCALL TABLE                                                   ║
+/*╠════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╣
+/*║                                                      Section List                                                      ║
+/*╟────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╢
+/*║  1. PROCESS & THREAD LIFECYCLE         11. SIGNALS                            21. NAMESPACES & CONTAINERS              ║
+/*║  2. PROCESS ATTRIBUTES & CONTROL       12. PIPES & FIFOs                      22. PROCESS INSPECTION & CONTROL         ║
+/*║  3. SCHEDULING & PRIORITIES            13. INTER-PROCESS COMMUNICATION        23. SYSTEM INFORMATION                   ║
+/*║  4. MEMORY MANAGEMENT                  14. SOCKETS & NETWORKING               24. KERNEL MODULES                       ║
+/*║  5. FILE I/O OPERATIONS                15. ASYNCHRONOUS I/O                   25. SYSTEM CONTROL & ADMINISTRATION      ║
+/*║  6. FILE DESCRIPTOR MANAGEMENT         16. TIME & CLOCKS                      26. PERFORMANCE MONITORING & TRACING     ║
+/*║  7. FILE METADATA                      17. RANDOM NUMBERS                     27. DEVICE & HARDWARE ACCESS             ║
+/*║  8. DIRECTORY & NAMESPACE OPERATIONS   18. USER & GROUP IDENTITY              28. ARCHITECTURE-SPECIFIC OPERATIONS     ║
+/*║  9. FILE SYSTEM OPERATIONS             19. CAPABILITIES & SECURITY            29. ADVANCED EXECUTION CONTROL           ║
+/*║ 10. FILE SYSTEM MONITORING             20. RESOURCE LIMITS & ACCOUNTING       30. LEGACY, OBSOLETE & UNIMPLEMENTED     ║
+/*╠════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╣
+/*║                                             1. PROCESS & THREAD LIFECYCLE                                              ║
+/*║                           Creation, execution, termination, and reaping of processes/threads                           ║
+/*╠════════════════════════════════════════════════════════╦═════════╤═════════╤═════════╤═════════╤═════════╤═════════════╣
+/*║                      Syscall Name                      ║ x86_64  │  arm64  │ riscv64 │ x86_32  │  arm32  │   riscv32   ║
+/*╟────────────────────────────────────────────────────────╨─────────┴─────────┴─────────┴─────────┴─────────┴─────────────╢
+/*║*/ #define NR_fork_linux                         BY_ARCH(       57,     void,     void,        2,        2,     void) //║
+/*║*/ #define NR_vfork_linux                        BY_ARCH(       58,     void,     void,      190,      190,     void) //║
+/*║*/ #define NR_clone_linux                        BY_ARCH(       56,      220,      220,      120,      120,      220) //║
+/*║*/ #define NR_clone3_linux                       BY_ARCH(      435,      435,      435,      435,      435,      435) //║
+/*║*/ #define NR_execve_linux                       BY_ARCH(       59,      221,      221,       11,       11,      221) //║
+/*║*/ #define NR_execveat_linux                     BY_ARCH(      322,      281,      281,      358,      387,      281) //║
+/*║*/ #define NR_exit_linux                         BY_ARCH(       60,       93,       93,        1,        1,       93) //║
+/*║*/ #define NR_exit_group_linux                   BY_ARCH(      231,       94,       94,      252,      248,       94) //║
+/*║*/ #define NR_wait4_linux                        BY_ARCH(       61,      260,      260,      114,      114,     void) //║
+/*║*/ #define NR_waitid_linux                       BY_ARCH(      247,       95,       95,      284,      280,       95) //║
+/*║*/ #define NR_waitpid_linux                      BY_ARCH(     void,     void,     void,        7,     void,     void) //║
+/*╠════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╣ Full table available at: https://github.com/t-cadet/c/blob/main/linux.h (Note: I am still early in my exploration of raw syscalls; there may be inaccuracies or other mistakes.) To my surprise, gathering Linux syscall numbers is a rather tortuous process. I started my journey by googling "linux syscall numbers" with the following results:1 Searchable Linux Syscall Table for x86_64 Chromium OS Docs - Linux System Call Table Linux kernel system calls for all architectures You may notice that all the search results are third-party. Is it a matter of going directly to the kernel docs website? I tried, but found nothing very relevant to gathering syscall numbers on the kernel docs website, nor in the manual. Okay, so back to third-party results. The syscall tables they provide look promising, until I notice something amiss: there are multiple tables, each with different syscall numbers. What's going on there? Sure enough, the Chromium link has a stark warning: [Syscall numbers] vary significantly across architectures/ABIs, both in mappings and in actual name. Really? Different syscall numbers on different architectures? I had to cross-check various resources to convince myself that this is not an artifact introduced by third-party resources. It is not: the answer on this Stack Exchange question suggests that at least for the discrepancies between x86 32 and 64 bits, it is a matter of cacheline usage optimization. For other architectures, AI suggests that in the 90s architecture ports (like Alpha, MIPS, or SPARC) ignored Linus' original x86 numbering and instead copied the syscall tables of proprietary Unixes (like OSF/1, IRIX, or Solaris) to allow them to run those non-Linux binaries natively (but I could not find any source to corroborate these AI claims). Anyway, after a detour through the libc syscall tables (musl, glibc): $ head -n 10 musl/arch/arm/bits/syscall.h.in #define __NR_restart_syscall	0
+#define __NR_exit	1
+#define __NR_fork	2
+#define __NR_read	3
+#define __NR_write	4
+#define __NR_open	5
+#define __NR_close	6
+#define __NR_creat	8
+#define __NR_link	9
+#define __NR_unlink	10 I ended up finding the primary source in the kernel: .tbl files $ find linux/arch -name *.tbl linux/arch/microblaze/kernel/syscalls/syscall.tbl
+linux/arch/sparc/kernel/syscalls/syscall.tbl
+linux/arch/x86/entry/syscalls/syscall_64.tbl
+linux/arch/x86/entry/syscalls/syscall_32.tbl
+linux/arch/xtensa/kernel/syscalls/syscall.tbl
+linux/arch/m68k/kernel/syscalls/syscall.tbl
+linux/arch/sh/kernel/syscalls/syscall.tbl
+linux/arch/mips/kernel/syscalls/syscall_n64.tbl
+linux/arch/mips/kernel/syscalls/syscall_n32.tbl
+linux/arch/mips/kernel/syscalls/syscall_o32.tbl
+linux/arch/s390/kernel/syscalls/syscall.tbl
+linux/arch/arm64/tools/syscall_64.tbl
+linux/arch/arm64/tools/syscall_32.tbl
+linux/arch/alpha/kernel/syscalls/syscall.tbl
+linux/arch/arm/tools/syscall.tbl
+linux/arch/parisc/kernel/syscalls/syscall.tbl
+linux/arch/powerpc/kernel/syscalls/syscall.tbl This is basically a tab-separated table format, but the optional columns, occasional space instead of tab, legacy ABIs, and duplicated legacy syscalls make it messy to parse: $ head -n22 linux/arch/x86/entry/syscalls/syscall_64.tbl # SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
+#
+# 64-bit system call numbers and entry vectors
+#
+# The format is:
+# <number> <abi> <name> <entry point> [<compat entry point> [noreturn]]
+#
+# The __x64_sys_*() stubs are created on-the-fly for sys_*() system calls
+#
+# The abi is "common", "64" or "x32" for this file.
+#
+0	common	read			sys_read
+1	common	write			sys_write
+2	common	open			sys_open
+3	common	close			sys_close
+4	common	stat			sys_newstat
+5	common	fstat			sys_newfstat
+6	common	lstat			sys_newlstat
+7	common	poll			sys_poll
+8	common	lseek			sys_lseek
+9	common	mmap			sys_mmap
+10	common	mprotect		sys_mprotect Also, there are no .tbl files for RISC-V 32 and 64 bits.2 If I understand correctly, the RISC-V tables are generated from a stock list based on x86 numbering with various #defines that enable or disable some syscalls. Since I do not know the right combination of #defines, I went back to glibc and used their RISC-V files; it seemed safer. With all the files thus gathered, I could finally parse them with a C script and generate my own implementation of a syscall number table, of which you saw a snippet earlier in the article. The decision of what syscall goes to what section was largely left to an AI, peppered with some of my nitpicks, which seems to have worked out well thanks to the AI's encyclopedic knowledge of all syscalls. For reference, here is the only taxonomy that a search for "linux syscall taxonomy" turns up. I ran into one last complication: not all architectures implement all syscalls, and so some syscall numbers are missing. That was pretty surprising: I was expecting a unified interface across all architectures, with perhaps one or two architecture-specific syscalls to access architecture-specific capabilities; but Linux syscalls are more like Swiss cheese. So I encoded these holes as void in my table to break compilation if they are ever used on the wrong architecture. ➜ Next time: implementing syscall wrappers in C. Footnotes I have since found another 3rd party syscall table that appears more reliable. Great effort went into generating it, as the description of the systrack tool by its author on its Hacker News post attests: I am using static analysis of kernel images (vmlinux ELF) that are built with debug information. Each table you see was extracted from a kernel built by my tool, Systrack, that can configure and build kernels that have all the syscalls available. The code is heavily commented and available on GitHub if you are interested: https://github.com/mebeim/systrack I realized soon in the process that simply looking at kernel sources was not enough to extract everything accurately, especially definition locations. I also wanted this to be a tool to extract syscalls actually implemented from a given kernel image, so that's what it does. ↩ I later found out that the kernel sources contain a generic scripts/syscall.tbl file: "modern" architectures such as RISC-V have a Makefile that lists the relevant ABIs for the architecture, and calls a shell script to filter the table. The Makefile and the script were pretty complicated, so I ended up parsing the generic .tbl file and doing the filtering myself, thus not relying on glibc for syscall numbers. ↩ Theme Settings Font Size 1x Accent Background Reset to Defaults
+==============
--- a/tests/summarization/https___techcrunch.com_2026_01_20_amagi-slides-in-india-debut-as-cloud-tv-software-firm-tests-investor-appetite_.txt
+++ b/tests/summarization/https___techcrunch.com_2026_01_20_amagi-slides-in-india-debut-as-cloud-tv-software-firm-tests-investor-appetite_.txt
--- a/tests/summarization/https___techcrunch.com_2026_01_20_anthropics-ceo-stuns-davos-with-nvidia-criticism_.txt
+++ b/tests/summarization/https___techcrunch.com_2026_01_20_anthropics-ceo-stuns-davos-with-nvidia-criticism_.txt
@@ -0,0 +1,2 @@
+Last week, after reversing an earlier ban, the U.S. administration officially approved the sale of Nvidia’s H200 chips, along with a chip line by AMD, to approved Chinese customers. Maybe they aren’t these chipmakers’ shiniest, most advanced chips, but they’re high-performance processors used for AI, making the export controversial. And at the World Economic Forum in Davos on Tuesday, Anthropic CEO Dario Amodei unloaded on both the administration and the chip companies over the decision. The criticism was particularly notable because one of those chipmakers, Nvidia, is a major partner and investor in Anthropic. “The CEOs of these companies say, ‘It’s the embargo on chips that’s holding us back,’” Amodei said, incredulous, in response to a question about the new rules. The decision is going to come back to bite the U.S., he warned. “We are many years ahead of China in terms of our ability to make chips,” he told Bloomberg’s editor-in-chief, who was interviewing him. “So I think it would be a big mistake to ship these chips.” Amodei then painted an alarming picture of what’s at stake. He talked about the “incredible national security implications” of AI models that represent “essentially cognition, that are essentially intelligence.” He likened future AI to a “country of geniuses in a data center,” saying to imagine “100 million people smarter than any Nobel Prize winner,” all under the control of one country or another. The image underscored why he thinks chip exports matter so much. But then came the biggest blow. “I think this is crazy,” Amodei said of the administration’s latest move. “It’s a bit like selling nuclear weapons to North Korea and [bragging that] Boeing made the casings.” That sound you hear? The team at Nvidia, screaming into their phones. Nvidia isn’t just another chip company. While Anthropic runs on the servers of Microsoft and Amazon and Google, Nvidia alone supplies the GPUs that power Anthropic’s AI models (every cloud provider needs Nvidia’s GPUs). Not only does Nvidia sit at the center of everything, but it also recently announced it was investing in Anthropic to the tune of up to $10 billion. Techcrunch event Disrupt 2026 Tickets: One-time offer Tickets are live! Save up to $680 while these rates last, and be among the first 500 registrants to get 50% off your +1 pass. TechCrunch Disrupt brings top leaders from Google Cloud, Netflix, Microsoft, Box, a16z, Hugging Face, and more to 250+ sessions designed to fuel growth and sharpen your edge. Connect with hundreds of innovative startups and join curated networking that drives deals, insights, and inspiration. Disrupt 2026 Tickets: One-time offer Tickets are live! Save up to $680 while these rates last, and be among the first 500 registrants to get 50% off your +1 pass. TechCrunch Disrupt brings top leaders from Google Cloud, Netflix, Microsoft, Box, a16z, Hugging Face, and more to 250+ sessions designed to fuel growth and sharpen your edge. Connect with hundreds of innovative startups and join curated networking that drives deals, insights, and inspiration. San Francisco | October 13-15, 2026 REGISTER NOW Just two months ago, the companies announced that financial relationship, along with a “deep technology partnership” with cheery promises to optimize each other’s technology. Fast-forward to Davos, and Amodei is comparing his partner to an arms dealer. Maybe it was just an unguarded moment — it’s possible he got swept up in his own rhetoric and blurted out the analogy. But given Anthropic’s strong position in the AI market, it seems more likely he felt comfortable speaking with confidence. The company has raised billions, is valued in the hundreds of billions, and its Claude coding assistant has developed a reputation as a highly beloved and top-tier AI coding tool, particularly among developers working on complex, real-world projects. It’s also entirely possible that Anthropic genuinely fears Chinese AI labs and wants Washington to act. If you want to get someone’s attention, nuclear proliferation comparisons are probably a pretty effective way to do it. But what’s perhaps most remarkable is that Amodei could sit onstage at Davos, drop a bomb like that, and walk away to some other gathering without fear that he just adversely impacted his business. News cycles move on, sure. Anthropic is also on solid footing right now. But it does feel that the AI race has grown so existential in the minds of its leaders that the usual constraints — investor relations, strategic partnerships, diplomatic niceties — don’t apply anymore. Amodei isn’t concerned about what he can and can’t say. More than anything else he said on that stage, that fearlessness is worth paying attention to.
+==============
--- a/tests/summarization/https___techcrunch.com_2026_01_20_bolna-nabs-6-3-million-from-general-catalyst-for-its-india-focused-voice-orchestration-platform_.txt
+++ b/tests/summarization/https___techcrunch.com_2026_01_20_bolna-nabs-6-3-million-from-general-catalyst-for-its-india-focused-voice-orchestration-platform_.txt
--- a/tests/summarization/https___techcrunch.com_2026_01_20_elon-musk-says-teslas-restarted-dojo3-will-be-for-space-based-ai-compute_.txt
+++ b/tests/summarization/https___techcrunch.com_2026_01_20_elon-musk-says-teslas-restarted-dojo3-will-be-for-space-based-ai-compute_.txt
@@ -0,0 +1,2 @@
+Elon Musk said over the long weekend that Tesla aims to restart work on Dojo3, the electric vehicle company’s previously abandoned third-generation AI chip. Only this time, Dojo3 won’t be aimed at training self-driving models on Earth. Instead, Musk says it will be dedicated to “space-based AI compute.” The move comes five months after Tesla effectively shut down its Dojo effort. The company disbanded the team behind its Dojo supercomputer following the departure of Dojo lead Peter Bannon. Around 20 Dojo workers also left to join DensityAI, a new AI infrastructure startup founded by former Dojo head Ganesh Venkataramanan and ex-Tesla employees Bill Chang and Ben Floering. At the time of Dojo’s shutdown, Bloomberg reported Tesla planned to increase its reliance on Nvidia and other partners like AMD for compute and Samsung for chip manufacturing, rather than continue developing its own custom silicon. Musk’s latest comments suggest the strategy has shifted again. The billionaire executive and Republican megadonor said in a post on X the decision to revive Dojo was based on the state of its in-house chip roadmap, noting that Tesla’s AI5 chip design was “in good shape.” Tesla’s AI5 chip, made by TSMC, was designed to power the automaker’s automated driving features and Optimus humanoid robots. Last summer, Tesla signed a $16.5 billion deal with Samsung to build its AI6 chips that promise to power Tesla vehicles and Optimus, as well as enable high-performance AI training in data centers. “AI7/Dojo3 will be for space-based AI compute,” Musk said on Sunday, positioning the resurrected project as more of a moonshot. To achieve that, Tesla is now gearing up to rebuild the team it dismantled months ago. Musk used the same post to recruit engineers directly, writing: “If you’re interested in working on what will be the highest volume chips in the world, send a note to AI_Chips@Tesla.com with 3 bullet points on the toughest technical problems you’ve solved.” Techcrunch event Disrupt 2026 Tickets: One-time offer Tickets are live! Save up to $680 while these rates last, and be among the first 500 registrants to get 50% off your +1 pass. TechCrunch Disrupt brings top leaders from Google Cloud, Netflix, Microsoft, Box, a16z, Hugging Face, and more to 250+ sessions designed to fuel growth and sharpen your edge. Connect with hundreds of innovative startups and join curated networking that drives deals, insights, and inspiration. Disrupt 2026 Tickets: One-time offer Tickets are live! Save up to $680 while these rates last, and be among the first 500 registrants to get 50% off your +1 pass. TechCrunch Disrupt brings top leaders from Google Cloud, Netflix, Microsoft, Box, a16z, Hugging Face, and more to 250+ sessions designed to fuel growth and sharpen your edge. Connect with hundreds of innovative startups and join curated networking that drives deals, insights, and inspiration. San Francisco | October 13-15, 2026 REGISTER NOW The timing of the announcement is notable. At CES 2026, Nvidia unveiled Alpamayo, an open source AI model for autonomous driving that directly challenges Tesla’s FSD software. Musk commented on X that solving the long tail of rare edge cases in driving is “super hard,” adding: “I honestly hope they succeed.” Musk and several other AI executives have argued the future of data centers may lie off-planet, since Earth’s power grids are already strained to the max. Axios recently reported Musk rival and OpenAI CEO Sam Altman is also excited by the prospect of putting data centers into orbit. Musk has an edge over his peers because he already controls the launch vehicles. Per Axios, Musk plans to use SpaceX’s upcoming IPO to help finance his vision of using Starship to launch a constellation of compute satellites that can operate in constant sunlight, harvesting solar power 24/7. Still, there are many roadblocks to making AI data centers in space a possibility, not least the challenge of cooling high-power compute in a vacuum. Musk’s comments of Tesla building “space-based AI compute” fit a familiar pattern: float an idea that sounds far-fetched, then try to brute-force it into reality.
+==============
--- a/tests/summarization/https___techcrunch.com_2026_01_20_in-an-effort-to-protect-young-users-chatgpt-will-now-predict-how-old-you-are_.txt
+++ b/tests/summarization/https___techcrunch.com_2026_01_20_in-an-effort-to-protect-young-users-chatgpt-will-now-predict-how-old-you-are_.txt
@@ -0,0 +1,2 @@
+As concern for AI’s impact on young people continues to mount, OpenAI has introduced an “age prediction” feature into ChatGPT that is designed to help identify minors and put sensible content constraints on their conversations. OpenAI has been heavily criticized in recent years for the impacts that ChatGPT can have on children. A number of teen suicides have been linked to the chatbot, and, like other AI vendors, OpenAI has also been criticized for allowing ChatGPT to discuss sexual topics with young users. Last April, the company was forced to address a bug that allowed its chatbot to generate erotica for users who were under the age of 18. The company has already been working on its underage user problem for some time, and its new “age prediction” feature merely adds to protections already in place. The new feature leverages an AI algorithm that assesses user accounts for particular “behavioral and account-level signals,” in an effort to identify young users, OpenAI said in a blog post Tuesday. Those “signals” include things like the user’s stated age, the length of time an account has existed, and the times of day that the account is usually active, the company said. The company already has content filters designed to weed out discussions of sex, violence, and other potentially problematic topics for users who are under age 18. If the age prediction mechanism identifies an account as under 18, those filters are automatically applied. If a user is mistakenly designated as underage, there is a way for them to reestablish their “adult” account. They can submit a selfie to OpenAI’s ID verification partner Persona, OpenAI says.
+==============
--- a/Show More
+++ b/Show More