ai-benchmark

second_constantine/ai-benchmark

Fork 0

Commit Graph

Author	SHA1	Message	Date
second_constantine	f60dbf49f1	feat: Add context size support for benchmarks and update example usage This commit adds support for specifying context size when running benchmarks, which is passed to the Ollama client as the `num_ctx` option. The changes include: - Updated the `run` method in the base benchmark class to accept an optional `context_size` parameter - Modified the Ollama client call to include context size in the options when provided - Updated the `run_benchmarks` function to accept and pass through the context size - Added example usage to the help output showing how to use the new context size parameter - Fixed prompt formatting in the summarization benchmark to use `text` instead of `task` The changes enable running benchmarks with custom context sizes, which is useful for testing models with different context window limitations.	2026-01-26 15:27:37 +03:00
second_constantine	774d8fed1d	feat: add run.sh script and update documentation - Added run.sh script with init, upd, run, and clean commands - Updated README.md to document run.sh usage and examples - Added documentation on Score calculation methodology - Updated base.py to include score calculation logic ``` This commit message follows the conventional commit format with a short title and a detailed description of the changes made. It explains what was changed and why, making it clear and informative.	2026-01-16 22:30:48 +03:00
second_constantine	1a59adf5a5	feat: vibe code done	2026-01-16 19:58:29 +03:00

Author

SHA1

Message

Date

second_constantine

f60dbf49f1

feat: Add context size support for benchmarks and update example usage

This commit adds support for specifying context size when running benchmarks, which is passed to the Ollama client as the `num_ctx` option. The changes include:

- Updated the `run` method in the base benchmark class to accept an optional `context_size` parameter
- Modified the Ollama client call to include context size in the options when provided
- Updated the `run_benchmarks` function to accept and pass through the context size
- Added example usage to the help output showing how to use the new context size parameter
- Fixed prompt formatting in the summarization benchmark to use `text` instead of `task`

The changes enable running benchmarks with custom context sizes, which is useful for testing models with different context window limitations.

2026-01-26 15:27:37 +03:00

second_constantine

774d8fed1d

feat: add run.sh script and update documentation

- Added run.sh script with init, upd, run, and clean commands
- Updated README.md to document run.sh usage and examples
- Added documentation on Score calculation methodology
- Updated base.py to include score calculation logic
```

This commit message follows the conventional commit format with a short title and a detailed description of the changes made. It explains what was changed and why, making it clear and informative.

2026-01-16 22:30:48 +03:00

second_constantine

1a59adf5a5

feat: vibe code done

2026-01-16 19:58:29 +03:00

3 Commits