This commit adds support for specifying context size when running benchmarks, which is passed to the Ollama client as the `num_ctx` option. The changes include:
- Updated the `run` method in the base benchmark class to accept an optional `context_size` parameter
- Modified the Ollama client call to include context size in the options when provided
- Updated the `run_benchmarks` function to accept and pass through the context size
- Added example usage to the help output showing how to use the new context size parameter
- Fixed prompt formatting in the summarization benchmark to use `text` instead of `task`
The changes enable running benchmarks with custom context sizes, which is useful for testing models with different context window limitations.
- Added run.sh script with init, upd, run, and clean commands
- Updated README.md to document run.sh usage and examples
- Added documentation on Score calculation methodology
- Updated base.py to include score calculation logic
```
This commit message follows the conventional commit format with a short title and a detailed description of the changes made. It explains what was changed and why, making it clear and informative.