Choose your role
BenchFlow aims to be the bridge between benchmark users and benchmark developers. Please choose a role to start your BenchFlow journey.For benchmark users
1
Install the benchflow sdk
2
Select your benchmarks
Discover benchmarks tailored to your needs on Benchmark Hub.
3
Implement your call_api
Extend the BaseAgent interface.
call_api
is used to call your intelligence (LLM, Agent …)YourAgent.py
4
Run your benchmark
Run your benchmark in a seperate script.Kickstart your free BenchFlow trial on BenchFlow.ai to unlock benchmarking insights.
For benchmark developers
1
Install the benchflow sdk
2
Make your benchmark a client
3
Containerize your benchmark
Package your benchmark as an image and provide an entry point to run the benchmark.
Please configure your Docker image to target the Linux platform. We plan to support additional platforms in future releases.
4
Extend BaseBench to Run Your Benchmarks
Implement your subclass in
benchflow_interface.py
and upload it to benchmark Hub.
There are 6 methods to be implemented. 5 of them are very simple and can often be implemented with just a return statement. The only one that might take a bit of time is the
get_result
method.5
Upload your benchmark to Benchmark Hub
Here’s your checklist:
- benchflow_interface.py - ensure your file is named correctly.****
-
readme.md – This should showcase the field formats provided in the
prepare_input method
from Step 2, along with detailed descriptions.