Quickstart
Start your journey on BenchFlow!
Choose your role
BenchFlow aims to be the bridge between benchmark users and benchmark developers. Please choose a role to start your BenchFlow journey.
For benchmark users
Install the benchflow sdk
Select your benchmarks
Discover benchmarks tailored to your needs on Benchmark Hub.
Implement your call_api
Extend the BaseAgent interface. call_api
is used to call your intelligence (LLM, Agent …)
Run your benchmark
Run your benchmark in a seperate script.
Kickstart your free BenchFlow trial on BenchFlow.ai to unlock benchmarking insights.
For benchmark developers
Install the benchflow sdk
Make your benchmark a client
Containerize your benchmark
Package your benchmark as an image and provide an entry point to run the benchmark.
Please configure your Docker image to target the Linux platform. We plan to support additional platforms in future releases.
Extend BaseBench to Run Your Benchmarks
Implement your subclass in benchflow_interface.py
and upload it to benchmark Hub.
There are 6 methods to be implemented.
5 of them are very simple and can often be implemented with just a return statement. The only one that might take a bit of time is the get_result
method.
Upload your benchmark to Benchmark Hub
Here’s your checklist:
-
benchflow_interface.py - ensure your file is named correctly.****
-
readme.md – This should showcase the field formats provided in the
prepare_input method
from Step 2, along with detailed descriptions.