W
llm-evaluation
by wshobson
Implement robust evaluation workflows for LLM applications using automated metrics, human feedback, and benchmarking. Ideal for teams testing LLM performance, comparing models, or validating AI improvements.
Skill Testing
Favorites 0GitHub 0
