Introduction to llm-d

An open-source, Kubernetes-native framework for distributed LLM inference

namespace. The main feature here is that fact that

And all that routing is handled by the Envoy Gateway,

which is handled by a variety of these Istio

So this one actually does the load test of all the prompts.

I'm gonna go ahead and also go into the logs.

Here is actually it hitting it over and over again.

It has information of where the connection is.

And when it finishes, it'll give us a variety

The first one is handling all those metrics.

And if we come up here, we can actually see

I'm gonna go ahead and also go into the logs

it'll give us a variety of different statistics as well.

And if we come up here, we can actually see that the hit rate.

And these are all the queries that's occurring.

And it'll go through multiple batches of these requests over and over again.

And when it finishes, it'll give us a variety of different statistics as well.