• turmoil@feddit.org
      link
      fedilink
      arrow-up
      2
      ·
      1 day ago

      Only explanation I would have is that a higher number of users would cause load balancing to serve lower parameter models. That way you’d have less reliable answers at peak usage times, but could enable more simultaneous users.