• spongebue@lemmy.world
    link
    fedilink
    English
    arrow-up
    39
    arrow-down
    1
    ·
    2 days ago

    So, I get that 256 is a base 2 number. But we’re not running 8-bit servers or whatever here (and yes, I understand that’s not what 8-bit generally refers to). Is there some kind of technical limitation I’m not thinking of where 257 would be any more difficult to implement, or really is it just that 256 has a special place in someone’s heart because it’s a base 2 number?

    • AbsolutelyNotAVelociraptor@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      57
      arrow-down
      1
      ·
      2 days ago

      Because 256 is exactly one byte. If you want to add a 257th member, you need a whole second byte just for that one person. That’s a waste of memory, unless you want to go to the 64k barrier of users per chat.

      • Zagorath@aussie.zone
        link
        fedilink
        English
        arrow-up
        25
        arrow-down
        11
        ·
        2 days ago

        Except that they’re almost certainly just using int, which is almost certainly at least 32 bits.

        256 is chosen because the people writing the code are programmers. And just like regular people like multiples of 10, programmers like powers of 2. They feel like nice round numbers.

        • verstra@programming.dev
          link
          fedilink
          English
          arrow-up
          48
          arrow-down
          2
          ·
          2 days ago

          Well, no. They are not certainly using int, they might be using a more efficient data type.

          This might be for legacy reasons or it might be intentional because it might actually matter a lot. If I make up an example, chat_participant_id is definitely stored with each message and probably also in some index, so you can search the messages. Multiply this over all chats on WhatsApp, even the ones with only two people in, and the difference between u8 and u16 might matter a lot.

          But I understand how a TypeScript or Java dev could think that the difference between 1 and 4 bytes is negligible.

          • MyBrainHurts@lemmy.ca
            link
            fedilink
            English
            arrow-up
            40
            arrow-down
            1
            ·
            2 days ago

            But I understand how a TypeScript or Java dev could think that the difference between 1 and 4 bytes is negligible.

            Shots fired.

            • jaybone@lemmy.zip
              link
              fedilink
              English
              arrow-up
              5
              ·
              2 days ago

              All these tough guys think you can’t bit shift in Java, never worked on a project with more than two people. Many such cases.

            • ByteJunk@lemmy.world
              link
              fedilink
              English
              arrow-up
              2
              arrow-down
              2
              ·
              2 days ago

              Fair point, but still better than wasting a nuclear power plant worth of electricity to solve math homework with an LLM

          • Zagorath@aussie.zone
            link
            fedilink
            English
            arrow-up
            6
            arrow-down
            12
            ·
            2 days ago

            They are not certainly using int

            Probably why I said “almost certainly”. And I stand by that. We’re not talking about chat_participant_id, we’re talking about GROUP_CHAT_LIMIT, probably a constant somewhere. And we’re talking about a value that would require a 9-bit unsigned int to store it, at a minimum (and therefore at least a 16-bit integer in sizes that actually exist for types). Unless it’s 8-bit and interprets a 0 as 256, which is highly unorthodox and would require bespoke coding basically all over instead of a basic num <= GROUP_CHAT_LIMIT.

            • boonhet@sopuli.xyz
              link
              fedilink
              English
              arrow-up
              9
              arrow-down
              1
              ·
              edit-2
              2 days ago

              Orrrr they have a u8 chat_participant_id of some kind and a binary data format for message passing. The GROUP_CHAT_LIMIT const may have a bigger data type, but they may very well be trying to conserve 3 bytes per message. Ids can easily start at 0.

              150 gigs of bandwidth saved per day doesn’t seem like a whole lot at their scale, but if they archive all the metadata, that’s over 50 terabytes a year saved on storage - multiplied by how many copies they have of their data. Still not a lot tbh, but if they also conserve data in every other place they can, they could be saving petabytes per year in storage.

              Still weird because then they’d have to reuse ids when people leave, otherwise you could join and leave 255 times to disable a group lol

            • Passerby6497@lemmy.world
              link
              fedilink
              English
              arrow-up
              4
              arrow-down
              2
              ·
              edit-2
              2 days ago

              And we’re talking about a value that would require a 9-bit unsigned int to store it, at a minimum (and therefore at least a 16-bit integer in sizes that actually exist for types). Unless it’s 8-bit and interprets a 0 as 256, which is highly unorthodox and would require bespoke coding basically all over instead of a basic num <= GROUP_CHAT_LIMIT.

              I think you’re just very confused friend, or misunderstanding how binary counting works, because why in the 9 hells would they be using 9 bits (512 possible values) to store 8 bits (256 possible members) of data?

              I think you’re confusing indexing (0-255) with counting (0-256), and mistakenly including a negation state (counting 0, which would be a null state for the variable) in your conception of the process. Because yes, index 255 is in fact count 256 and 0 would actually be 1. Index = count -1

              • Zagorath@aussie.zone
                link
                fedilink
                English
                arrow-up
                1
                arrow-down
                2
                ·
                2 days ago

                I’m imagining something like this:

                def add_member(group, user):
                    if (len(group.members) <= GROUP_CHAT_LIMIT):
                        ...
                

                If GROUP_CHAT_LIMIT is 8 bits, this does not work.

                • Passerby6497@lemmy.world
                  link
                  fedilink
                  English
                  arrow-up
                  4
                  arrow-down
                  1
                  ·
                  2 days ago

                  So add a +1 like you would for any index to count comparison?

                  I guess I’m failing to see how this doesn’t work as long as you properly handle the comparison logic. Maybe you can explain how this doesn’t work…

        • Lodespawn@aussie.zone
          link
          fedilink
          English
          arrow-up
          24
          ·
          2 days ago

          It’ll have to do with packet headers, 8 bits is a lot for an instant message packet header.

        • ViatorOmnium@piefed.social
          link
          fedilink
          English
          arrow-up
          12
          arrow-down
          2
          ·
          2 days ago

          For high volume wire formats using uint8 instead of uint32 can make a huge difference when considering the big picture. Not everyone is working on bootcamp level software.

        • jaybone@lemmy.zip
          link
          fedilink
          English
          arrow-up
          5
          ·
          2 days ago

          It’s not that they “like it”. It’s ultimately a hardware limitation. Of course we can have 64 bit integers, or however many bits. It’s an appealing optimization.

      • spongebue@lemmy.world
        link
        fedilink
        English
        arrow-up
        7
        arrow-down
        2
        ·
        2 days ago

        If each user is assigned a number as to where they’re placed in the group, I guess. But what happens when people are added and removed? If #145 leaves a full group, does #146 and beyond get decremented to make room for the new #256? (or #255 if zero-indexed). It just doesn’t seem like something you’d actually see in code not designed by a first semester CS student.

        Also, more importantly, memory is cheap AF now 🤷‍♂️

        • SandmanXC@lemmy.world
          link
          fedilink
          English
          arrow-up
          43
          ·
          2 days ago

          While I completely agree with the sentiment, snorting too much “memory is cheap AF” could lead to terminal cases of Electron.

        • morphballganon@lemmynsfw.com
          cake
          link
          fedilink
          English
          arrow-up
          6
          arrow-down
          1
          ·
          2 days ago

          There would be no need to decrement later people because they’re definitely referred to using pointers. You’d just need to update the previous person’s pointer to the new next person.

          • spongebue@lemmy.world
            link
            fedilink
            English
            arrow-up
            1
            ·
            2 days ago

            If it’s a numeric ID (0-255) assigned to each person in the group, you’d either need to decrement later people or assign based on some kind of lowest available method, in which case you’d get kinda funny UX when new-member-Jerry can be #3 on the list because he’s taking over for old-member-Gerry, or he can be #255 because that’s the last spot.

            If we’re talking about pointers, I assume you mean a collection with up to 256 of them. In which case, there are plenty of collection data structures out there that wouldn’t really have a hard limit (and if you go with a basic array, wouldn’t that have a size limit of far more than 256 natively on pretty much any language?)

        • ViatorOmnium@piefed.social
          link
          fedilink
          English
          arrow-up
          3
          ·
          2 days ago

          Memory and network stop being cheap AF when you multiply it by a billion users. And Whatsapp is a mobile app that’s expected to work on the crappiest of networks and connections.

          • spongebue@lemmy.world
            link
            fedilink
            English
            arrow-up
            1
            ·
            2 days ago

            It is also used to transmit data including video. I don’t think an additional byte is noticeable on that kind of scale

    • mEEGal@lemmy.world
      link
      fedilink
      English
      arrow-up
      14
      ·
      2 days ago

      when writing somewhat low-level code, you always make assumptions about things. in this case, they chose to manage 256 entries in some array; the bound used to be lower.

      but implicitly there’s a tradeoff, probably memory / CPU utilisation in the server.

      it’s always about the tradeoff between what the users want, what is easier for you to maintain, what your infrastructure can provide, etc.

    • SparroHawc@lemmy.zip
      link
      fedilink
      English
      arrow-up
      10
      ·
      2 days ago

      There’s often a lot of fun cheats you can use - bitwise operators, etc - if your numbers are small powers of two.

      Also it’s easier to organize memory, if you’re doing funky memory management tricks, if the memory you’re allocating fits nicely into the blocks available to you which are always in powers of two.

      They’re not necessarily great reasons if you’re using a language with sufficient abstraction, but it’s still easier in most instances to use powers of two anyways if you’re getting into the guts of things.

    • jaaake@lemmy.world
      link
      fedilink
      English
      arrow-up
      4
      ·
      2 days ago

      The issue isn’t storing each individual ID, it’s all of the networking operations that are done and total things that are stored/cached per user in each chat. All of those things are handled and stored as efficiently as possible. Sure they could set it to any number, but 256 is a nice round one when considering everything that is happening and the use cases involved. They have user research data and probably see that 128 is too close to a group size that happens with some regularity, but group sizes very rarely get close to 256, and 512 is right out.