• tempest@lemmy.ca
      link
      fedilink
      English
      arrow-up
      10
      ·
      4 hours ago

      CloudFlare has become an Internet protection racket and I’m not happy about it.

      • Laser@feddit.org
        link
        fedilink
        English
        arrow-up
        4
        ·
        3 hours ago

        It’s been this from the very beginning. But they don’t fit the definition of a protection racket as they’re not the ones attacking you if you don’t pay up. So they’re more like a security company that has no competitors due to the needed investment to operate.

  • Amberskin@europe.pub
    link
    fedilink
    English
    arrow-up
    45
    ·
    10 hours ago

    Uh, are they admitting they are trying to circumvent technological protections setup to restrict access to a system?

    Isn’t that a literal computer crime?

  • Electricd@lemmybefree.net
    link
    fedilink
    English
    arrow-up
    4
    arrow-down
    2
    ·
    edit-2
    7 hours ago

    They do have a point though. It would be great to let per-prompt searches go through, but not mass scrapping

    I believe a lot of websites don’t want both though

      • Electricd@lemmybefree.net
        link
        fedilink
        English
        arrow-up
        1
        ·
        3 hours ago

        I assume their script does some search engine stuff like query google or bing and then “scrap” the links they go on

        Some selenium stuff

    • sunbeam60@lemmy.ml
      link
      fedilink
      English
      arrow-up
      3
      ·
      2 hours ago

      They’re not. They’re using this as an excuse to become paid gatekeepers of the internet as we know it. All that’s happening is that Cloudflare is using this to menuever into position where they can say “nice traffic you’ve got there - would be a shame if something happened to it”.

      AI companies are crap.

      What Cloudflare is doing here is also crap.

      And we’re cheering it on.

  • kreskin@lemmy.world
    link
    fedilink
    English
    arrow-up
    12
    arrow-down
    3
    ·
    edit-2
    13 hours ago

    they cant get their ai to check a box that says “I am not a robot”? I’d think thatd be a first year comp sci student level task. And robots.txt files were basically always voluntary compliance anyway.

    • Dr. Moose@lemmy.world
      link
      fedilink
      English
      arrow-up
      15
      arrow-down
      1
      ·
      9 hours ago

      Cloudflare actually fully fingerprints your browser and even sells that data. Thats your IP, TLS, operating system, full browser environment, installed extensions, GPU capabilities etc. It’s all tracked before the box even shows up, in fact the box is there to give the runtime more time to fingerprint you.

      • tempest@lemmy.ca
        link
        fedilink
        English
        arrow-up
        4
        arrow-down
        1
        ·
        4 hours ago

        Yeah and the worst part is it doesn’t fucking work for the one thing it’s supposed to do.

        The only thing it does is stop the stupidest low effort scrapers and forces the good ones to use a browser.

  • Wispy2891@lemmy.world
    link
    fedilink
    English
    arrow-up
    10
    ·
    edit-2
    16 hours ago

    Here comes the ridiculous offer to buy Google chrome with money they don’t have: easy delicious scraping directly from the user source

  • Kissaki@feddit.org
    link
    fedilink
    English
    arrow-up
    98
    arrow-down
    1
    ·
    edit-2
    1 day ago

    Perplexity argues that a platform’s inability to differentiate between helpful AI assistants and harmful bots causes misclassification of legitimate web traffic.

    So, I assume Perplexity uses appropriate identifiable user-agent headers, to allow hosters to decide whether to serve them one way or another?

    • ubergeek@lemmy.today
      link
      fedilink
      English
      arrow-up
      7
      ·
      8 hours ago

      And I’m assuming if the robots.txt state their UserAgent isn’t allowed to crawl, it obeys it, right? :P

      • Kissaki@feddit.org
        link
        fedilink
        English
        arrow-up
        2
        ·
        7 hours ago

        No, as per the article, their argumentation is that they are not web crawlers generating an index, they are user-action-triggered agents working live for the user.

        • ubergeek@lemmy.today
          link
          fedilink
          English
          arrow-up
          1
          ·
          5 hours ago

          Except, it’s not a live user hitting 10 sights all the same time, trying to crawl the entire site… Live users cannot do that.

          That said, if my robots.txt forbids them from hitting my site, as a proxy, they obey that, right?

    • Dr. Moose@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      arrow-down
      4
      ·
      8 hours ago

      Its not up to the hoster to decide whom to serve content. Web is intended to be user agent agnostic.

    • lime!@feddit.nu
      link
      fedilink
      English
      arrow-up
      34
      ·
      1 day ago

      yeah it’s almost like there as already a system for this in place

  • Dr. Moose@lemmy.world
    link
    fedilink
    English
    arrow-up
    7
    arrow-down
    15
    ·
    9 hours ago

    It’s insane that anyone would side with Cloudflare here. To this day I cant visit many websites like nexusmods just because I run Firefox on Linux. The Cloudflare turnstile just refreshes infinitely and has been for months now.

    Cloudflare is the biggest cancer on the web, fucking burn it.

    • CatDogL0ver@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      3 hours ago

      It happened to me before until I did a Google search. It was my VPN web protection. It was too " over protective".

      Check your security settings, antivirus and VPN

    • Dremor@lemmy.world
      link
      fedilink
      English
      arrow-up
      22
      arrow-down
      1
      ·
      8 hours ago

      Linux and Firefox here. No problem at all with Cloudflare, despite having more or less as much privacy preserving add-on as possible. I even spoof my user agent to the latest Firefox ESR on Linux.

      Something’s muat be wrong with your setup.

      • COASTER1921@lemmy.ml
        link
        fedilink
        English
        arrow-up
        3
        arrow-down
        1
        ·
        4 hours ago

        I suspect a lot of it comes down to your ISP. Like the original commentor I also frequently can’t pass CloudFlare turnstile when on Wifi, although refreshing the page a few times usually gets me through. Worst case on my phone’s hotspot I can much more consistently pass. It’s super annoying and combined with their recent DNS outage has totally ruined any respect I had for CloudFlare.

        Interesting video on the subject: https://youtu.be/SasXJwyKkMI

      • Dr. Moose@lemmy.world
        link
        fedilink
        English
        arrow-up
        5
        arrow-down
        5
        ·
        7 hours ago

        Thats not how it works. Cf uses thousands of variables to estimate a trust score and block people so just because it works for you doesn’t mean it works.

        • Dremor@lemmy.world
          link
          fedilink
          English
          arrow-up
          6
          arrow-down
          1
          ·
          edit-2
          6 hours ago

          Same goes the other way. It’s not because it doesn’t work for you that it should go away.

          That technology has its uses, and Cloudflare is probably aware that there are still some false positive, and probably is working on it as we write.

          The decision is for the website owner to take, taking into consideration the advantages of filtering out a majority of bots and the disadvantages of loosing some legitimate traffic because of false positives. If you get Cloudflare challenge, chances are that he chosed that the former vastly outclass the later.

          Now there are some self-hosted alternatives, like Anubis, but business clients prefer SaaS like Cloudflare to having to maintain their own software. Once again it is their choices and liberty to do so.

          • Dr. Moose@lemmy.world
            link
            fedilink
            English
            arrow-up
            4
            arrow-down
            4
            ·
            6 hours ago

            lmao imagine shilling for corporate Cloudflare like this. Also false positive vs false negative are fundamentally not equal.

            Cloudflare is probably aware that there are still some false positive, and probably is working on it as we write.

            The main issue with Cloudflare is that it’s mostly bullshit. It does not report any stats to the admins on how many users were rejected or any false positive rates and happily put’s everyone under “evil bot” umbrella. So people from low trust score environments like Linux or IPs from poorer countries are under significant disadvantage and left without a voice.

            I’m literally a security dev working with Cloudflare anti-bot myself (not by choice). It’s a useful tool for corporate but a really fucking bad one for the health of the web, much worse than any LLM agent or crawler, period.

            • Laser@feddit.org
              link
              fedilink
              English
              arrow-up
              1
              arrow-down
              1
              ·
              3 hours ago

              So people from low trust score environments like Linux

              Linux user here, Cloudflare hasn’t blocked access to a single page for me unless I use a VPN, which then can trigger it.

    • dodos@lemmy.world
      link
      fedilink
      English
      arrow-up
      15
      arrow-down
      1
      ·
      8 hours ago

      I’m on Linux with Firefox and have never had that issue before (particularly nexusmods which I use regularly). Something else is probably wrong with your setup.

      • Dr. Moose@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        arrow-down
        7
        ·
        7 hours ago

        “Wrong with my setup” - thats not how internet works.

        I’m based in south east asia and often work on the road so IP rating probably is the final crutch in my fingerprint score.

        Either way this should be no way acceptible.

        • JcbAzPx@lemmy.world
          link
          fedilink
          English
          arrow-up
          1
          arrow-down
          1
          ·
          4 hours ago

          That is exactly how the internet works. That’s always how the internet has worked.

    • Leon@pawb.social
      link
      fedilink
      English
      arrow-up
      15
      ·
      1 day ago

      I’m still holding out for Stephen Hawking to mail out Demon Summoning programs.