Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse

FòrumCAT

  1. Home
  2. Uncategorized
  3. LEAKED: A New List Reveals Top Websites Meta Is Scraping of Copyrighted Content to Train Its AI (Including Many Fediverse Instances!!!)

LEAKED: A New List Reveals Top Websites Meta Is Scraping of Copyrighted Content to Train Its AI (Including Many Fediverse Instances!!!)

Scheduled Pinned Locked Moved Uncategorized
fedipactmetathreads
2 Posts 2 Posters 0 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • fedipact@cyberpunk.lolF This user is from outside of this forum
    fedipact@cyberpunk.lolF This user is from outside of this forum
    fedipact@cyberpunk.lol
    wrote last edited by
    #1

    LEAKED: A New List Reveals Top Websites Meta Is Scraping of Copyrighted Content to Train Its AI (Including Many Fediverse Instances!!!)

    "The tech giant is sidestepping guardrails that websites use to prevent being scraped, data show, in a move whistleblowers say is unethical and potentially illegal."

    ARTICLE: https://www.dropsitenews.com/p/meta-facebook-tech-copyright-privacy-whistleblower

    FULL PDF: https://www.dropsitenews.com/api/v1/file/b3555944-e204-4f5e-9a64-e44281b19a82.pdf

    #FediPact #meta #threads #AI

    spla@mastodont.catS 1 Reply Last reply
    0
    • fedipact@cyberpunk.lolF fedipact@cyberpunk.lol

      LEAKED: A New List Reveals Top Websites Meta Is Scraping of Copyrighted Content to Train Its AI (Including Many Fediverse Instances!!!)

      "The tech giant is sidestepping guardrails that websites use to prevent being scraped, data show, in a move whistleblowers say is unethical and potentially illegal."

      ARTICLE: https://www.dropsitenews.com/p/meta-facebook-tech-copyright-privacy-whistleblower

      FULL PDF: https://www.dropsitenews.com/api/v1/file/b3555944-e204-4f5e-9a64-e44281b19a82.pdf

      #FediPact #meta #threads #AI

      spla@mastodont.catS This user is from outside of this forum
      spla@mastodont.catS This user is from outside of this forum
      spla@mastodont.cat
      wrote last edited by spla@mastodont.cat
      #2

      @FediPact I did apply this nginx config to fight against it and many other IA bots and scrappers:

      https://github.com/kurren/ai-bots-crawlers

      returning 444 to them seems a good way to confuse them and decrease server load.

      1 Reply Last reply
      0
      Reply
      • Reply as topic
      Log in to reply
      • Oldest to Newest
      • Newest to Oldest
      • Most Votes


      • Login

      • First post
        Last post
      0
      • Categories
      • Recent
      • Tags
      • Popular
      • World
      • Users
      • Groups