Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse

FòrumCAT

  1. Home
  2. Uncategorized
  3. Can anyone tell me what is #NodeBB and why is it scraping and republishing fediverse content without consent?

Can anyone tell me what is #NodeBB and why is it scraping and republishing fediverse content without consent?

Scheduled Pinned Locked Moved Uncategorized
nodebb
41 Posts 10 Posters 1 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • dentangle@chaos.socialD dentangle@chaos.social

    @onepict

    Hi @ordnung

    Are you aware that posts are being scraped and reblogged by community.nodebb.org?

    The #nodebb software is reposting content, including unlisted posts and effectively making the fediverse searchable.

    This looks like a #Fediblock to me.

    alex@anarres.familyA This user is from outside of this forum
    alex@anarres.familyA This user is from outside of this forum
    alex@anarres.family
    wrote last edited by
    #21

    @dentangle @onepict @ordnung

    Forum software NodeBB joins the fediverse

    This might have something to do with it.

    I'm pretty sure a Fedi instance is supposed to cache posts. That is literally what one is supposed to do.

    (So I searched for some of my previously deleted accounts. They didn't cache any of those. It seems they're being reasonably good Fedi citizens and respecting deletes.)

    onepict@chaos.socialO dentangle@chaos.socialD 2 Replies Last reply
    0
    • dentangle@chaos.socialD dentangle@chaos.social

      @Gargron @jonny @onepict As far as I can tell all public and unlisted posts are being posted publicly on the web by nodebb and have been picked up by search engines.

      I realise everything on here is effectively "public" including DMs, but there has been strong resistance until now from the community to making the fediverse searchable.

      NodeBB has broken that expectation.

      dentangle@chaos.socialD This user is from outside of this forum
      dentangle@chaos.socialD This user is from outside of this forum
      dentangle@chaos.social
      wrote last edited by
      #22

      @Gargron @jonny @onepict Instantly updated and searchable 🙂

      gargron@mastodon.socialG 1 Reply Last reply
      0
      • alex@anarres.familyA alex@anarres.family

        @dentangle @onepict @ordnung

        Forum software NodeBB joins the fediverse

        This might have something to do with it.

        I'm pretty sure a Fedi instance is supposed to cache posts. That is literally what one is supposed to do.

        (So I searched for some of my previously deleted accounts. They didn't cache any of those. It seems they're being reasonably good Fedi citizens and respecting deletes.)

        onepict@chaos.socialO This user is from outside of this forum
        onepict@chaos.socialO This user is from outside of this forum
        onepict@chaos.social
        wrote last edited by
        #23

        @alex @dentangle @ordnung Well that explains why they just did it.

        It's just another thing to connect 🙄

        1 Reply Last reply
        0
        • alex@anarres.familyA alex@anarres.family

          @dentangle @onepict @ordnung

          Forum software NodeBB joins the fediverse

          This might have something to do with it.

          I'm pretty sure a Fedi instance is supposed to cache posts. That is literally what one is supposed to do.

          (So I searched for some of my previously deleted accounts. They didn't cache any of those. It seems they're being reasonably good Fedi citizens and respecting deletes.)

          dentangle@chaos.socialD This user is from outside of this forum
          dentangle@chaos.socialD This user is from outside of this forum
          dentangle@chaos.social
          wrote last edited by
          #24

          @alex yes, it appears to be a forum that has recently patched in fediverse support without understanding or respecting our conventions.

          1 Reply Last reply
          0
          • dentangle@chaos.socialD dentangle@chaos.social

            @Gargron @jonny @onepict Instantly updated and searchable 🙂

            gargron@mastodon.socialG This user is from outside of this forum
            gargron@mastodon.socialG This user is from outside of this forum
            gargron@mastodon.social
            wrote last edited by
            #25

            @dentangle @jonny @onepict As I said, that page should have a noindex tag on it (if you know what that is), and I consider it an oversight that it doesn't. I've let the NodeBB folks know about it a few minutes ago. However, the existence of this page is completely normal. The equivalent page on mastodon.social is mastodon.social/@dentangle@chaos.social, and it is how I can talk to you despite not having an account on chaos.social.

            onepict@chaos.socialO dentangle@chaos.socialD 2 Replies Last reply
            0
            • gargron@mastodon.socialG gargron@mastodon.social

              @dentangle @jonny @onepict As I said, that page should have a noindex tag on it (if you know what that is), and I consider it an oversight that it doesn't. I've let the NodeBB folks know about it a few minutes ago. However, the existence of this page is completely normal. The equivalent page on mastodon.social is mastodon.social/@dentangle@chaos.social, and it is how I can talk to you despite not having an account on chaos.social.

              onepict@chaos.socialO This user is from outside of this forum
              onepict@chaos.socialO This user is from outside of this forum
              onepict@chaos.social
              wrote last edited by
              #26

              @Gargron @dentangle @jonny I'm aware of backfilling and profiles existing on fediverse instances. So are other folks.

              My main issue is it being searchable on search engines. Plus mushing everything together without respecting the public/quiet public stuff.

              Thank you Eugen for making them aware.

              1 Reply Last reply
              0
              • gargron@mastodon.socialG gargron@mastodon.social

                @dentangle @jonny @onepict As I said, that page should have a noindex tag on it (if you know what that is), and I consider it an oversight that it doesn't. I've let the NodeBB folks know about it a few minutes ago. However, the existence of this page is completely normal. The equivalent page on mastodon.social is mastodon.social/@dentangle@chaos.social, and it is how I can talk to you despite not having an account on chaos.social.

                dentangle@chaos.socialD This user is from outside of this forum
                dentangle@chaos.socialD This user is from outside of this forum
                dentangle@chaos.social
                wrote last edited by
                #27

                @Gargron @jonny @onepict Thanks. Yes, I understand. I do hope it is merely an "oversight" as you put it.

                Given the number of times we've had to slap down attempts to make the fediverse searchable it's astonishing that a fediverse developer wouldn't take more care. Mistake or not, it's a huge breach of trust.

                thisismissem@hachyderm.ioT 1 Reply Last reply
                0
                • dentangle@chaos.socialD dentangle@chaos.social

                  @Gargron @jonny @onepict Thanks. Yes, I understand. I do hope it is merely an "oversight" as you put it.

                  Given the number of times we've had to slap down attempts to make the fediverse searchable it's astonishing that a fediverse developer wouldn't take more care. Mistake or not, it's a huge breach of trust.

                  thisismissem@hachyderm.ioT This user is from outside of this forum
                  thisismissem@hachyderm.ioT This user is from outside of this forum
                  thisismissem@hachyderm.io
                  wrote last edited by
                  #28

                  @dentangle @Gargron @jonny @onepict so at a protocol level "quiet public" doesn't really exist, all that happens in mastodon is that as:Public gets moved from `to`to `cc`, so they're effectively the same audience being addressed.

                  So NodeBB is actually right, at a protocol level, to treat public and "quiet public" as the same.

                  Though it sounds like steps will be taken to prevent indexing & display (when unauthenticated) of remote content outside of the context of a thread (you can't exactly mark sections of a page as noindex)

                  dentangle@chaos.socialD 1 Reply Last reply
                  0
                  • thisismissem@hachyderm.ioT thisismissem@hachyderm.io

                    @dentangle @Gargron @jonny @onepict so at a protocol level "quiet public" doesn't really exist, all that happens in mastodon is that as:Public gets moved from `to`to `cc`, so they're effectively the same audience being addressed.

                    So NodeBB is actually right, at a protocol level, to treat public and "quiet public" as the same.

                    Though it sounds like steps will be taken to prevent indexing & display (when unauthenticated) of remote content outside of the context of a thread (you can't exactly mark sections of a page as noindex)

                    dentangle@chaos.socialD This user is from outside of this forum
                    dentangle@chaos.socialD This user is from outside of this forum
                    dentangle@chaos.social
                    wrote last edited by
                    #29

                    @thisismissem @Gargron @jonny @onepict

                    The problem, as Gargron identified, appears to be the lack of a "noindex" tag, which in Fediverse terms is like running an SMTP open relay - a misconfiguration rather than a fault in protocol - but which should not be the default in any software, and which will get you instablocked by the entire Internet.

                    thisismissem@hachyderm.ioT 1 Reply Last reply
                    0
                    • dentangle@chaos.socialD dentangle@chaos.social

                      @thisismissem @Gargron @jonny @onepict

                      The problem, as Gargron identified, appears to be the lack of a "noindex" tag, which in Fediverse terms is like running an SMTP open relay - a misconfiguration rather than a fault in protocol - but which should not be the default in any software, and which will get you instablocked by the entire Internet.

                      thisismissem@hachyderm.ioT This user is from outside of this forum
                      thisismissem@hachyderm.ioT This user is from outside of this forum
                      thisismissem@hachyderm.io
                      wrote last edited by
                      #30

                      @dentangle @Gargron @jonny @onepict right, best practice is to not make remote content directly viewable without authentication (but it may still appear in thread/reply views without authentication)

                      dentangle@chaos.socialD 1 Reply Last reply
                      0
                      • thisismissem@hachyderm.ioT thisismissem@hachyderm.io

                        @dentangle @Gargron @jonny @onepict right, best practice is to not make remote content directly viewable without authentication (but it may still appear in thread/reply views without authentication)

                        dentangle@chaos.socialD This user is from outside of this forum
                        dentangle@chaos.socialD This user is from outside of this forum
                        dentangle@chaos.social
                        wrote last edited by
                        #31

                        @thisismissem @Gargron @jonny @onepict yes, where "best practice" == "if I don't want my instance defederated by the majority of the fediverse"

                        thisismissem@hachyderm.ioT 1 Reply Last reply
                        0
                        • dentangle@chaos.socialD dentangle@chaos.social

                          @thisismissem @Gargron @jonny @onepict yes, where "best practice" == "if I don't want my instance defederated by the majority of the fediverse"

                          thisismissem@hachyderm.ioT This user is from outside of this forum
                          thisismissem@hachyderm.ioT This user is from outside of this forum
                          thisismissem@hachyderm.io
                          wrote last edited by
                          #32

                          @dentangle @Gargron @jonny @onepict the source of that best practice is more around rehosting random content and consequently having liability for that content.

                          dentangle@chaos.socialD 1 Reply Last reply
                          0
                          • thisismissem@hachyderm.ioT thisismissem@hachyderm.io

                            @dentangle @Gargron @jonny @onepict the source of that best practice is more around rehosting random content and consequently having liability for that content.

                            dentangle@chaos.socialD This user is from outside of this forum
                            dentangle@chaos.socialD This user is from outside of this forum
                            dentangle@chaos.social
                            wrote last edited by
                            #33

                            @thisismissem @Gargron @jonny @onepict

                            That may be the case for some instance admins, but most users are not admins.

                            The bigger issue is that feeding fediverse toots into search engines violates conventions and the expectations of most users. That's what causes fedi-riots every time some bright spark does it.

                            1 Reply Last reply
                            0
                            • dentangle@chaos.socialD dentangle@chaos.social

                              Can anyone tell me what is #NodeBB and why is it scraping and republishing fediverse content without consent?

                              dentangle@chaos.socialD This user is from outside of this forum
                              dentangle@chaos.socialD This user is from outside of this forum
                              dentangle@chaos.social
                              wrote last edited by
                              #34

                              Hi @julian

                              I know you're very busy sitting on panels at #Fedicon and talking about how to make the fediverse better. Great.

                              Unfortunately you are still running a scraper that is feeding search engines.

                              You've been posting from the con (tip: we use alt text on pictures here on the fediverse), so I know you're online.

                              You're following me, so you'll have seen my question. @Gargron has spoken to you too I believe.

                              A day later, no acknowledgement or apology or fix or promise of a fix. Why?

                              julian@community.nodebb.orgJ 1 Reply Last reply
                              0
                              • dentangle@chaos.socialD dentangle@chaos.social

                                Hi @julian

                                I know you're very busy sitting on panels at #Fedicon and talking about how to make the fediverse better. Great.

                                Unfortunately you are still running a scraper that is feeding search engines.

                                You've been posting from the con (tip: we use alt text on pictures here on the fediverse), so I know you're online.

                                You're following me, so you'll have seen my question. @Gargron has spoken to you too I believe.

                                A day later, no acknowledgement or apology or fix or promise of a fix. Why?

                                julian@community.nodebb.orgJ This user is from outside of this forum
                                julian@community.nodebb.orgJ This user is from outside of this forum
                                julian@community.nodebb.org
                                wrote last edited by
                                #35

                                Hi dentangle@chaos.social, I haven't been at a laptop this entire day since 7am this morning.

                                Around then I added a change to the link tags sent for remote profiles so that they point to the canonical source (your actual profile).

                                I'll likely just put in a redirect to your profile so it won't be accessible.

                                1 Reply Last reply
                                0
                                • julian@community.nodebb.orgJ This user is from outside of this forum
                                  julian@community.nodebb.orgJ This user is from outside of this forum
                                  julian@community.nodebb.org
                                  wrote last edited by
                                  #36

                                  dentangle@chaos.social I appreciate your civility so far while I work through what needs to be done about this.

                                  1 Reply Last reply
                                  0
                                  • dentangle@chaos.socialD dentangle@chaos.social

                                    Can anyone tell me what is #NodeBB and why is it scraping and republishing fediverse content without consent?

                                    deadsuperhero@social.wedistribute.orgD This user is from outside of this forum
                                    deadsuperhero@social.wedistribute.orgD This user is from outside of this forum
                                    deadsuperhero@social.wedistribute.org
                                    wrote last edited by
                                    #37

                                    @dentangle@chaos.social Quick question, what makes you think this is a scraper? NodeBB is forum software that implements ActivityPub and federates using the protocol.

                                    dentangle@chaos.socialD 1 Reply Last reply
                                    0
                                    • deadsuperhero@social.wedistribute.orgD deadsuperhero@social.wedistribute.org

                                      @dentangle@chaos.social Quick question, what makes you think this is a scraper? NodeBB is forum software that implements ActivityPub and federates using the protocol.

                                      dentangle@chaos.socialD This user is from outside of this forum
                                      dentangle@chaos.socialD This user is from outside of this forum
                                      dentangle@chaos.social
                                      wrote last edited by
                                      #38

                                      @deadsuperhero It doesn't matter where the data is coming from, the effect is the same. Scraping done over AP is still scraping. The data (retrieved over AP in this case) is being republished without a "noindex" tag so it is being fed into search engines, including posts on your peertube server.

                                      1 Reply Last reply
                                      0
                                      • dentangle@chaos.socialD This user is from outside of this forum
                                        dentangle@chaos.socialD This user is from outside of this forum
                                        dentangle@chaos.social
                                        wrote last edited by
                                        #39

                                        @julian Thank you for your response and taking this seriously.

                                        Please keep everyone informed. Feeding fediverse data to search engines (even accidentally, as this appears to be) is a breach of trust. How you handle this now is likely to be remembered by the fediverse for a long time.

                                        julian@community.nodebb.orgJ 1 Reply Last reply
                                        0
                                        • dentangle@chaos.socialD dentangle@chaos.social

                                          @julian Thank you for your response and taking this seriously.

                                          Please keep everyone informed. Feeding fediverse data to search engines (even accidentally, as this appears to be) is a breach of trust. How you handle this now is likely to be remembered by the fediverse for a long time.

                                          julian@community.nodebb.orgJ This user is from outside of this forum
                                          julian@community.nodebb.orgJ This user is from outside of this forum
                                          julian@community.nodebb.org
                                          wrote last edited by
                                          #40

                                          dentangle@chaos.social the noindex tag has been added to all remote profiles.

                                          1 Reply Last reply
                                          1
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • World
                                          • Users
                                          • Groups