https://hub.docker.com/r/sciactive/nephele

In the latest version of Nephele, you can now create a WebDAV server that deduplicates files that you add to it.

I created this feature because every night at midnight, my Minecraft world that my friends and I play on gets backed up. Our world has grown to about 5 GB, but every night, the same files get backed up over and over. It’s a waste of space to store the same files again and again, but I want the ability to roll back our world to any day in the past.

So with this new feature of Nephele, I can upload the Minecraft backup and only the files that have changed will take up additional space. It’s like having infinite incremental backups that never need a full backup after the first time, and can be accessed instantly.

Nephele will only delete a file from the file storage once all copies that share the same file contents have been deleted, so unlike with most incremental backup solutions, you can delete previous backups easily and regain space.

Edit: So, I think my post is causing some confusion. I should make it clear that my use case is specific for me. This is a general purpose deduplicating file server. It will take any files you give it and deduplicate them in its storage. It’s not a backup system, and it’s not a versioning system. My use case is only one of many you can use a deduplicating file server for.

  • oni@lemmy.ml
    link
    fedilink
    English
    arrow-up
    1
    ·
    6 months ago

    That’s a great feature! Currently I am way too invested in Nextcloud ecosystem to try anything else. But if I ever want to leave Nextcloud for a simple webdav server, this will be my consideration!

    • oni@lemmy.ml
      link
      fedilink
      English
      arrow-up
      1
      ·
      6 months ago

      I looked into a bit more, and the fact that I can put it on top of an S3 compatible filesystem, with encryption and deduplication, is huge. I am seriously considering it as a backup system now.

    • hperrin@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      1
      ·
      edit-2
      6 months ago

      Not at all. Btrfs snapshots:

      • aren’t accessible unless you revert to them
      • only happen when you manually trigger them
      • don’t deduplicate files in the file system, just across snapshots
      • are handled at the file-system level (meaning you’d have to create a separate file system, or at least a separate subvolume if you’re already using btrfs, to make them with an exclusive set of files)
      • don’t have access controls beyond Linux’ basic file controls (so sharing a server will be complicated)
      • aren’t served across the network (you can serve a btrfs file system, but then you can’t access a previous snapshot)
      • aren’t portable (you can’t just copy a set of files to a new server, you have to image the partition)

      They serve a very different purpose than a deduplicating file server. Now, there are other deduplicating file servers, but I don’t know of any that are open source and run on Linux.

      • poVoq@slrpnk.net
        link
        fedilink
        English
        arrow-up
        1
        ·
        6 months ago

        Uhm, I think you need to do better research as most of the above isn’t true.

          • poVoq@slrpnk.net
            link
            fedilink
            English
            arrow-up
            1
            ·
            6 months ago

            Points 1,2,6,7 are wrong, and the others are partially wrong and/or can be easily solved with other existing tools.

            • hperrin@lemmy.worldOP
              link
              fedilink
              English
              arrow-up
              1
              ·
              edit-2
              6 months ago

              Can you explain to me then:

              • How do you access the files in a previous snapshot without reverting to it?
              • How does btrfs automatically make its own snapshots?
              • How does btrfs serve the contents of previous snapshots across the network?
              • How can I copy the contents of all previous snapshots at once without imaging the partition?

              If you’re using other tools on top of btrfs to implement a deduplicating file server, then you can’t say I reinvented btrfs snapshots, can you?

              I don’t know how much clearer I can make the distinction between a copy on write file system and a deduplicating file server. They are completely different things for completely different purposes. The only thing they have in common is that they will deduplicate data, but a COW FS only deduplicates data under certain conditions. My server will deduplicate every file across its entire file store.

              I get that people on Lemmy love to shit on other people’s accomplishments. I’ve never posted anything on here without it being criticized, but saying I “reinvented btrfs snapshots” is quite possibly the worst, most inaccurate take anyone has ever had on any of my posts.

              • poVoq@slrpnk.net
                link
                fedilink
                English
                arrow-up
                1
                ·
                6 months ago

                Snapshots are accessible in read only mode without reverting to it, snapshots can be easily configured to be taken automatically with a simple cron job, btrfs allows full control of snapshots over SSH, and you can easily copy a snapshot to another btrfs filesystem on the same or remote server.

                Also btrfs follows the Unix philosophy, so of course you will be using additional tools with it, but btrbk for example makes all of the above really easy with no additional tools needed.

                Obviously there are differences, but serving WebDAV on top of a btrfs filesystem is very similar to what you have made.

                • hperrin@lemmy.worldOP
                  link
                  fedilink
                  English
                  arrow-up
                  1
                  ·
                  edit-2
                  6 months ago

                  It very much is not. Again, btrfs will only deduplicate data under certain circumstances, like if you copy a file to a new location. If I take a USB stick with an 8gb movie file on it and copy that to btrfs twice, it will take up 16gb on disk. If I copy it to btrfs once, then copy it from there to a new location, it will take up 8gb on disk. Btrfs does not deduplicate files, it deduplicates copies. I want something that deduplicates files.

                  If you run WebDAV on top of btrfs and try what I’m using it for, it literally will not deduplicate anything, because you’re always writing new files to it, not copying existing files.

                  Triggering a snapshot with a cron job doesn’t mean it’s automatic to btrfs. The action still happens only when triggered. Btrfs doesn’t take snapshots for you.

                  What good is management through SSH? I want a deduplicating file server, not a versioning file system I have to manage over SSH server. If I wanted versioning like that, I would just use git.

                  And again, adding tools on top of btrfs to recreate something similar to what I’ve made here does not mean I reinvented btrfs. Btrfs is a COW FS. I wrote a deduplicating file server. I honestly can’t believe you don’t see the difference here. Like, are you trolling?

                  I feel like you misinterpreted my post to mean that my use case is the only thing you could use my server for, and you’re just running with it, even though I’ve told you multiple times, I wrote a deduplicating file server, not an incremental backup system, and not a versioning system. The fact that I’m using it for incremental backups is inconsequential to what it actually does. It deduplicates files and serves them from WebDAV. AFAIK, there’s no other open source server that does that.