Distributed block and object storage

samuel · 17. März 2014 um 02:23

Eine sehr interessante und ausführliche Diskussion zum Thema wie organisiert man die Strukturen des unendlich großen Strom von fortwährender Datenflut? Kurze Zusammenfassung: Hierarchielos über Objekte die gekoppelt an Metadaten sind und damit beliebig verknüpf- und indexierter werden und das ohne direkte Datenbank auf Speichermedium ebene. http://chucksblog.emc.com/chucks_blog/2009/08/the-future-doesnt-have-a-file-system.html

samuel · 17. März 2014 um 08:05

Hier gibt das Ganze schön visuell dargestellt. Ich kratze das ganze aber gerne noch mal in meinem Blogpost an, für den ich bestimmt noch ein wenig brauchen werde, da die Gedanken wohl organisiert sein sollten.

daniel · 19. März 2014 um 16:47

Es gibt da bereits eine weit verbreitete Variante: http://www.openstack.org/software/openstack-storage/

Das sollte man sich mal genauer ansehen. Könnte einige unserer Kern-Probleme erschlagen. Interessanter Aspekt: Amazon S3 Kompatibilität.

In der Diskussion in dem von Dir verlinkten Artikel ist ein sehr schönes Beispiel für die Notwendigkeit eines Paradigmenwechsels gegeben:

I agree with the general sentiment of the article. Most hierarchies are inherently flawed due to cross-cutting concerns. (Maybe a misuse of the term, but I think it works) If I’m organizing my pictures, I might choose date as the primary dimension and create a folder structure like: 2009/August/Vision Scuba Trip

But what if I want to view all pictures of a certain person, or all my Scuba Diving pictures. These are separate hierarchies that cut across my primary one. I could try creating these other directory hierarchies and symbolically linking to files from them, but this is cumbersome. Besides, you want a real database behind the scenes so you can efficiently sort and search your data.

You might say this problem is up to your photo software to solve. But I disagree. There is a need for a standard file database that is core to the operating system. Once you develop such a standard a whole new world of computing opens up. Suddenly you have the ability to create and explore data relationships in much more powerful ways.

Es gab z.B. von MS auch mal die Bemühungen etwas ähnliches zu etnwickeln: http://de.wikipedia.org/wiki/WinFS Das Projekt ist gescheitert und einige der Features findet man heute z.B. im SQL-Server.

almereyda · 31. Mai 2014 um 00:32

Ich fände es schön einmal im Rahmen von allmende.io über das Experimentieren mit http://ceph.com/ oder http://basho.com/riak/ nachzudenken.

almereyda · 4. Juni 2014 um 13:33

Ein etwas ausführlicheres Round Up über das Thema gibt es auf zehn Seiten beim Admin Magazin zu GlusterFS und Ceph im Vergleich. git-annex und Tahoe-LAFS sollten aber nicht unerwähnt bleiben.

Pydio finde ich übrigens auch spannend, da es mehrere Backends unterstützt.

samuel · 4. Juni 2014 um 23:17

Dem hinzuzufügen wären: XtreemFS & OriFS

almereyda · 6. Juni 2014 um 00:29

Außerdem auch Camlistore

almereyda · 9. Juni 2014 um 20:56

That somehow relates to http://refspecs.linuxfoundation.org/FHS_2.3/fhs-2.3.html.

almereyda · 13. Juni 2014 um 07:39

Yeah:

Gefunden via:

samuel · 13. Juni 2014 um 11:32

This looks really interesting!

almereyda · 25. Juni 2014 um 20:06

Have we mentionned Nimbus.IO already?

almereyda · 30. Juni 2014 um 16:08

In the end, if we even once integrate one of those object based storage systems behind NDN as the federation engine (and Telehash to WebFist through firewalls), that would be a strong leap towards Socially aware cloud storage [via]

In a sense, we could even call the Federated Wiki an object storage system, but we’re mostly referring to BLOB data instead of structured data here, right?

almereyda · 7. August 2014 um 01:45

Never stop, diesmal:

Sheepdog : S3 und Swift kompatible API + block + object abstraction layers for QEMU.

almereyda · 9. August 2014 um 00:34

RFC 6392: A Survey of In-Network Storage Systems

almereyda · 18. November 2014 um 22:36

https://twitter.com/NewzSec/status/534450978494619649

Now after Bittorrent Sync can be considered compromised, I came back investigating distributed storage systems again. Some new things can be concluded:

Somewhere at TU Graz someone is working with high amounts of distributed data. @species, maybe you even want to contact him? Even also to Markus Lanthaler once time is ready?
- LizardFS, MooseFS fork
- RozoFS from Nantes, which relies on the Mojette Transform algorithm.
This great shootout on Ceph, Sheepdog and GlusterFS is funny to read. If you want object storage, why not use Swift?

Anything else to mention? Yes. I may build on Ceph or RozoFS.

almereyda · 31. Mai 2015 um 21:59

These experiments on docker-ipfs may shed a new light on some of the questions here.

almereyda · 19. Juni 2017 um 11:06

There is also a ceph frontend called https://rook.io/

almereyda · 3. August 2017 um 16:28