πFS
πFS
Check out https://github.com/philipl/inferencefs/ for the latest in data-free filesystems! πfs: Never worry about data again! πfs is a revolutionary new file system that, instead of wasting space storing your data on your hard drive, stores your data in π! You’ll never run out of space again - π holds every file that could possibly exist! They said 100% compression was impossible? You’re looking at it!
快去看看 https://github.com/philipl/inferencefs/,了解最新的“无数据”文件系统!πfs:从此再也不用为数据发愁!πfs 是一种革命性的新型文件系统,它不再浪费硬盘空间来存储数据,而是将数据存储在圆周率 π 中!你将永远不会再遇到空间不足的问题——π 包含了所有可能存在的文件!有人说 100% 的压缩率是不可能的?你现在看到的正是它!
πfs is dead simple to build: Firstly, you must install autoconf, automake, libfuse packages in your system. For example, if you have Debian try:
sudo apt-get install autotools-dev
sudo apt-get install automake
sudo apt-get install libfuse-dev
./autogen.sh
./configure
make
make install
πfs 的构建极其简单:首先,你需要在系统中安装 autoconf、automake 和 libfuse 软件包。例如,如果你使用的是 Debian,请尝试:
sudo apt-get install autotools-dev
sudo apt-get install automake
sudo apt-get install libfuse-dev
./autogen.sh
./configure
make
make install
πfs is dead simple to use: πfs -o mdd=<metadata directory> <mountpoint> where the metadata directory is where πfs should store its metadata (such as filenames or the locations of your files in π) and mountpoint is your usual filesystem mountpoint.
πfs 的使用也非常简单:πfs -o mdd=<元数据目录> <挂载点>。其中,元数据目录是 πfs 存储元数据(如文件名或文件在 π 中的位置)的地方,而挂载点则是你常用的文件系统挂载点。
What does π have to do with my data? π (or pi) is one of the most important constants in mathematics and has a variety of interesting properties (which you can read about at wikipedia). One of the properties that π is conjectured to have is that it is normal, which is to say that its digits are all distributed evenly, with the implication that it is a disjunctive sequence, meaning that all possible finite sequences of digits will be present somewhere in it. If we consider π in base 16 (hexadecimal), it is trivial to see that if this conjecture is true, then all possible finite files must exist within π. The first record of this observation dates back to 2001.
π 与我的数据有什么关系?π(圆周率)是数学中最重要的常数之一,具有多种有趣的性质(你可以在维基百科上阅读相关内容)。π 被推测具有的一个性质是“正规数”,即它的数字分布是均匀的,这意味着它是一个析取序列,即所有可能的有限数字序列都会在其中某处出现。如果我们以 16 进制来看待 π,那么显而易见,如果这个猜想成立,那么所有可能的有限文件都必然存在于 π 中。这一观察结果最早可追溯到 2001 年。
From here, it is a small leap to see that if π contains all possible files, why are we wasting exabytes of space storing those files, when we could just look them up in π! Every file that could possibly exist? That’s right! Every file you’ve ever created, or anyone else has created or will create! Copyright infringement? It’s just a few digits of π! They were always there!
由此不难推断:如果 π 包含了所有可能的文件,我们为什么还要浪费艾字节(exabytes)的空间去存储它们,直接在 π 中查找不就行了吗!所有可能存在的文件?没错!你曾经创建的、别人已经创建或将要创建的每一个文件都在里面!版权侵权?那不过是 π 的几位数字而已!它们一直都在那里!
But how do I look up my data in π? As long as you know the index into π of your file and its length, its a simple task to extract the file using the Bailey–Borwein–Plouffe formula. Similarly, you can use the formula to initially find the index of your file. Now, we all know that it can take a while to find a long sequence of digits in π, so for practical reasons, we should break the files up into smaller chunks that can be more readily found. In this implementation, to maximise performance, we consider each individual byte of the file separately, and look it up in π.
但我该如何在 π 中查找我的数据呢?只要你知道文件在 π 中的索引及其长度,使用 Bailey–Borwein–Plouffe 公式提取文件就是一项简单的任务。同样,你也可以使用该公式来初步查找文件的索引。众所周知,在 π 中寻找长数字序列可能需要一些时间,因此出于实际考虑,我们应该将文件拆分成更易于查找的小块。在此实现中,为了最大化性能,我们将文件的每个字节单独考虑,并在 π 中进行查找。
So I’ve looked up my bytes in π, but how do I remember where they are? Well, you’ve obviously got to write them down somewhere; you could use a piece of paper, but remember all that storage space we saved by moving our data into π? Why don’t we store our file locations there!?! Even better, the location of our files in π is metadata and as we all know metadata is becoming more and more important in everything we do. Doesn’t it feel great to have generated so much metadata? Why waste time with old fashioned data when you can just deal with metadata, and lots of it!
那么,我已经查到了字节在 π 中的位置,但我该怎么记住它们呢?显然,你得把它们写在什么地方;你可以用纸记下来,但别忘了我们通过把数据移入 π 所节省的那些存储空间?为什么我们不把文件位置也存在那里呢!?!更好的是,文件在 π 中的位置属于元数据,众所周知,元数据在我们所做的一切中正变得越来越重要。生成这么多元数据感觉不是很棒吗?既然可以处理元数据,而且是海量的元数据,何必还要浪费时间处理那些老掉牙的数据呢!
Yeah, but what happens if lose my file locations? No problem, the locations are just metadata! Your files are still there, sitting in π - they’re never going away, are they?
“是啊,但如果我弄丢了文件位置怎么办?”没问题,这些位置只是元数据!你的文件依然在那里,安稳地待在 π 中——它们永远不会消失,对吧?
Why is this thing so slow? It took me five minutes to store a 400 line text file! Well, this is just an initial prototype, and don’t worry, there’s always Moore’s law!
“为什么这东西这么慢?存一个 400 行的文本文件竟然花了我五分钟!”嗯,这只是一个初步原型,别担心,还有摩尔定律呢!
Where do we go from here? There’s lots of potential for the future! Variable run length search and lookup! Arithmetic Coding! Parallelizable lookup! Cloud based π lookup! πfs for Hadoop!
未来路在何方?未来潜力无限!可变游程长度搜索与查找!算术编码!可并行化查找!基于云的 π 查找!以及用于 Hadoop 的 πfs!