Advantages of Kosmix's KFS vs. HDFS

standalone

浏览: 598008 次
性别:
来自: 上海

最近访客更多访客>>

liujun.1980

rkikbs

yy629

songhait

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

hadoop
cloud

HDFS KFS

A post about KFS vs. HDFS

October 02, 2007

Advantages of Kosmix's KFS vs. HDFS

I was excited to learn last week that my friends at Kosmix have decided to open source a project long in the works: the Kosmix Distributed File System, or KFS (see the offical blog post). A number of people have commented on this release including Ethan Stock of zVents, who plans to use KFS along with their HyperTable clone of BigTable, and Rich Skrenta, who gives an excellent list of features of KFS.

Now, as a dumb product manager, my biggest questions were about KFS vs. HDFS, which is the distributed file system built by the Hadoop project. Powerset already makes extensive use of the Hadoop stack, including HDFS. So, I asked Sriram Rao, the lead engineer of KFS if he could explain to me what the different is between HDFS and KFS. Here are some of his answers, which I think give more insight into why Kosmix chose to build KFS.

So why did Kosmix build KFS instead of using HDFS? Apparently, KFS/HDFS were done in parallel. The implementation was done from 2006-2007 and now Kosmix feels it's in a releasable state. One of the reasons to stick with KFS over HDFS is that HDFS is written in Java and Kosmix's back-end is written in C++ and they were worried about the speed of the JNI interface.
File writing - HDFS writes to a file once and read many times. But, when writing to a file, you have to write from the start to the end and that is it. Conversely, in KFS you can write to a file as many times as you want and write anywhere in the file (i.e., seek and write) and append to an existing file. I've heard that Yahoo is working to fix this problem in HDFS, but it still isn't implemented.
Data integrity - Currently, with HDFS, after you write to a file, the data becomes “visible” to other apps only when the application closes the file. So, if the process were to crash before closing, the data written is lost. With KFS, the data becomes visible when it gets pushed out to the chunkservers. For performance, clients cache data; when the cache is full or when the applicatiohn choses, data gets flushed out.
Data rebalancing - KFS has rudimentary support for automatic rebalancing. When you add new nodes/there is a change in space utilization amongst nodes, the system may migrate chunks from over-utilized nodes to under-utilized nodes. HDFS doesn’t have such support now.

Hopefully I transcribed these accurately! Definitely check out the KFS project, as the more people contributing, the better. Powerset will be evaluating KFS in the coming weeks to see if it has any features that can propel us ahead of using HDFS.

KFS

分享到：

HDFS的写操作策略 | ZFS Features Summary

2009-07-21 17:12
浏览 1469
评论(0)
分类:开源软件
查看更多

发表评论

您还没有登录,请您登录后再发表评论

相关推荐

jSP在线教学质量评价系统的设计与实现(源代码): 在线教学质量评价系统可以方便和全面地收集教师教学工作的数据，提供师生网上评教的评分结果，快速集中收集各方面的评教信息，使教务管理部门能够及时了解教学动态和师资情况，为教务老师提供相关决策支持，为职称评聘提供教学工作质量的科学依据，同时减轻了教务老师的工作量。

python-3.10.7-amd64.zip: python-3.10.7-amd64.zip

自研扩散模型高光谱修复网络: 自研扩散模型高光谱修复网络基于MST_Plus_Plus 网络改造。试验数据扩散模型loss初步测试降到了0.005，比不加扩散loss小了20倍，训练入口 train_cos_img.py

企业数据治理之数据安全治理方案.pptx: 企业数据治理之数据安全治理方案

毕业设计基于Android的一个红外防盗报警源码.zip: 这是历年的毕业设计的项目，基于Android的一个红外防盗报警。需要自己添加蜂鸣器和热释电的硬件访问服务。

短视频用户价值研究报告2022: 短视频用户价值研究报告2022

基于springboot的食堂管理系统.zip: 基于springboot的java毕业&课程设计

50.基于SSM的停车场管理系统的设计与实现-基于SSM+ Mysql+Java设计与实现(可运行源码+数据库+lw)毕业设计管: 可运行源码（含数据库脚本）+开发文档+lw（高分毕设项目） java期末大作业毕业设计项目管理系统计算机软件工程大数据专业内容概要：首先在日常的出行中，老旧城区道路狭窄，容易造成车辆的堵塞，每天早晚，接送孩子的车辆数密集，会造成相应的交通堵塞情况。而同样的，在停车的管理上，一方面我国的停车场面积较少，停车位一位难求，特别是在现在的一些小区里，为了抢停车位而产生的矛盾也日益突出。另一方面在停车场的管理上也存在着较大的管理问题，进车容易出车难是当下的停车场所出现的主要问题。而现在的停车场管理系统眼花缭乱，效果水平也良莠不齐，停车场的管理是当下各大城市的公共设施发展的一大难题，而国家、各大省市也都开全套项目源码+详尽文档，一站式解决您的学习与项目需求。适用人群：计算机、通信、人工智能、自动化等专业的学生、老师及从业者。使用场景及目标：无论是毕设、期末大作业还是课程设计，一键下载，轻松部署，助您轻松完成项目。项目代码经过调试测试，确保直接运行，节省您的时间和精力。其他说明：项目整体具有较高的学习借鉴价值，基础能力强的可以在此基础上修改调整，以实现不同的功能。

基于SpringBoot的新闻管理发布系统，新闻后台管理系统。.zip: 基于springboot的java毕业&课程设计

微信小程序设计-金融行业.rar: 微信小程序设计之相关行业源码及图文导入教程

JAVA泡泡堂网络游戏的设计与实现(源代码+lw).zip: 网络游戏开发是一项很大的工程，需要很多综合性的知识。这对于刚刚入门的开发者来说很难理解。本论文从研究开发一个模仿泡泡堂网络游戏的例子出发，讲述网络游戏开发中用到的一些最基本的知识和设计思想,使大家清晰的理解游戏开发的过程。整个设计中利用java中的swing编程，结合游戏的操作流程，对整个游戏进行精心的设计和大量的测试，实现游戏软件服务器端和客户端的开发，为玩家提供一个友好美观的操作界面，并添加聊天等功能以增加玩家之间的互动性，此外实现了可编辑场景地图的功能，使得游戏内容的更加丰富，玩家交互性更好，确保了游戏更具有趣味性、灵活性，以满足玩家对这款网络游戏的要求。

外东洪路中段.m4a: 外东洪路中段.m4a

软考3333333333: 软考3333333333

Elasticsearch 的全文搜索功能使用方法: 附件是Elasticsearch 的全文搜索功能使用方法，文件绿色安全，请大家放心下载，仅供交流学习使用，无任何商业目的！

CosmoChron：一种使用宇宙成因核素和直接年龄限制的多功能年龄深度建模方法matlab代码.zip: 1.版本：matlab2014/2019a/2021a 2.附赠案例数据可直接运行matlab程序。 3.代码特点：参数化编程、参数可方便更改、代码编程思路清晰、注释明细。 4.适用对象：计算机，电子信息工程、数学等专业的大学生课程设计、期末大作业和毕业设计。

基于springboot + websocket + html5 canvas打造网络版坦克大战.zip: 基于springboot的java毕业&课程设计

CCNP TSHOOT 642-832 Official Certification Guide: CCNP TSHOOT 642-832 Official Certification Guide

MySQL8.4.0 LTS（mysql-server-8.4.0-1ubuntu22.04-amd64.deb-bundle): MySQL8.4.0 LTS（mysql-server_8.4.0-1ubuntu22.04_amd64.deb-bundle.tar）适用于Ubuntu 22.04 Linux (x86, 64-bit)

课设毕设基于SpringBoot+Vue的某银行OA系统 LW+PPT+源码可运行.zip: 课设毕设基于SpringBoot+Vue的某银行OA系统 LW+PPT+源码可运行.zip

最近访客 更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

Advantages of Kosmix's KFS vs. HDFS

A post about KFS vs. HDFS

October 02, 2007

Advantages of Kosmix's KFS vs. HDFS

评论

发表评论

相关推荐

hadoop-2.2.0 build failure due to missing dependancy

HDFS中租约管理源代码分析

Question on HBase source code

Using the libjars option with Hadoop

What's Xen?

学习hadoop之基于protocol buffers的 RPC

学习hadoop之基于protocol buffers的 RPC

Hadoop RPC 一问

Hadoop Version Graph

Hadoop 2.0 代码分析---MapReduce

how to study hadoop?

首相发怒记之hadoop篇

Cloud Security?

一个HDFS Error

hadoop cluster at ebay

[转]hadoop at ebay

【读书笔记】Data warehousing and analytics infrastructure at facebook

cassandra example

想了解Thrift，留个记号

impact of total region numbers?

最近访客更多访客>>