flashcache – Robin on Linux

Books I read in year 2015

The first book is about network hardware, like router, switcher. As a coder, I usually use servers on cloud, therefore haven’t see the real high performance routers (I have sought bare server, 1Gb switcher). This book open my eyes.
The second book is about how to build Datacenter. It’s really a work for architecture, not IT guys.
About two years ago, I worked with Mysql team in my company as a kernel developer. We have used PCIE-card of NAND and flashcache as our solution for Mysql to process hight throughput pressure. But util this year, I have read over the architecture of InnoDB Engine which is the most powerful and effective engine in Mysql. Actually, it’s not so difficult to have a overview of the InnoDB Engine in a book. But, it is still very hard to understand the code of it 🙂
I haven’t go to cinema to watch “The Martian” because I have read it in my Kindle on my commute everyday. It is really a sci-fi story for Geeks who like do research on Computer,Chemistry,Physics,etc. The only question I want to ask the author is:” How could you invent so much troubles on Mars to torture Mark Watney?”

China Linux Storage & Filesystem 2015 workshop (second day)

Zheng Liu from Alibaba lead the topic about ext4. The most important change in EXT-series filesystem this year is: ext3 has gone, people could only use ext3 by mount ext4 with special arguments in latest kernel (actually, in CentOS 7.0). Encrypt feature has complete in ext4.
Robin Dong (Yes, it’s me) from Alibaba give a presentation about cold storage (Slide is here). We develop distributed storage system based on a small open-source software called “sheepdog“, and modified it heavily to improve data recovery performance and make sure it could run in low-end but high-density storage servers.

Discussion in tea break

Yanhai Zhu from Alibaba (We have done so much works on storage) lead a topic about cache in virtual machines environment. Alibaba choose Bcache as code base to develop a new cache software.
Robin: Why Bcache? Why not flashcache?
Yanhai: I started my work on flashcache first, but flashcache is not profit to the product environment. First, flashcache is unfriendly to sequential-write. Second, it use hash data structure to distributed IO requests at beginning, which will split the cache data in multi-tenant environment. Bcache use B-tree instead of hash-table to store data, it’s better for our requirements.
They use radical write-back strategy on cache. It works very well because the cache sequentialize the write IOs and make backend easy to absorb the pressure peak.
The last topic is lead by Zhongjie Wu from Memblaze, a famous startup company in China on flash storage technology. It’s about NVDIMM, the most hot hardware technology in recent years. A NVDIMM is not expensive, it is only a DDR DIMM with a capacitance. Memblaze has develop a new 1U storage server with a NVDIMM and many flash cards. It contain their own developed OS and could use Fabric-Channel/Ethernet to connect to client. The main purpose of NVDIMM is to reduce latency, and they use write-back strategy(Surely).
The big problem they face with NVDIMM is CPU can’t flush data in its L1 cache to NVDIMM when whole server powers down. To solve this problem, Memblaze use write-combining in CPU multi-cores, it hurts the performance a little but avoid the data missing finally.

clsf2015

All the staff in this CLSF 2015

Articles from other attenders:
https://blogs.oracle.com/linuxkernel/entry/china_linux_storage_and_file1