D-thinker forms many physical hosts in datacenters to a big virtual machine by a new ISA and provides the programmers system software for datacenter-scale programming, such as the programming languages, the compilers and application frameworks.
The result of performance comparison on sort, K-means and Logistic Regression shows that D-thinker is 2-72 times faster than Hadoop and 1.6-5 times faster than Spark, which shows its high performance.
D-thinker provides technologies and tools for data storage, indexing, data mining and machine learning over data from GBs to PBs.
- Big data indexing and query.
- Efficient graph process.
- SQL subset to be supported.
- Consistent data updates.
- BSP programming framework (also supports Pregel-style programs).
- Rich programming libraries/frameworks for data mining and machine learning.