정보실

웹학교

정보실

기타 BigData 분산 프로그래밍

본문

  • AddThis Hydra - distributed data processing and storage system originally developed at AddThis.
  • AMPLab SIMR - run Spark on Hadoop MapReduce v1.
  • Apache APEX - a unified, enterprise platform for big data stream and batch processing.
  • Apache Beam - an unified model and set of language-specific SDKs for defining and executing data processing workflows.
  • Apache Crunch - a simple Java API for tasks like joining and data aggregation that are tedious to implement on plain MapReduce.
  • Apache DataFu - collection of user-defined functions for Hadoop and Pig developed by LinkedIn.
  • Apache Flink - high-performance runtime, and automatic program optimization.
  • Apache Gearpump - real-time big data streaming engine based on Akka.
  • Apache Gora - framework for in-memory data model and persistence.
  • Apache Hama - BSP (Bulk Synchronous Parallel) computing framework.
  • Apache MapReduce - programming model for processing large data sets with a parallel, distributed algorithm on a cluster.
  • Apache Pig - high level language to express data analysis programs for Hadoop.
  • Apache REEF - retainable evaluator execution framework to simplify and unify the lower layers of big data systems.
  • Apache S4 - framework for stream processing, implementation of S4.
  • Apache Spark - framework for in-memory cluster computing.
  • Apache Spark Streaming - framework for stream processing, part of Spark.
  • Apache Storm - framework for stream processing by Twitter also on YARN.
  • Apache Samza - stream processing framework, based on Kafka and YARN.
  • Apache Tez - application framework for executing a complex DAG (directed acyclic graph) of tasks, built on YARN.
  • Apache Twill - abstraction over YARN that reduces the complexity of developing distributed applications.
  • Baidu Bigflow - an interface that allows for writing distributed computing programs providing lots of simple, flexible, powerful APIs to easily handle data of any scale.
  • Cascalog - data processing and querying library.
  • Cheetah - High Performance, Custom Data Warehouse on Top of MapReduce.
  • Concurrent Cascading - framework for data management/analytics on Hadoop.
  • Damballa Parkour - MapReduce library for Clojure.
  • Datasalt Pangool - alternative MapReduce paradigm.
  • DataTorrent StrAM - real-time engine is designed to enable distributed, asynchronous, real time in-memory big-data computations in as unblocked a way as possible, with minimal overhead and impact on performance.
  • Facebook Corona - Hadoop enhancement which removes single point of failure.
  • Facebook Peregrine - Map Reduce framework.
  • Facebook Scuba - distributed in-memory datastore.
  • Google Dataflow - create data pipelines to help themæingest, transform and analyze data.
  • Google MapReduce - map reduce framework.
  • Google MillWheel - fault tolerant stream processing framework.
  • IBM Streams - platform for distributed processing and real-time analytics. Provides toolkits for advanced analytics like geospatial, time series, etc. out of the box.
  • JAQL - declarative programming language for working with structured, semi-structured and unstructured data.
  • Kite - is a set of libraries, tools, examples, and documentation focused on making it easier to build systems on top of the Hadoop ecosystem.
  • Metamarkets Druid - framework for real-time analysis of large datasets.
  • Netflix PigPen - map-reduce for Clojure which compiles to Apache Pig.
  • Nokia Disco - MapReduce framework developed by Nokia.
  • Onyx - Distributed computation for the cloud.
  • Pinterest Pinlater - asynchronous job execution system.
  • Pydoop - Python MapReduce and HDFS API for Hadoop.
  • Ray - A fast and simple framework for building and running distributed applications.
  • Rackerlabs Blueflood - multi-tenant distributed metric processing system
  • Skale - High performance distributed data processing in NodeJS.
  • Stratosphere - general purpose cluster computing framework.
  • Streamdrill - useful for counting activities of event streams over different time windows and finding the most active one.
  • streamsx.topology - Libraries to enable building IBM Streams application in Java, Python or Scala.
  • Tuktu - Easy-to-use platform for batch and streaming computation, built using Scala, Akka and Play!
  • Twitter Heron - Heron is a realtime, distributed, fault-tolerant stream processing engine from Twitter replacing Storm.
  • Twitter Scalding - Scala library for Map Reduce jobs, built on Cascading.
  • Twitter Summingbird - Streaming MapReduce with Scalding and Storm, by Twitter.
  • Twitter TSAR - TimeSeries AggregatoR by Twitter.
  • Wallaroo - The ultrafast and elastic data processing engine. Big or fast data - no fuss, no Java needed.
  • 트위터로 보내기
  • 페이스북으로 보내기
  • 구글플러스로 보내기
  • 카카오톡으로 보내기

페이지 정보

조회 18회 ]  작성일19-11-14 15:48

웹학교