Architecture overview v2
Hadoop is a framework that allows you to store a large data set in a distributed file system.
The Hadoop data wrapper provides an interface between a Hadoop file system and a Postgres database. The Hadoop data wrapper transforms a Postgres SELECT
statement into a query that is understood by the HiveQL or Spark SQL interface.
When possible, the foreign data wrapper asks the Hive or Spark server to perform the actions associated with the WHERE
clause of a SELECT
statement. Pushing down the WHERE
clause improves performance by decreasing the amount of data moving across the network.