sparkLearn how to use Spark and get useful tips, tutorial videos, specifications, and afterSpark’s primary abstraction is a distributed collection of items called a Dataset. Datasets can be created from Hadoop InputFormats (such as HDFS files) or by transforming other Datasets.