Skip to content

Commit 710f62f

Browse files
committed
Merge pull request #2981 from cjgronlund/cleanup-cg
delete giraph article; fix HDI intro TOC
2 parents 8176977 + 9790465 commit 710f62f

File tree

2 files changed

+7
-234
lines changed

2 files changed

+7
-234
lines changed

articles/hdinsight-giraph.md

-219
This file was deleted.

articles/hdinsight-hadoop-introduction.md

+7-15
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
<properties
2-
pageTitle="Introduction to Hadoop in HDInsight: Big data analysis in the cloud | Azure"
2+
pageTitle="Introduction to Hadoop in the cloud: Big data analysis | Azure"
33
description="An introduction to the Hadoop components on HDInsight. Learn how HDInsight uses Hadoop clusters in the cloud to manage, analyze, and report on big data."
44
services="hdinsight"
55
documentationCenter=""
@@ -19,33 +19,25 @@
1919

2020
# Introduction to Hadoop in HDInsight: Big-data processing and analysis in the cloud
2121

22-
Get an introduction to the Hadoop ecosystem in Azure HDInsight - components, common terminology, and scenarios. Also, find out about tutorials and resources for using Hadoop in HDInsight.
22+
Get an introduction to the Hadoop ecosystem in Azure HDInsight - components, common terminology, and solutions. Also, find out about documentation, tutorials, and resources for using Hadoop in HDInsight.
23+
24+
## What is Hadoop in HDInsight?
2325

2426
Azure HDInsight deploys and provisions Apache Hadoop clusters in the cloud, providing a software framework designed to manage, analyze, and report on big data. The Hadoop core provides reliable data storage with the Hadoop Distributed File System (HDFS), and a simple MapReduce programming model to process and analyze, in parallel, the data stored in this distributed system.
2527

2628

27-
### What is big data?
29+
## What is big data?
2830
Big data refers to data being collected in ever-escalating volumes, at increasingly high velocities, and for a widening variety of unstructured formats and variable semantic contexts.
2931

3032
Big data describes any large body of digital information, from the text in a Twitter feed, to the sensor information from industrial equipment, to information about customer browsing and purchases on an online catalog. Big data can be historical (meaning stored data) or real-time (meaning streamed directly from the source).
3133

3234
For big data to provide actionable intelligence or insight, not only must the right questions be asked and data be relevant to the issues be collected, the data must be accessible, cleaned, analyzed, and then presented in a useful way. That's where Hadoop in HDInsight can help.
3335

34-
## In this article
35-
36-
This article provides an overview of Hadoop on HDInsight, including:
37-
38-
* **[Overview of the Hadoop ecosystem on HDInsight](#overview)** - HDInsight is the Hadoop solution on Azure and provides implementations of Storm, HBase, Pig, Hive, Sqoop, Oozie, Ambari, and so on. HDInsight also integrates with business intelligence (BI) tools such as Excel, SQL Server Analysis Services, and SQL Server Reporting Services.
39-
40-
* **[Advantages of Hadoop in the cloud](#advantage)** - Reasons you should consider the HDInsight cloud implementation of Hadoop.
41-
42-
* **[HDInsight solutions for big-data analysis](#solutions)** - Some practical ways you can use HDInsight to answer questions for your organization, from analyzing Twitter sentiment to analyzing HVAC system effectiveness.
43-
44-
* **[Resources for learning more about big-data analysis, Hadoop, and HDInsight](#resources)** - Links to additional information.
4536

4637
## <a name="overview"></a>Overview of the Hadoop ecosystem on HDInsight
4738

48-
Apache Hadoop is the rapidly expanding technology stack that is the go-to solution for big-data analysis. HDInsight is the framework for the Microsoft Azure cloud implementation of Hadoop.
39+
Apache Hadoop is the rapidly expanding technology stack that is the go-to solution for big-data analysis. HDInsight is framework for the Microsoft Azure cloud implementation of Hadoop. It includes implementations of Storm, HBase, Pig, Hive, Sqoop, Oozie, Ambari, and so on. HDInsight also integrates with business intelligence (BI) tools such as Excel, SQL Server Analysis Services, and SQL Server Reporting Services.
40+
4941

5042
* Azure HDInsight deploys and provisions Hadoop clusters in the cloud, by using either **Linux** or **Windows** as the underlying OS.
5143

0 commit comments

Comments
 (0)